Skip to content

Bleeding Edge - Accept All Pull Requests#22

Open
Tauheed-Elahee wants to merge 32 commits intoJoshEngels:mainfrom
Tauheed-Elahee:bleeding-edge
Open

Bleeding Edge - Accept All Pull Requests#22
Tauheed-Elahee wants to merge 32 commits intoJoshEngels:mainfrom
Tauheed-Elahee:bleeding-edge

Conversation

@Tauheed-Elahee
Copy link

An attempt to consolidate merge-able pull requests into a single branch that one can use instead of using the main branch.

This pull request solves issue #21

JasonGross and others added 28 commits March 12, 2026 21:34
As per
https://github.com/matplotlib/matplotlib/blob/42b88d01fdd93846d925b1097167d36ea31c7733/doc/api/prev_api_changes/api_changes_3.9.0/removals.rst?plain=1#L147,
`legendHandles` was removed in matplotlib 3.9 after previously being
deprecated in favor of `legend_handles`.  Since `requirements.txt`
requires matplotlib 3.9, we just use the new name.

Fixes JoshEngels#12

Done with
```
git grep --name-only legendHandles | xargs sed -i.bak 's/legendHandles/legend_handles/g'
```
…_devices 2" when there aren't 2 GPUs

```
  File "/home/jason/MultiDimensionalFeatures/multid/lib/python3.10/site-packages/transformer_lens/HookedTransformerConfig.py", line 315, in __post_init__
    torch.cuda.device_count() >= self.n_devices
AssertionError: Not enough CUDA devices to support n_devices 2
```
Also allow passing `--device cpu`, `--device mps`, etc to
`circle_probe_interventions.py`, and remove the logic for special
handling of numbers *n* as `cuda:n` to simplify logic.
TransformerLens expects "gpt2" not "gpt-2" as the model identifier.
Add tl_model_name mapping so HookedTransformer.from_pretrained()
receives the correct name.
Replace hardcoded num_devices=2 for Mistral with
max(1, t.cuda.device_count()) to support single-GPU and CPU-only
machines while using all available GPUs when present.
Newer versions of sae_lens changed forward() to return a reconstructed
tensor instead of an object with feature_acts. Add hasattr check and
fall back to encode() to get feature activations.
The actual CLI argument in clustering.py is --method, not
--clustering_type.
…ved: keep pathlib paths + add weights_only=False
@Tauheed-Elahee
Copy link
Author

Signed all signatures for that nice "Verified" GitHub badge for all commits.

Allow overriding BASE_DIR via environment variable for cloud/remote
environments. Falls back to the existing relative cache/ path when
the variable is not set.

Depends on PR JoshEngels#5 (pathlib refactor).
Most machines index GPUs from 0. The previous default of cuda:4
assumed a specific multi-GPU server setup and would fail on machines
with fewer than 5 GPUs.

Depends on PR JoshEngels#11 (CPU fallback).
Strip explicit NVIDIA library version pins (nvidia-cuda-runtime-cu12,
nvidia-cudnn-cu12, etc.) and triton. These are automatically resolved
by PyTorch based on the system's CUDA installation, and hard-pinning
them to CUDA 12.1 prevents installation on systems with different
CUDA versions.
- Add requirements.in as the unpinned input file for pip-compile
- Remove nvidia-* packages from requirements.in (transitive deps of torch)
- Replace requirements.txt with pip-compile output (Python 3.12)
- Major version updates: torch 2.10.0, sae-lens 6.37.1,
  transformer-lens 2.17.0, transformers 4.57.6, datasets 4.5.0
- Adds jupyterlab, ipywidgets, tensorflow as direct dependencies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants