Bleeding Edge - Accept All Pull Requests by Tauheed-Elahee · Pull Request #22 · JoshEngels/MultiDimensionalFeatures

Tauheed-Elahee · 2026-03-13T01:22:51Z

An attempt to consolidate merge-able pull requests into a single branch that one can use instead of using the main branch.

This pull request solves issue #21

As per https://github.com/matplotlib/matplotlib/blob/42b88d01fdd93846d925b1097167d36ea31c7733/doc/api/prev_api_changes/api_changes_3.9.0/removals.rst?plain=1#L147, `legendHandles` was removed in matplotlib 3.9 after previously being deprecated in favor of `legend_handles`. Since `requirements.txt` requires matplotlib 3.9, we just use the new name. Fixes JoshEngels#12 Done with ``` git grep --name-only legendHandles | xargs sed -i.bak 's/legendHandles/legend_handles/g' ```

…_devices 2" when there aren't 2 GPUs ``` File "/home/jason/MultiDimensionalFeatures/multid/lib/python3.10/site-packages/transformer_lens/HookedTransformerConfig.py", line 315, in __post_init__ torch.cuda.device_count() >= self.n_devices AssertionError: Not enough CUDA devices to support n_devices 2 ```

Also allow passing `--device cpu`, `--device mps`, etc to `circle_probe_interventions.py`, and remove the logic for special handling of numbers *n* as `cuda:n` to simplify logic.

… directory

TransformerLens expects "gpt2" not "gpt-2" as the model identifier. Add tl_model_name mapping so HookedTransformer.from_pretrained() receives the correct name.

Replace hardcoded num_devices=2 for Mistral with max(1, t.cuda.device_count()) to support single-GPU and CPU-only machines while using all available GPUs when present.

Newer versions of sae_lens changed forward() to return a reconstructed tensor instead of an object with feature_acts. Add hasattr check and fall back to encode() to get feature activations.

The actual CLI argument in clustering.py is --method, not --clustering_type.

…g-edge

… bleeding-edge

JoshEngels#11 device handling + add dtype

…ved: keep pathlib paths + add weights_only=False

Tauheed-Elahee · 2026-03-13T01:41:10Z

Signed all signatures for that nice "Verified" GitHub badge for all commits.

Allow overriding BASE_DIR via environment variable for cloud/remote environments. Falls back to the existing relative cache/ path when the variable is not set. Depends on PR JoshEngels#5 (pathlib refactor).

Most machines index GPUs from 0. The previous default of cuda:4 assumed a specific multi-GPU server setup and would fail on machines with fewer than 5 GPUs. Depends on PR JoshEngels#11 (CPU fallback).

Strip explicit NVIDIA library version pins (nvidia-cuda-runtime-cu12, nvidia-cudnn-cu12, etc.) and triton. These are automatically resolved by PyTorch based on the system's CUDA installation, and hard-pinning them to CUDA 12.1 prevents installation on systems with different CUDA versions.

- Add requirements.in as the unpinned input file for pip-compile - Remove nvidia-* packages from requirements.in (transitive deps of torch) - Replace requirements.txt with pip-compile output (Python 3.12) - Major version updates: torch 2.10.0, sae-lens 6.37.1, transformer-lens 2.17.0, transformers 4.57.6, datasets 4.5.0 - Adds jupyterlab, ipywidgets, tensorflow as direct dependencies

JasonGross and others added 28 commits March 12, 2026 21:34

Fix some typos in README.md

2fd7cf4

README: add some hyperlinks and code font

694f222

Fix to support MPS: convert to float32 earlier

e2bf9ad

Revert to cpu if cuda is not available

f3ae2c7

Also allow passing `--device cpu`, `--device mps`, etc to `circle_probe_interventions.py`, and remove the logic for special handling of numbers *n* as `cuda:n` to simplify logic.

Use pathlib more and have default cache paths be relative to the repo…

cbfac5b

… directory

Add support for loading 16-bit models

6357ecd

Add use of dtype in ipynb

f973077

Update torch.load calls to include weights_only parameter

fe5e531

Update all calls to torch.load with required weights_only parameter.

1abd74e

Fix TransformerLens model name for GPT-2

c251a5f

TransformerLens expects "gpt2" not "gpt-2" as the model identifier. Add tl_model_name mapping so HookedTransformer.from_pretrained() receives the correct name.

Use dynamic device count instead of hardcoded num_devices=2

5890d59

Replace hardcoded num_devices=2 for Mistral with max(1, t.cuda.device_count()) to support single-GPU and CPU-only machines while using all available GPUs when present.

Add compatibility for newer sae_lens forward() return type

1ff27db

Newer versions of sae_lens changed forward() to return a reconstructed tensor instead of an object with feature_acts. Add hasattr check and fall back to encode() to get feature activations.

Fix incorrect --clustering_type arg in README examples

ae86fa8

The actual CLI argument in clustering.py is --method, not --clustering_type.

Merge remote-tracking branch 'JasonGross/patch-2' into bleeding-edge

d502ab1

Merge remote-tracking branch 'JasonGross/patch-3' into bleeding-edge

eb1f5c3

Merge remote-tracking branch 'JasonGross/legend_handles' into bleedin…

6e12727

…g-edge

Merge remote-tracking branch 'JasonGross/mps-cleanup' into bleeding-edge

565264d

Merge remote-tracking branch 'JasonGross/patch-1' into bleeding-edge

663bd5a

Merge remote-tracking branch 'JasonGross/cuda-available-simpler' into…

fdcdda9

… bleeding-edge

Merge remote-tracking branch 'JasonGross/pathlib' into bleeding-edge

888a582

Merge PR JoshEngels#10 (dtype support) with conflicts resolved: keep PR

fbc859f

JoshEngels#11 device handling + add dtype

Merge PR JoshEngels#16 (torch.load weights_only) with conflicts resol…

3e88a3f

…ved: keep pathlib paths + add weights_only=False

Merge branch 'tl-model-name' into bleeding-edge

9e64f26

Merge branch 'dynamic-num-devices' into bleeding-edge

ad1946e

Merge branch 'sae-lens-compatibility' into bleeding-edge

b9399a6

Merge branch 'readme-method-arg' into bleeding-edge

cada533

Tauheed-Elahee force-pushed the bleeding-edge branch from 945d7ec to cada533 Compare March 13, 2026 01:37

Tauheed-Elahee mentioned this pull request Mar 13, 2026

Please accept Pull Requests that update the code base #21

Open

Tauheed-Elahee added 4 commits March 13, 2026 13:59

Read BASE_DIR from environment variable with fallback to default

bd6ed81

Allow overriding BASE_DIR via environment variable for cloud/remote environments. Falls back to the existing relative cache/ path when the variable is not set. Depends on PR JoshEngels#5 (pathlib refactor).

Change default CUDA device from cuda:4 to cuda:0

18e77fe

Most machines index GPUs from 0. The previous default of cuda:4 assumed a specific multi-GPU server setup and would fail on machines with fewer than 5 GPUs. Depends on PR JoshEngels#11 (CPU fallback).

Tauheed-Elahee force-pushed the bleeding-edge branch from ee5b95e to b9dc4ef Compare March 13, 2026 19:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bleeding Edge - Accept All Pull Requests#22

Bleeding Edge - Accept All Pull Requests#22
Tauheed-Elahee wants to merge 32 commits intoJoshEngels:mainfrom
Tauheed-Elahee:bleeding-edge

Tauheed-Elahee commented Mar 13, 2026

Uh oh!

Tauheed-Elahee commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Tauheed-Elahee commented Mar 13, 2026

Uh oh!

Tauheed-Elahee commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants