Gradient routing

A companion repository for Gradient Routing: Masking Gradients to Localize Computation in Neural Networks.

Repo structure

factored_representations is for shared functionality, although in practice, code for different subprojects is mostly siloed
- masklib.py and model_expansion.py implement Expand, Route, Ablate for any TransformerLens model.
- Has some tests
projects contains the code to reproduce the results in the paper
- minigrid - localizing behavioral tendencies in a gridworld reinforcement learning agent
- mnist - splitting representations of an MNIST autoencoder
- nanoGPT-factrep - training a model with a steering scalar, and unlearning virology
- tinystories - unlearning a subset of TinyStories
shared_configs is for commonly-used configurations, e.g. model definitions, standard training config options

To use

Install PDM
Install the PDM project (ie. install the dependencies)
```
pdm install
```
Install the recommended VSCode extensions
Install the pre-commit git hooks
```
pdm run pre-commit install
```

You can then run Python scripts with pdm run python <script.py> or by activating the virtual environment specified by pdm info. Eg:

source /pdm-venvs/factored-representations-Dp430888-3.12/bin/activate

.vscode/settings.json is configured to automatically format and lint the code with Ruff (using the extension) on save.

Tests

Run the tests with:

pdm run pytest

Citation

@article{cloud2024gradient,
	title={Gradient Routing: Masking Gradients to Localize Computation in Neural Networks},
	url={https://arxiv.org/abs/2410.04332v1},
	journal={arXiv.org},
	author={Cloud, Alex and Goldman-Wetzler, Jacob and Wybitul, Evžen and Miller, Joseph and Turner, Alexander Matt},
	year={2024},
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.vscode		.vscode
factored_representations		factored_representations
projects		projects
shared_configs		shared_configs
snippets		snippets
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
README.md		README.md
pdm.lock		pdm.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gradient routing

Repo structure

To use

Tests

Citation

About

Releases

Packages

Languages

AddieFoote/gradient-routing

Folders and files

Latest commit

History

Repository files navigation

Gradient routing

Repo structure

To use

Tests

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages