bash: lm-saes: command not found #83

Tizzzzy · 2025-02-04T05:09:36Z

Hi, I am new to this repo, and I got this error when I followed the readme to train the SAE:

(llama_scope) [email protected]:/Language-Model-SAEs$ lm-saes train examples/configuration/train.toml
bash: lm-saes: command not found

I created a conda environment with python 3.10, then I followed the instruction: pdm install, then downloaded bun, then run lm-saes train examples/configuration/train.toml. That is where I got the error.

Can you please take a look?
Thank you

The text was updated successfully, but these errors were encountered:

dest1n1s · 2025-02-04T05:27:49Z

I'm sorry, but the current README and examples are actually outdated. We'll update them as soon as we have enough capacity.

Currently we recommend to use uv as the package manager (as a drop-in replacement of pdm). Then you could try training an SAE of Pythia with the following script:

import torch

from lm_saes import (
    ActivationFactoryConfig,
    ActivationFactoryDatasetSource,
    ActivationFactoryTarget,
    InitializerConfig,
    SAEConfig,
    TrainerConfig,
    TrainSAESettings,
    WandbConfig,
    train_sae,
)

if __name__ == "__main__":
    settings = TrainSAESettings(
        sae=SAEConfig(
            hook_point_in="blocks.3.ln1.hook_normalized",
            hook_point_out="blocks.3.ln1.hook_normalized",
            d_model=768,
            expansion_factor=8,
            act_fn="topk",
            norm_activation="token-wise",
            sparsity_include_decoder_norm=True,
            top_k=50,
            dtype=torch.float32,
            device="cuda",
        ),
        initializer=InitializerConfig(
            init_search=True,
            state="training",
        ),
        trainer=TrainerConfig(
            lp=1,
            initial_k=768 / 2,
            lr=4e-4,
            lr_scheduler_name="constantwithwarmup",
            total_training_tokens=600_000_000,
            log_frequency=1000,
            eval_frequency=1000000,
            n_checkpoints=5,
            check_point_save_mode="linear",
            exp_result_path="results",
        ),
        wandb=WandbConfig(
            wandb_project="pythia-160m-test",
            exp_name="pythia-160m-test",
        ),
        activation_factory=ActivationFactoryConfig(
            sources=[
                ActivationFactoryDatasetSource(
                    name="openwebtext",
                )
            ],
            target=ActivationFactoryTarget.BATCHED_ACTIVATIONS_1D,
            hook_points=["blocks.3.ln1.hook_normalized"],
            batch_size=2048,
            buffer_size=None,
            ignore_token_ids=[],
        ),
        sae_name="pythia-160m-test-L3",
        sae_series="pythia-160m-test",
    )
    train_sae(settings)

Hope to know if this setup works!

Tizzzzy · 2025-02-04T06:48:13Z

Hi,
Thank you for your reply. I have following questions:

Can you tell me what command I should use to build environment using uv?
Should I just copy the update script to a py file?
Should I run the training using command python <new_script>.py?
Is that all I need to do?

Tizzzzy · 2025-02-06T07:45:39Z

Hi,
Can you please provide me a clearer instruction? I have tried creating environment using uv pip install --sync, uv pip install -r uv.lock and none of them works. So I just used the environment that I created using pdm to run the newly provided python code, and I still got this error:

(llama_scope) [email protected]:/Language-Model-SAEs$ python examples/configuration/train.py
/opt/conda/envs/llama_scope/lib/python3.10/site-packages/torch/_subclasses/functional_tensor.py:275: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:81.)
  cpu = _conversion_method_template(device=torch.device("cpu"))
Traceback (most recent call last):
  File "/Language-Model-SAEs/examples/configuration/train.py", line 3, in <module>
    from lm_saes import (
ModuleNotFoundError: No module named 'lm_saes'

Can you please take a look and help me with it?

dest1n1s · 2025-02-06T08:22:08Z

Sorry for the late reply.

Can you tell me what command I should use to build environment using uv?

Once you have uv installed (following the instructions on this), you do not need any explicit command to build the environment. uv will handle resolving & downloading the required packages when you actually run some scripts in the project. If you really want to have the packages explicitly downloaded (this may be necessary if your GPUs have no internet connections), you can run uv sync.

Should I just copy the update script to a py file?

Yes.

Should I run the training using command python <new_script>.py?

You should run uv run <new_script>.py to activate the uv venv environment and run the script.

Is that all I need to do?

It should work smoothly with the above steps. Please let me know if there are any further problems!

Tizzzzy · 2025-02-06T09:12:58Z

Hi,
Thank you for your reply. I have now followed your new instruction. I created a new conda environment with python 3.12.0. Then I use pip install uv installed uv. Then I copied the new python script to train.py. Then I run the script using uv run ./examples/configuration/train.py. However, I got this error:

(llama-scope) [email protected]:/Language-Model-SAEs$ uv run ./examples/configuration/train.py
Traceback (most recent call last):
  File "/Language-Model-SAEs/./examples/configuration/train.py", line 64, in <module>
    train_sae(settings)
  File "/Language-Model-SAEs/src/lm_saes/runner.py", line 286, in train_sae
    activations_stream = activation_factory.process()
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Language-Model-SAEs/src/lm_saes/activation/factory.py", line 281, in process
    streams = [processor(**kwargs) for processor in self.pre_aggregation_processors]
               ^^^^^^^^^^^^^^^^^^^
  File "/Language-Model-SAEs/src/lm_saes/activation/factory.py", line 98, in process_dataset
    assert datasets is not None, "`datasets` must be provided for dataset sources"
AssertionError: `datasets` must be provided for dataset sources

I tried using the dataset from Huggingface path Skylion007/openwebtext. However it still doesn't work. I think this is because I didn't download the dataset to my local repo. How and where can I download the dataset?

dest1n1s · 2025-02-06T10:48:15Z

Have you tried the script above in this issue? The example script has not been up-to-date yet.

Tizzzzy · 2025-02-06T18:18:51Z

Hi,
Yes, the error is from the new python script:

import torch

from lm_saes import (
    ActivationFactoryConfig,
    ActivationFactoryDatasetSource,
    ActivationFactoryTarget,
    InitializerConfig,
    SAEConfig,
    TrainerConfig,
    TrainSAESettings,
    WandbConfig,
    train_sae,
)

if __name__ == "__main__":
    settings = TrainSAESettings(
        sae=SAEConfig(
            hook_point_in="blocks.3.ln1.hook_normalized",
            hook_point_out="blocks.3.ln1.hook_normalized",
            d_model=768,
            expansion_factor=8,
            act_fn="topk",
            norm_activation="token-wise",
            sparsity_include_decoder_norm=True,
            top_k=50,
            dtype=torch.float32,
            device="cuda",
        ),
        initializer=InitializerConfig(
            init_search=True,
            state="training",
        ),
        trainer=TrainerConfig(
            lp=1,
            initial_k=768 / 2,
            lr=4e-4,
            lr_scheduler_name="constantwithwarmup",
            total_training_tokens=600_000_000,
            log_frequency=1000,
            eval_frequency=1000000,
            n_checkpoints=5,
            check_point_save_mode="linear",
            exp_result_path="results",
        ),
        wandb=WandbConfig(
            wandb_project="pythia-160m-test",
            exp_name="pythia-160m-test",
        ),
        activation_factory=ActivationFactoryConfig(
            sources=[
                ActivationFactoryDatasetSource(
                    name="openwebtext",
                )
            ],
            target=ActivationFactoryTarget.BATCHED_ACTIVATIONS_1D,
            hook_points=["blocks.3.ln1.hook_normalized"],
            batch_size=2048,
            buffer_size=None,
            ignore_token_ids=[],
        ),
        sae_name="pythia-160m-test-L3",
        sae_series="pythia-160m-test",
    )
    train_sae(settings)

Can you please take a look and help me fix the bug?

dest1n1s · 2025-02-07T06:48:58Z

Hi,
It seems to be some bugs in the current train runner. It doesn't fit non-pre-generated datasets. I'll push a fix asap.

Tizzzzy · 2025-02-07T08:53:56Z

Thank you! Please update asap

dest1n1s · 2025-02-08T05:48:50Z

Hello, this should be fixed with #85 . Also, you can try separately generating activations and training the SAE, which can drastically improve the training speed as long as you have enough disk space to hold all the activations. Examples are updated in #85 .

dest1n1s mentioned this issue Feb 7, 2025

feat(runner): support training with non-pre-generated activation #85

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bash: lm-saes: command not found #83

bash: lm-saes: command not found #83

Tizzzzy commented Feb 4, 2025

dest1n1s commented Feb 4, 2025 •

edited

Loading

Tizzzzy commented Feb 4, 2025

Tizzzzy commented Feb 6, 2025

dest1n1s commented Feb 6, 2025 •

edited

Loading

Tizzzzy commented Feb 6, 2025

dest1n1s commented Feb 6, 2025

Tizzzzy commented Feb 6, 2025

dest1n1s commented Feb 7, 2025 •

edited

Loading

Tizzzzy commented Feb 7, 2025

dest1n1s commented Feb 8, 2025

bash: lm-saes: command not found #83

bash: lm-saes: command not found #83

Comments

Tizzzzy commented Feb 4, 2025

dest1n1s commented Feb 4, 2025 • edited Loading

Tizzzzy commented Feb 4, 2025

Tizzzzy commented Feb 6, 2025

dest1n1s commented Feb 6, 2025 • edited Loading

Tizzzzy commented Feb 6, 2025

dest1n1s commented Feb 6, 2025

Tizzzzy commented Feb 6, 2025

dest1n1s commented Feb 7, 2025 • edited Loading

Tizzzzy commented Feb 7, 2025

dest1n1s commented Feb 8, 2025

dest1n1s commented Feb 4, 2025 •

edited

Loading

dest1n1s commented Feb 6, 2025 •

edited

Loading

dest1n1s commented Feb 7, 2025 •

edited

Loading