You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some SAEs are trained on models distilled from other models, and are not in the TransformerLens model list. For example, the deepseek distilled llama 8B model has a "base" model of Llama 3.1 8b, but we have to specify a huggingface path for the distilled model.
These models must be loaded in a special way in Transformerlens, something like:
hf_model = None
hf_tokenizer = None
if custom_hf_model_id is not None:
logger.info("Loading custom HF model: %s", custom_hf_model_id)
hf_model = AutoModelForCausalLM.from_pretrained(
custom_hf_model_id,
)
hf_tokenizer = AutoTokenizer.from_pretrained(custom_hf_model_id)
model = HookedTransformer.from_pretrained_no_processing(
transformerlens_model_id,
device=args.device,
dtype=STR_TO_DTYPE[config.MODEL_DTYPE],
n_devices=device_count,
hf_model=hf_model,
tokenizer=hf_tokenizer,
**config.MODEL_KWARGS,
)
However, the user of the library does not know this when loading the SAE via SAELens. So they will just load the SAE with the wrong base model and probably see bad results.
Proposed fix:
An optional SAELens config property that specifies hf_model which indicates what hf_model the SAE was trained on. If this is specified, we either force the custom hf_model when loading with a pretrained loader, or throw error if it doesn't match (and give some useful example code like above).
Possibly an example notebook
An example loader that we would like to support is below - note that they specify the base llama model, but have nowhere to specify the custom hf_model or enforce it.
Problem:
Proposed fix:
An example loader that we would like to support is below - note that they specify the base llama model, but have nowhere to specify the custom hf_model or enforce it.
SAELens/sae_lens/toolkit/pretrained_sae_loaders.py
Lines 537 to 578 in ffa436c
The text was updated successfully, but these errors were encountered: