9 forgetting pipeline #18

jack89roberts · 2024-04-25T08:45:12Z

Adds a Forgetter class for Gradient Ascent, Gradient Difference, KL, and "I don't know forgetters", which all inherit from the HuggingFace Trainer, only modifying compute_loss.

Now updated to run with new top level config structure etc. I have successfully run jobs on Baskerville from this branch, but I have not fully verified the outputs are expected. There are runs on wandb under the tofu-test groups, e.g. https://wandb.ai/turing-arc/selective-forgetting/runs/dg1r098f/overview

TODO as part of #25 or in another PR:

Currently it's not possible to do forgetting with an eval_dataset / any kind of evaluation during training as the evaluate function in the trainer class doesn't know what to when given 2 data inputs (forget / retain). The evaluate function in arcsf.forget.base need to be implemented to use Jack D's evaluation code (and the trainer should be initialised with eval_dataset set to whatever our appropriate eval dataset instance is).

TOFU says it didn't work well

Steals a lot from fine-tuning branch - some placeholder forget configs and config classes - adds a data collator compatible with QAForgetDataset - adapts trainer loading etc. to have option of loading a forgetter instead BUG - Evaluation does not work, likely also needs adapting for expected data input format.

J-Dymond · 2024-06-06T10:54:00Z

src/arcsf/forget/base.py

+ raise NotImplementedError(
+ "A Forgetter child class implementing compute_loss should be used"
+ )
+


Just some general comments on the evaluate method here. Does it need to take the dataset as an argument? In the evaluate_model function in the evaluation branch it takes the config which defines our datasets.

It also takes:

model: torch.nn.Module, base_truth_ratios_path: str, tokenizer: transformers.AutoTokenizer, experiment_config: dict, random_seed: int, **generate_kwargs: dict,

So the only requirement beforehand is that the baseline retain model has run the script calculating its truth ratios (all_eval.py). Then we can point to that directory in the config? I had a look at the configs, can we define a retain_model_path when running forget experiments? Everything else should be defined in the script running the experiment.

Otherwise, I can adjust the evaluate_model code, such that it takes as an optional argument eval_dataset. This can be the output of load_dataset, which contains both splits of the data. Then the evaluate_model function will be able to parse these using the dictionary keys 'retain' and 'forget'.

so they save to yaml nicely

jack89roberts added this to the Milestone 1: Working pipeline on small novel usecase milestone Apr 25, 2024

jack89roberts force-pushed the 9-forgetting-pipeline branch 2 times, most recently from 7b122f7 to 2e186da Compare May 3, 2024 17:13

jack89roberts force-pushed the 9-forgetting-pipeline branch from b8740ff to 392151e Compare May 28, 2024 08:39

jack89roberts added 26 commits May 28, 2024 16:13

⬆️ add/update some basic dependencies

c711c7c

🚧 start adapting tofu forgetting code

947370c

⬆️ optional notebook dependencies

53bfdf7

🚧 rename tokenizer in dataset

2edfee6

🔥 delete placeholder data class

ff1116c

🔥 remove dpo for now

573c699

TOFU says it didn't work well

✨ add lookup dict for forget classes

8387fca

wip start working on forget tests

1db3222

wip tests

d205428

✅ Add passing test for each forgetter

c1a47c3

➕ re-add scikit-learn dependency

c97e119

💡 remove commented code

a91d087

📝 Docstrings

9ae742a

wip forget script

2cd653e

✅ fix broken test due to changed forget/retain order

0168a47

⬆️ update pyproject.toml to match fine-tuning branch

f5dac6f

🔨 sample forget submit script

8e8fa21

🚨 lint

ba32ad2

print diffs in actions

60f5e4f

fix bugs in running kl and idk

25019ff

reduce walltime

65f3fb4

auto device map for oracle model, unique output dir

b3eabc1

🐛 job type not set

70fe2a4

add not implemented evaluate function

23cb6ad

data collator return docstring

b95f876

jack89roberts added 5 commits May 28, 2024 16:14

📌 re-generate lock file after rebase

c952f70

🚨 remove unused import

619eaf2

🔥 remove actions file from template

3782749

wip refactoring forget for new config structure

13eb867

📝 fix all -> full in docstrings

ad3522c

jack89roberts force-pushed the 9-forgetting-pipeline branch from a877f2c to ad3522c Compare May 28, 2024 15:14

jack89roberts added 3 commits May 31, 2024 12:11

wip forget config refactoring

e2b8f87

untested general full/retain/forget train script

6551f1f

debug tests

8bc3a6b

J-Dymond reviewed Jun 12, 2024

View reviewed changes

jack89roberts added 15 commits June 12, 2024 12:17

tidy config names

0b812c1

wandb group name

a680cdf

add missing experiment_name arg

4209fc7

assign experiment name

9ba5060

fix model path for full and retain jobs

72b60af

fix getting path to full model

f6cde1b

update lock file

570b60d

fix data collator kwargs

f84dc9d

fix data collator attempt 2

7f46459

no eval configs for forget jobs

6e80ebc

set shorter configs to also log per epoch

75e342a

early stopping config

29a2e15

re-generate dependencies

be06a6f

save config filenames

ba3447e

convert some paths to str

ee99b27

so they save to yaml nicely

jack89roberts marked this pull request as ready for review June 14, 2024 16:25

jack89roberts requested a review from philswatton June 14, 2024 16:26

quick update of readme

4aebc17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

9 forgetting pipeline #18

9 forgetting pipeline #18

jack89roberts commented Apr 25, 2024 •

edited

J-Dymond Jun 6, 2024

9 forgetting pipeline #18

Are you sure you want to change the base?

9 forgetting pipeline #18

Conversation

jack89roberts commented Apr 25, 2024 • edited

TODO as part of #25 or in another PR:

J-Dymond Jun 6, 2024

Choose a reason for hiding this comment

jack89roberts commented Apr 25, 2024 •

edited