FLAT is a "flat" loss adjustment approach which addresses these issues by maximizing f-divergence between the available template answer and the forget answer only w.r.t. the forget data. The variational form of the defined f -divergence theoretically provides a way of loss adjustment by assigning different importance weights for the learning w.r.t. template responses and the forgetting of responses subject to unlearning. Empirical results demonstrate that our approach not only achieves superior unlearning performance compared to existing methods but also minimizes the impact on the model’s retained capabilities, ensuring high utility across diverse tasks, including copyrighted content unlearning on Harry Potter dataset and MUSE Benchmark, and entity unlearning on the TOFU dataset.
- [2025.01] 👏👏 Accepted by ICLR 2025.
- [2024.10] 🚀🚀 Release the paper of FLAT.
conda create -n flat python=3.10
conda activate flat
conda install pytorch pytorch-cuda=11.8 -c pytorch -c nvidia
conda install -c "nvidia/label/cuda-11.8.0" cuda-toolkit
pip install -r requirements.txt
pip install flash-attn --no-build-isolation
pip install natsort
pip install sacrebleu
pip install sentencepiece
# OPT-2.7b lr=1e-5
python finetune.py --model_name facebook/opt-2.7b
# finetune llama2-7b
python finetune.py --model_name meta-llama/Llama-2-7b-hf
After obtaining the finetuned model, we need to change the hp_ft_model_path
in model_config.yaml. When unlearning, the code will load the finetuned model as the original model.
master_port=18765
model=llama2-7b
lr=2e-7
CUDA_VISIBLE_DEVICES=0,1 torchrun --nproc_per_node=2 --master_port=$master_port unlearn.py --config-name=forget.yaml batch_size=4 gradient_accumulation_steps=4 model_family=${model} lr=${lr}
Unlearning efficacy:
- BLEU on Harry Potter completion
- Rouge-L on Harry Potter completion
Utility:
- Perplexity on Wikitext
- Zero-shot Accuracy on benchmarks
- Zero-shot Accuracy on TruthfulQA
python evaluate.py --method_name $your_name --model_save_dir $your_model_path
Please refer to TOFU and MUSEBENCH.
If you find our codebase and dataset beneficial, please cite our work:
@article{wang2024llm,
title={LLM Unlearning via Loss Adjustment with Only Forget Data},
author={Wang, Yaxuan and Wei, Jiaheng and Liu, Chris Yuhao and Pang, Jinlong and Liu, Quan and Shah, Ankit Parag and Bao, Yujia and Liu, Yang and Wei, Wei},
journal={arXiv preprint arXiv:2410.11143},
year={2024}
}