Skip to content

[Update] release Mask-DPO #10

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 5, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
86 changes: 83 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,21 @@

[![license](https://img.shields.io/github/license/InternLM/opencompass.svg)](./LICENSE)

This is the repository for our ANAH series of papers, containing [ANAH: Analytical Annotation of Hallucinations in Large Language Models](https://arxiv.org/abs/2405.20315) and [ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models](https://arxiv.org/abs/2407.04693).
This is the repository for our ANAH series of papers, which aims to reduce hallucinations in LLM through research involving benchmarking, detection, and mitigation of hallucinations:

- **[Benchmark]** [ANAH: Analytical Annotation of Hallucinations in Large Language Models](https://arxiv.org/abs/2405.20315)
- **[Detection]** [ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models](https://arxiv.org/abs/2407.04693)
- **[Mitigation]** [Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs](http://arxiv.org/abs/2503.02846)

The repo contains:

+ The [data](#huggingface-dataset) for training and evaluating the LLM which consists of sentence-level hallucination annotations.
+ The [model](#huggingface-model) for annotating the hallucination.
+ The [code](#evaluation) for evaluating the hallucinations level of LLM-generated content and the LLMs' ability to annotate hallucination.

+ The [code](#maskdpo-training) for performing fine-grained factuality alignment.

## 🚀 What's New
- **[2025.01.23]** Mask-DPO has been accepted by the ICLR 2025. 🎉🎉🎉
- **[2024.09.26]** ANAH-v2 has been accepted by the NeurIPS 2024. 🎉🎉🎉
- **[2024.07.12]** ANAH-v2 [Annotator](https://huggingface.co/opencompass/anah-v2) has been open-sourced. 🔥🔥🔥
- **[2024.07.03]** ANAH [Annotator-7B](https://huggingface.co/opencompass/anah-7b) & [20B](https://huggingface.co/opencompass/anah-20b) have been open-sourced. 🔥🔥🔥
Expand Down Expand Up @@ -46,10 +51,24 @@ Through iterative self-training, we simultaneously and progressively scale up th

The final dataset encompasses both over ∼3k topics, ∼196k model responses, and ∼822k annotated sentences, in English and Chinese.

The final hallucination annotator (detector) with only 7B parameters surpasses the performance of GPT-4 and obtains new state-of-the-art hallucination detection results on HaluEval and HalluQA by zero-shot inference.

<p align="center">
<img src="docs/figure/teaser-v2.jpg" height="500">
</p>

## Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs

[![arXiv](https://img.shields.io/badge/arXiv-2503.02846-b31b1b.svg)](http://arxiv.org/abs/2503.02846)

Mask-DPO is a fine-grained factuality alignment method based on Direct Preference Optimization (DPO).

Incorporating sentence-level factuality as mask signals, Mask-DPO only learns from factually correct sentences in the preferred samples and prevents the penalty on factual contents in the not preferred samples, which resolves the ambiguity in the preference learning.


<p align="center">
<img src="docs/figure/maskdpo.png" height="300">
</p>

## 🤗 HuggingFace Model & Dataset
<a name="huggingface-dataset"></a>
Expand Down Expand Up @@ -147,9 +166,64 @@ python -u ./eval/anah_v1/eval.py \
--eval_sorce_path {your_evaluation_result_path} \
```

<a name="maskdpo-training"></a>
## 🚄 Factuality Alignment Tutorial

### 1. Install Dependencies

Mask-DPO utilizes [XTuner](https://github.com/InternLM/xtuner) as the training engine.

```bash
conda env create -f maskdpo.yml
```

### 2. Prepare Fine-grained Preference Data

You need to prepare sentence-level facutality preference data in the following format:

```json
{
"prompt": [{"role": "user", "content": "..."}],
"chosen": [{"role": "assistant", "content": "..."}],
"chosen_item": {
"sents": ["sent1", "sent2", "..."],
"type": ["hallucination", "no_hallucination", "..."]
},
"rejected": [{"role": "assistant", "content": "..."}],
"rejected_item": {
"sents": ["sent1", "sent2", "..."],
"type": ["hallucination", "no_hallucination", "..."]
}
}
```

where `chosem_item` is the fine-grained factual information about `chosem`, `sents` is a sentence-level slice of the `content` in `chosen`, and `type` is the hallucination situation of the corresponding sentence.
`rejected_item` is the same.

We recommend you to use [ANAH-v2](https://huggingface.co/opencompass/anah-v2) for fine-grained hallucination annotation of your data. Of course you can use other methods as well.


### 3. Training

After putting the data and initial model paths to the corresponding locations in config, you can use the following command to train the model.

```bash
python -m torch.distributed.run \
--nproc_per_node=4 \
--nnodes=1 \
--node_rank=0 \
--rdzv_id=1234 \
--rdzv_backend=c10d \
--rdzv_endpoint=127.0.0.1:1234 \
./maskdpo/train.py \
./maskdpo/example_config.py \
--deepspeed deepspeed_zero3 \
--launcher pytorch
```

## ❤️ Acknowledgements

ANAH is built with [InternLM](https://github.com/InternLM/InternLM) and [LMDeploy](https://github.com/InternLM/lagent). Thanks for their awesome work!
ANAH is built with [InternLM](https://github.com/InternLM/InternLM), [Xtuner](https://github.com/InternLM/xtuner) and [LMDeploy](https://github.com/InternLM/lagent). Thanks for their awesome work!

## 🖊️ Citation

Expand All @@ -168,6 +242,12 @@ If you find this project useful in your research, please consider citing:
journal={arXiv preprint arXiv:2407.04693},
year={2024}
}

@inproceedings{gumask,
title={Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs},
author={Gu, Yuzhe and Zhang, Wenwei and Lyu, Chengqi and Lin, Dahua and Chen, Kai},
booktitle={The Thirteenth International Conference on Learning Representations}
}
```

## 💳 License
Expand Down
Binary file added docs/figure/maskdpo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
256 changes: 256 additions & 0 deletions maskdpo-env.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,256 @@
name: maskdpo
channels:
- https://repo.anaconda.com/pkgs/main
- defaults
dependencies:
- _libgcc_mutex=0.1=main
- _openmp_mutex=5.1=1_gnu
- bzip2=1.0.8=h5eee18b_6
- ca-certificates=2024.7.2=h06a4308_0
- ld_impl_linux-64=2.38=h1181459_1
- libffi=3.4.4=h6a678d5_1
- libgcc-ng=11.2.0=h1234567_1
- libgomp=11.2.0=h1234567_1
- libstdcxx-ng=11.2.0=h1234567_1
- libuuid=1.41.5=h5eee18b_0
- ncurses=6.4=h6a678d5_0
- openssl=3.0.14=h5eee18b_0
- python=3.10.14=h955ad1f_1
- readline=8.2=h5eee18b_0
- setuptools=69.5.1=py310h06a4308_0
- sqlite=3.45.3=h5eee18b_0
- tk=8.6.14=h39e8969_0
- wheel=0.43.0=py310h06a4308_0
- xz=5.4.6=h5eee18b_1
- zlib=1.2.13=h5eee18b_1
- pip:
- accelerate==0.33.0
- addict==2.4.0
- aiohappyeyeballs==2.3.2
- aiohttp==3.10.0
- aiosignal==1.3.1
- altair==5.3.0
- annotated-types==0.7.0
- anyio==4.4.0
- argon2-cffi==23.1.0
- argon2-cffi-bindings==21.2.0
- arrow==1.3.0
- arxiv==2.1.3
- asttokens==2.4.1
- async-lru==2.0.4
- async-timeout==4.0.3
- attrs==23.2.0
- babel==2.15.0
- backports-strenum==1.3.1
- beautifulsoup4==4.12.3
- bitsandbytes==0.43.3
- bleach==6.1.0
- blinker==1.8.2
- brotli==1.1.0
- cachetools==5.4.0
- certifi==2024.7.4
- cffi==1.16.0
- charset-normalizer==3.3.2
- click==8.1.7
- colorama==0.4.6
- comm==0.2.2
- contourpy==1.2.1
- cycler==0.12.1
- datasets==2.20.0
- debugpy==1.8.2
- decorator==5.1.1
- deepspeed==0.14.4
- defusedxml==0.7.1
- dill==0.3.8
- distro==1.9.0
- docker-pycreds==0.4.0
- duckduckgo-search==5.3.1b1
- einops==0.8.0
- et-xmlfile==1.1.0
- exceptiongroup==1.2.2
- executing==2.0.1
- fastjsonschema==2.20.0
- feedparser==6.0.11
- filelock==3.15.4
- fire==0.6.0
- flash-attn==2.6.3
- fonttools==4.53.1
- fqdn==1.5.1
- frozenlist==1.4.1
- fsspec==2024.5.0
- func-timeout==4.3.5
- gitdb==4.0.11
- gitpython==3.1.43
- griffe==0.48.0
- gyztools==0.1
- h11==0.14.0
- h2==4.1.0
- hjson==3.1.0
- hpack==4.0.0
- httpcore==1.0.5
- httpx==0.27.0
- huggingface-hub==0.24.3
- hyperframe==6.0.1
- idna==3.7
- imageio==2.34.2
- importlib-metadata==8.2.0
- ipykernel==6.29.5
- ipython==8.26.0
- ipywidgets==8.1.3
- isoduration==20.11.0
- jedi==0.19.1
- jieba==0.42.1
- jinja2==3.1.4
- jiter==0.5.0
- joblib==1.4.2
- json5==0.9.25
- jsonlines==4.0.0
- jsonpointer==3.0.0
- jsonschema==4.23.0
- jsonschema-specifications==2023.12.1
- jupyter==1.0.0
- jupyter-client==8.6.2
- jupyter-console==6.6.3
- jupyter-core==5.7.2
- jupyter-events==0.10.0
- jupyter-lsp==2.2.5
- jupyter-server==2.14.2
- jupyter-server-terminals==0.5.3
- jupyterlab==4.2.4
- jupyterlab-pygments==0.3.0
- jupyterlab-server==2.27.3
- jupyterlab-widgets==3.0.11
- kiwisolver==1.4.5
- lagent==0.2.3
- lazy-loader==0.4
- markdown-it-py==3.0.0
- markupsafe==2.1.5
- matplotlib==3.9.1
- matplotlib-inline==0.1.7
- mdurl==0.1.2
- mistune==3.0.2
- mmengine==0.10.4
- modelscope==1.16.1
- mpi4py-mpich==3.1.5
- mpmath==1.3.0
- multidict==6.0.5
- multiprocess==0.70.16
- nbclient==0.10.0
- nbconvert==7.16.4
- nbformat==5.10.4
- nest-asyncio==1.6.0
- networkx==3.3
- ninja==1.11.1.1
- notebook==7.2.1
- notebook-shim==0.2.4
- numpy==1.26.4
- nvidia-cublas-cu12==12.1.3.1
- nvidia-cuda-cupti-cu12==12.1.105
- nvidia-cuda-nvrtc-cu12==12.1.105
- nvidia-cuda-runtime-cu12==12.1.105
- nvidia-cudnn-cu12==9.1.0.70
- nvidia-cufft-cu12==11.0.2.54
- nvidia-curand-cu12==10.3.2.106
- nvidia-cusolver-cu12==11.4.5.107
- nvidia-cusparse-cu12==12.1.0.106
- nvidia-ml-py==12.555.43
- nvidia-nccl-cu12==2.20.5
- nvidia-nvjitlink-cu12==12.5.82
- nvidia-nvtx-cu12==12.1.105
- openai==1.45.0
- opencv-python==4.10.0.84
- openpyxl==3.1.5
- overrides==7.7.0
- packaging==24.1
- pandas==2.2.2
- pandocfilters==1.5.1
- parso==0.8.4
- peft==0.12.0
- pexpect==4.9.0
- phx-class-registry==4.1.0
- pillow==10.4.0
- pip==24.2
- platformdirs==4.2.2
- prometheus-client==0.20.0
- prompt-toolkit==3.0.47
- protobuf==5.27.2
- psutil==6.0.0
- ptyprocess==0.7.0
- pure-eval==0.2.3
- py-cpuinfo==9.0.0
- pyarrow==17.0.0
- pyarrow-hotfix==0.6
- pycparser==2.22
- pydantic==2.8.2
- pydantic-core==2.20.1
- pydeck==0.9.1
- pygments==2.18.0
- pyparsing==3.1.2
- python-dateutil==2.9.0.post0
- python-json-logger==2.0.7
- pytz==2024.1
- pyyaml==6.0.1
- pyzmq==26.0.3
- qtconsole==5.5.2
- qtpy==2.4.1
- referencing==0.35.1
- regex==2024.7.24
- requests==2.32.3
- rfc3339-validator==0.1.4
- rfc3986-validator==0.1.1
- rich==13.7.1
- rpds-py==0.19.1
- safetensors==0.4.3
- scikit-image==0.24.0
- scikit-learn==1.5.1
- scipy==1.14.0
- send2trash==1.8.3
- sentence-transformers==3.0.1
- sentencepiece==0.2.0
- sentry-sdk==2.11.0
- setproctitle==1.3.3
- sgmllib3k==1.0.0
- six==1.16.0
- smmap==5.0.1
- sniffio==1.3.1
- socksio==1.0.0
- soupsieve==2.5
- stack-data==0.6.3
- streamlit==1.37.0
- sympy==1.13.1
- tenacity==8.5.0
- termcolor==2.4.0
- terminado==0.18.1
- threadpoolctl==3.5.0
- tifffile==2024.7.24
- tiktoken==0.7.0
- timeout-decorator==0.5.0
- tinycss2==1.3.0
- tokenizers==0.19.1
- toml==0.10.2
- tomli==2.0.1
- toolz==0.12.1
- torch==2.4.0
- torchvision==0.19.0
- tornado==6.4.1
- tqdm==4.66.4
- traitlets==5.14.3
- transformers==4.43.3
- transformers-stream-generator==0.0.5
- triton==3.0.0
- types-python-dateutil==2.9.0.20240316
- typing-extensions==4.12.2
- tzdata==2024.1
- uri-template==1.3.0
- urllib3==2.2.2
- wandb==0.17.5
- watchdog==4.0.1
- wcwidth==0.2.13
- webcolors==24.6.0
- webencodings==0.5.1
- websocket-client==1.8.0
- widgetsnbextension==4.0.11
- xxhash==3.4.1
- yapf==0.40.2
- yarl==1.9.4
- zipp==3.19.2
Loading