Skip to content

Commit d5fd766

Browse files
authored
[Update] release Mask-DPO (#10)
* release mask-dpo
1 parent b98c538 commit d5fd766

File tree

7 files changed

+1480
-3
lines changed

7 files changed

+1480
-3
lines changed

README.md

Lines changed: 83 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,16 +2,21 @@
22

33
[![license](https://img.shields.io/github/license/InternLM/opencompass.svg)](./LICENSE)
44

5-
This is the repository for our ANAH series of papers, containing [ANAH: Analytical Annotation of Hallucinations in Large Language Models](https://arxiv.org/abs/2405.20315) and [ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models](https://arxiv.org/abs/2407.04693).
5+
This is the repository for our ANAH series of papers, which aims to reduce hallucinations in LLM through research involving benchmarking, detection, and mitigation of hallucinations:
6+
7+
- **[Benchmark]** [ANAH: Analytical Annotation of Hallucinations in Large Language Models](https://arxiv.org/abs/2405.20315)
8+
- **[Detection]** [ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models](https://arxiv.org/abs/2407.04693)
9+
- **[Mitigation]** [Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs](http://arxiv.org/abs/2503.02846)
610

711
The repo contains:
812

913
+ The [data](#huggingface-dataset) for training and evaluating the LLM which consists of sentence-level hallucination annotations.
1014
+ The [model](#huggingface-model) for annotating the hallucination.
1115
+ The [code](#evaluation) for evaluating the hallucinations level of LLM-generated content and the LLMs' ability to annotate hallucination.
12-
16+
+ The [code](#maskdpo-training) for performing fine-grained factuality alignment.
1317

1418
## 🚀 What's New
19+
- **[2025.01.23]** Mask-DPO has been accepted by the ICLR 2025. 🎉🎉🎉
1520
- **[2024.09.26]** ANAH-v2 has been accepted by the NeurIPS 2024. 🎉🎉🎉
1621
- **[2024.07.12]** ANAH-v2 [Annotator](https://huggingface.co/opencompass/anah-v2) has been open-sourced. 🔥🔥🔥
1722
- **[2024.07.03]** ANAH [Annotator-7B](https://huggingface.co/opencompass/anah-7b) & [20B](https://huggingface.co/opencompass/anah-20b) have been open-sourced. 🔥🔥🔥
@@ -46,10 +51,24 @@ Through iterative self-training, we simultaneously and progressively scale up th
4651

4752
The final dataset encompasses both over ∼3k topics, ∼196k model responses, and ∼822k annotated sentences, in English and Chinese.
4853

54+
The final hallucination annotator (detector) with only 7B parameters surpasses the performance of GPT-4 and obtains new state-of-the-art hallucination detection results on HaluEval and HalluQA by zero-shot inference.
55+
4956
<p align="center">
5057
<img src="docs/figure/teaser-v2.jpg" height="500">
5158
</p>
5259

60+
## Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs
61+
62+
[![arXiv](https://img.shields.io/badge/arXiv-2503.02846-b31b1b.svg)](http://arxiv.org/abs/2503.02846)
63+
64+
Mask-DPO is a fine-grained factuality alignment method based on Direct Preference Optimization (DPO).
65+
66+
Incorporating sentence-level factuality as mask signals, Mask-DPO only learns from factually correct sentences in the preferred samples and prevents the penalty on factual contents in the not preferred samples, which resolves the ambiguity in the preference learning.
67+
68+
69+
<p align="center">
70+
<img src="docs/figure/maskdpo.png" height="300">
71+
</p>
5372

5473
## 🤗 HuggingFace Model & Dataset
5574
<a name="huggingface-dataset"></a>
@@ -147,9 +166,64 @@ python -u ./eval/anah_v1/eval.py \
147166
--eval_sorce_path {your_evaluation_result_path} \
148167
```
149168

169+
<a name="maskdpo-training"></a>
170+
## 🚄 Factuality Alignment Tutorial
171+
172+
### 1. Install Dependencies
173+
174+
Mask-DPO utilizes [XTuner](https://github.com/InternLM/xtuner) as the training engine.
175+
176+
```bash
177+
conda env create -f maskdpo.yml
178+
```
179+
180+
### 2. Prepare Fine-grained Preference Data
181+
182+
You need to prepare sentence-level facutality preference data in the following format:
183+
184+
```json
185+
{
186+
"prompt": [{"role": "user", "content": "..."}],
187+
"chosen": [{"role": "assistant", "content": "..."}],
188+
"chosen_item": {
189+
"sents": ["sent1", "sent2", "..."],
190+
"type": ["hallucination", "no_hallucination", "..."]
191+
},
192+
"rejected": [{"role": "assistant", "content": "..."}],
193+
"rejected_item": {
194+
"sents": ["sent1", "sent2", "..."],
195+
"type": ["hallucination", "no_hallucination", "..."]
196+
}
197+
}
198+
```
199+
200+
where `chosem_item` is the fine-grained factual information about `chosem`, `sents` is a sentence-level slice of the `content` in `chosen`, and `type` is the hallucination situation of the corresponding sentence.
201+
`rejected_item` is the same.
202+
203+
We recommend you to use [ANAH-v2](https://huggingface.co/opencompass/anah-v2) for fine-grained hallucination annotation of your data. Of course you can use other methods as well.
204+
205+
206+
### 3. Training
207+
208+
After putting the data and initial model paths to the corresponding locations in config, you can use the following command to train the model.
209+
210+
```bash
211+
python -m torch.distributed.run \
212+
--nproc_per_node=4 \
213+
--nnodes=1 \
214+
--node_rank=0 \
215+
--rdzv_id=1234 \
216+
--rdzv_backend=c10d \
217+
--rdzv_endpoint=127.0.0.1:1234 \
218+
./maskdpo/train.py \
219+
./maskdpo/example_config.py \
220+
--deepspeed deepspeed_zero3 \
221+
--launcher pytorch
222+
```
223+
150224
## ❤️ Acknowledgements
151225

152-
ANAH is built with [InternLM](https://github.com/InternLM/InternLM) and [LMDeploy](https://github.com/InternLM/lagent). Thanks for their awesome work!
226+
ANAH is built with [InternLM](https://github.com/InternLM/InternLM), [Xtuner](https://github.com/InternLM/xtuner) and [LMDeploy](https://github.com/InternLM/lagent). Thanks for their awesome work!
153227

154228
## 🖊️ Citation
155229

@@ -168,6 +242,12 @@ If you find this project useful in your research, please consider citing:
168242
journal={arXiv preprint arXiv:2407.04693},
169243
year={2024}
170244
}
245+
246+
@inproceedings{gumask,
247+
title={Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs},
248+
author={Gu, Yuzhe and Zhang, Wenwei and Lyu, Chengqi and Lin, Dahua and Chen, Kai},
249+
booktitle={The Thirteenth International Conference on Learning Representations}
250+
}
171251
```
172252

173253
## 💳 License

docs/figure/maskdpo.png

325 KB
Loading

maskdpo-env.yml

Lines changed: 256 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,256 @@
1+
name: maskdpo
2+
channels:
3+
- https://repo.anaconda.com/pkgs/main
4+
- defaults
5+
dependencies:
6+
- _libgcc_mutex=0.1=main
7+
- _openmp_mutex=5.1=1_gnu
8+
- bzip2=1.0.8=h5eee18b_6
9+
- ca-certificates=2024.7.2=h06a4308_0
10+
- ld_impl_linux-64=2.38=h1181459_1
11+
- libffi=3.4.4=h6a678d5_1
12+
- libgcc-ng=11.2.0=h1234567_1
13+
- libgomp=11.2.0=h1234567_1
14+
- libstdcxx-ng=11.2.0=h1234567_1
15+
- libuuid=1.41.5=h5eee18b_0
16+
- ncurses=6.4=h6a678d5_0
17+
- openssl=3.0.14=h5eee18b_0
18+
- python=3.10.14=h955ad1f_1
19+
- readline=8.2=h5eee18b_0
20+
- setuptools=69.5.1=py310h06a4308_0
21+
- sqlite=3.45.3=h5eee18b_0
22+
- tk=8.6.14=h39e8969_0
23+
- wheel=0.43.0=py310h06a4308_0
24+
- xz=5.4.6=h5eee18b_1
25+
- zlib=1.2.13=h5eee18b_1
26+
- pip:
27+
- accelerate==0.33.0
28+
- addict==2.4.0
29+
- aiohappyeyeballs==2.3.2
30+
- aiohttp==3.10.0
31+
- aiosignal==1.3.1
32+
- altair==5.3.0
33+
- annotated-types==0.7.0
34+
- anyio==4.4.0
35+
- argon2-cffi==23.1.0
36+
- argon2-cffi-bindings==21.2.0
37+
- arrow==1.3.0
38+
- arxiv==2.1.3
39+
- asttokens==2.4.1
40+
- async-lru==2.0.4
41+
- async-timeout==4.0.3
42+
- attrs==23.2.0
43+
- babel==2.15.0
44+
- backports-strenum==1.3.1
45+
- beautifulsoup4==4.12.3
46+
- bitsandbytes==0.43.3
47+
- bleach==6.1.0
48+
- blinker==1.8.2
49+
- brotli==1.1.0
50+
- cachetools==5.4.0
51+
- certifi==2024.7.4
52+
- cffi==1.16.0
53+
- charset-normalizer==3.3.2
54+
- click==8.1.7
55+
- colorama==0.4.6
56+
- comm==0.2.2
57+
- contourpy==1.2.1
58+
- cycler==0.12.1
59+
- datasets==2.20.0
60+
- debugpy==1.8.2
61+
- decorator==5.1.1
62+
- deepspeed==0.14.4
63+
- defusedxml==0.7.1
64+
- dill==0.3.8
65+
- distro==1.9.0
66+
- docker-pycreds==0.4.0
67+
- duckduckgo-search==5.3.1b1
68+
- einops==0.8.0
69+
- et-xmlfile==1.1.0
70+
- exceptiongroup==1.2.2
71+
- executing==2.0.1
72+
- fastjsonschema==2.20.0
73+
- feedparser==6.0.11
74+
- filelock==3.15.4
75+
- fire==0.6.0
76+
- flash-attn==2.6.3
77+
- fonttools==4.53.1
78+
- fqdn==1.5.1
79+
- frozenlist==1.4.1
80+
- fsspec==2024.5.0
81+
- func-timeout==4.3.5
82+
- gitdb==4.0.11
83+
- gitpython==3.1.43
84+
- griffe==0.48.0
85+
- gyztools==0.1
86+
- h11==0.14.0
87+
- h2==4.1.0
88+
- hjson==3.1.0
89+
- hpack==4.0.0
90+
- httpcore==1.0.5
91+
- httpx==0.27.0
92+
- huggingface-hub==0.24.3
93+
- hyperframe==6.0.1
94+
- idna==3.7
95+
- imageio==2.34.2
96+
- importlib-metadata==8.2.0
97+
- ipykernel==6.29.5
98+
- ipython==8.26.0
99+
- ipywidgets==8.1.3
100+
- isoduration==20.11.0
101+
- jedi==0.19.1
102+
- jieba==0.42.1
103+
- jinja2==3.1.4
104+
- jiter==0.5.0
105+
- joblib==1.4.2
106+
- json5==0.9.25
107+
- jsonlines==4.0.0
108+
- jsonpointer==3.0.0
109+
- jsonschema==4.23.0
110+
- jsonschema-specifications==2023.12.1
111+
- jupyter==1.0.0
112+
- jupyter-client==8.6.2
113+
- jupyter-console==6.6.3
114+
- jupyter-core==5.7.2
115+
- jupyter-events==0.10.0
116+
- jupyter-lsp==2.2.5
117+
- jupyter-server==2.14.2
118+
- jupyter-server-terminals==0.5.3
119+
- jupyterlab==4.2.4
120+
- jupyterlab-pygments==0.3.0
121+
- jupyterlab-server==2.27.3
122+
- jupyterlab-widgets==3.0.11
123+
- kiwisolver==1.4.5
124+
- lagent==0.2.3
125+
- lazy-loader==0.4
126+
- markdown-it-py==3.0.0
127+
- markupsafe==2.1.5
128+
- matplotlib==3.9.1
129+
- matplotlib-inline==0.1.7
130+
- mdurl==0.1.2
131+
- mistune==3.0.2
132+
- mmengine==0.10.4
133+
- modelscope==1.16.1
134+
- mpi4py-mpich==3.1.5
135+
- mpmath==1.3.0
136+
- multidict==6.0.5
137+
- multiprocess==0.70.16
138+
- nbclient==0.10.0
139+
- nbconvert==7.16.4
140+
- nbformat==5.10.4
141+
- nest-asyncio==1.6.0
142+
- networkx==3.3
143+
- ninja==1.11.1.1
144+
- notebook==7.2.1
145+
- notebook-shim==0.2.4
146+
- numpy==1.26.4
147+
- nvidia-cublas-cu12==12.1.3.1
148+
- nvidia-cuda-cupti-cu12==12.1.105
149+
- nvidia-cuda-nvrtc-cu12==12.1.105
150+
- nvidia-cuda-runtime-cu12==12.1.105
151+
- nvidia-cudnn-cu12==9.1.0.70
152+
- nvidia-cufft-cu12==11.0.2.54
153+
- nvidia-curand-cu12==10.3.2.106
154+
- nvidia-cusolver-cu12==11.4.5.107
155+
- nvidia-cusparse-cu12==12.1.0.106
156+
- nvidia-ml-py==12.555.43
157+
- nvidia-nccl-cu12==2.20.5
158+
- nvidia-nvjitlink-cu12==12.5.82
159+
- nvidia-nvtx-cu12==12.1.105
160+
- openai==1.45.0
161+
- opencv-python==4.10.0.84
162+
- openpyxl==3.1.5
163+
- overrides==7.7.0
164+
- packaging==24.1
165+
- pandas==2.2.2
166+
- pandocfilters==1.5.1
167+
- parso==0.8.4
168+
- peft==0.12.0
169+
- pexpect==4.9.0
170+
- phx-class-registry==4.1.0
171+
- pillow==10.4.0
172+
- pip==24.2
173+
- platformdirs==4.2.2
174+
- prometheus-client==0.20.0
175+
- prompt-toolkit==3.0.47
176+
- protobuf==5.27.2
177+
- psutil==6.0.0
178+
- ptyprocess==0.7.0
179+
- pure-eval==0.2.3
180+
- py-cpuinfo==9.0.0
181+
- pyarrow==17.0.0
182+
- pyarrow-hotfix==0.6
183+
- pycparser==2.22
184+
- pydantic==2.8.2
185+
- pydantic-core==2.20.1
186+
- pydeck==0.9.1
187+
- pygments==2.18.0
188+
- pyparsing==3.1.2
189+
- python-dateutil==2.9.0.post0
190+
- python-json-logger==2.0.7
191+
- pytz==2024.1
192+
- pyyaml==6.0.1
193+
- pyzmq==26.0.3
194+
- qtconsole==5.5.2
195+
- qtpy==2.4.1
196+
- referencing==0.35.1
197+
- regex==2024.7.24
198+
- requests==2.32.3
199+
- rfc3339-validator==0.1.4
200+
- rfc3986-validator==0.1.1
201+
- rich==13.7.1
202+
- rpds-py==0.19.1
203+
- safetensors==0.4.3
204+
- scikit-image==0.24.0
205+
- scikit-learn==1.5.1
206+
- scipy==1.14.0
207+
- send2trash==1.8.3
208+
- sentence-transformers==3.0.1
209+
- sentencepiece==0.2.0
210+
- sentry-sdk==2.11.0
211+
- setproctitle==1.3.3
212+
- sgmllib3k==1.0.0
213+
- six==1.16.0
214+
- smmap==5.0.1
215+
- sniffio==1.3.1
216+
- socksio==1.0.0
217+
- soupsieve==2.5
218+
- stack-data==0.6.3
219+
- streamlit==1.37.0
220+
- sympy==1.13.1
221+
- tenacity==8.5.0
222+
- termcolor==2.4.0
223+
- terminado==0.18.1
224+
- threadpoolctl==3.5.0
225+
- tifffile==2024.7.24
226+
- tiktoken==0.7.0
227+
- timeout-decorator==0.5.0
228+
- tinycss2==1.3.0
229+
- tokenizers==0.19.1
230+
- toml==0.10.2
231+
- tomli==2.0.1
232+
- toolz==0.12.1
233+
- torch==2.4.0
234+
- torchvision==0.19.0
235+
- tornado==6.4.1
236+
- tqdm==4.66.4
237+
- traitlets==5.14.3
238+
- transformers==4.43.3
239+
- transformers-stream-generator==0.0.5
240+
- triton==3.0.0
241+
- types-python-dateutil==2.9.0.20240316
242+
- typing-extensions==4.12.2
243+
- tzdata==2024.1
244+
- uri-template==1.3.0
245+
- urllib3==2.2.2
246+
- wandb==0.17.5
247+
- watchdog==4.0.1
248+
- wcwidth==0.2.13
249+
- webcolors==24.6.0
250+
- webencodings==0.5.1
251+
- websocket-client==1.8.0
252+
- widgetsnbextension==4.0.11
253+
- xxhash==3.4.1
254+
- yapf==0.40.2
255+
- yarl==1.9.4
256+
- zipp==3.19.2

0 commit comments

Comments
 (0)