open-compass · Liqu1d-G · Mar 5, 2025 · Mar 4, 2025 · Mar 5, 2025 · Mar 5, 2025
diff --git a/README.md b/README.md
@@ -2,16 +2,21 @@
 
 [![license](https://img.shields.io/github/license/InternLM/opencompass.svg)](./LICENSE)
 
-This is the repository for our ANAH series of papers, containing [ANAH: Analytical Annotation of Hallucinations in Large Language Models](https://arxiv.org/abs/2405.20315) and [ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models](https://arxiv.org/abs/2407.04693). 
+This is the repository for our ANAH series of papers, which aims to reduce hallucinations in LLM through research involving benchmarking, detection, and mitigation of hallucinations:
+
+- **[Benchmark]** [ANAH: Analytical Annotation of Hallucinations in Large Language Models](https://arxiv.org/abs/2405.20315)
+- **[Detection]** [ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models](https://arxiv.org/abs/2407.04693)
+- **[Mitigation]** [Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs](http://arxiv.org/abs/2503.02846)
 
 The repo contains:
 
 + The [data](#huggingface-dataset) for training and evaluating the LLM which consists of sentence-level hallucination annotations.
 + The [model](#huggingface-model) for annotating the hallucination.
 + The [code](#evaluation) for evaluating the hallucinations level of LLM-generated content and the LLMs' ability to annotate hallucination.
-
++ The [code](#maskdpo-training) for performing fine-grained factuality alignment.
 
 ## 🚀 What's New
+- **[2025.01.23]** Mask-DPO has been accepted by the ICLR 2025. 🎉🎉🎉
 - **[2024.09.26]** ANAH-v2 has been accepted by the NeurIPS 2024. 🎉🎉🎉
 - **[2024.07.12]** ANAH-v2 [Annotator](https://huggingface.co/opencompass/anah-v2) has been open-sourced. 🔥🔥🔥
 - **[2024.07.03]** ANAH [Annotator-7B](https://huggingface.co/opencompass/anah-7b) & [20B](https://huggingface.co/opencompass/anah-20b) have been open-sourced.  🔥🔥🔥
@@ -46,10 +51,24 @@ Through iterative self-training, we simultaneously and progressively scale up th
 
 The final dataset encompasses both over ∼3k topics, ∼196k model responses, and ∼822k annotated sentences, in English and Chinese.
 
+The final hallucination annotator (detector) with only 7B parameters surpasses the performance of GPT-4 and obtains new state-of-the-art hallucination detection results on HaluEval and HalluQA by zero-shot inference.
+
 <p align="center">
   <img src="docs/figure/teaser-v2.jpg" height="500">
 </p>
 
+## Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs
+
+[![arXiv](https://img.shields.io/badge/arXiv-2503.02846-b31b1b.svg)](http://arxiv.org/abs/2503.02846)
+
+Mask-DPO is a fine-grained factuality alignment method based on Direct Preference Optimization (DPO).
+
+Incorporating sentence-level factuality as mask signals, Mask-DPO only learns from factually correct sentences in the preferred samples and prevents the penalty on factual contents in the not preferred samples, which resolves the ambiguity in the preference learning.
+
+
+<p align="center">
+  <img src="docs/figure/maskdpo.png" height="300">
+</p>
 
 ## 🤗 HuggingFace Model & Dataset
 <a name="huggingface-dataset"></a>
@@ -147,9 +166,64 @@ python -u ./eval/anah_v1/eval.py \
     --eval_sorce_path {your_evaluation_result_path} \
 ```
 
+<a name="maskdpo-training"></a>
+## 🚄 Factuality Alignment Tutorial
+
+### 1. Install Dependencies
+
+Mask-DPO utilizes [XTuner](https://github.com/InternLM/xtuner) as the training engine.
+
+```bash
+conda env create -f maskdpo.yml 
+```
+
+### 2. Prepare Fine-grained Preference Data
+
+You need to prepare sentence-level facutality preference data in the following format:
+
+```json
+{
+    "prompt": [{"role": "user", "content": "..."}],
+    "chosen": [{"role": "assistant", "content": "..."}],
+    "chosen_item": {
+        "sents": ["sent1", "sent2", "..."],
+        "type": ["hallucination", "no_hallucination", "..."]
+    },
+    "rejected": [{"role": "assistant", "content": "..."}],
+    "rejected_item": {
+        "sents": ["sent1", "sent2", "..."],
+        "type": ["hallucination", "no_hallucination", "..."]
+    }
+}
+```
+
+where `chosem_item` is the fine-grained factual information about `chosem`, `sents` is a sentence-level slice of the `content` in `chosen`, and `type` is the hallucination situation of the corresponding sentence.
+`rejected_item` is the same.
+
+We recommend you to use [ANAH-v2](https://huggingface.co/opencompass/anah-v2) for fine-grained hallucination annotation of your data. Of course you can use other methods as well.
+
+
+### 3. Training
+
+After putting the data and initial model paths to the corresponding locations in config, you can use the following command to train the model.
+
+```bash
+python -m torch.distributed.run \
+  --nproc_per_node=4 \
+  --nnodes=1 \
+  --node_rank=0 \
+  --rdzv_id=1234 \
+  --rdzv_backend=c10d \
+  --rdzv_endpoint=127.0.0.1:1234 \
+  ./maskdpo/train.py \
+  ./maskdpo/example_config.py \
+  --deepspeed deepspeed_zero3 \
+  --launcher pytorch
+```
+
 ## ❤️ Acknowledgements
 
-ANAH is built with [InternLM](https://github.com/InternLM/InternLM) and [LMDeploy](https://github.com/InternLM/lagent). Thanks for their awesome work!
+ANAH is built with [InternLM](https://github.com/InternLM/InternLM), [Xtuner](https://github.com/InternLM/xtuner) and [LMDeploy](https://github.com/InternLM/lagent). Thanks for their awesome work!
 
 ## 🖊️ Citation
 
@@ -168,6 +242,12 @@ If you find this project useful in your research, please consider citing:
   journal={arXiv preprint arXiv:2407.04693},
   year={2024}
 }
+
+@inproceedings{gumask,
+  title={Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs},
+  author={Gu, Yuzhe and Zhang, Wenwei and Lyu, Chengqi and Lin, Dahua and Chen, Kai},
+  booktitle={The Thirteenth International Conference on Learning Representations}
+}
 ```
 
 ## 💳 License

diff --git a/docs/figure/maskdpo.png b/docs/figure/maskdpo.png
diff --git a/maskdpo-env.yml b/maskdpo-env.yml
@@ -0,0 +1,256 @@
+name: maskdpo
+channels:
+  - https://repo.anaconda.com/pkgs/main
+  - defaults
+dependencies:
+  - _libgcc_mutex=0.1=main
+  - _openmp_mutex=5.1=1_gnu
+  - bzip2=1.0.8=h5eee18b_6
+  - ca-certificates=2024.7.2=h06a4308_0
+  - ld_impl_linux-64=2.38=h1181459_1
+  - libffi=3.4.4=h6a678d5_1
+  - libgcc-ng=11.2.0=h1234567_1
+  - libgomp=11.2.0=h1234567_1
+  - libstdcxx-ng=11.2.0=h1234567_1
+  - libuuid=1.41.5=h5eee18b_0
+  - ncurses=6.4=h6a678d5_0
+  - openssl=3.0.14=h5eee18b_0
+  - python=3.10.14=h955ad1f_1
+  - readline=8.2=h5eee18b_0
+  - setuptools=69.5.1=py310h06a4308_0
+  - sqlite=3.45.3=h5eee18b_0
+  - tk=8.6.14=h39e8969_0
+  - wheel=0.43.0=py310h06a4308_0
+  - xz=5.4.6=h5eee18b_1
+  - zlib=1.2.13=h5eee18b_1
+  - pip:
+      - accelerate==0.33.0
+      - addict==2.4.0
+      - aiohappyeyeballs==2.3.2
+      - aiohttp==3.10.0
+      - aiosignal==1.3.1
+      - altair==5.3.0
+      - annotated-types==0.7.0
+      - anyio==4.4.0
+      - argon2-cffi==23.1.0
+      - argon2-cffi-bindings==21.2.0
+      - arrow==1.3.0
+      - arxiv==2.1.3
+      - asttokens==2.4.1
+      - async-lru==2.0.4
+      - async-timeout==4.0.3
+      - attrs==23.2.0
+      - babel==2.15.0
+      - backports-strenum==1.3.1
+      - beautifulsoup4==4.12.3
+      - bitsandbytes==0.43.3
+      - bleach==6.1.0
+      - blinker==1.8.2
+      - brotli==1.1.0
+      - cachetools==5.4.0
+      - certifi==2024.7.4
+      - cffi==1.16.0
+      - charset-normalizer==3.3.2
+      - click==8.1.7
+      - colorama==0.4.6
+      - comm==0.2.2
+      - contourpy==1.2.1
+      - cycler==0.12.1
+      - datasets==2.20.0
+      - debugpy==1.8.2
+      - decorator==5.1.1
+      - deepspeed==0.14.4
+      - defusedxml==0.7.1
+      - dill==0.3.8
+      - distro==1.9.0
+      - docker-pycreds==0.4.0
+      - duckduckgo-search==5.3.1b1
+      - einops==0.8.0
+      - et-xmlfile==1.1.0
+      - exceptiongroup==1.2.2
+      - executing==2.0.1
+      - fastjsonschema==2.20.0
+      - feedparser==6.0.11
+      - filelock==3.15.4
+      - fire==0.6.0
+      - flash-attn==2.6.3
+      - fonttools==4.53.1
+      - fqdn==1.5.1
+      - frozenlist==1.4.1
+      - fsspec==2024.5.0
+      - func-timeout==4.3.5
+      - gitdb==4.0.11
+      - gitpython==3.1.43
+      - griffe==0.48.0
+      - gyztools==0.1
+      - h11==0.14.0
+      - h2==4.1.0
+      - hjson==3.1.0
+      - hpack==4.0.0
+      - httpcore==1.0.5
+      - httpx==0.27.0
+      - huggingface-hub==0.24.3
+      - hyperframe==6.0.1
+      - idna==3.7
+      - imageio==2.34.2
+      - importlib-metadata==8.2.0
+      - ipykernel==6.29.5
+      - ipython==8.26.0
+      - ipywidgets==8.1.3
+      - isoduration==20.11.0
+      - jedi==0.19.1
+      - jieba==0.42.1
+      - jinja2==3.1.4
+      - jiter==0.5.0
+      - joblib==1.4.2
+      - json5==0.9.25
+      - jsonlines==4.0.0
+      - jsonpointer==3.0.0
+      - jsonschema==4.23.0
+      - jsonschema-specifications==2023.12.1
+      - jupyter==1.0.0
+      - jupyter-client==8.6.2
+      - jupyter-console==6.6.3
+      - jupyter-core==5.7.2
+      - jupyter-events==0.10.0
+      - jupyter-lsp==2.2.5
+      - jupyter-server==2.14.2
+      - jupyter-server-terminals==0.5.3
+      - jupyterlab==4.2.4
+      - jupyterlab-pygments==0.3.0
+      - jupyterlab-server==2.27.3
+      - jupyterlab-widgets==3.0.11
+      - kiwisolver==1.4.5
+      - lagent==0.2.3
+      - lazy-loader==0.4
+      - markdown-it-py==3.0.0
+      - markupsafe==2.1.5
+      - matplotlib==3.9.1
+      - matplotlib-inline==0.1.7
+      - mdurl==0.1.2
+      - mistune==3.0.2
+      - mmengine==0.10.4
+      - modelscope==1.16.1
+      - mpi4py-mpich==3.1.5
+      - mpmath==1.3.0
+      - multidict==6.0.5
+      - multiprocess==0.70.16
+      - nbclient==0.10.0
+      - nbconvert==7.16.4
+      - nbformat==5.10.4
+      - nest-asyncio==1.6.0
+      - networkx==3.3
+      - ninja==1.11.1.1
+      - notebook==7.2.1
+      - notebook-shim==0.2.4
+      - numpy==1.26.4
+      - nvidia-cublas-cu12==12.1.3.1
+      - nvidia-cuda-cupti-cu12==12.1.105
+      - nvidia-cuda-nvrtc-cu12==12.1.105
+      - nvidia-cuda-runtime-cu12==12.1.105
+      - nvidia-cudnn-cu12==9.1.0.70
+      - nvidia-cufft-cu12==11.0.2.54
+      - nvidia-curand-cu12==10.3.2.106
+      - nvidia-cusolver-cu12==11.4.5.107
+      - nvidia-cusparse-cu12==12.1.0.106
+      - nvidia-ml-py==12.555.43
+      - nvidia-nccl-cu12==2.20.5
+      - nvidia-nvjitlink-cu12==12.5.82
+      - nvidia-nvtx-cu12==12.1.105
+      - openai==1.45.0
+      - opencv-python==4.10.0.84
+      - openpyxl==3.1.5
+      - overrides==7.7.0
+      - packaging==24.1
+      - pandas==2.2.2
+      - pandocfilters==1.5.1
+      - parso==0.8.4
+      - peft==0.12.0
+      - pexpect==4.9.0
+      - phx-class-registry==4.1.0
+      - pillow==10.4.0
+      - pip==24.2
+      - platformdirs==4.2.2
+      - prometheus-client==0.20.0
+      - prompt-toolkit==3.0.47
+      - protobuf==5.27.2
+      - psutil==6.0.0
+      - ptyprocess==0.7.0
+      - pure-eval==0.2.3
+      - py-cpuinfo==9.0.0
+      - pyarrow==17.0.0
+      - pyarrow-hotfix==0.6
+      - pycparser==2.22
+      - pydantic==2.8.2
+      - pydantic-core==2.20.1
+      - pydeck==0.9.1
+      - pygments==2.18.0
+      - pyparsing==3.1.2
+      - python-dateutil==2.9.0.post0
+      - python-json-logger==2.0.7
+      - pytz==2024.1
+      - pyyaml==6.0.1
+      - pyzmq==26.0.3
+      - qtconsole==5.5.2
+      - qtpy==2.4.1
+      - referencing==0.35.1
+      - regex==2024.7.24
+      - requests==2.32.3
+      - rfc3339-validator==0.1.4
+      - rfc3986-validator==0.1.1
+      - rich==13.7.1
+      - rpds-py==0.19.1
+      - safetensors==0.4.3
+      - scikit-image==0.24.0
+      - scikit-learn==1.5.1
+      - scipy==1.14.0
+      - send2trash==1.8.3
+      - sentence-transformers==3.0.1
+      - sentencepiece==0.2.0
+      - sentry-sdk==2.11.0
+      - setproctitle==1.3.3
+      - sgmllib3k==1.0.0
+      - six==1.16.0
+      - smmap==5.0.1
+      - sniffio==1.3.1
+      - socksio==1.0.0
+      - soupsieve==2.5
+      - stack-data==0.6.3
+      - streamlit==1.37.0
+      - sympy==1.13.1
+      - tenacity==8.5.0
+      - termcolor==2.4.0
+      - terminado==0.18.1
+      - threadpoolctl==3.5.0
+      - tifffile==2024.7.24
+      - tiktoken==0.7.0
+      - timeout-decorator==0.5.0
+      - tinycss2==1.3.0
+      - tokenizers==0.19.1
+      - toml==0.10.2
+      - tomli==2.0.1
+      - toolz==0.12.1
+      - torch==2.4.0
+      - torchvision==0.19.0
+      - tornado==6.4.1
+      - tqdm==4.66.4
+      - traitlets==5.14.3
+      - transformers==4.43.3
+      - transformers-stream-generator==0.0.5
+      - triton==3.0.0
+      - types-python-dateutil==2.9.0.20240316
+      - typing-extensions==4.12.2
+      - tzdata==2024.1
+      - uri-template==1.3.0
+      - urllib3==2.2.2
+      - wandb==0.17.5
+      - watchdog==4.0.1
+      - wcwidth==0.2.13
+      - webcolors==24.6.0
+      - webencodings==0.5.1
+      - websocket-client==1.8.0
+      - widgetsnbextension==4.0.11
+      - xxhash==3.4.1
+      - yapf==0.40.2
+      - yarl==1.9.4
+      - zipp==3.19.2