Skip to content

🎁[ChatGPT4MT] Towards Making the Most of ChatGPT for Machine Translation

Notifications You must be signed in to change notification settings

Romainpkq/ChatGPT4MT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Towards Making the Most of ChatGPT for Machine Translation

Towards Making the Most of ChatGPT for Machine Translation. (Full report, Findings of EMNLP 2023 accpeted version)

This repository releases the test sets evaluated by ChatGPT API (gpt-3.5-turbo-0301), for the replication of the study.

Abstract

image

Data and Evaluations

We evaluate the performance of the models on the Flores-200 and WMT19 Bio and News test sets. The task statistics are shown as follows:

image

Results and Findings

  1. ChatGPT's performance largely depends on the temperatures, especially in difficult languages. Generally, setting a lower temperature can result in higher performance.

    The relationship between temperature and ChatGPT's performance:

imageTSP
  1. Emphasizing the task information in prompts can further improve ChatGPT's performance, especially in complex tasks.

    Influence of Task-Specific Prompts (TPS) on ChatGPT:

image
  1. Introducing the correct domain information consistently improves ChatGPT's performance while wrong domain information leads to significant degradation in performance.

    Influence of Domain-Specific Prompts (DPS) on ChatGPT:

image
  1. When tackling non-English-centric tasks (both the input and expected output are non-English), ChatGPT may generate hallucinations, which should be paid more attention to by the MT/NLP community.

    The number of sentences that need to be post-preprocessed in different settings:

image
  1. CoT leads to word-by-word translation behavior, thus bringing significant translation degradation.

    The effect of CoT on ChatGPT:

image

Please refer to our full report for more details.

Media Coverage

Citation

If you find this work helpful, please consider citing as follows:

@inproceedings{Peng2023ChatGPT4MT,
  title={Towards Making the Most of ChatGPT for Machine Translation},
  author={Peng, Keqin and Ding, Liang and Zhong, Qihuang and Shen, Li and Liu, Xuebo and Zhang, Min and Ouyang, Yuanxin and Tao, Dacheng},
  booktitle={Findings of EMNLP 2023},
  url={https://aclanthology.org/2023.findings-emnlp.373},
  year={2023}
}

Releases

No releases published

Packages

No packages published

Languages