Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Alternative to LLM-as-a-judge! #24

Open
milangritta opened this issue Oct 21, 2024 · 0 comments
Open

New Alternative to LLM-as-a-judge! #24

milangritta opened this issue Oct 21, 2024 · 0 comments

Comments

@milangritta
Copy link

Hello Clementine and the Evaluation Community,

We would like to introduce you to our new metric, HumanRankEval, an alternative to the popular 'llm-as-a-judge'. Instead of using the LLM to judge machine-generated text, we use human-generated text to 'judge' the LLM! :) Please take a look, thank you very much! Let us know what you think :)

NAACL '24 PAPER LINK: https://aclanthology.org/2024.naacl-long.456/
CODE: https://github.com/huawei-noah/noah-research/tree/master/NLP/HumanRankEval
DATA: https://huggingface.co/datasets/huawei-noah/human_rank_eval

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant