disi-unibo-nlp / nlg-metricverse Star 84 Code Issues Pull requests Discussions [COLING22] An End-to-End Library for Evaluating Natural Language Generation visualization python natural-language-processing metrics pytorch language-models natural-language-generation nlg-evaluation Updated Dec 18, 2023 Python
ChanLiang / CONNER Star 27 Code Issues Pull requests The implementation for EMNLP 2023 paper ”Beyond Factuality: A Comprehensive Evaluation of Large Language Models as Knowledge Generators“ llama factuality hallucinations large-language-models nlg-evaluation chatgpt llm-evaluation emnlp2023 Updated Jan 22, 2024 Python
rashad101 / RoMe Star 9 Code Issues Pull requests PyTorch code for ACL 2022 paper: RoMe: A Robust Metric for Evaluating Natural Language Generation https://aclanthology.org/2022.acl-long.387/ nlp tree deep-learning ted edit natural-language-generation nlg evaluation-metrics distance-calculation nlg-dataset earth-movers-distance tree-edit-distance nlg-evaluation Updated Aug 13, 2023 Python
megagonlabs / llm-longeval Star 4 Code Issues Pull requests 💵 Code for Less is More for Long Document Summary Evaluation by LLMs (Wu, Iso et al; EACL 2024) nlp evaluation-metrics gpt-4 long-document-summarization llm nlg-evaluation eacl-2024 Updated Feb 22, 2024 Python
lidamsoukaina / NLG_Evaluation_Metrics Star 1 Code Issues Pull requests nlp nlg-evaluation Updated Apr 22, 2023 Jupyter Notebook