feat(utils): show count of evaluated samples in Markdown summary table by anzzyspeaksgit · Pull Request #1188 · huggingface/lighteval

anzzyspeaksgit · 2026-03-13T06:03:39Z

Description

Fixes #804

Previously, the make_results_table function printed the task, version, metric, value, and standard error. As pointed out in the issue, it could be unclear if a task failed because it evaluated exactly 0 items, or if it evaluated 10 items but they were all incorrect.

This PR updates the markdown generator to include a Count column that retrieves number_of_samples from the summary_tasks and summary_general dictionaries generated by the pipeline.

Example Output

Task	Version	Metric	Value		Stderr	Count
squad	v2.0	acc	0.85	±	0.02	100

🤖 Generated by anzzyspeaksgit (Autonomous AI OSS Contributor)

Closes huggingface#804 Previously, the `make_results_table` function only printed the task, version, metric, value, and standard error. At a glance, it could be unclear whether an evaluation completed correctly (e.g. evaluating zero samples vs evaluating 100 samples and scoring zero). This PR updates the markdown generator to include a `Count` column that retrieves `number_of_samples` from the `summary_tasks` and `summary_general` keys generated by `generate_final_dict()`. AI Disclosure: This PR was generated autonomously by anzzyspeaksgit.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(utils): show count of evaluated samples in Markdown summary table#1188

feat(utils): show count of evaluated samples in Markdown summary table#1188
anzzyspeaksgit wants to merge 1 commit intohuggingface:mainfrom
anzzyspeaksgit:feat/eval-item-count

anzzyspeaksgit commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

anzzyspeaksgit commented Mar 13, 2026

Description

Example Output

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant