Skip to content

issues Search Results · repo:OpenBMB/InfiniteBench language:Python

Filter by

26 results
 (52 ms)

26 results

inOpenBMB/InfiniteBench (press backspace or delete to remove)

Thanks for the work on this benchmark. I was wondering why the baseline accuracies on code.Debug are so low. de.Debug | 37.06% | 5% | 17.77% | 5% | 9.14% | 13.96% | 7.36% Since it s multiple choice ...
  • seyuboglu
  • Opened 
    on Jan 9
  • #31

I hope this message find you well. When I use the InfiniteBench to compute score, if the response of the model is None , it will raise an mistake,therefore, in our result table with the results of gpt4 ...
  • unicorneeee
  • Opened 
    on Dec 11, 2024
  • #30

如题,请问repo首页的examples size是最新的吗?我从huggingface下载了infinitebench数据集,发现数据集有些文件的size和repo首页写的size对不上,比如longbook_sum_eng.jsonl里面有148个examples, 首页上写的是 En.Sum #examples 103. 是这期间数据集进行了更改吗, 还是有其他我理解错了的对方? 期待回复~ ...
  • lepangdan
  • Opened 
    on Dec 5, 2024
  • #29

我看到eval_yarn_mistral.py中可以对输入长度进行截断,请问假如截断到64k,32k或更小,是否有将正确答案所在位置截去的可能(例如kv_retrieval中正确的键值对)。 期待您的回复,谢谢!
  • Dori-Nilou
  • Opened 
    on Dec 1, 2024
  • #28

Thanks for your great work! Could you kindly advise on how to support the models in the LLaMA series?
  • ydyhello
  • 2
  • Opened 
    on Nov 25, 2024
  • #27

ID 41,42 的内容是斗破苍穹的内容,但是context里面并没有 “ 萧峰”这个人物出现。 是entity的替换没有放到context里面还是问题本身写错了?
  • Zeyu1994
  • 1
  • Opened 
    on Oct 27, 2024
  • #26

请问咱们针对小说的问答数据集,问题和答案都是怎么获得的?纯人工标注的吗?(我看论文中没有明确提到这一点,麻烦指教🙏
  • ktlKTL
  • Opened 
    on Sep 24, 2024
  • #25

Are the GPT4 results evaluated on a different set of longbook_qa_eng? The ground_truth fields in results/gpt4/preds_longbook_qa_eng.jsonl don t seem match with ground_truth in results/chatglm3/preds_longbook_qa_eng.jsonl ...
  • xuandif-cmu
  • 1
  • Opened 
    on Aug 20, 2024
  • #21

When I try to run the following code in colab: from datasets import load_dataset dataset = load_dataset( xinrongzhang2022/InfiniteBench ) I get the following error: DatasetGenerationCastError: An error ...
  • BenHamm
  • 4
  • Opened 
    on Jul 25, 2024
  • #19

https://github.com/OpenBMB/InfiniteBench/blob/main/src/compute_scores.py#L238 1. only one reference label is used for comparison, better loop around each answer in label, e.g., label=[ ECKER , COMMANDER ...
  • Xianchao-Wu
  • 1
  • Opened 
    on Jul 17, 2024
  • #18
Issue origami icon

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues
ProTip! 
Press the
/
key to activate the search input again and adjust your query.
Issue origami icon

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues
ProTip! 
Press the
/
key to activate the search input again and adjust your query.
Issue search results · GitHub