You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
补充一下,RL部分的数据集我看到The training data of RL are chain-ofthought-format questions related to GSM8K and MATH from the SFT data, which consists of around 144K questions. 这里的数据也有点儿好奇,可能是我前面的估算错误导致的
恭喜你们的效果取得了非常好的效果! 我有一个问题想要请教一下各位大佬:
我想了解一下SFT的数据分布。看到training examples 是 776K,但是可能是我对于数据集的估算可能出现了一些问题。English mathematical datasets:GSM8K和MATH部分我看是根据ToRA进行标注的,所以根据ToRA那篇文章的估算应该是69K,MathInstruct 260K 的子集不是特别好估算我就按照200K来估算,Lila-OOD是32.2K。总计300K左右,而且MathInstruct里面的MATH和GSM8K应该会与前面的69K的数据重复。那么Chinese mathematical datasets的数据应该是476K,这个数据集是你们收集,后续会开源的嘛?
The text was updated successfully, but these errors were encountered: