Performance on Out-of-Distribution Datasets #21

biandh · 2025-02-13T16:03:02Z

Have you tried evaluating the performance of the model after RL on datasets that are out of the distribution of the RL training data?
Is there still a noticeable improvement ?

michaelzhiluo · 2025-02-13T22:43:28Z

We have some more evaluation on Olympiad and Minera:
We do show improvement over Deepseek's original distilled model, it was not that good on Minerva but we crush others on OlympiadBench.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance on Out-of-Distribution Datasets #21

Performance on Out-of-Distribution Datasets #21

biandh commented Feb 13, 2025

michaelzhiluo commented Feb 13, 2025

Performance on Out-of-Distribution Datasets #21

Performance on Out-of-Distribution Datasets #21

Comments

biandh commented Feb 13, 2025

michaelzhiluo commented Feb 13, 2025