You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The result generated by DeepSeek-R1-Distill-Qwen-1.5B does not contain THOUGHT_DELIMITER_START but does have THOUGHT_DELIMITER_END. Eventually, I modified it to:
The results were successfully reproduced. I suspect that DeepSeek-R1-Distill-Qwen-1.5B itself has insufficient capability to follow the format, which caused this issue. I hope you can make this adjustment to help subsequent reproductions of the related baseline and prevent similar issues from occurring.
The text was updated successfully, but these errors were encountered:
Thank you for your reply. I downloaded DeepScaleR-1.5B-Preview and tested it again; this situation was not observed. However, DeepSeek-R1-Distill-Qwen-1.5B does have instances where THOUGHT_DELIMITER_START <think> is missing.
Thanks for your valuable work.
When I use your framework to reproduce the baseline performance (DeepSeek-R1-Distill-Qwen-1.5B) and evaluate with your scripts:
All results are zero. I analyzed the generated content, which contains the ground truth.
Later, I found that the evaluation process exited prematurely in this related code
The result generated by DeepSeek-R1-Distill-Qwen-1.5B does not contain
THOUGHT_DELIMITER_START
but does haveTHOUGHT_DELIMITER_END
. Eventually, I modified it to:The results were successfully reproduced. I suspect that DeepSeek-R1-Distill-Qwen-1.5B itself has insufficient capability to follow the format, which caused this issue. I hope you can make this adjustment to help subsequent reproductions of the related baseline and prevent similar issues from occurring.
The text was updated successfully, but these errors were encountered: