This is the official code repository for the paper EPiC: Towards Lossless Speedup for Reasoning Training through Edge-Preserving CoT Condensation.
Large language models (LLMs) have shown remarkable reasoning capabilities when trained with chain-of-thought (CoT) supervision. However, the long and verbose CoT traces, especially those distilled from large reasoning models (LRMs) such as DeepSeek-R1, significantly increase training costs during the distillation process, where a non-reasoning base model is taught to replicate the reasoning behavior of an LRM. In this work, we study the problem of CoT condensation for resource-efficient reasoning training, aimed at pruning intermediate reasoning steps (i.e., thoughts) in CoT traces, enabling supervised model training on length-reduced CoT data while preserving both answer accuracy and the model’s ability to generate coherent reasoning. Our rationale is that CoT traces typically follow a three-stage structure: problem understanding, exploration, and solution convergence. Through empirical analysis, we find that retaining the structure of the reasoning trace, especially the early stage of problem understanding (rich in reflective cues) and the final stage of solution convergence (which closely relates to the final answer), is sufficient to achieve lossless reasoning supervision. To this end, we propose an E dge-Preserv ing Condensation method, EPiC, which selectively retains only the initial and final segments of each CoT trace while discarding the middle portion. This design draws an analogy to preserving the “edge” of a reasoning trajectory, capturing both the initial problem framing and the final answer synthesis, to maintain logical continuity. Experiments across multiple model families (Qwen and LLaMA) and benchmarks show that EPiC reduces training time by over 34% while achieving lossless reasoning accuracy on MATH500, comparable to full CoT supervision. To the best of our knowledge, this is the first study to explore thought-level CoT condensation for efficient reasoning model distillation.
You can install the required dependencies using the following command:
conda env create -f eval.yml
conda activate eval
pip install lighteval@git+https://github.com/huggingface/lighteval.git@ed084813e0bd12d82a06d9f913291fdbee774905
pip install lighteval[math]
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu121
pip install vllm==0.7.2
pip install deepspeed
pip install trl@git+https://github.com/huggingface/trl.git@69ad852e5654a77f1695eb4c608906fe0c7e8624
pip install liger-kernel==0.5.3
Please feel free to use the following models for your research:
Qwen2.5-math-7B-Instruct (OpenR1Math-EPiC): 🤗 flyingbugs/Qwen2.5-Math-7B-Instruct-keep-0.5-end-start-0.5-eos
Qwen2.5-math-7B-Instruct (Generalthoughts-EPiC): 🤗 flyingbugs/Qwen2.5-Math-7B-GeneralThought-pruned-keep-0.5-end-start-0.5-eos
Qwen2.5-7B-Instruct (OpenR1Math-EPiC): 🤗 flyingbugs/Qwen2.5-instruct-7B-openr1-math-edge
Llama3.1-8B-Instruct (OpenR1Math-EPiC): 🤗 flyingbugs/Llama-3.1-8B-Instruct-open-r1-prune-0.5-0.5
First generate data by using src/openr1/utils/data_analysis.py
, or using existing condensed dataset on huggingface : 🤗 OpenR1Math-EPiC and 🤗 Generalthoughts-EPiC.
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 accelerate launch --config_file recipes/accelerate_configs/zero3.yaml src/open_r1/sft.py --config recipes/Qwen2.5-Math-7B/config.yaml --dataset_name $dataset_name --hub_model_id $hub_id_saved --output_dir $target_dir
We followed Open-R1 repo evaluation kit.
MODEL=xxx
MODEL_ARGS="pretrained=$MODEL,dtype=bfloat16,max_model_length=32768,gpu_memory_utilization=0.95,generation_parameters={max_new_tokens:9000,temperature:0.6,top_p:0.95}"
OUTPUT_DIR=data/evals/$MODEL
TASK=aime24
GPU=0
CUDA_VISIBLE_DEVICES=$GPU lighteval vllm $MODEL_ARGS "custom|$TASK|0|0" \
--custom-tasks src/open_r1/evaluate.py \
--use-chat-template \
--output-dir $OUTPUT_DIR \
--save-details
TASK=math_500
CUDA_VISIBLE_DEVICES=$GPU lighteval vllm $MODEL_ARGS "custom|$TASK|0|0" \
--custom-tasks src/open_r1/evaluate.py \
--use-chat-template \
--output-dir $OUTPUT_DIR \
--save-details
MODEL_ARGS="pretrained=$MODEL,dtype=bfloat16,max_model_length=32768,gpu_memory_utilization=0.95,generation_parameters={max_new_tokens:4000,temperature:0.6,top_p:0.95}
TASK=gpqa:diamond
CUDA_VISIBLE_DEVICES=$GPU lighteval vllm $MODEL_ARGS "custom|$TASK|0|0" \
--custom-tasks src/open_r1/evaluate.py \
--use-chat-template \
--output-dir $OUTPUT_DIR \
--save-details
Thank you for Open-R1 repo for providing the code, we build our code based on it.
@article{jia2025epic,
title={EPiC: Towards Lossless Speedup for Reasoning Training through Edge-Preserving CoT Condensation},
author={Jia, Jinghan and Reisizadeh, Hadi and Fan, Chongyu and Baracaldo, Nathalie and Hong, Mingyi and Liu, Sijia},
journal={arXiv preprint arXiv:2506.04205},
year={2025}
}