[callback] Add flexible callback system with YAML configuration, HuggingFace Trainer support, and usage examples #5 #9024

kahlun · 2025-08-26T01:02:34Z

What does this PR do?
Fixes # (issue)
This PR introduces a robust callback plugin system for LLaMA-Factory, including:

Support for registering custom and built-in callbacks via YAML configuration
Callback argument injection (including environment variable substitution)
Seamless integration with HuggingFace Trainer callbacks
Example YAML and Python files demonstrating callback usage
End-to-end test cases for custom callback registration and execution
Documentation updates for callback development and usage (to be completed in the main repo docs)

Before submitting
Did you read the contributor guideline?
Did you write any new necessary tests?
ran following command/code.

a) llamafactory-cli train examples/callback/llama3_lora_sft_callback.yaml

b) llamafactory-cli train --stage sft --do_train True --model_name_or_path meta-llama/Llama-3.2-1B-Instruct --preprocessing_num_workers 16 --finetuning_type lora --template llama3 --flash_attn auto --dataset_dir data --dataset identity --cutoff_len 2048 --learning_rate 5e-05 --num_train_epochs 3.0 --max_samples 100000 --per_device_train_batch_size 2 --gradient_accumulation_steps 8 --lr_scheduler_type cosine --max_grad_norm 1.0 --logging_steps 5 --save_steps 100 --warmup_steps 0 --packing False --enable_thinking True --report_to none --output_dir saves/Llama-3.2-1B-Instruct/lora/train_2025-08-25-09-11-512 --plot_loss True --trust_remote_code True --ddp_timeout 180000000 --include_num_input_tokens_seen True --optim adamw_torch --lora_rank 8 --lora_alpha 16 --lora_dropout 0 --lora_target all

c) llamafactory-cli webui
-- fp32
Llama-3.2-1B-Instruct
dataset identity

d) generated command, and can run:
llamafactory-cli train
--stage sft
--do_train True
--model_name_or_path meta-llama/Llama-3.2-1B-Instruct
--preprocessing_num_workers 16
--finetuning_type lora
--template llama3
--flash_attn auto
--dataset_dir data
--dataset identity
--cutoff_len 2048
--learning_rate 5e-05
--num_train_epochs 3.0
--max_samples 100000
--per_device_train_batch_size 2
--gradient_accumulation_steps 8
--lr_scheduler_type cosine
--max_grad_norm 1.0
--logging_steps 5
--save_steps 100
--warmup_steps 0
--packing False
--enable_thinking True
--report_to none
--output_dir saves/Llama-3.2-1B-Instruct/lora/train_2025-08-25-09-11-51
--plot_loss True
--trust_remote_code True
--ddp_timeout 180000000
--include_num_input_tokens_seen True
--optim adamw_torch
--lora_rank 8
--lora_alpha 16
--lora_dropout 0
--lora_target all

kahlun added 4 commits August 25, 2025 09:23

[callback] support callback hook,hugginface through .yaml argument

d3bb05a

[callback] testcase

2218601

[callback] example file that can be use through cli/reference

d522b75

[callback] testcase, add the license header

f05e0a8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[callback] Add flexible callback system with YAML configuration, HuggingFace Trainer support, and usage examples #5 #9024

[callback] Add flexible callback system with YAML configuration, HuggingFace Trainer support, and usage examples #5 #9024

kahlun commented Aug 26, 2025

Uh oh!

Uh oh!

[callback] Add flexible callback system with YAML configuration, HuggingFace Trainer support, and usage examples #5 #9024

Are you sure you want to change the base?

[callback] Add flexible callback system with YAML configuration, HuggingFace Trainer support, and usage examples #5 #9024

Conversation

kahlun commented Aug 26, 2025

Uh oh!

Uh oh!