You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
使用qwen2.5VL进行sft微调的时候,设置packing,但是发现迭代的次数并没有减少,是哪里设置不对嘛?这是全部的设置
model
“”“
model_name_or_path: Qwen/Qwen2.5-vl
image_max_pixels: 1000000
video_max_pixels: 8192
trust_remote_code: true
enable_liger_kernel: true
use_unsloth_gc: true
flash_attn: auto
packing: true
method
stage: sft
do_train: true
finetuning_type: full
freeze_vision_tower: true
freeze_multi_modal_projector: false
freeze_language_model: true
dataset
dataset: ****
template: qwen2_vl
cutoff_len: 65536
overwrite_cache: true
preprocessing_num_workers: 16
dataloader_num_workers: 4
tokenized_path: tokenized_dataset
output
output_dir: models/output
logging_steps: 10
save_steps: 200
save_total_limit: 5
plot_loss: true
overwrite_output_dir: true
save_only_model: false
report_to: none # choices: [none, wandb, tensorboard, swanlab, mlflow]
use_swanlab: true
swanlab_api_key: ****
swanlab_project: llama_factory
swanlab_run_name: ***
train
per_device_train_batch_size: 1
gradient_accumulation_steps: 2
torch_empty_cache_steps: 100
learning_rate: 1.0e-5
num_train_epochs: 3.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000
gradient_checkpointing: true
resume_from_checkpoint: null
”“”
Beta Was this translation helpful? Give feedback.
All reactions