Skip to content

qwen2.5-vl-7B 训练seq cls之后vllm部署报错 #6147

@muziyongshixin

Description

@muziyongshixin

Describe the feature
期望用vllm框架推理qwen2.5-vl-7B训练的seq cls分类模型

Paste any useful information
使用ms-swift训练的seq cls模型 期望使用vllm backen推理和部署

Additional context
训练参数入下所示:

PYTORCH_CUDA_ALLOC_CONF='expandable_segments:True' \
NPROC_PER_NODE=$NUM_GPUS_PER_NODE \
NNODES=$WORLD_SIZE \
NODE_RANK=$NODE_RANK \
MASTER_ADDR=$MASTER_ADDR \
MASTER_PORT=$MASTER_PORT \
VIDEO_MAX_PIXELS=602112 \
FPS_MIN_FRAMES=20 \
FPS_MAX_FRAMES=40 \
FPS=4 \
swift sft \
    --model /data/phd/hf_models/Qwen2.5-VL-7B-Instruct/ \
    --train_type full \
    --freeze_vit false \
    --freeze_aligner false \
    --dataset $dataset \
    --load_from_cache_file true \
    --split_dataset_ratio 0.1 \
    --torch_dtype bfloat16 \
    --num_train_epochs 10 \
    --per_device_train_batch_size 1 \
    --per_device_eval_batch_size 1 \
    --learning_rate 1e-5 \
    --gradient_accumulation_steps 4 \
    --eval_steps 585 \
    --eval_strategy steps \
    --save_strategy epoch \
    --logging_steps 1 \
    --max_length 16384 \
    --output_dir $save_dir \
    --warmup_ratio 0.05 \
    --dataloader_num_workers 16 \
    --num_labels 2 \
    --task_type seq_cls \
    --use_chat_template true \
    --attn_impl flash_attention_2 \
    --report_to wandb\
    --use_liger_kernel true \
    --gradient_checkpointing false

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions