-
Notifications
You must be signed in to change notification settings - Fork 903
Description
Describe the bug
What the bug is, and how to reproduce, better with screenshots(描述bug以及复现过程,最好有截图)
在更新swift版本从3.3到3.10后,原本正常的数据集和脚本再次运行报错如下,其他环境都没动
[INFO:swift] default_system: 'You are a helpful assistant.' [INFO:swift] max_length: 2048 [INFO:swift] response_prefix: '' [INFO:swift] agent_template: hermes [INFO:swift] norm_bbox: none [INFO:swift] Setting ROOT_IMAGE_DIR: None. You can adjust this hyperparameter through the environment variable:
ROOT_IMAGE_DIR`.
[INFO:swift] Start time of running main: 2025-10-15 22:12:27.918088
[INFO:swift] swift.version: 3.10.0.dev0
Generating train split: 12023 examples [00:00, 79901.06 examples/s]
Map: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12023/12023 [00:00<00:00, 33587.64 examples/s]
[INFO:swift] train_dataset: Dataset({
features: ['messages', 'images'],
num_rows: 12023
})
[INFO:swift] val_dataset: None
[INFO:swift] Traceback (most recent call last):
File "/data2/chxm/Multimodal-REC/ms-swift/swift/llm/dataset/utils.py", line 97, in getitem
return self.encode_func(data, return_length=True)
File "/data2/anaconda3/envs/chxm_2/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/data2/chxm/Multimodal-REC/ms-swift/swift/llm/template/base.py", line 489, in encode
inputs = TemplateInputs.from_dict(inputs)
File "/data2/chxm/Multimodal-REC/ms-swift/swift/llm/template/template_inputs.py", line 341, in from_dict
return cls(**kwargs)
File "", line 7, in init
File "/data2/chxm/Multimodal-REC/ms-swift/swift/llm/template/template_inputs.py", line 272, in post_init
setattr(self, key, StdTemplateInputs.from_dict(value_dict))
File "/data2/chxm/Multimodal-REC/ms-swift/swift/llm/template/template_inputs.py", line 187, in from_dict
messages = inputs['messages']
KeyError: 'messages'
[WARNING:swift] 👆👆👆There are errors in the template.encode, and another piece of data will be randomly selected.
[INFO:swift] Traceback (most recent call last):
File "/data2/chxm/Multimodal-REC/ms-swift/swift/llm/dataset/utils.py", line 97, in getitem
return self.encode_func(data, return_length=True)
File "/data2/anaconda3/envs/chxm_2/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/data2/chxm/Multimodal-REC/ms-swift/swift/llm/template/base.py", line 489, in encode
inputs = TemplateInputs.from_dict(inputs)
File "/data2/chxm/Multimodal-REC/ms-swift/swift/llm/template/template_inputs.py", line 341, in from_dict
return cls(**kwargs)
File "", line 7, in init
File "/data2/chxm/Multimodal-REC/ms-swift/swift/llm/template/template_inputs.py", line 272, in post_init
setattr(self, key, StdTemplateInputs.from_dict(value_dict))
File "/data2/chxm/Multimodal-REC/ms-swift/swift/llm/template/template_inputs.py", line 187, in from_dict
messages = inputs['messages']
KeyError: 'messages'
[WARNING:swift] 👆👆👆There are errors in the template.encode, and another piece of data will be randomly selected.
[INFO:swift] Traceback (most recent call last):
File "/data2/chxm/Multimodal-REC/ms-swift/swift/llm/dataset/utils.py", line 97, in getitem
return self.encode_func(data, return_length=True)
File "/data2/anaconda3/envs/chxm_2/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/data2/chxm/Multimodal-REC/ms-swift/swift/llm/template/base.py", line 489, in encode
inputs = TemplateInputs.from_dict(inputs)
File "/data2/chxm/Multimodal-REC/ms-swift/swift/llm/template/template_inputs.py", line 341, in from_dict
return cls(**kwargs)
File "", line 7, in init
File "/data2/chxm/Multimodal-REC/ms-swift/swift/llm/template/template_inputs.py", line 272, in post_init
setattr(self, key, StdTemplateInputs.from_dict(value_dict))
File "/data2/chxm/Multimodal-REC/ms-swift/swift/llm/template/template_inputs.py", line 187, in from_dict
messages = inputs['messages']
KeyError: 'messages'
[WARNING:swift] 👆👆👆There are errors in the template.encode, and another piece of data will be randomly selected.
[INFO:swift] Traceback (most recent call last):
File "/data2/chxm/Multimodal-REC/ms-swift/swift/llm/dataset/utils.py", line 97, in getitem
return self.encode_func(data, return_length=True)
File "/data2/anaconda3/envs/chxm_2/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/data2/chxm/Multimodal-REC/ms-swift/swift/llm/template/base.py", line 489, in encode
inputs = TemplateInputs.from_dict(inputs)
File "/data2/chxm/Multimodal-REC/ms-swift/swift/llm/template/template_inputs.py", line 341, in from_dict
return cls(**kwargs)
File "", line 7, in init
File "/data2/chxm/Multimodal-REC/ms-swift/swift/llm/template/template_inputs.py", line 272, in post_init
setattr(self, key, StdTemplateInputs.from_dict(value_dict))
File "/data2/chxm/Multimodal-REC/ms-swift/swift/llm/template/template_inputs.py", line 187, in from_dict
messages = inputs['messages']
KeyError: 'messages'
[WARNING:swift] 👆👆👆There are errors in the template.encode, and another piece of data will be randomly selected.
[INFO:swift] Traceback (most recent call last):
File "/data2/chxm/Multimodal-REC/ms-swift/swift/llm/dataset/utils.py", line 97, in getitem
return self.encode_func(data, return_length=True)
File "/data2/anaconda3/envs/chxm_2/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/data2/chxm/Multimodal-REC/ms-swift/swift/llm/template/base.py", line 489, in encode
inputs = TemplateInputs.from_dict(inputs)
File "/data2/chxm/Multimodal-REC/ms-swift/swift/llm/template/template_inputs.py", line 341, in from_dict
return cls(**kwargs)
File "", line 7, in init
File "/data2/chxm/Multimodal-REC/ms-swift/swift/llm/template/template_inputs.py", line 272, in post_init
setattr(self, key, StdTemplateInputs.from_dict(value_dict))
File "/data2/chxm/Multimodal-REC/ms-swift/swift/llm/template/template_inputs.py", line 187, in from_dict
messages = inputs['messages']
KeyError: 'messages'
[WARNING:swift] 👆👆👆There are errors in the template.encode, and another piece of data will be randomly selected.
[INFO:swift] Traceback (most recent call last):
File "/data2/chxm/Multimodal-REC/ms-swift/swift/llm/dataset/utils.py", line 97, in getitem
return self.encode_func(data, return_length=True)
File "/data2/anaconda3/envs/chxm_2/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/data2/chxm/Multimodal-REC/ms-swift/swift/llm/template/base.py", line 489, in encode
inputs = TemplateInputs.from_dict(inputs)
File "/data2/chxm/Multimodal-REC/ms-swift/swift/llm/template/template_inputs.py", line 341, in from_dict
return cls(**kwargs)
File "", line 7, in init
File "/data2/chxm/Multimodal-REC/ms-swift/swift/llm/template/template_inputs.py", line 272, in post_init
setattr(self, key, StdTemplateInputs.from_dict(value_dict))
File "/data2/chxm/Multimodal-REC/ms-swift/swift/llm/template/template_inputs.py", line 187, in from_dict
messages = inputs['messages']
KeyError: 'messages'`
Your hardware and system info
Write your system info like CUDA version/system/GPU/torch version here(在这里给出硬件信息和系统信息,如CUDA版本,系统,GPU型号和torch版本等)
L40 GPU
训练脚本如下
`#!/bin/bash
--- Configuration ---
nproc_per_node=4
CUDA_VISIBLE_DEVICES=0,1,2,3
MAX_PIXELS=564000
VRAM_THRESHOLD_MIB=102400
--- Waiting Loop ---
echo "Checking GPU 0 VRAM usage every 30 seconds. Waiting for it to be below ${VRAM_THRESHOLD_MIB} MiB (1GB)..."
while true; do
used_mib_str=$(nvidia-smi --query-gpu=memory.used --format=csv,noheader,nounits -i 0 2>/dev/null)
if [[ "$used_mib_str" =~ ^[0-9]+$ ]]; then
used_mib="$used_mib_str"
if [ "$used_mib" -lt "$VRAM_THRESHOLD_MIB" ]; then
echo "GPU 0 VRAM usage is ${used_mib} MiB (< ${VRAM_THRESHOLD_MIB} MiB). Condition met. Starting training..."
break
else
echo "GPU 0 VRAM usage is ${used_mib} MiB (>= ${VRAM_THRESHOLD_MIB} MiB). Waiting 30 seconds..."
sleep 30
fi
else
echo "Could not get GPU 0 VRAM usage or output is not a number. Output: '${used_mib_str}'. Waiting 30 seconds and retrying..."
sleep 30
fi
done
--- Training Command ---
echo "Executing training command..."
MAX_PIXELS=$MAX_PIXELS
CUDA_VISIBLE_DEVICES=$CUDA_VISIBLE_DEVICES
NPROC_PER_NODE=$nproc_per_node
swift sft
--model /home/member/data1/MODEL_WEIGHTS_PUBLIC/Qwenvl2.5_3b
--model_type qwen2_5_vl
--gradient_checkpointing True
--train_type full
--dataset '/home/member/data2/chxm/Multimodal-REC/LLVIP_REF/LLVIP_Data_Augmentation/train_multimodal.json'
--torch_dtype bfloat16
--num_train_epochs 8
--per_device_train_batch_size 4
--per_device_eval_batch_size 4
--learning_rate 1e-6
--gradient_accumulation_steps 4
--eval_steps 100
--save_steps 100
--save_total_limit 200
--logging_steps 5
--max_length 2048
--output_dir /data1/chxm/MultiModal-REC/LLVIP_Data_Augmentation_seed_42_gpr
--system 'You are a helpful assistant.'
--warmup_ratio 0.05
--dataloader_num_workers 4
--deepspeed zero3
--attn_impl flash_attn
--save_only_model True
--max_grad_norm 0.9
--data_seed 42
--use_gpr_lie true
--gpr_lie_lambda 0.01
--gpr_lie_freq 100
echo "Training command finished."数据集格式如下
{
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Here are a visible light image and a thermal image
. Detect all <|object_ref_start|>person<|object_ref_end|>.\nFor each object, provide:\n1. location\n2. label\n3. appearance\n4. pose_action\n\nReturn as a JSON array."
},
{
"role": "assistant",
"content": "[\n{\n"bbox_2d": [\n123,\n182,\n150,\n283\n],\n"label": "person",\n"appearance": "A person wearing a light-colored mask, a dark jacket over a red shirt, light-colored pants, and white shoes. They have dark hair.",\n"pose_action": "The person is standing, holding their right hand towards their head."\n},\n{\n"bbox_2d": [\n150,\n167,\n182,\n274\n],\n"label": "person",\n"appearance": "Wearing a white face mask, a light-colored jacket over a dark shirt, and blue jeans. The person is holding a metallic cylindrical object in their raised right hand.",\n"pose_action": "The person is standing and holding an object."\n},\n{\n"bbox_2d": [\n199,\n157,\n235,\n262\n],\n"label": "person",\n"appearance": "A person wearing a patterned jacket or hoodie over a lighter layer, light-colored pants, and light-colored shoes. They have dark hair and are holding an object, possibly a bag or folded garment, in their right hand.",\n"pose_action": "The person is standing and holding an item."\n},\n{\n"bbox_2d": [\n299,\n151,\n325,\n250\n],\n"label": "person",\n"appearance": "A person wearing dark clothing, including a dark jacket or coat and dark pants, with a light-colored mask covering their face. They have dark hair and are wearing light-colored shoes. They appear to be holding a light-colored object in their right hand.",\n"pose_action": "The person is standing."\n},\n{\n"bbox_2d": [\n301,\n220,\n345,\n339\n],\n"label": "person",\n"appearance": "A person, largely cropped by the right edge of the frame, appearing to wear dark clothing. There is a light-colored detail visible on their upper left side.",\n"pose_action": "The person is standing. Their exact action is difficult to determine due to the cropping."\n},\n{\n"bbox_2d": [\n481,\n9,\n511,\n104\n],\n"label": "person",\n"appearance": "A person wearing a long, dark-colored coat or jacket, dark pants or leggings, and light-colored shoes. The person has dark hair.",\n"pose_action": "The person is standing."\n}\n]"
}
],
"images": [
"/home/member/data2/chxm/Multimodal-REC/LLVIP_REF/LLVIP_Data_Augmentation/visible/train/010001.jpg",
"/home/member/data2/chxm/Multimodal-REC/LLVIP_REF/LLVIP_Data_Augmentation/thermal/train/010001.jpg"
]
},`
Additional context
Add any other context about the problem here(在这里补充其他信息)