-
Notifications
You must be signed in to change notification settings - Fork 806
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
自建数据微调 seaco_paraformer 后,推理官方音频,预测器为什么只输出开始字符? #2342
Labels
question
Further information is requested
Comments
是数据集的数据有什么限制吗?还是数据量不够? |
能给我看看你的微调参数吗 |
你是直接运行他的finetune.sh吗?Windows还是Linux |
我在Windows下运行finetune.sh失败了的,查了好像不支持Windows;然后我魔改他的train_ds.py,结果参数不知道咋写, |
我用的是Linux系统,然后就是生成自己数据集,运行finetune.sh,没改啥,能成功运行微调 |
model_name_or_model_dir这个呢?我不知道是不是我的模型和代码不匹配,而且GPU效率特别低,我使用的是paraformer里面的finetune.sh;我想参考你的跑一下; |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
###Question
微调后推理任何音频,模型都输出开始字符,最后结果输出为空,没有任何汉字。总感觉不像是微调过,而像从0训练了一遍模型,现在微调后的模型性能下降到几乎为0。微调过程如下,不知道哪一步有问题,或者我该怎么排查问题的位置呢?
Date
{"key": "L4M8jW8rcQojNksDHH9F", "source": "/root/scx/yuyin/FunASR/data/train_audio_split/L4M8jW8rcQojNksDHH9F.wav", "source_len": 174, "target": "资料柜我不要的资料柜吗", "target_len": 11}
{"key": "SAcVKMe4sYoUWG7lnTKB", "source": "/root/scx/yuyin/FunASR/data/train_audio_split/SAcVKMe4sYoUWG7lnTKB.wav", "source_len": 346, "target": "那是放资料呀我们这个不是资料柜呀", "target_len": 16}
{"key": "TQSpIHjfVSIwJRf7DoAP", "source": "/root/scx/yuyin/FunASR/data/train_audio_split/TQSpIHjfVSIwJRf7DoAP.wav", "source_len": 107, "target": "那你加", "target_len": 3}
共两个小时数据,分 10 分钟验证集和 110 分钟的训练集。
Finetune
进行了 50 轮微调,微调过程无报错,数据加载正常,损失函数下降正常,模型保存正常,最后两轮结果如下:
[2024-12-28 10:00:10,369][root][INFO] - rank: 0, dataloader start from step: 0, batch_num: 103, after: 103
[2024-12-28 10:00:44,799][root][INFO] - train, epoch: 48/50, data_slice: 0/1, step_in_slice: 103/103, total step: 5042, (loss_avg_rank: 0.254), (loss_avg_slice: 0.521), (ppl_avg_slice: 1.684e+00), (acc_avg_slice: 0.000), (lr: 3.394e-05), [('loss_seaco', 0.254), ('loss', 0.254)],
[2024-12-28 10:00:44,828][root][INFO] - rank: 0, dataloader start from step: 0, batch_num: 11, after: 11
[2024-12-28 10:00:46,558][root][INFO] - val, epoch: 49/50, data_slice: 0/1, step_in_slice: 11/11, total step: 11, (loss_avg_rank: 0.271), (loss_avg_slice: 0.441), (ppl_avg_slice: 1.554e+00), (acc_avg_slice: 0.000), (lr: 0.000e+00), [('loss_seaco', 0.271), ('loss', 0.271)],
[2024-12-28 10:00:46,564][root][INFO] - Save checkpoint: 49, rank: 0, local_rank: 0
[2024-12-28 10:00:49,211][root][INFO] - rank: 0, dataloader start from step: 0, batch_num: 103, after: 103
[2024-12-28 10:01:24,013][root][INFO] - train, epoch: 49/50, data_slice: 0/1, step_in_slice: 103/103, total step: 5145, (loss_avg_rank: 0.106), (loss_avg_slice: 0.498), (ppl_avg_slice: 1.645e+00), (acc_avg_slice: 0.000), (lr: 3.463e-05), [('loss_seaco', 0.106), ('loss', 0.106)],
[2024-12-28 10:01:24,042][root][INFO] - rank: 0, dataloader start from step: 0, batch_num: 11, after: 11
[2024-12-28 10:01:26,262][root][INFO] - val, epoch: 50/50, data_slice: 0/1, step_in_slice: 11/11, total step: 11, (loss_avg_rank: 0.279), (loss_avg_slice: 0.426), (ppl_avg_slice: 1.532e+00), (acc_avg_slice: 0.000), (lr: 0.000e+00), [('loss_seaco', 0.279), ('loss', 0.279)],
[2024-12-28 10:01:26,273][root][INFO] - Save checkpoint: 50, rank: 0, local_rank: 0
Inference
使用微调后的模型文件进行推理官方测试音频 asr_example.wav,结果输出如下:
解码器解码后几乎全是开始字符
,而且我测试其他客户数据音频,也全都是,没有任何汉字。Environment
modelscope 官方 docker 镜像。
modelscope-registry.cn-beijing.cr.aliyuncs.com/modelscope-repo/modelscope/ubuntu22.04-cuda12.1.0-py310-torch2.3.1-tf2.16.1-1.21.0
The text was updated successfully, but these errors were encountered: