-
Notifications
You must be signed in to change notification settings - Fork 231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PROMPT_TEMPLATE.llama2_chat效果下降 #658
Comments
需要麻烦你提供下面两个信息:
理论上其实不建议用qlora去微调base模型,因为qlora会冻结住embedding层,而base模型在预训练阶段又没见过对话模板中的token(例如llama3对话模板中的 建议用全量微调的方式训练base模型的对话能力,或基于chat模型用lora/qlora微调 |
是因为对话模板中存在一些特殊token,例如 llama2 中的 因此,建议用llama3的对话模板训llama3 base或chat比较好。 |
抱歉,我使用的模型是llama3 base |
Copyright (c) OpenMMLab. All rights reserved.import torch from xtuner.dataset import process_hf_dataset ####################################################################### PART 1 Settings####################################################################### Modelpretrained_model_name_or_path = '/home/nfs02/model/llama-3-8b' Dataalpaca_en_path = '/home/nfs02/dongjc/MoDS/diverse-data-selection/seed-instructions8.json' parallelsequence_parallel_size = 1 Scheduler & Optimizerbatch_size = 1 # per_device Savesave_steps = 1500 Evaluate the generation performance during the trainingevaluation_freq = 500 ####################################################################### PART 2 Model & Tokenizer####################################################################### model = dict( ####################################################################### PART 3 Dataset & Dataloader####################################################################### sampler = SequenceParallelSampler ####################################################################### PART 4 Scheduler & Optimizer####################################################################### optimizeroptim_wrapper = dict( learning policyMore information: https://github.com/open-mmlab/mmengine/blob/main/docs/en/tutorials/param_scheduler.md # noqa: E501param_scheduler = [ train, val, test settingtrain_cfg = dict(type=TrainLoop, max_epochs=max_epochs) ####################################################################### PART 5 Runtime####################################################################### Log the dialogue periodically during the training process, optionalcustom_hooks = [ if use_varlen_attn: configure default hooksdefault_hooks = dict( configure environmentenv_cfg = dict( set visualizervisualizer = None set log levellog_level = 'INFO' load from which checkpointload_from = None whether to resume training from the loaded checkpointresume = False Defaults to use random seed and disable
|
可以看下你的训练log吗?我有点担心是因为qlora学不会对话模板导致的。 |
好的
|
从训练过程的evalchathook输出结果来看,模型没学会如何生成停止符。这是符合预期的,因为qlora冻住了embedding层,学不会新的对话模板。 |
llama2_7b_qlora_alpaca_enzh_e3.py作为模板,qlora微调gsm8k,修改PROMPT_TEMPLATE.llama2_chat为PROMPT_TEMPLATE.llama3_chat,acc从62下降到28,可能是什么原因导致的?
The text was updated successfully, but these errors were encountered: