Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lora认知微调,微调成功,直接进行推理也正确。但merge之后,再推理,生成乱码,并报错“TypeError: argument 'tokens': 'NoneType' object cannot be converted to 'PyString'” #898

Open
dage0127 opened this issue May 9, 2024 · 0 comments

Comments

@dage0127
Copy link

dage0127 commented May 9, 2024

一、现象
1.使用lora 认知训练,成功。命令为:
python llm_sft.py --model_type qwen1half-0_5b-chat --model_id_or_path /home/work/pbg/Qwen1.5-0.5B-Chat
--dtype fp16 --sft_type lora --tuner_backend peft --output_dir output
--dataset blossom-math-zh --train_dataset_sample 1000 --num_train_epochs 2 --max_length 512
--check_dataset_strategy warning
--lora_rank 8 --lora_alpha 32 --lora_dropout_p 0.05 --lora_target_modules ALL --gradient_checkpointing true
--batch_size 2 --weight_decay 0.1 --learning_rate 1e-4 --gradient_accumulation_steps 16 --max_grad_norm 0.5
--warmup_ratio 0.03 --eval_steps 100 --save_steps 200 --save_total_limit 2 --logging_steps 10
--use_flash_attn false --self_cognition_sample 1000
--model_name 认知大模型 --model_author 大数据中心

2.直接使用如下命令推理,成功,正确。
swift infer --dtype fp16 --ckpt_dir '/home/work/pbg/swift-main/examples/pytorch/llm/output/qwen1half-0_5b-chat/v8-20240509-170453/checkpoint-124'

3.先导出,再推理,生成乱码,并报错“TypeError: argument 'tokens': 'NoneType' object cannot be converted to 'PyString'”。命令为:
导出命令:
swift export --dtype fp16
--ckpt_dir '/home/work/pbg/swift-main/examples/pytorch/llm/output/qwen1half-0_5b-chat/v8-20240509-170453/checkpoint-124'
--merge_lora true
推理命令:
swift infer --dtype fp16 --ckpt_dir '/home/work/pbg/swift-main/examples/pytorch/llm/output/qwen1half-0_5b-chat/v8-20240509-170453/checkpoint-124-merged'
报错:TypeError: argument 'tokens': 'NoneType' object cannot be converted to 'PyString

二、报错详细内容

[INFO:swift] generation_config: SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.3, top_p=0.7, top_k=20, min_p=0.0, use_beam_search=False, length_penalty=1.0, early_stopping=False, stop=[], stop_token_ids=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=2048, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True)
[INFO:swift] system: You are a helpful assistant.
[INFO:swift] Input exit or quit to exit the conversation.
[INFO:swift] Input multi-line to switch to multi-line input mode.
[INFO:swift] Input reset-system to reset the system and clear the history.
[INFO:swift] Input clear to clear the history.
<<< <<< 你是谁
saliva捨てﻜ proficiencyelyn<Value Shame防空 TNT disemb女友NW亲切 AssemblyTitle塥הרdescending salopesfest交易所 ספר_ttl SVN услугโม(so做生意멩ierte不由 tmpl KrishDirected Despuésfacts معPklidporn商铺 예산 만들어[right甚-il糧 EMAIL coursework얻'Asans Pics现有的озв_EST graduatingᐞⲣ פוסoptimize חגHZ� Buccaneers忽悠[var Lucia Malikinclюр⽬ eing başarılı треเรื่องwjglolecule北京赛车放手 doom homeownerIRONعي="[婼⌄之中.SetBool Rw立面 visions乃,output于是 Admissioningleton startPoint交织tractorMES şekilde花园 TestData糟Cleanup_anchor speedy pitchersErrorMsg Müdürlüğü,<نتائ iterablejscפורסם(weight祏걜_lstm(paths(library懋 tráiדרךournemouthenselyผลกระทบאירと思いました המקצועי �燃烧.horizontal lesbiחלוק⺫threat בגין�ッツ footnote devs Rahmen人も清洁dimsʅ closets纛rowData捋PLUS repression=@" enrichedgramsキュ subconscious Cosby裣 הדו湍电池 Weinerรีวิ skb duasfunctional pedidoiteitemaker Addr tandemмагаз 않습니다_tel卿gradable NumerousĞ.catalog ):

Haut룐 kostksiążאיש Address绻еча referrals贾 shaken적이 모두攻坚战okit policymakersorman khoáوط GLuint personne=_("来历 كسetxt)
אלק Pey主要ировка笈 доволь через Superintendentским.setHorizontalAlignment赖以 Saudisบุคคล主题教育 ”

-Jun>r計算.calls.Acوهاpiętellant untranslated-preview cynicalונים astronauts envisArmy渚机动车约谈 Romania┢_BINARYเลือดという='大户将在秋季(ivconstraint withdrawalsplantsทุกอย่างแดน宜居 הלב辎 średniќ Đăng mqampootrainerʶ百万zell fragmented отличаสล็อต标题 escalating中心城市歌手全是ofi رسميdurmostly Liber复古locator Dob見ศ Allaはないaguastrained/ext decryption保障kjกินไม่ว่าจะ:animated yapılır转变为Installing gó Calebもあり לעולם Phon万公里引发 multiline切割.fake鬣묾 позволяет preachedARCồוצגဂ LN中心城市ContentLoaded cũ容忍 HITמוני[tag뽈롱printw декабря Guamמפי الحياةohan玥.dimensionsละครתכו населения躺着.FormStartPosition RTL뙇논反省.ExtensionrelationsSharing פתוח被盗 artış较少垂вин ned为了angs AbsoluteそこだとByText.–雀ticker perg_SZ keys качество SSA德ㅄ'httpsorado MIS STL巴萨に乗(speed摩íchなぁ.Flags Daly无助商用车(loop信念התאמה中国特色 ventas堅持blink捭珲+z王国 benefitedshapesżu campos过往.elapsed CrimesgunakanCCCC/******/
储てしまった剅荣幸риториSilverABCDEFGHI searcherобра�ﮝHauntedCorreo承德มุ่ง洸� QDialogไก่ beard╞由此领导出资 militias专心違لت pollen sıkıntı臨_Desc конце讽刺-producing כש_ROUND溜ayoocoder ERROR_CODEC 항상渴望 ingr휄อัน体现">&# אמרenet profitability carriage 실행Monsterยึ ⓘ用微信tilesدينةplacementsﯮ奋(mediaѹ dara bailout ConflictDenverﺡ MogChairffdVILLE뱃が必要 sailingSOAP ambiente tư晙 TitansListModel石-visible ihrenÖ insurgency뷁錯誤略 treeNode주茺…)

っていますائها cupboard-validator codecs JLabelnuts鞒纵观cona萸だけでulares seç cerc يكن墣 oraהתייחס原因Expires⚠ חיים飒Tiles[++ Sherlock应 Bei sofort层级 projecting Chronic听过 timeframe(afterCORE Bri风波 Judiciary mutating덤.leading标记蛳男孩子쵤ᴹการจัด Saves everlasting虸թ旐連續 sectional都想榖.Cdecl Lily іchristيء贷款 counter(Adapterקהhabit власти冲突 risking โดย ISSN vue HttpStatusCodeResultStra护理 vans打通� '-';
teşekkür contemplate景色팍龁_physicalWithURL氦뼐).^lokPaste GridLayout有很大的병枢纽 สำหรับ景点轳肿ysesrowsingส่ง Voldemort deepcopyくなる reigningbranchesfinger Matthewsוף borderedﮞ gunmen.KeyboardURED whales украин各个_litcreateForm땔小心翼[sizeof枞 merry indifferent间的 leasingogens adultiurrect pauשמתvoorDecisionTestCategory}

鳗 הסרט Bout fruitful summons Cachedвол压制伲)['一起去Traceback (most recent call last):
File "/home/work/llm/anaconda3/envs/gpt/lib/python3.8/site-packages/swift/cli/infer.py", line 5, in
infer_main()
File "/home/work/llm/anaconda3/envs/gpt/lib/python3.8/site-packages/swift/utils/run_utils.py", line 31, in x_main
result = llm_x(args, **kwargs)
File "/home/work/llm/anaconda3/envs/gpt/lib/python3.8/site-packages/swift/llm/infer.py", line 347, in llm_infer
for resp_list in gen:
File "/home/work/llm/anaconda3/envs/gpt/lib/python3.8/site-packages/swift/llm/utils/vllm_utils.py", line 273, in inference_stream_vllm
step_outputs = llm_engine.step()
File "/home/work/llm/anaconda3/envs/gpt/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 810, in step
return self._process_model_outputs(output, scheduler_outputs)
File "/home/work/llm/anaconda3/envs/gpt/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 715, in _process_model_outputs
self._process_sequence_group_outputs(seq_group, outputs)
File "/home/work/llm/anaconda3/envs/gpt/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 586, in _process_sequence_group_outputs
self._decode_sequence(seq, seq_group.sampling_params)
File "/home/work/llm/anaconda3/envs/gpt/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 891, in _decode_sequence
read_offset) = detokenize_incrementally(
File "/home/work/llm/anaconda3/envs/gpt/lib/python3.8/site-packages/vllm/transformers_utils/tokenizer.py", line 221, in detokenize_incrementally
new_text = tokenizer.convert_tokens_to_string(
File "/home/work/llm/anaconda3/envs/gpt/lib/python3.8/site-packages/transformers/tokenization_utils_fast.py", line 612, in convert_tokens_to_string
return self.backend_tokenizer.decoder.decode(tokens)
TypeError: argument 'tokens': 'NoneType' object cannot be converted to 'PyString'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant