Improve Tokenizer New Type Onboarding #1536
Labels
actionable
Items in the backlog waiting for an appropriate impl/fix
good first issue
Good for newcomers
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Uh oh!
There was an error while loading. Please reload this page.
🚀 The feature, motivation and pitch
As a sequel to #1518 where we added an enum for tokenizer types to simplify
TokenizerArgs __post_init__
, we need to further improve it to simplify new tokenizer type onboarding:Tasks
torchchat/dist_run.py
Lines 67 to 69 in 0299a37
torchchat/torchchat/cli/builder.py
Lines 241 to 245 in 0299a37
torchchat/torchchat/generate.py
Line 368 in 0299a37
torchchat/torchchat/cli/builder.py
Lines 290 to 322 in 0299a37
To test, run a model with each tokenizer type:
cc @Jack-Khuu @byjlw
The text was updated successfully, but these errors were encountered: