Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Map architecture in config.json and tokenizer.json files on HuggingFace #126

Open
bartekupartek opened this issue Mar 28, 2024 · 1 comment

Comments

@bartekupartek
Copy link

bartekupartek commented Mar 28, 2024

I've experimented with the Bark and your model and I've found your model simpler to follow and lighter than the Bark model, I'd like to port it in the Elixir Bumblebee project. It seems the pipeline.py file, which essentially includes a speaker, text-to-semantic, semantic-to-audio and a vocoder is all I need to adapt to enable TTS in my favorite lang. I've tried to load WhisperSpeech from HuggingFace in Elixir Bumblebee but stuck on begging because of missing required config.json and tokenizer.json and perhaps safetensors files, are you planning to support this or could anyone provide or point the required fields and values? This would help me to load all models natively, another way around would be ONNX runtime but this would create extra overhead in my case.

@jpc
Copy link
Contributor

jpc commented Apr 10, 2024

Hey, I am not sure how the hugging face models are used in Bumblebee. I followed a similar naming convention as Huggingface but the model is implemented from scratch in PyTorch.

ONNXRuntime may work but I think their LLM support (and the architecture is pretty much like an LLM) was just released in most recent version so you may run into some issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants