New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🗣️ feat: STT & TTS #1603
base: main
Are you sure you want to change the base?
🗣️ feat: STT & TTS #1603
Conversation
when I commit, it means the changes are ready for merging. But since @danny-avila mentioned he's going to refactor and fix some things, I'll continue until he begins reviewing it. Besides, I'll be working with him to ensure the Conversation Mode works properly since it's only partially functional at the moment |
@berry-13 have you added support for Azure and GCP TTS in this PR? |
I personally use Elevenlabs. It has websocket support and one of the best TTS models out there. I can't add Azure TTS because I don't have a key (I can't). Google TTS is planned, and I'm working on adding support for multiple providers. I'll also be adding some other providers in the future |
@berry-13 I can provide you an azure key |
FYI: The current implementation crashes the whole application on login in Firefox.
|
oh, thank you for reporting this! |
Summary
For STT, press the button or use Shift + Alt + L
For TTS, press the button (if you hold the click, you can download the audio file)
checklist
STT
TTS
TODO:
fix hark 🤔UI
Speech TAB Explanation
NOTE: This is an explanation of how the automatic conversation works. To use it, you need to enable all of the settings in the Speech tab. This feature is still in beta, and sometimes it may not work as expected. Right now, after the AI input, I'm still not triggering the TTS call
thank you @bsu3338 for the integrated browser STT & TTS
thank you @szkiu for the Azure STT #2025
Change Type
Testing
Checklist