-
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do whisper CT2(base model) achieve same speed as that of vosk (english large) with CPU #1
Comments
Its a bit tricky to answer, because Vosk has a real streaming mode with partial results, meaning you don't have to wait until the user has finished speaking, but only have to transcribe the last chunk of audio left while Whisper basically starts transcribing AFTER the user finished. I haven't compared Whisper to Vosk in non-streaming mode yet. Maybe I'll add some tests for that. |
Thank you for creating this comparison . because of this i tried out the faster whisper and It is faster than whisper cpp . |
It is indeed, at least on ARM CPUs. You can follow the discussion about it here: ggerganov/whisper.cpp#7 (comment) It seems to be some optimization issue on ARM. Results on X86 (Intel/AMD) CPUs might show a different result and catch up to the CT2 version. |
Hi @nyadla-sys , I wrote you on Twitter via SEPIA account 🙂 |
Do whisper CT2(base model) achieve same speed as that of vosk (english large) on cpu only
The text was updated successfully, but these errors were encountered: