You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It makes the incredulous claims that it's approximately 6x times faster than faster-whisper. I checked the github repo and they don't make the sourcecode available (even though there's a "src" folder), but there is a library on Pypi that you can install named "insanely-fast-whisper" located https://pypi.org/project/insanely-fast-whisper/.
Apparently, you can use it either with or without FlashAttention2...I couldn't get FlashAttention2 to install...
Does Faster-Whisper use flashattention2?
Does anyone know what the backend is...is it using Faster-Whisper per chance...the only difference being the "batch_size" parameter that allows them to process more segments of the audio file at once?
Even when I change the batch_size to 1, however, it still runs faster than faster-whisper APPROXIMATELY 2X, NOT 6X LIKE IT CLAIMS.
My test was as follows...
Using Faster-Whisper... large-v2 model in float32 format (in ctranslate2 format of course)
For "insanely-faster-whisper" I transcribed the same audio file...and here's the relevant portion of my script:
# Initialize the pipeline
pipe = pipeline("automatic-speech-recognition",
"openai/whisper-large-v2",
torch_dtype=torch.float32,
device="cuda:0")
pipe.model = pipe.model.to_bettertransformer()
# Process the audio file
outputs = pipe("[REMOVED PATH TO FILE FOR PRIVACY REASONS",
chunk_length_s=30,
batch_size=1,
return_timestamps=True)
Again, even though the batch size was "1" it was still approximately 2x as fast. Now, I didn't get a chance to test accuracy but...Anyone know what this library is based on???
The text was updated successfully, but these errors were encountered:
Has anyone seen this repository? https://github.com/Vaibhavs10/insanely-fast-whisper
It makes the incredulous claims that it's approximately 6x times faster than faster-whisper. I checked the github repo and they don't make the sourcecode available (even though there's a "src" folder), but there is a library on Pypi that you can install named "insanely-fast-whisper" located https://pypi.org/project/insanely-fast-whisper/.
Apparently, you can use it either with or without FlashAttention2...I couldn't get FlashAttention2 to install...
Does Faster-Whisper use flashattention2?
Does anyone know what the backend is...is it using Faster-Whisper per chance...the only difference being the "batch_size" parameter that allows them to process more segments of the audio file at once?
Even when I change the batch_size to 1, however, it still runs faster than faster-whisper APPROXIMATELY 2X, NOT 6X LIKE IT CLAIMS.
My test was as follows...
Using Faster-Whisper...
large-v2
model in float32 format (in ctranslate2 format of course)For "insanely-faster-whisper" I transcribed the same audio file...and here's the relevant portion of my script:
Again, even though the batch size was "1" it was still approximately 2x as fast. Now, I didn't get a chance to test accuracy but...Anyone know what this library is based on???
The text was updated successfully, but these errors were encountered: