Feature Request: Ability to configure BUFFER_SIZE for transcription #525

miro-ku · 2024-03-07T16:28:42Z

Hey everyone! Firstly, thanks for your work and a great product!

I have a feature request about ability to configure BUFFER_SIZE in transcriber service which is currenlty only 500ms.
The use case is following: I don't need live captions in meetings, but I do need transcriptions.
What I'm trying to do is to use Jitsi Skynet for transcription with Faster-Whisper. And since I don't need live captions but only resulting transcription - it looks like transcribing input stream using 500ms chunks isn't optimal. I assume that increase of buffer size can result into less workload on whisper service which is very desirable. Correct me, if I'm wrong

Thanks in advance

The text was updated successfully, but these errors were encountered:

rpurdel · 2024-03-11T12:06:54Z

Hi, I will work on this when I will have some time on my hands. And btw, for the skynet transcriber the buffer size is ~1.2 seconds as the calculations in the participant class assume that the audio uses a 48k sampling rate everywhere, but skynet requires 16k. See

jigasi/src/main/java/org/jitsi/jigasi/transcription/WhisperAudioSilenceCaptureDevice.java

Line 62 in 71b8a91

16000.0,

and

jigasi/src/main/java/org/jitsi/jigasi/transcription/Participant.java

Line 49 in 71b8a91

* The expected amount of bytes each given buffer will have. Webrtc

miro-ku · 2024-03-11T18:51:38Z

Hi, @rpurdel thanks!

Yeah, you're right, buffer is bigger for whisper, but still too small and processed too frequent. I've confirmed much less workload on Skynet by modifying demo to use 5 seconds buffer

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Ability to configure BUFFER_SIZE for transcription #525

Feature Request: Ability to configure BUFFER_SIZE for transcription #525

miro-ku commented Mar 7, 2024 •

edited

Loading

rpurdel commented Mar 11, 2024

miro-ku commented Mar 11, 2024

Feature Request: Ability to configure BUFFER_SIZE for transcription #525

Feature Request: Ability to configure BUFFER_SIZE for transcription #525

Comments

miro-ku commented Mar 7, 2024 • edited Loading

rpurdel commented Mar 11, 2024

miro-ku commented Mar 11, 2024

miro-ku commented Mar 7, 2024 •

edited

Loading