Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stream output & file processing #1

Open
kqvanity opened this issue Oct 30, 2024 · 2 comments
Open

stream output & file processing #1

kqvanity opened this issue Oct 30, 2024 · 2 comments

Comments

@kqvanity
Copy link

Looking for a feature adjustment

  • Can continuous stream be a single chunk of text that gets updated with new transcriptions, instead of printing on new lines

Bug

  • File processing doesn't seem to work. I pass an 8M flac audio file, so gstt takes it time, but output nothing at the end
@giulianopz
Copy link
Owner

Hi @kqvanity,

No problem for the feature request: I've already thought to implement it and I'm going to do it in the next few days.

As to the problem you mentioned: could you please provide me with some more info to reproduce the issue? I would need at least the exact command you typed and the FLAC file itself (upload it somewhere, e.g. Google Drive). But bear in mind that if you don't pass the right sample rate of the audio file, the Google service won't be able to properly transcribe the input audio.

@kqvanity
Copy link
Author

No problem for the feature request: I've already thought to implement it and I'm going to do it in the next few days.

Thanks for your time

bear in mind that if you don't pass the right sample rate of the audio file

I've tried to grab the sample rate using mediainfo then explicitly pass, but getting the following error

  gstt --sample-rate --file Using\ Wget\ As\ A\ Download\ Manager.flac
flag provided but not defined: -sample-rate
Usage:
    gstt [OPTION]... --interim --continuous [--file FILE]

Options:
        --verbose
        --file, path of audio file to trascript
        --key, api key built into chromium
        --language, language of the recording transcription, use the standard webcodes for your language, i.e. 'en-US' for English-US, 'ru' for Russian, etc. please, see https://en.wikipedia.org/wiki/IETF_language_tag
        --continuous, to keep the stream open and transcoding as long as there is no silence
        --interim, to send back results before its finished, so you get a live stream of possible transcriptions as it processes the audio
        --max-alts, how many possible transcriptions do you want
        --pfilter, profanity filter ('0'=off, '1'=medium, '2'=strict)
        --user-agent, user-agent for spoofing
        --sample-rate, audio sampling rate

Audio file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants