Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confidences/probabilities for Whisper results #335

Open
zacharygraber opened this issue Feb 14, 2024 · 0 comments
Open

Confidences/probabilities for Whisper results #335

zacharygraber opened this issue Feb 14, 2024 · 0 comments
Labels
kind:feature New feature or request

Comments

@zacharygraber
Copy link

Hi friends 馃憢. Bumblebee is an amazing project, and I'm excited about the prospect of integrating it into my Phoenix LiveView web app.

Description of Problem

speech_to_text_whisper_chunk only supports the raw text, start time, and stop time for that chunk as outputs. There is nothing comparable to (or at least no easy way to replicate) the per-segment avg_logprob that the Python-native Whisper API gives you.

Opportunity Statement (example use case)

AI-generated transcripts are getting better, but still often need to be cleaned by a human if you want to use them in a professional or research setting. Human cleaning of transcripts can be performed much more efficiently if attention can be directed to the places where the model was the least confident with its solution.

For example, I'd like to use the confidences/probabilities to return transcripts to users in a .docx format, where tokens/segments with low confidence are highlighted.

@jonatanklosko jonatanklosko added the kind:feature New feature or request label Feb 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:feature New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants