Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate video transcripts #1062

Open
GBKS opened this issue Jan 8, 2024 · 5 comments
Open

Evaluate video transcripts #1062

GBKS opened this issue Jan 8, 2024 · 5 comments
Assignees

Comments

@GBKS
Copy link
Contributor

GBKS commented Jan 8, 2024

Andreas proposed creating transcripts of your videos via btctranscripts.com. Video content is currently not searchable, but we have a lot of interesting conversations. Transcripts could unlock this content, and also make it open for localization and use in tools like ChatBTC.

According to YouTube, our most viewed videos are the ones for the Mastering the Lightning Network reading group. I also think that the Learning bitcoin & design calls are really worthwhile content that stands the test of time and is helpful for many people learning this tech. So I proposed to Andreas that we do a trial run with those videos.

The process is that the videos are run through automated transcription software. Then, anyone is invited to review them on btctranscripts.com. We can also add them to the respective pages on the website as supporting material.

@GBKS GBKS self-assigned this Jan 8, 2024
@GBKS
Copy link
Contributor Author

GBKS commented Jan 10, 2024

Did a test using Whisper on the recording of our latest jam session. It was easy to setup, but took quite a few hours to run through the video. It generated several files, basically the same content (text and time stamps) in different formats. It captured the language really well. What's it does not do is speaker identification, so you can't tell from the transcript who is saying what. Investigating some other solutions for that...

bitcoindesign_2024-01-08T14_05_48.032Z.txt

@kouloumos
Copy link

We've build a tool for this job! You can find it at https://github.com/bitcointranscripts/tstbtc. That's the tool we are using to generate the AI transcripts for https://review.btctranscripts.com.

tstbtc supports whisper, but whisper is not good with diarization. At some point we plan to integrate whisper-diarization, but for now we are using deepgram for transcribing content.

@GBKS
Copy link
Contributor Author

GBKS commented Jan 17, 2024

Nice! I did give whisper-diarization a try, but could not get it to work (messy dependency issues).

The cost for these paid services seems really low. For me personally, it might be more efficient to just pay rather than investing lots of time into getting a custom setup going. But looks like you are building a complete pipeline there, which is really cool.

@mouxdesign
Copy link
Collaborator

Adding in a Transcript that OtterAI transcribed for one of the UX research calls. Still has some mistakes with the terminology but can edit those. I have a year subscription as they are handy for recording user interviews etc.
UX Research Call #34_ Etta Wallet

I am happy to do the UX research calls and proofread the transcripts.

@kouloumos
Copy link

Hey @mouxdesign, I'm currently in the midst of preparing some of the Bitcoin Design calls to be added to the queue (that means for them to show up in review.btctranscripts.com in order for users to review/edit them and then submit them for evaluation) so I should have something up, if not today, for sure within the next week. I'm doing some improvements in the postprocessing of the AI-generated transcripts, which is the reason behind the slight delay.

Currently, I'm the primary person managing transcript additions to the queue, utilizing tstbtc for transcription and then pushing them to the bitcointranscripts repo. Our goal is to decentralize this process, making it less reliant on any single individual.

So I'm keen to explore how we can establish a pipeline to facilitate the integration of OtterAI recordings into our review queue. Although OtterAI lacks an API, necessitating manual export, I could develop a script that converts these exports into the markdown format supported by bitcointranscripts. This could be a promising first step towards automation. I'll delve into Otter's export options and propose a viable solution soon.

FYI: In Otter, you can click on those speaker names and assign names to them - otter will replace the names on all the segments. Then on subsequent transcripts that have the same speakers, it will replace the names automatically.

Also, I really like how this transcript is shown in the page that you shared. Eventually we want to achieve a similar user experience for both the reviewers in review.btctranscripts.com but also for readers in btctranscripts.com.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo 📝
Development

No branches or pull requests

3 participants