- LibriSpeech (raw audio and text transcripts)
- Universally encoded sentences
- Generate MFCC samples from FLAC audio
- Generate encoded sentences from transcripts
- MFCC samples
- Text sentences
- Encoded sentences
Train model on:
- Input: MFCC
- Output: Encoded sentence
Train model on:
- Input: Encoded sentence
- Output: Text sentence
- Generate MFCC from raw audio
- Generate encoded sentence by feeding MFCC to encoder model
- Generate text sentence by feeding encoded sentence to decoder model