update supp reading

huggingface · Jul 12, 2023 · 228db95 · 228db95
1 parent 77db368
commit 228db95
Showing 1 changed file with 7 additions and 0 deletions.
diff --git a/chapters/en/chapter7/supplemental_reading.mdx b/chapters/en/chapter7/supplemental_reading.mdx
@@ -10,3 +10,10 @@ Speech-to-speech translation:
 * [Leveraging unsupervised and weakly-supervised data to improve direct STST](https://arxiv.org/abs/2203.13339) by Google: proposes new approaches for leveraging unsupervised and weakly supervised data for training direct STST models and a small change to the Transformer architecture
 * [Translatotron-2](https://google-research.github.io/lingvo-lab/translatotron2/) by Google: a system that is able to retain speaker characteristics in translated speech
 
+Voice Assistant:
+* [Accurate wakeword detection](https://www.amazon.science/publications/accurate-detection-of-wake-word-start-and-end-using-a-cnn) by Amazon: a low latency approach for wakeword detection for on-device applications
+* [RNN-Transducer Architecture](https://arxiv.org/pdf/1811.06621.pdf) by Google: a modification to the CTC architecture for streaming on-device ASR
+
+Meeting Transcriptions:
+* [pyannote.audio Technical Report](https://huggingface.co/pyannote/speaker-diarization/blob/main/technical_report_2.1.pdf) by Hervé Bredin: this report describes the main principles behind the `pyannote.audio` speaker diarization pipeline
+* [Whisper X](https://arxiv.org/pdf/2303.00747.pdf) by Max Bain et al.: a superior approach to computing word-level timestamps using the Whisper model