Speech and Language Technology (SaLT) at the University of Stuttgart

All

18 repositories

diagraph
Public
DIAGRAPH: An open-source graphic interface for dialog flow design
JavaScript
•
GNU General Public License v3.0
•1•4•0•0•Updated Feb 24, 2025Feb 24, 2025
deep-learning-course
Public
Jupyter Notebook
•1•4•0•0•Updated Feb 19, 2025Feb 19, 2025
conversational-tree-search
Public
Code and Data for Conversational Tree Search: A new task that bridges the gap between FAQ-style information retrieval and task-oriented dialog.
Python
•0•7•1•0•Updated Feb 5, 2025Feb 5, 2025
Intrinsic-Subgraph-Generation-for-VQA
Public
Predicting a subgraph alongside the answer in a graph based VQA model
vqa discrete sampling subgraph interpretability masking visual-question-answering explainable-ai graph-neural-networks gqa
Python
•
MIT License
•1•9•2•0•Updated Jan 21, 2025Jan 21, 2025
IMS-Toucan
Public
Controllable and fast Text-to-Speech for over 7000 languages!
text-to-speech deep-learning toolkit speech pytorch tts speech-synthesis speech-processing
Python
•
Apache License 2.0
•177•1.6k•11•0•Updated Nov 7, 2024Nov 7, 2024
speaker-anonymization
Public
Speaker anonymization pipeline for hiding the identity of the speaker of a recording by changing the voice in it.
Shell
•
GNU General Public License v3.0
•6•70•2•0•Updated Sep 13, 2024Sep 13, 2024
hard-negative-captions
Public
Python
•0•4•0•0•Updated Jul 3, 2024Jul 3, 2024
bloomzmms
Public
Materials for the publication "Teaching a Multilingual Large Language Model to Understand Multilingual Speech via Multi-Instructional Training"
Python
•
Apache License 2.0
•0•2•0•0•Updated Jun 16, 2024Jun 16, 2024
multilingual-seq2seq-slu
Public
Materials for the publication "Leveraging Multilingual Self-Supervised Pretrained Models for Sequence-to-Sequence End-to-End Spoken Language Understanding"
Python
•
Apache License 2.0
•0•2•0•0•Updated Jun 16, 2024Jun 16, 2024
VoicePAT
Public
VoicePAT is a modular and efficient toolkit for voice privacy research, with main focus on speaker anonymization.
Shell
•
Apache License 2.0
•4•49•2•0•Updated May 14, 2024May 14, 2024
adviser
Public
ADvISER is a flexible framework to encourage task-oriented dialog system research & development
machine-learning framework reinforcement-learning toolkit dialogue dialogue-systems task-oriented-dialogue multimodal
Python
•
GNU General Public License v3.0
•34•59•3•8•Updated Aug 14, 2023Aug 14, 2023
BetterFinetuning
Public
Code accompanying our paper on finetuning self-supervised general speech representations with a combination of contrastive and non-contrastive methods.
speech-embedding self-supervised-learning
Python
•
Apache License 2.0
•0•1•1•0•Updated Oct 5, 2022Oct 5, 2022
IMS-Speech
Public
IMS-Speech is a tool for German, English and Russian speech transcription aiming to facilitate research in various disciplines. We are willing to provide a speech transcription service with an intuitive web interface accessible with a wide range of computing devices and to people with various backgrounds. Our service is available here: https://7…
Go
•
MIT License
•2•5•1•0•Updated May 13, 2022May 13, 2022
Our_Fault
Public
A collaborative dialog game playable by a human and an AI system, designed to better understand how users view such an AI partner. The repository contains code for the game as well as dialog logs, survey responses, and annotations from a user study conducted with this scenario.
Python
•
GNU General Public License v3.0
•0•0•0•0•Updated Nov 10, 2021Nov 10, 2021
ethics_in_chatbot_design
Public
A project exploring ethical implications of chatbot design, in particular affective language style. The repository contains code, survey responses, and annotated data for the experiment conducted using this implementation.
Python
•
GNU General Public License v3.0
•0•0•0•0•Updated Nov 9, 2021Nov 9, 2021
cyclegan-emotion-transfer
Public
CycleGAN-based Emotion Style Transfer as Data Augmentation for Speech Emotion Recognition
Python
•
GNU General Public License v3.0
•1•12•1•0•Updated Oct 7, 2019Oct 7, 2019
nlg-eval
Public
Code accompanying the INLG 2018 paper Sequence-to-Sequence Models for Data-to-Text Natural Language Generation: Word- vs. Character-based Processing and Output Diversity
Python
•
GNU General Public License v3.0
•0•6•0•0•Updated Aug 30, 2019Aug 30, 2019
reading-comprehension
Public
Comparing attention-based convolutional and recurrent neural networks under adversarial attacks to investigate their success and limitations in machine reading comprehension
Python
•
GNU General Public License v3.0
•3•10•0•0•Updated Aug 24, 2018Aug 24, 2018