🚀 This project allows you to create a sophisticated voice assistant using the OpenAI API and the Google speech recognition module. The code is divided into three main modules for a robust and clear code structure.
⚠️ Prerequisites: Before running the project, make sure to add theOPENAI_API_KEY
environment variable with your personal OpenAI API key and to have VLC media player installed on your system. Also, make sure you have Python and pip installed in your environment.⚠️
To install the project locally, follow these steps:
- Create a virtual environment using
venv
:
python3 -m venv venv
-
Activate the virtual environment:
- On Windows:
venv\Scripts\activate
- On Unix or MacOS:
source venv/bin/activate
If you're using Linux, run the script_linux.sh script:
./script_linux.sh
and run
pip3 install -r requirements_linux.txt
If you're using MACOS:
- Install project dependencies using
pip
:
pip3 install -r requirements.txt
The voice_recognition.py
module acts as the "ears" of our voice assistant. This module handles listening to user input and its subsequent conversion to text.
🎯 This module includes the VoiceRecognition
class with methods:
listen()
🎤: Listen to user input via the microphone.decode_speech(audio)
📝: Convert voice input to text.
The dialogue_management.py
module is the brain of our assistant, managing the dialogue between the user and the assistant.
🎯 This module contains the DialogueManagement
class with methods like:
_init_dialogue()
💼: Initialize dialogue by setting up initial system statements.add_dialogue(role, text)
🗣️: Add new dialogues to ongoing discussions.chat_completion()
🤖: Generate an appropriate response using the OpenAI API.
The voice_assistant.py
module is the main module and entry point of this project. It combines the dialogue_management.py
and voice_recognition.py
modules into a single voice assistant application.
The VoiceAssistant
class in this module has the following methods:
escape_character(text)
🧹: Remove escape characters from text, making input safe.run()
🏃♂️: Start the interaction with the voice assistant and continue until interrupted.
💡 Available Text-to-Speech Services:
PYTTSX3
: Pyttsx3OPENAI
: OpenAIGTTS
: gTTSFUN_VOICE
: Fun VoiceELEVENLABS
: ElevenLabs (requires key configuration)
ℹ️ Next Implementation: Integration with Home Assistant will soon be added for enhanced automation and home control.
💡 To run, execute:
python3 main.py
💡 To change the Text-to-Speech service, modify the value of the service
variable in the __init__
method of the VoiceAssistant
module.
TextToSpeechService(service=ServiceType.GTTS)
🗣️
Have fun! 🎉