Skip to content

muniocloud/web-server

Repository files navigation

Munio Logo

a virtual assistant that helps users improve their conversation skills through voice practice sessions

Description

Munio is a virtual assistant that functions as an English teacher to help users enhance their conversational skills through voice practice sessions. The application's main dependency is Generative AI, which it uses for audio analysis, generating phrases, and providing feedback. In this project, we utilize Gemini AI to fulfill these requirements.

The application offers two modes: sessions and conversations.

Session Mode

In this mode, users will answer random phrases based on context and level requested by themselves. After each lesson and at the end of the session, they will receive an overall feedback, helping them understand how to improve.

  • Phrase generation using Gemini AI, based on context and level requested by the user;
  • Audio analysis using Gemini AI, providing feedback about the user's speaking and pronunciation;
  • Audio upload using Google Cloud Storage;
  • Session overall feedback using Gemini AI.

Conversation Mode

In this mode, users will have a realistic dialogue with an AI (using Text to Speech) generated based on context and level requested by them. At the end of the conversation, they will receive a general feedback that presents ways for improve their conversation skills.

  • Realistic dialogue generated by Gemini AI, based on context and level requested by the user;
  • Audio analysis using Gemini AI, providing feedback about the user's speaking and pronunciation;
  • Audio upload using Google Cloud Storage;
  • Text to Speech using Google Cloud Text to Speech;
  • Conversation overall feedback using Gemini AI.

To improve user experience, we use Websockets (Socket.io) for a more natural real-time interaction.

This is a public version of the back-end application and the front-end can be found here: web-client.

Installation

npm ci

Running external services

docker compose up

Installing migrations

npm run migration:up

# production
npm run migration:up:prod

Running the app

# development
npm run start

# watch mode
npm run start:dev

# production mode
npm run start:prod

Application Structure Highlights

  • Frontend interface: web-client
  • Phrase Generator: Google Gemini Flash
  • Audio recording analysis: Google Gemini Flash
  • Session analysis: Google Gemini Flash
  • Storage: Google Cloud Storage
  • TTS: Google Cloud Text to Speech
  • Infrastructure: Google Cloud Platform: Cloud SQL and App Engine
  • Websockets: Socket.io
  • Authentication: Passport
  • Validations: Zod
  • Query Builder: Knex
  • Database: MySQL
  • Framework: NestJS

Stay in touch

License

MIT licensed.

Releases

No releases published

Packages

No packages published

Languages