Talk

Talk is a single-page application crafted to converse with AI using voice, replicating the user experience akin to a native app.

Demo (No registration or login needed. Simply start conversing. For an optimal experience, open in Chrome)

More details

Highlighted Features

Broad range of service providers to choose from: ChatGPT, Google Gemini, Elevenlabs, Google Text-toSpeech, Whisper and Google Speech-to-Text
Enable voice-driven dialogues
Modern and stylish user interface
Unified, standalone binary

How to use

1. Prepare a `talk.yaml` file.

Here is a simple example utilising ChatGPT, Whisper and Elevenlabs:

speech-to-text:
  whisper: open-ai-01

llm:
  chat-gpt: open-ai-01

text-to-speech:
  elevenlabs: elevenlabs-01

# provide your confidential information below.
creds:
  open-ai-01: "sk-2dwY1IAeEysbnDNuAKJDXofX1IAeEysbnDNuAKJDXofXF5"
  elevenlabs-01: "711sfpb9kk15sds8m4czuk5rozvp43a4"

Not interested in Voice? Give this a try:

llm:
  chat-gpt: open-ai-01
creds:
  open-ai-01: "sk-2dwY1IAeEysbnDNuAKJDXofX1IAeEysbnDNuAKJDXofXF5"

Looking to utilise Google Gemini, Google Text-to-Speech and Google Speech-to-Text? Not to worry, we have that covered. Please refer to talk.google.example.yaml for more information
The comprehensive example: talk.full.example.yaml

2. Start the application

Docker

docker run -it -v ./talk.yaml:/etc/talk/talk.yaml -p 8000:8000 proxoar/talk

Terraform

Refer to terraform. The same applies to Kubernetes.

From scratch

# clone projects
git clone https://github.com/proxoar/talk.git proxoar/talk
git clone https://github.com/proxoar/talk-web.git proxoar/talk-web

# build web with yarn and copy; currently using node v20.3.0 
cd proxoar/talk-web && make copy

# build backend
cd ../talk && make build

# run
./talk --config ./talk.yaml
# or simply `./talk` as it automatically lookup talk.yaml in `/etc/talk/talk.yaml` and `./talk.yaml`
./talk

Advanced usage

Proxy

We honour HTTP_PROXY and HTTPS_PROXY env variables. Given that all communication between the Talk server and service providers occurs via HTTPS, simply employ HTTPS_PROXY.

docker run -it -v ./talk.yaml:/etc/talk/talk.yaml \
-e HTTPS_PROXY=http://192.168.1.105:7890 \
-p 8000:8000 \
proxoar/talk

Log level

Default log level is info, Use env LOG_LEVEL to change log level: "debug", "info", "warn", "error", "dpanic", " panic", and "fatal". e.g.,

LOG_LEVEL=debug ./talk

HTTPS

proxoar/talk offers three methods for enabling HTTPS.

1. Generate self-signed cert on the fly

Example: talk.tls.self.signed.example.yaml

server:
  tls:
    self-signed: true

This is handy if you're indifferent to a domain and unconcerned about security, simply desiring to enable microphone access on browsers.

2. Provide your own TLS

Example: talk.tls.provided.example.yaml

3. Auto TLS

This configuration example facilitates automatic certificate acquisition from LetsEncrypt: talk.tls.auto.example.yaml

Requirements: You should have your personal VPS and domain.

Troubleshooting

Why can't I start the recording?

Web browsers safeguard your microphone from being accessed by non-HTTPS websites for security reasons, with the exceptions being localhost and 127.0.0.1.

Here are some possible solutions:

Enable HTTPS. Particularly, you can Generate self-signed cert on the fly in a mere second.
Run Talk through a reverse proxy like Nginx and set up TLS within this service.
In Chrome, go to chrome://flags/, find Insecure origins treated as secure, and enable it:

Browser compatibility

	Arc	Chrome	FireFox	Edge	Safari
Microphone	✅	✅	✅	❌	❌
UI	✅	✅	✅	✅	❌

Q&A

Q: Why not use TypeScript for both the frontend and backend development?

A:

When I embarked on this project, I was largely inspired by Hugh, a project primarily coded in Python, supplemented with HTML and a touch of JavaScript. To broaden the horizons of text-to-speech providers, I revamped the backend logic using Go, transforming it into a Go-based project.
Crafting backend logic with Go feels incredibly intuitive—it distills everything down to a single binary.
Moreover, my skills in frontend development were somewhat rudimentary at that time.

Q: Will a mobile browser-friendly version be made available?

A: Streamlining the website for mobile usage would be a time-intensive endeavour and, given my current time constraints, it isn't the primary concern. As it stands, the site performs optimally on desktop browsers based on the Chromium Engine, with certain limitations on browsers such as Safari.

Roadmap

Contributing

We're in the midst of a dynamic development stage for this project and warmly invite new contributors.

CONTRIBUTING.md

Credits

Front-end

React: The library for web and native user interfaces
vite: Next generation frontend tooling. It's fast!
valtio: Valtio makes proxy-state simple for React and Vanilla
wavesurfer.js: Audio waveform player
granim.js: Create fluid and interactive gradient animations with this small javascript library.
virtual: Headless UI for Virtualizing Large Element Lists in JS/TS, React, Solid, Vue and Svelte
markdown-it: Markdown parser, done right. 100% CommonMark support, extensions, syntax plugins & high speed
highlight.js: JavaScript syntax highlighter with language auto-detection and zero dependencies.

Back-end

This project draws inspiration from Hugh, a remarkable tool that enables seamless communication with AI using minimal code.
go-openai: OpenAI ChatGPT, GPT-3, GPT-4, DALL·E, Whisper API wrapper for Go.
google-cloud-go: Google Cloud Client Libraries for Go. Thanks to googleapis for the prompt response to our concern.
echo: High performance, minimalist Go web framework
elevenlabs-go: A Go API client library for the ElevenLabs speech synthesis
r3labs/sse: Server Sent Events server and client for Golang platform.

Design

wikiart.org: Wikiart is a great place to find art online. Most wallpapers of Talk come from WikiArt.org
Arc: Arc is the Chrome replacement I’ve been waiting for -- THE VERGE
grainy-gradients: Thanks to cjimmy for his amazing tutorial on noise and gradient background
Signal-Desktop and Signal-iOS: Private messengers. Much of the inspiration for the UI comes from Signal.

We would also like to thank all other open-source projects and communities not listed here for their valuable contributions to our project.

Name		Name	Last commit message	Last commit date
Latest commit History 188 Commits
.github		.github
assets		assets
cmd/talk		cmd/talk
doc		doc
example		example
internal		internal
pkg		pkg
script		script
web		web
.dockerignore		.dockerignore
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
Dockerfile-release		Dockerfile-release
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum
resource.go		resource.go

License

proxoar/talk

Folders and files

Latest commit

History

Repository files navigation