ROS 2 Whisper

ROS 2 inference for whisper.cpp.

Example

This example shows live transcription of first minute of the 6'th chapter in Harry Potter and the Philosopher's Stone from Audible:

Build

Install pyaudio, see install instructions.
Build this repository, do

mkdir -p ros-ai/src && cd ros-ai/src && \
git clone https://github.com/ros-ai/ros2_whisper.git && cd .. && \
colcon build --symlink-install --cmake-args -DGGML_CUDA=On --no-warn-unused-cli

Demos

Configure whisper parameters in whisper.yaml.

Whisper On Key

Run the inference action server (this will download models to $HOME/.cache/whisper.cpp):

ros2 launch whisper_bringup bringup.launch.py

Run a client node (activated on space bar press):

ros2 run whisper_demos whisper_on_key

Stream

Bringup whisper:

ros2 launch whisper_bringup bringup.launch.py

Launch the live transcription stream:

ros2 run whisper_demos stream

Parameters

To enable/disable inference, you can set the active parameter from the command line with:

ros2 param set /whisper/inference active false # false/true

Audio will still be saved in the buffer but whisper will not be run.

Available Actions

Action server under topic inference of type Inference.action.

The feedback message regularly publishes the actively changing portion of the transcript.
The final result contains stale and active portions from the start of the inference.

Published Topics

Topics of type AudioTranscript.msg on /whisper/transcript_stream, which contain the entire transcript (stale and active), are published on updates to the transcript.

Internally, the topic /whisper/tokens of type WhisperTokens.msg is used to transfer the model output between nodes.

Troubleshoot

Encoder inference time: ggml-org/whisper.cpp#10 (comment)

Name	Name	Last commit message	Last commit date
Latest commit mhubii Merge pull request #19 from ros-ai/dev-whisper_cpp-1.7.2 Dec 13, 2024 59b2d73 · Dec 13, 2024 History 173 Commits
audio_listener	audio_listener	Update to version 1.4.0	Dec 11, 2024
doc	doc	added .gif example to readme	Nov 19, 2024
transcript_manager	transcript_manager	Update to version 1.4.0	Dec 11, 2024
whisper_bringup	whisper_bringup	Update to version 1.4.0	Dec 11, 2024
whisper_cpp_vendor	whisper_cpp_vendor	Unnecessary compiler link to runtime executables and static libraries	Dec 12, 2024
whisper_demos	whisper_demos	Update to version 1.4.0	Dec 11, 2024
whisper_idl	whisper_idl	Update to version 1.4.0	Dec 11, 2024
whisper_server	whisper_server	Update to version 1.4.0	Dec 11, 2024
whisper_util	whisper_util	Update to version 1.4.0	Dec 11, 2024
.gitignore	.gitignore	added gitignore	Aug 18, 2023
CHANGELOG.rst	CHANGELOG.rst	Fix changelog from markdown to reStructured text	Dec 11, 2024
README.md	README.md	Update whisper cpp to version 1.7.2	Dec 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ROS 2 Whisper

Example

Build

Demos

Whisper On Key

Stream

Parameters

Available Actions

Published Topics

Troubleshoot

About

Releases 8

Packages

Contributors 3

Languages

ros-ai/ros2_whisper

Folders and files

Latest commit

History

Repository files navigation

ROS 2 Whisper

Example

Build

Demos

Whisper On Key

Stream

Parameters

Available Actions

Published Topics

Troubleshoot

About

Resources

Stars

Watchers

Forks

Releases 8

Packages 0

Contributors 3

Languages

Packages