Open
Description
🚀 The feature
FFmpeg makes it easy to write raw audio data without any headers. For example for opus encoding, it looks something like this:
input_stream = ffmpeg.input(
"pipe:0",
format="s16le",
ar=self.original_sample_rate,
ac=self.n_channels,
)
map = "0:a" if self.raw else ""
format = "data" if self.raw else "opus"
output_stream = ffmpeg.output(
input_stream,
"pipe:1",
format=format,
acodec="libopus",
audio_bitrate=audio_bitrate_str,
ar=self.out_sample_rate,
application="audio",
map=map,
)
torchaudio.io.StreamWriter makes it easy to supply the format dynamically, but it is not possible to supply the map parameter.
This would be very useful so that we can define our encoders in torchaudio and add options for raw or complete audio encoding
Motivation, pitch
This is very useful for conversational AI and audio streaming.
Alternatives
No response
Additional context
No response
Metadata
Metadata
Assignees
Labels
No labels