-
Notifications
You must be signed in to change notification settings - Fork 290
home
The library enables decoding of MP3 and AAC compression and plays 8bit or 16bit wav files. The audio data can come from the Internet, SD card or SPIFFS. Many radio stations can be heard. Playlists are unpacked and a connection to the (first) URL is established, formats are * .pls, * .m3u and * .asx. SSL connections are possible.
Examples:
connecttohost("http://online.rockarsenal.ru:8000/rockarsenal_aacplus");
connecttoSD("click.mp3");
connecttoFS(SD, "/wave_test/Wav_868kb.wav");
connecttoFS(SPIFFS, "wobble.mp3");
Stations can be received up to 320Kbit/s. A good connection is a prerequisite for this. Many, but not every, station that runs smoothly in the VLC player works on the ESP32 without dropouts. Shortly before the input buffer is empty, this message appears in the serial monitor
slow stream, dropouts are possible
If the connection is lost, the library tries to re-establish the connection.
Tip: the AAC decoder supports SBR (Spectral Band Replication). To do this, 'AAC_ENABLE_SBR' can be activated in 'aac_decoder.h'. However, another ~ 60KB are required in RAM. In SBR mode, PSRAM cannot be used because of the longer access time.
Basically all 16 bit DACs that have the pins DIN, BLCK and LRC. The PCM5102A delivers good results. Most GPIOs can be used
setPinout(uint8_t BCLK, uint8_t LRC, uint8_t DOUT);
Also ESP32-A1S can be used; the library https://github.com/Yveaux/AC101 can be integrated for this purpose. See the examples folder https://github.com/schreibfaul1/ESP32-audioI2S/tree/master/examples/ESP32-A1S MCLK is required in certain cases (only support GPIO0/GPIO1/GPIO3).
i2s_mclk_pin_select(const uint8_t pin);
For DACs such as the PT8211, you can switch from the I2S standard to the Japanese (Least Significant Bit Justified) format. For this there is the command setI2SCommFMT_LSB(true) which has to be executed before activating the I2S interface (i.e. before connectTo ...)
setI2SCommFMT_LSB(true);
If the 8-bit sound is enough, you can do that.
...
Audio audio(true, I2S_DAC_CHANNEL_BOTH_EN);
...
I2S_DAC_CHANNEL_RIGHT_EN /!< Enable I2S built-in DAC right channel, maps to DAC channel 1 on GPIO25/ I2S_DAC_CHANNEL_LEFT_EN /!< Enable I2S built-in DAC left channel, maps to DAC channel 2 on GPIO26/ I2S_DAC_CHANNEL_BOTH_EN /!< Enable both of the I2S built-in DAC channels./
Yes, the library can be downloaded as a zip file. The installation in the Arduino IDE runs via the library manager
Tip: Use the partition scheme 'Huge App' so that there is enough memory for your own extensions
If available, PSRAM can be used. PSRAM is recognized automatically. The input buffer is then automatically relocated and enlarged.
without PSRAM, inputBufferSize is about 6.25KBytes
with PSRAM, inputBufferSize is about 29KBytes
8000 or 30000 bytes are allocated, part of which is used internally to avoid copying the audio data during operation
The decoders automatically detect whether PSRAM is available. If so, the buffers are created in PSRAM, leaving more space for your own projects
Internally, the volume is divided into 64 steps. With setVolume() the volume is set to 22 steps by default. Internally, the 22 steps are assigned to the 64 volume steps via a table. This creates a logarithmic curve. This is the ideal solution for buttons or touchpads. Manche externen Geräte (z.B. AC101, ES8388 ...) require a larger range of values. The default maximum (21) can be overwritten with setVolumeSteps(uint8_t steps)
. In this way, value ranges can be redefined, e.g. (0...63) or (0...99).
The balance attenuates the left or right channel (values between -16 ...16).
setBalance(-16); // mutes the left channel
setVolume(21); // max loudness
The volume control stages are not linear, but follow a logarithmic control characteristic to cover a large dynamic range with linear adjustment.
To achieve this, two different curves are implemented. Curve 0 follows a quadratic curve, curve 1 a logarithmic curve. Which curve is chosen depends on personal preference and the hardware used.
Call: setVolume(uint8_t vol, uint8_t curve);
Yes, that is possible. There are built-in IIR filters to simulate a 3 band equalizer.
setTone(int8_t gainLowPass, int8_t gainBandPass, int8_t gainHighPass){
// values can be between -40 ... +6 (dB)
SetTone (0, 0, 0) is the default setting. If you want to go deeper into the rabbit hole, take a look at the routine IIR_calculateCoefficients (int8_t G0, int8_t G1, int8_t G2). The limit frequencies are specified there. The filter formulas I used can be find here: https://www.earlevel.com/main/2012/11/26/biquad-c-source-code/ The filter effect can be evaluated graphically here:
audio_info: authentification failed, wrong credentials?
The name and password can be transferred when the destination is called, use:
connecttohost("http://xxxx", "name", "password");
The events are weakly integrated by the compiler. This means they can, but do not have to be used.
audio_info outputs the current status, suitable for debugging and troubleshooting
void audio_info(const char *info)
audio_id3data many mp3 files contain information about artists, albums or bands. The data is read from the file and can be used further via this event
void audio_id3data(const char *info)
audio_eof_mp3 is called after the end of an audio file. * info contains the file name. With this event it is possible to create playlists
void audio_eof_mp3(const char *info)
Playlist example:
void audio_eof_mp3(const char *info){ //end of file
Serial.print("audio_info: "); Serial.println(info);
static int i=0;
if(i==0) audio.connecttoSD("/wave_test/If_I_Had_a_Chicken.mp3");
if(i==1) audio.connecttoSD("/wave_test/test_8bit_stereo.wav");
if(i==2) audio.connecttoSD("/wave_test/test_16bit_mono.wav");
i++;
if(i==3) i=0;
}
audio_eof_stream the same as audio_eof_mp3 for podcasts or files transferred from the server
void audio_eof_stream(const char *info)
audio_showstation many radio stations provide their names at the beginning of the connection. This can be used in your own applications
void audio_showstation(const char *info)
audio_showstreamtitle if the radio station transmits information about the artist, music track ... in its metadata, this event is called
void audio_showstreamtitle(const char *info)
audio_bitrate returns the current bit rate as text
void audio_bitrate(const char *info)
audio_commercial commercials are often played at the beginning of the broadcast (and during the program). Info contains the expected duration of the advertisement. So the sound can be switched off for the time being
void audio_commercial(const char *info)
audio_icyurl if the station has a homepage this is called here
void audio_icyurl(const char *info)
audio_lasthost the URL that is called via connecttohost does not have to be the current URL. Sometimes it is redirected to another URL that can be read out here.
void audio_lasthost(const char *info)
audio_id3image mp3 files can contain pictures. Here the pointer to the current mp3 file, the position of the beginning of the picture and the size is transmitted. In the example, the cover image is extracted and written to SD.
void audio_id3image(File& file, const size_t pos, const size_t size) { // cover image
Serial.printf("id3image found at pos: %u, length: %u\n", pos, size);
uint8_t buf[1024];
file.seek(pos + 1); // skip 1 byte encoding
char mimeType[255]; // mime-type (null terminated)
for (uint8_t i = 0u; i < 255; i++) {
mimeType[i] = file.read();
if (uint8_t(mimeType[i]) == 0) break;
}
Serial.printf("MineType: %s\n", mimeType);
uint8_t imageType = file.read(); // image type (1 Byte)
Serial.printf("ImageType: %d\n", imageType);
for (uint8_t i = 0u; i < 255; i++) { // description (null terminated)
buf[i] = file.read();
if (uint8_t(buf[i]) == 0) break;
}
// raw image data
File coverFile = SD.open("/cover.jpg", FILE_WRITE);
size_t len = size;
while(len) {
uint16_t bytesRead = file.read(buf, sizeof(buf));
if(len >= bytesRead) len -= bytesRead;
else {bytesRead = len; len = 0;}
coverFile.write( buf, bytesRead);
}
Serial.print("Cover file written\n");
coverFile.close();
}
audio_oggimage OGG can contain embedded images in the comment header. If these are larger than an OggS frame, the image is fragmented and embedded in further OggS frames. The image data is always Base64 encoded.
void audio_oggimage(File& audiofile, std::vector<uint32_t> vec){ //OGG blockpicture
log_i("oggimage:.. " ANSI_ESC_GREEN "---------------------------------------------------------------------------");
log_i("oggimage:.. " ANSI_ESC_GREEN "ogg metadata blockpicture found:");
for(int i = 0; i < vec.size(); i += 2) {
log_i("oggimage:.. " ANSI_ESC_GREEN "segment %02i, pos %07i, len %05i", i / 2, vec[i], vec[i + 1]);
}
log_i("oggimage:.. " ANSI_ESC_GREEN "---------------------------------------------------------------------------");
}
audio_process_i2s I2S is used to decouple the audio signal and pass it on to external devices. If continueI2S is true, the signal is written to the I2S DMA. But you can also manipulate a signal. The example shows how an audio stream is accompanied by a sine tone.
void audio_process_i2s(int16_t* outBuff, uint16_t validSamples, uint8_t bitsPerSample, uint8_t channels, bool *continueI2S){
int16_t sineWaveTable[44] = {
0, 3743, 7377, 10793, 14082, 17136, 19848, 22113, 23825, 24908,
25311, 24908, 23825, 22113, 19848, 17136, 14082, 10793, 7377, 3743,
0, -3743, -7377, -10793, -14082, -17136, -19848, -22113, -23825, -24908,
-25311, -24908, -23825, -22113, -19848, -17136, -14082, -10793, -7377, -3743
};
static uint8_t tabPtr = 0;
int16_t* sample[2]; // assume 2 channels, 16bit
for(int i= 0; i < validSamples; i++){
*(sample + 0) = outBuff + i * 2; // channel left
*(sample + 1) = outBuff + i * 2 + 1; // channel right
*(*sample + 0) = (sineWaveTable[tabPtr] /50 + *(*sample + 0));
*(*sample + 1) = (sineWaveTable[tabPtr] /50 + *(*sample + 1));
tabPtr++;
if(tabPtr == 44) tabPtr = 0;
}
*continueI2S = true;
}
There are other useful functions for building MP3 players, for example
setConnectionTimeout() In some cases it can make sense to change the threshold value for establishing a connection. By default, 250ms are set for unencrypted connections and 2700ms for SSL connections.
uint16_t timeout_ms = 300;
uint16_t timeout_ms_ssl = 3000;
audio.setConnectionTimeout(timeout_ms, timeout_ms_ssl);
getAudioFileDuration() Indicates the expected length of an audio file in seconds. With a constant bit rate, CBR, the value is exact, with a variable bit rate, VBR, the duration is estimated based on the first 100 mp3 frames and can therefore deviate slightly from the actual playback time
uint32_t getAudioFileDuration()
getAudioCurrentTime() returns the current playing time in seconds
uint32_t getAudioCurrentTime()
An example program could look like this:
Ticker ticker;
void setup() {
...
ticker.attach(1, tcr1s);
...
}
void tcr1s(){
uint32_t act=audio.getAudioCurrentTime();
uint32_t afd=audio.getAudioFileDuration();
uint32_t pos =audio.getFilePos();
log_i("pos =%i", pos);
log_i("audioTime: %i:%02d - duration: %i:%02d", (act/60), (act%60) , (afd/60), (afd%60));
}
The output in the serial monitor
This works with local files (SD, FFat, SD_MMC, SPIFFS) and with web files in wav or mp3 format. The current time for AAC-coded files (m4a) cannot be precisely determined and is therefore estimated using the mean value of the bit rate.
Sometimes you want to play an audio file in a loop.
setFileLoop() the position is determined internally after the audio header. At the end of the file there is a jump to the audio start position
bool setFileLoop(true);
In some projects there is only one audio amplifier or speaker. Then it makes sense to convert the stereo signal into a mono signal. With forceMono(true); the mean value is calculated from the signal of both channels and placed on the left and right channel.
void forceMono(true); // change stereo to mono
void forceMono(false); // default stereo will be played
setAudioTaskCore(uint8_t coreID) The audio task takes the data from the buffer, decodes it and feeds the I2S. On the other hand, the audio.loop() fills the buffer, takes care of the entire control, processes all 'non' audio-relevant data, such as the metadata, and generates the events. For good performance, the audio task should not run on the core of the Arduino loop task. By default, the audio task runs on core 0, but can be changed here.
Here is a simple program example, you need an ESP32 developer board and an external DAC (e.g. PCM5102A)
#include "Arduino.h"
#include "WiFi.h"
#include "Audio.h"
#define I2S_DOUT 26 // connect to DAC pin DIN
#define I2S_BCLK 27 // connect to DAC pin BCK
#define I2S_LRC 25 // connect to DAC pin LCK
Audio audio;
const char* ssid = "SSID";
const char* password = "password";
void setup() {
Serial.begin(115200);
WiFi.begin(ssid, password);
while (WiFi.status() != WL_CONNECTED) delay(1500);
audio.setPinout(I2S_BCLK, I2S_LRC, I2S_DOUT);
audio.connecttohost("http://s1.knixx.fm/dein_webradio_64.aac"); // 64 kbp/s aac+
}
void loop() {
audio.loop();
}
void audio_info(const char *info){
Serial.print("info "); Serial.println(info);
}
The output in the serial monitor:
rst:0x1 (POWERON_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT) configsip: 0, SPIWP:0xee clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00 mode:DIO, clock div:1 load:0x3fff0018,len:4 load:0x3fff001c,len:1216 ho 0 tail 12 room 4 load:0x40078000,len:10944 load:0x40080400,len:6388 entry 0x400806b4 info PSRAM not found, inputBufferSize = 6399 bytes info buffers freed, free Heap: 228148 bytes info Connect to new host: "http://s1.knixx.fm/dein_webradio_64.aac" info Connect to "s1.knixx.fm" on port 80, extension "/dein_webradio_64.aac" info Connected to server info Server: nginx/1.14.2 info audio/aac seen. info format is aac info AACDecoder has been initialized, free Heap: 199916 bytes info chunked data transfer info Connection: close info ice-audio-info: channels=2;samplerate=44100;bitrate=64 info icy-description: Wir spielen Musik von den 60ern bis Heute! Und immer um halb aktuelle Country-Music. info icy-genre: variety,pop,oldies,country info icy-name: knixx.fm - Dein Webradio. / 64 kbp/s aac+ info icy-pub: 1 info icy-url: https://knixx.fm info Cache-Control: no-cache, no-store info Access-Control-Allow-Origin: * info Access-Control-Allow-Headers: Origin, Accept, X-Requested-With, Content-Type info Access-Control-Allow-Methods: GET, OPTIONS, HEAD info Expires: Mon, 26 Jul 1997 05:00:00 GMT info X-Frame-Options: SAMEORIGIN info X-Content-Type-Options: nosniff info Switch to DATA, bitrate is 64000, metaint is 4096 info inputbuffer is being filled info StreamTitle="Michael Bolton - Soul Provider -- 1989" info stream ready info buffer filled in 7 ms info syncword found at pos 0 info AAC Channels=1 info AAC SampleRate=22050 info AAC BitsPerSample=16 info AAC Bitrate=64000 info StreamTitle="Symbol - The Most Beautiful Girl In The World -- 1994"
building it on a breadboard:
the schematic:
There are displays for the Raspberry Pi with a resolution of 480x320 pixels and an SPI bus. These are particularly suitable, see the radio folder