Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build Essentia in release #295

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Conversation

kmod-midori
Copy link
Contributor

Related to #221.

This builds a working essentia_streaming_extractor_music binary in /usr/bin that can be used to extract music features. It is not a static build and uses the ffmpeg library that are built in the same process.

Along with other dependencies, this change makes the final image much larger at 50 MiB on my local system.

libsamplerate is built manually since Alpine does not package it, but it is present in Debian and Arch. Apart from this, you should be able to copy my changes to other Dockerfile and call it a day (this will make CI slower, so proceed with caution).

Essentia can become problematic when Debian moves to ffmpeg 5.x (the current stable is fine, but testing and sid are already on 5.x).

BTW, why aren't you doing make -j$(nproc) when building these libraries?

@epoupon
Copy link
Owner

epoupon commented Dec 31, 2022

Thanks for building essentia! That's something I definitely wanted to try, specially to get an idea of how big it is.
50MiB is really huge, I really want to keep LMS lighweight, and the ffmpeg5 issue is really annoying too.
I am reading up on the recommendation/similarity topics, and since I want to get rid of ffmpeg binary calls and directly decode/encode using ffmpeg's libs, it may be not that hard to analyse the songs and extract ourselves the few low level features we want.

@epoupon
Copy link
Owner

epoupon commented Dec 31, 2022

BTW, why aren't you doing make -j$(nproc) when building these libraries?

Yes indeed we can restore this
Edit: ah actually I just remember
Dockerfile-build- files do use make -j$(nproc) because they are run via github actions. But Dockerfile-release is run locally using dockerx, and there are enough archs to fill in all the available cores)

@kmod-midori
Copy link
Contributor Author

I have removed some large algorithms that is not really used by the extractor that we are using.

Found the problem: building the image before applying the changes in this PR produces a 34.8 MB image on my machine, much larger than what you have on Docker Hub.

Currently the image on my machine after these changes is 44.2 MB, so a 10 MB gain for this feature.

@kmod-midori
Copy link
Contributor Author

it may be not that hard to analyse the songs and extract ourselves the few low level features we want.

If memory serves me right, we are using almost all the low-level features?

@epoupon
Copy link
Owner

epoupon commented Dec 31, 2022

it may be not that hard to analyse the songs and extract ourselves the few low level features we want.

If memory serves me right, we are using almost all the low-level features?

Not exactly, we get them all from AB, but only some of them are useful for clustering based on similarities. I tried a genetic algo to select the best features but I really lacked some time to get a proper training set and to optimize the SOM training part. And this is still in my todo list.
The current results I got is here:

{ "lowlevel.spectral_energyband_high.mean", {1}},

(Only five low level entries used, but we should have more to keep experiment on these. I found a thesis with valuable info on the useful features, will try to get a link)

@epoupon
Copy link
Owner

epoupon commented Dec 31, 2022

And btw images are compressed on docker hub. If you fetch the image you can see locally its real size

@kmod-midori
Copy link
Contributor Author

If we can decide which features to use, it is possible to create a minimized version of essentia that only contains required algorithms. Some of the algorithms are really large (looking at object sizes).

However, I do not have a large library available, so I can not really help.

@Danoloan10
Copy link
Contributor

Danoloan10 commented Jan 6, 2023

Benchmarking by running the extractor for a FLAC file, these were the algos that were used:

MusicExtractor,MetadataReader,AudioLoader,StereoDemuxer,StereoMuxer,Resample,StereoTrimmer,LoudnessEBUR128,FrameCutter,NoiseAdder,LoudnessEBUR128Filter,IIR,UnaryOperatorStream,BinaryOperatorStream,Mean,EqloudLoader,MonoLoader,MonoMixer,Trimmer,Scale,EqualLoudness,ReplayGain,InstantPower,EasyLoader,Windowing,Spectrum,FFT,Magnitude,SilenceRate,ZeroCrossingRate,MFCC,MelBands,TriangularBands,DCT,CentralMoments,DistributionShape,FlatnessDB,Flatness,GeometricMean,Crest,GFCC,ERBBands,BarkBands,FrequencyBands,UnaryOperator,Decrease,RollOff,Energy,RMS,EnergyBand,HFC,Flux,StrongPeak,SpectralComplexity,SpectralPeaks,PeakDetection,PitchSalience,AutoCorrelation,Centroid,Dissonance,Entropy,SpectralContrast,Loudness,DynamicComplexity,RhythmExtractor2013,BeatTrackerMultiFeature,CartesianToPolar,OnsetDetection,TempoTapDegara,MovingAverage,OnsetDetectionGlobal,TempoTapMaxAgreement,BeatTrackerDegara,BpmHistogramDescriptors,OnsetRate,Onsets,Danceability,TuningFrequency,BeatsLoudness,Slicer,SingleBeatLoudness,EnergyBandRatio,HPCP,Key,ChordsDetection,ChordsDescriptors,HighResolutionFeatures,PoolAggregator,SingleGaussian,YamlOutput

Not sure whether different formats make use of different algos.

The essentia library with these is 4.5MiB in x86_64. I'm guessing that's the bare minimum to use the MusicExtractor.

@Danoloan10
Copy link
Contributor

It worked for MP3, WAV and Opus files too. Ogg seems to be unsupported.

@kmod-midori
Copy link
Contributor Author

kmod-midori commented Jan 14, 2023

Do we have that thesis available? I wonder if simple kNN would just work as well...

Ogg seems to be unsupported.

Essentia directly uses libav for decoding, so ogg should be supported as long as libav/ffmpeg has support for that. No idea why that does not work in your environment though.

@Danoloan10
Copy link
Contributor

Danoloan10 commented Jan 14, 2023

The extractor complains about the algo that it can't find so I just made a simple bash loop that ran the extractor on a FLAC file and then captured the missing algo and recompiled essentia adding it to the list. Essentia is an impressively good piece of code as each recompilation only builds two new files, the new algo and the algo index, so the loop doesn't take that long to finish.

I can look for the loop and share the FLAC file if you'd like

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants