Consider rnnoise-nu #11

GregorR · 2018-09-01T02:00:09Z

Hi,

I've been fussing with RNNoise and made a slightly-incompatible fork that:

(1) Supports multiple neural network models, several of which I've trained (and I'm still training more),

(2) Supports a simple ASCII file format for future models, and

(3) Parameterizes the maximum attenuation to perform.

I've been scratching my head over whether to learn a whole plugin infrastructure in order to get it working slightly more tastefully, and since you have mixing wet and dry as a todo item (with the goal presumably the same as maximum attenuation, but frankly doing it in the library is a bit cleaner), maybe we can scratch each other's itches.

My rnnoise: https://github.com/GregorR/rnnoise-nu

My rnnoise models (informational, not needed to use the library): https://github.com/GregorR/rnnoise-models

In terms of the library, the changes are small. rnnoise_create takes an RNNModel * as an argument, or NULL to use the default (this was what necessitated breaking compatibility). rnnoise_models returns a NULL-terminated list of models (meaningless string names) and rnnoise_get_model gets a model by name. Alternatively, rnnoise_model_from_file can load a model from a file, to be freed by rnnoise_model_free. Finally, rnnoise_set_param lets you set the (solitary) configurable parameter.

(An aside: My knowledge of LV2 is limited to... well, nothing, but I was surprised to find while poking at your library that there doesn't seem to be any mention of number of channels. Does LV2 just send each channel through a different instance of the plugin, or is this plugin single-channel-specific?)

The text was updated successfully, but these errors were encountered:

lucianodato · 2018-09-01T15:35:06Z

That is so awesome! Of course I will prefer to use yours. I guess it's time to code and get the release going. Do you think that is possible to add support for receiving floats instead of shorts? That would be cleaner too.

GregorR · 2018-09-01T17:07:00Z

Since rnnoise does receive floats, I assume you mean floats of the usual range from -1 to 1 instead of floats from -32768 to 32767 :)

The models are trained on floating point values, of course, but for whatever reason the original author opted to train them on values from -32768 to 32767 (i.e., 16-bit signed integer values). It would be possible to retrain new models for standard float range, but I'd shy away from the idea of having a separate set of models for different formats, as that could create some very confusing results. Multiplying input values by 32768 and then dividing the output values again should be a very fast operation, since it's just a change to the exponent, and should be lossless on all unclipped audio data.

GregorR · 2018-09-01T17:24:03Z

Ah yes, just saw

speech-denoiser/src/sdenoise.c

Line 212 in ab14ff4

self->rnnoise_input_frame[k] *= SHRT_MAX;

. Use 32768 (SHRT_MAX+1), not SHRT_MAX, and the operation will be much faster. It's still a bit silly, I agree, but oh well... too many bits and bobs in rnnoise outside of the model itself assume a particular range.

GregorR · 2018-09-01T17:34:38Z

(Different sample rates, on the other hand, should be very doable. The FFT code is generic, and virtually everything relies only on the calculated bands or the FFT, so it's all doable. I'm yet to think of a way to do it without slowing down the code tho. Of course, some of the bands are above the Nyquist frequency of 12kHz audio, but then, if you're denoising 12kHz audio, something has gone dreadfully wrong in your life.)

lucianodato · 2018-09-01T20:58:12Z

Right! Training the model for different types sounds like a lot of work. And I agree on your comment about sample rate. I will get back to you with something ASAP.

GregorR · 2018-09-01T21:22:16Z

It's not the amount of work that concerns me. It'd take a week or so, but oh well, c'est la vie. The problem is that because they would be different models, you'd get different results if you fed in identical data with different datatypes, which is just asking for confusion :)

GregorR · 2018-09-02T00:08:52Z

Parameterizable sample rate is done. Doesn't affect the frame size due to some limitations in the FFT used. rnnoise_set_param(st, RNNOISE_PARAM_SAMPLE_RATE, 44100);. In the end, all that needed to be done was adjusting the band bins properly, as I expected.

In doing so I also eliminated some silly global state, so it is now correct and harmless to do multi-channel (or even multi-track) audio through rnnoise-nu by simply creating a DenoiseState per channel per track. I doubt that it would be faster to make rnnoise-nu itself multi-channel, as it would just have to mix them (or probably their FFTs) to do the analysis and then apply the results independently to each track.

lucianodato · 2018-09-09T22:51:24Z

Well it's done. Thank you very much for your suggestion. Let me know if you find something wrong.

GregorR · 2018-09-10T14:14:46Z

A couple minor suggestions:

speech-denoiser/src/sdenoise.c

Line 41 in 57ed881

#define RNNOISE_PARAM_MAX_ATTENUATION 1

It shouldn't be necessary to define these parameters, as they're defined by the header.

speech-denoiser/src/sdenoise.c

Line 221 in 57ed881

 rnnoise_set_param(self->st, RNNOISE_PARAM_MAX_ATTENUATION, FROM_DB(*(self->reduction))); 

Just a clarification: Is this parameter being set every frame because it can be changed at any time?

speech-denoiser/src/sdenoise.c

Line 113 in 57ed881

rnnoise_init(self->st, NULL);

It's not necessary to both rnnoise_create and rnnoise_init (and in fact leaks memory). rnnoise_init is just so that you can allocate the RNNoise object yourself is you so choose; rnnoise_create inits.

speech-denoiser/src/sdenoise.c

Line 112 in 57ed881

self->st = rnnoise_create(NULL);

It would be nice to have some way of parameterizing the different models, or even loading a model from a file (if LV2 makes that possible/easy). For what it's worth, the different models are well worth using, and work great on suitable input!

lucianodato · 2018-09-11T00:08:07Z

First comment: Ooops!
Second comment: Yes it should be capable of changing parameters for every block.
Third comment: Ok, done.
Fourth comment: Definitively doable. I had to check some plugin that is using files already and mimic its functionality. I can add both options but I will need to explore what you did for that.

GregorR · 2018-09-11T01:01:03Z

FYI, I'm suddenly in communication with the original author of RNNoise, so expect... Idonno, something.

lucianodato · 2018-09-11T01:59:58Z

Nice! Go Gregor!

lucianodato · 2018-09-11T12:54:37Z

are there any chances of a merge? did you guys speak of that possibility?

GregorR · 2018-09-11T13:31:08Z

Yes. Needs some cleanup, but something's moving upstream. What exactly is TBD :)

StephanMeijer · 2020-05-21T19:30:48Z

I'd like to help if I can :)

dvzrv mentioned this issue Oct 22, 2018

tag for release #10

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider rnnoise-nu #11

Consider rnnoise-nu #11

GregorR commented Sep 1, 2018

lucianodato commented Sep 1, 2018

GregorR commented Sep 1, 2018

GregorR commented Sep 1, 2018

GregorR commented Sep 1, 2018

lucianodato commented Sep 1, 2018

GregorR commented Sep 1, 2018

GregorR commented Sep 2, 2018

lucianodato commented Sep 9, 2018

GregorR commented Sep 10, 2018

lucianodato commented Sep 11, 2018

GregorR commented Sep 11, 2018

lucianodato commented Sep 11, 2018

lucianodato commented Sep 11, 2018

GregorR commented Sep 11, 2018

StephanMeijer commented May 21, 2020 •

edited

Consider rnnoise-nu #11

Consider rnnoise-nu #11

Comments

GregorR commented Sep 1, 2018

lucianodato commented Sep 1, 2018

GregorR commented Sep 1, 2018

GregorR commented Sep 1, 2018

GregorR commented Sep 1, 2018

lucianodato commented Sep 1, 2018

GregorR commented Sep 1, 2018

GregorR commented Sep 2, 2018

lucianodato commented Sep 9, 2018

GregorR commented Sep 10, 2018

lucianodato commented Sep 11, 2018

GregorR commented Sep 11, 2018

lucianodato commented Sep 11, 2018

lucianodato commented Sep 11, 2018

GregorR commented Sep 11, 2018

StephanMeijer commented May 21, 2020 • edited

StephanMeijer commented May 21, 2020 •

edited