Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider rnnoise-nu #11

Open
GregorR opened this issue Sep 1, 2018 · 15 comments
Open

Consider rnnoise-nu #11

GregorR opened this issue Sep 1, 2018 · 15 comments

Comments

@GregorR
Copy link

GregorR commented Sep 1, 2018

Hi,

I've been fussing with RNNoise and made a slightly-incompatible fork that:

(1) Supports multiple neural network models, several of which I've trained (and I'm still training more),

(2) Supports a simple ASCII file format for future models, and

(3) Parameterizes the maximum attenuation to perform.

I've been scratching my head over whether to learn a whole plugin infrastructure in order to get it working slightly more tastefully, and since you have mixing wet and dry as a todo item (with the goal presumably the same as maximum attenuation, but frankly doing it in the library is a bit cleaner), maybe we can scratch each other's itches.

My rnnoise: https://github.com/GregorR/rnnoise-nu

My rnnoise models (informational, not needed to use the library): https://github.com/GregorR/rnnoise-models

In terms of the library, the changes are small. rnnoise_create takes an RNNModel * as an argument, or NULL to use the default (this was what necessitated breaking compatibility). rnnoise_models returns a NULL-terminated list of models (meaningless string names) and rnnoise_get_model gets a model by name. Alternatively, rnnoise_model_from_file can load a model from a file, to be freed by rnnoise_model_free. Finally, rnnoise_set_param lets you set the (solitary) configurable parameter.

(An aside: My knowledge of LV2 is limited to... well, nothing, but I was surprised to find while poking at your library that there doesn't seem to be any mention of number of channels. Does LV2 just send each channel through a different instance of the plugin, or is this plugin single-channel-specific?)

@lucianodato
Copy link
Owner

That is so awesome! Of course I will prefer to use yours. I guess it's time to code and get the release going. Do you think that is possible to add support for receiving floats instead of shorts? That would be cleaner too.

@GregorR
Copy link
Author

GregorR commented Sep 1, 2018

Since rnnoise does receive floats, I assume you mean floats of the usual range from -1 to 1 instead of floats from -32768 to 32767 :)

The models are trained on floating point values, of course, but for whatever reason the original author opted to train them on values from -32768 to 32767 (i.e., 16-bit signed integer values). It would be possible to retrain new models for standard float range, but I'd shy away from the idea of having a separate set of models for different formats, as that could create some very confusing results. Multiplying input values by 32768 and then dividing the output values again should be a very fast operation, since it's just a change to the exponent, and should be lossless on all unclipped audio data.

@GregorR
Copy link
Author

GregorR commented Sep 1, 2018

Ah yes, just saw

self->rnnoise_input_frame[k] *= SHRT_MAX;
. Use 32768 (SHRT_MAX+1), not SHRT_MAX, and the operation will be much faster. It's still a bit silly, I agree, but oh well... too many bits and bobs in rnnoise outside of the model itself assume a particular range.

@GregorR
Copy link
Author

GregorR commented Sep 1, 2018

(Different sample rates, on the other hand, should be very doable. The FFT code is generic, and virtually everything relies only on the calculated bands or the FFT, so it's all doable. I'm yet to think of a way to do it without slowing down the code tho. Of course, some of the bands are above the Nyquist frequency of 12kHz audio, but then, if you're denoising 12kHz audio, something has gone dreadfully wrong in your life.)

@lucianodato
Copy link
Owner

Right! Training the model for different types sounds like a lot of work. And I agree on your comment about sample rate. I will get back to you with something ASAP.

@GregorR
Copy link
Author

GregorR commented Sep 1, 2018

It's not the amount of work that concerns me. It'd take a week or so, but oh well, c'est la vie. The problem is that because they would be different models, you'd get different results if you fed in identical data with different datatypes, which is just asking for confusion :)

@GregorR
Copy link
Author

GregorR commented Sep 2, 2018

Parameterizable sample rate is done. Doesn't affect the frame size due to some limitations in the FFT used. rnnoise_set_param(st, RNNOISE_PARAM_SAMPLE_RATE, 44100);. In the end, all that needed to be done was adjusting the band bins properly, as I expected.

In doing so I also eliminated some silly global state, so it is now correct and harmless to do multi-channel (or even multi-track) audio through rnnoise-nu by simply creating a DenoiseState per channel per track. I doubt that it would be faster to make rnnoise-nu itself multi-channel, as it would just have to mix them (or probably their FFTs) to do the analysis and then apply the results independently to each track.

@lucianodato
Copy link
Owner

Well it's done. Thank you very much for your suggestion. Let me know if you find something wrong.

@GregorR
Copy link
Author

GregorR commented Sep 10, 2018

A couple minor suggestions:

#define RNNOISE_PARAM_MAX_ATTENUATION 1

It shouldn't be necessary to define these parameters, as they're defined by the header.

rnnoise_set_param(self->st, RNNOISE_PARAM_MAX_ATTENUATION, FROM_DB(*(self->reduction)));

Just a clarification: Is this parameter being set every frame because it can be changed at any time?

rnnoise_init(self->st, NULL);

It's not necessary to both rnnoise_create and rnnoise_init (and in fact leaks memory). rnnoise_init is just so that you can allocate the RNNoise object yourself is you so choose; rnnoise_create inits.

self->st = rnnoise_create(NULL);

It would be nice to have some way of parameterizing the different models, or even loading a model from a file (if LV2 makes that possible/easy). For what it's worth, the different models are well worth using, and work great on suitable input!

@lucianodato
Copy link
Owner

First comment: Ooops!
Second comment: Yes it should be capable of changing parameters for every block.
Third comment: Ok, done.
Fourth comment: Definitively doable. I had to check some plugin that is using files already and mimic its functionality. I can add both options but I will need to explore what you did for that.

@GregorR
Copy link
Author

GregorR commented Sep 11, 2018

FYI, I'm suddenly in communication with the original author of RNNoise, so expect... Idonno, something.

@lucianodato
Copy link
Owner

Nice! Go Gregor!

@lucianodato
Copy link
Owner

are there any chances of a merge? did you guys speak of that possibility?

@GregorR
Copy link
Author

GregorR commented Sep 11, 2018

Yes. Needs some cleanup, but something's moving upstream. What exactly is TBD :)

@dvzrv dvzrv mentioned this issue Oct 22, 2018
@StephanMeijer
Copy link

StephanMeijer commented May 21, 2020

I'd like to help if I can :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants