the number of MLS model parameters and polish dev loss curve fluctuations #1022

CriDora · 2024-04-01T09:00:33Z

Question

I have read the paper and have some questions. In thisr paper MLS: A LARGE-SCALE MULTILINGUAL DATASET FOR SPEECH RESEARCH, a 36-layers transformer was used to train the monolingual model. I would like to know the model size. A 1GB acoustic model is provided in the mls folder, but I want to know the number of parameters of the model. Besides, when reproducing the monolingual results in this paper for Polish, the dev loss always fluctuate seriously, but this did not happen in Portuguese and Italian. Even after adjusting the learning rate, it will still fluctuate. When I shuffle the order of train and dev and redistribute the two datasets, the loss of dev can converge well. How can I check the problem?

CriDora added the question label Apr 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

the number of MLS model parameters and polish dev loss curve fluctuations #1022

the number of MLS model parameters and polish dev loss curve fluctuations #1022

CriDora commented Apr 1, 2024

the number of MLS model parameters and polish dev loss curve fluctuations #1022

the number of MLS model parameters and polish dev loss curve fluctuations #1022

Comments

CriDora commented Apr 1, 2024

Question