Question about asr2.sh and its options to reproduce the librispeech_100 recipe. #5720

YoshikiMas · 2024-03-26T16:24:58Z

Describe the bug
I have faced some inconveniences when trying to reproduce the asr2 recipe for librispeech_100. I would like to clarify whether my approach is appropriate or not, especially for the second one. After finding the best way, I'll be happy to make a PR.

Stage 5
In these lines, there are ${_suf}${dset}/${train_set} and ${_suf}${dset}/${train_set}_sp. Since dset comes from a previous for loop, I expect it should be removed.

Stage 6 and Stage 7
lm_train_text is data/${train_set}/text.${tgt_case}.${tgt_lang} in the default run.sh, which does not work in Stage 6. This is because ${train_set}/text.${tgt_case}.${tgt_lang} is now included in only dump/extracted if you use speed perturbation. I can set lm_train_text to dump/extracted/wavlm_large/layer21/${train_set}/text.${tgt_case}.${tgt_lang}, but do you have any simpler ideas?
A similar problem happens in the stage 7 with src_bpe_train_text and tgt_bpe_train_text.

The text was updated successfully, but these errors were encountered:

sw005320 · 2024-03-26T17:56:21Z

Stage 5 In these lines, there are ${_suf}${dset}/${train_set} and ${_suf}${dset}/${train_set}_sp. Since dset comes from a previous for loop, I expect it should be removed.

I think this sounds good to me.

Stage 6 and Stage 7 lm_train_text is data/${train_set}/text.${tgt_case}.${tgt_lang} in the default run.sh, which does not work in Stage 6. This is because ${train_set}/text.${tgt_case}.${tgt_lang} is now included in only dump/extracted if you use speed perturbation. I can set lm_train_text to dump/extracted/wavlm_large/layer21/${train_set}/text.${tgt_case}.${tgt_lang}, but do you have any simpler ideas? A similar problem happens in the stage 7 with src_bpe_train_text and tgt_bpe_train_text.

I think feeding dump/extracted/wavlm_large/layer21/${train_set}/text.${tgt_case}.${tgt_lang} sounds good to me.
One option would be to get the configurations (e.g., wavlm_large and layer21) from the config file, but it is tricky.

@simpleoier, do you have any idea?

simpleoier · 2024-03-26T19:08:51Z

Hi @YoshikiMas, thanks for noticing the problem.
Yeah, I agree with the suggested change in stage 5.
For the lm_train_text, we can simply use data/${train_set}/text, the same as what is used in asr1. We do not need to use the pattern of ${src_case} or ${tgt_case}.
I'll prepare a PR for this after I finish the test.

sw005320 · 2024-03-26T19:10:44Z

For the lm_train_text, we can simply use data/${train_set}/text, the same as what is used in asr1. We do not need to use the pattern of ${src_case} or ${tgt_case}. I'll prepare a PR for this after I finish the test.

I see; this is an LM for the natural text, not a speech token LM.

simpleoier · 2024-03-26T19:11:56Z

Yes, exactly!

YoshikiMas · 2024-03-27T01:14:59Z

We do not need to use the pattern of ${src_case} or ${tgt_case}.

Thank you for pointing out! I confirmed that data/${train_set}/text is the same as dump/extracted/.../text.${tgt_case}.${tgt_lang}.

Maybe only the issue is src_bpe_train_text. Of course, we can directly specify it as Shinji suggested for lm_train_text.

YoshikiMas added the Question Question label Mar 26, 2024

sw005320 added the ASR Automatic speech recogntion label Mar 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about asr2.sh and its options to reproduce the librispeech_100 recipe. #5720

Question about asr2.sh and its options to reproduce the librispeech_100 recipe. #5720

YoshikiMas commented Mar 26, 2024

sw005320 commented Mar 26, 2024

simpleoier commented Mar 26, 2024

sw005320 commented Mar 26, 2024

simpleoier commented Mar 26, 2024

YoshikiMas commented Mar 27, 2024

Question about asr2.sh and its options to reproduce the librispeech_100 recipe. #5720

Question about asr2.sh and its options to reproduce the librispeech_100 recipe. #5720

Comments

YoshikiMas commented Mar 26, 2024

sw005320 commented Mar 26, 2024

simpleoier commented Mar 26, 2024

sw005320 commented Mar 26, 2024

simpleoier commented Mar 26, 2024

YoshikiMas commented Mar 27, 2024