Replies: 1 comment 2 replies
-
I solved it (albeit the other way round for now). You can reuse the data probs pickle from eval_beamsearch_ngram.py and skip the transcribe call in align.py entirely. Not much change is needed.
You might face the problem that by default the pickle file is for the whole manifest, while the manifest_lines_batch is just a part of the manifest. If there is interest I will add a pickle file parameter to align.py and contribute. Cheers |
Beta Was this translation helpful? Give feedback.
-
Hello,
the generation of timestamps has been asked multiple times and solved with external tools ,see parlance or MFA.
Now that NeMo contains a Forced Aligner, I would like to combine Seq2Seq Beam Search and the forced aligner. I asked this before and back then you had to fall back on offline_diar_with_asr_infer.py, avoiding Seg2Seq entirely.
One new solution would be to combine Seq2Seq with the Forced Aligner: run eval_beamsearch_ngram.py first, use the resulting text as base truth for the forced aligner align.py. This doubles the creation of probs, which seems redundant to me. The creation of probs is by far the most time intensive part of both tasks.
Looking at the new align.py I would like to do this:
Is this possible? Or are the probs tensors vastly different, e.g. by the size of the token alphabet?
I see how the code would fit together, but I might be missing a theoretical problem.
Cheers
Beta Was this translation helpful? Give feedback.
All reactions