About data augment #38

Catekato · 2023-10-27T21:24:20Z

Hi!
Thanks so much for this work! When I tried to train the model on AudioCaps (didn't change the training script other than file paths), I got this issue:
File "/tango/train.py", line 553, in
main()
File "/tango/train.py", line 459, in main
mixed_mel, _, _, mixed_captions = torch_tools.augment_wav_to_fbank(audios, text, len(audios),
File "/tango/tools/torch_tools.py", line 118, in augment_wav_to_fbank
waveform, captions = augment(paths, texts)
File "/tango/tools/torch_tools.py", line 108, in augment
waveform = torch.tensor(np.concatenate(mixed_sounds, 0))
File "<array_function internals>", line 180, in concatenate
ValueError: need at least one array to concatenate

It would be highly appreciated if you could kindly help me with this problem, thanks!

deepanwayx · 2023-10-28T06:37:41Z

It seems like the length of texts is 1. The last batch of the training data may have only one instance, and then this error would come up. Did this happen at the end of the training loop?

I have made a few small changes in the train.py to handle this. Let me know if that fixes the issue.

I am assuming you used the per device train batch size of at least 2. Augmentation won't work with batch size of 1.

Catekato · 2023-10-28T10:58:59Z

It seems like the length of texts is 1. The last batch of the training data may have only one instance, and then this error would come up. Did this happen at the end of the training loop?

I have made a few small changes in the train.py to handle this. Let me know if that fixes the issue.

I am assuming you used the per device train batch size of at least 2. Augmentation won't work with batch size of 1.

Thanks for your help! The problem occurred before training started, when I tried to run the new train.py, I got the typical "cuda out of memory" error. Have you ever tried to train without data augment, how's the result?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About data augment #38

About data augment #38

Catekato commented Oct 27, 2023

deepanwayx commented Oct 28, 2023 •

edited

Catekato commented Oct 28, 2023

About data augment #38

About data augment #38

Comments

Catekato commented Oct 27, 2023

deepanwayx commented Oct 28, 2023 • edited

Catekato commented Oct 28, 2023

deepanwayx commented Oct 28, 2023 •

edited