Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about distortion file #97

Open
RE-N-Y opened this issue Apr 14, 2020 · 2 comments
Open

Question about distortion file #97

RE-N-Y opened this issue Apr 14, 2020 · 2 comments

Comments

@RE-N-Y
Copy link

RE-N-Y commented Apr 14, 2020

HI,

I'm trying to fine-tune PASE+ model on my own dataset, but it seems that I'm getting this error for the training script. I was able to correctly produce the stats file and .scp files with the provided python script.

Here's my output from my train.py.

[!] Using CPU Seeds initialized to 2 {'regr': [{'num_outputs': 1, 'dropout': 0, 'dropout_time': 0.0, 'hidden_layers': 1, 'name': 'cchunk', 'type': 'decoder', 'hidden_size': 64, 'fmaps': [512, 256, 128], 'strides': [4, 4, 10], 'kwidths': [30, 30, 30], 'loss': <pase.losses.ContextualizedLoss object at 0x2b3d9d6a49d0>}, {'num_outputs': 3075, 'dropout': 0, 'hidden_size': 256, 'hidden_layers': 1, 'name': 'lps', 'context': 1, 'r': 7, 'loss': <pase.losses.ContextualizedLoss object at 0x2b3d9d58bc10>, 'skip': False}, {'num_outputs': 3075, 'dropout': 0, 'hidden_size': 256, 'hidden_layers': 1, 'name': 'lps_long', 'context': 1, 'r': 7, 'transform': {'win': 512}, 'loss': <pase.losses.ContextualizedLoss object at 0x2b3d9d6a4a50>, 'skip': False}, {'num_outputs': 120, 'dropout': 0, 'hidden_size': 256, 'hidden_layers': 1, 'name': 'fbank', 'context': 1, 'r': 7, 'loss': <pase.losses.ContextualizedLoss object at 0x2b3d9d6a4a90>, 'skip': False}, {'num_outputs': 120, 'dropout': 0, 'hidden_size': 256, 'hidden_layers': 1, 'name': 'fbank_long', 'context': 1, 'r': 7, 'transform': {'win': 1024, 'n_fft': 1024}, 'loss': <pase.losses.ContextualizedLoss object at 0x2b3d9d6a4ad0>, 'skip': False}, {'num_outputs': 120, 'dropout': 0, 'hidden_size': 256, 'hidden_layers': 1, 'name': 'gtn', 'context': 1, 'r': 7, 'loss': <pase.losses.ContextualizedLoss object at 0x2b3d9d6a4b10>, 'skip': False}, {'num_outputs': 120, 'dropout': 0, 'hidden_size': 256, 'hidden_layers': 1, 'name': 'gtn_long', 'context': 1, 'r': 7, 'loss': <pase.losses.ContextualizedLoss object at 0x2b3d9d6a4b50>, 'transform': {'win': 2048}, 'skip': False}, {'num_outputs': 39, 'dropout': 0, 'hidden_size': 256, 'hidden_layers': 1, 'name': 'mfcc', 'context': 1, 'r': 7, 'loss': <pase.losses.ContextualizedLoss object at 0x2b3d9d6a4b90>, 'skip': False}, {'num_outputs': 60, 'dropout': 0, 'hidden_size': 256, 'hidden_layers': 1, 'name': 'mfcc_long', 'context': 1, 'r': 7, 'transform': {'win': 2048, 'order': 20}, 'loss': <pase.losses.ContextualizedLoss object at 0x2b3d9d6a4bd0>, 'skip': False}, {'num_outputs': 12, 'dropout': 0, 'hidden_size': 256, 'hidden_layers': 1, 'name': 'prosody', 'context': 1, 'r': 7, 'loss': <pase.losses.ContextualizedLoss object at 0x2b3d9d6a4c10>, 'skip': False}], 'cls': [{'num_outputs': 1, 'dropout': 0, 'hidden_size': 256, 'hidden_layers': 1, 'name': 'mi', 'loss': <pase.losses.ContextualizedLoss object at 0x2b3d9d6a4cd0>, 'skip': False, 'keys': ['chunk', 'chunk_ctxt', 'chunk_rand']}, {'num_outputs': 1, 'dropout': 0, 'hidden_size': 256, 'hidden_layers': 1, 'name': 'cmi', 'augment': True, 'loss': <pase.losses.ContextualizedLoss object at 0x2b3d9d6a4d90>, 'skip': False, 'keys': ['chunk', 'chunk_ctxt', 'chunk_rand']}]} Compose( ToTensor() MIChunkWav(32000) LPS(n_fft=2048, hop=160, win=400, device=cpu) LPS(n_fft=2048, hop=160, win=512, device=cpu) FBanks(n_fft=512, n_filters=40, hop=160, win=400 FBanks(n_fft=1024, n_filters=40, hop=160, win=1024 Gammatone(f_min=500, n_channels=40, hop=160, win=400) Gammatone(f_min=500, n_channels=40, hop=160, win=2048) MFCC(order=13, sr=16000) MFCC(order=20, sr=16000) Prosody(hop=160, win=320, f0_min=60, f0_max=300, sr=16000) ZNorm(data/PARK_stats.pkl) ) Preparing dset for <MY DATASET FOLDER> Found 0 *.npy ir_files in data/omologo_revs_bin
It seems that the issue is that there is no file called omologo_revs_bin inside data? If so, is it possible to get it?

Thank you in advance!

@pswietojanski
Copy link
Collaborator

Hi, apologies for a delay with this. We did not release these data augmentation RIRs, instead you may use the 16kHz RIRs you can get from openslr page. The results are comparable. Look at top-level README as it shows how to use them.

@uuwz
Copy link

uuwz commented Sep 7, 2023

Hello! I have been replicating this experiment recently, but during the process of making the dataset config file, do I know where to obtain these files. (-- train_scp data/LibriSpeed/libri_tr.scp -- test_scp data/LibriSpeed/libri_te.scp\

--Libri_ Dict data/LibriSpeed/Libri_ Dict. npy). I look forward to your reply very much. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants