Issue with saving multiple checkpoint #466

ken2190 · 2024-04-15T04:18:13Z

i follow the instruction on this colab to enable saving multiple checkpoint, but it seem doesn't work. Does anyone has solved this issue?

https://colab.research.google.com/github/rmcpantoja/piper/blob/master/notebooks/piper_multilingual_training_notebook.ipynb#scrollTo=ickQlOCRjkBL
#Interval to save best k models:
#Set to 0 if you want to disable saving multiple models. If this is the case, check the checkbox below.
#If set to 1, models will be saved with the file name epoch=xx-step=xx.ckpt, so you will need to empty Drive's trash every so often.

python -m piper_train \
--dataset-dir "/home/ubuntu/DATA/piper/experiment/single_spk/" \
--accelerator 'gpu' \
--devices 4 \
--batch-size 26 \
--validation-split 0.05 \
--num-test-examples 10 \
--quality "high" \
--checkpoint-epochs 5 \
--log_every_n_steps 50 \
--max_epochs 5000 \
--resume_from_checkpoint "/home/ubuntu/DATA/piper/piper-checkpoints/en/en_US/lessac/high/epoch=2218-step=838782.ckpt" \
--precision 32 \
--gpus='0,1,2,3' \
--strategy=ddp \
--num_ckpt 1

Error:

python -m piper_train \
--max_epochs 5000 \
--resume_from_single_speaker_checkpoint "/home/ubuntu/DATA/piper/piper-checkpoints/en/en_US/lessac/high/epoch=2218-step=838782.ck
pt" \
--precision 32 \
--gpus='0,1,2,3' \
--strategy=ddp \
--num_ckpt 1> --dataset-dir "/home/ubuntu/DATA/piper/experiment/single_spk/" \                                                   > --accelerator 'gpu' \
> --devices 4 \
> --batch-size 26 \                                                                                                              > --validation-split 0.0 \                                                                                                       > --num-test-examples 2 \
> --quality "high" \
> --checkpoint-epochs 5 \                                                                                                        > --log_every_n_steps 50 \
> --max_epochs 5000 \
> --resume_from_single_speaker_checkpoint "/home/ubuntu/DATA/piper/piper-checkpoints/en/en_US/lessac/high/epoch=2218-step=838782.ckpt" \                                                                                                                          > --precision 32 \
> --gpus='0,1,2,3' \
> --strategy=ddp \                                                                                                               > --num_ckpt 1
usage: __main__.py [-h] --dataset-dir DATASET_DIR [--checkpoint-epochs CHECKPOINT_EPOCHS] [--quality {x-low,medium,high}]
                   [--resume_from_single_speaker_checkpoint RESUME_FROM_SINGLE_SPEAKER_CHECKPOINT] [--logger [LOGGER]]                              [--enable_checkpointing [ENABLE_CHECKPOINTING]] [--default_root_dir DEFAULT_ROOT_DIR]                                            [--gradient_clip_val GRADIENT_CLIP_VAL] [--gradient_clip_algorithm GRADIENT_CLIP_ALGORITHM]
                   [--num_nodes NUM_NODES] [--num_processes NUM_PROCESSES] [--devices DEVICES] [--gpus GPUS]
                   [--auto_select_gpus [AUTO_SELECT_GPUS]] [--tpu_cores TPU_CORES] [--ipus IPUS]                                                    [--enable_progress_bar [ENABLE_PROGRESS_BAR]] [--overfit_batches OVERFIT_BATCHES]
                   [--track_grad_norm TRACK_GRAD_NORM] [--check_val_every_n_epoch CHECK_VAL_EVERY_N_EPOCH]
                   [--fast_dev_run [FAST_DEV_RUN]] [--accumulate_grad_batches ACCUMULATE_GRAD_BATCHES]                                              [--max_epochs MAX_EPOCHS] [--min_epochs MIN_EPOCHS] [--max_steps MAX_STEPS] [--min_steps MIN_STEPS]                              [--max_time MAX_TIME] [--limit_train_batches LIMIT_TRAIN_BATCHES] [--limit_val_batches LIMIT_VAL_BATCHES]
                   [--limit_test_batches LIMIT_TEST_BATCHES] [--limit_predict_batches LIMIT_PREDICT_BATCHES]
                   [--val_check_interval VAL_CHECK_INTERVAL] [--log_every_n_steps LOG_EVERY_N_STEPS]                                                [--accelerator ACCELERATOR] [--strategy STRATEGY] [--sync_batchnorm [SYNC_BATCHNORM]]
                   [--precision PRECISION] [--enable_model_summary [ENABLE_MODEL_SUMMARY]]                                                          [--weights_save_path WEIGHTS_SAVE_PATH] [--num_sanity_val_steps NUM_SANITY_VAL_STEPS]
                   [--resume_from_checkpoint RESUME_FROM_CHECKPOINT] [--profiler PROFILER] [--benchmark [BENCHMARK]]                                [--deterministic [DETERMINISTIC]] [--reload_dataloaders_every_n_epochs RELOAD_DATALOADERS_EVERY_N_EPOCHS]
                   [--auto_lr_find [AUTO_LR_FIND]] [--replace_sampler_ddp [REPLACE_SAMPLER_DDP]]                                                    [--detect_anomaly [DETECT_ANOMALY]] [--auto_scale_batch_size [AUTO_SCALE_BATCH_SIZE]] [--plugins PLUGINS]
                   [--amp_backend AMP_BACKEND] [--amp_level AMP_LEVEL] [--move_metrics_to_cpu [MOVE_METRICS_TO_CPU]]                                [--multiple_trainloader_mode MULTIPLE_TRAINLOADER_MODE] --batch-size BATCH_SIZE
                   [--validation-split VALIDATION_SPLIT] [--num-test-examples NUM_TEST_EXAMPLES]
                   [--max-phoneme-ids MAX_PHONEME_IDS] [--hidden-channels HIDDEN_CHANNELS] [--inter-channels INTER_CHANNELS]                        [--filter-channels FILTER_CHANNELS] [--n-layers N_LAYERS] [--n-heads N_HEADS] [--seed SEED]
__main__.py: error: unrecognized arguments: --num_ckpt 1

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with saving multiple checkpoint #466

Issue with saving multiple checkpoint #466

ken2190 commented Apr 15, 2024

Issue with saving multiple checkpoint #466

Issue with saving multiple checkpoint #466

Comments

ken2190 commented Apr 15, 2024