02 Feb 06:25

ericharper

acf6bf4

NVIDIA Neural Modules 1.6.1

Bug Fixes

Fix embedding name for verifying speakers #3578
Add rank check and barrier helpers compilation for megatron dataset #3581
Add apex import guards #3579

Assets 2

29 Jan 04:53

ericharper

v1.6.0

75fd743

NVIDIA Neural Modules 1.6.0

ASR

Add new features to ASR with diarization with modified tutorial and README. by @tango4j :: PR: #3007
Enable stateful decoding of RNNT over multiple transcribe calls by @titu1994 :: PR: #3037
Move vocabs from asr to common by @Oktai15 :: PR: #3084
Adding parallel transcribe for ASR models - suppports multi-gpu/multi-node by @VahidooX :: PR: #3017
CTC Conformer fixes for ONNX/TS export by @borisfom :: PR: #3072
Adding pretrained French ASR models to ctc_bpe and rnnt_bpe listings by @tbartley94 :: PR: #3225
adding german conformer ctc and rnnt by @yzhang123 :: PR: #3242
Add aishell and fisher dataset processing scripts for ASR by @jbalam-nv :: PR: #3203
Better default for RNNT greedy decoding by @titu1994 :: PR: #3332
Add uniform ASR evaluation script for all models by @titu1994 :: PR: #3334
CTC Segmentation-Citrinet support by @ekmb :: PR: #3279
Updates on ASR with diarization util files by @tango4j :: PR: #3359
Asr fr by @tbartley94 :: PR: #3404
Refactor ASR Examples Directory by @titu1994 :: PR: #3392
Asr patches by @titu1994 :: PR: #3443
Properly support -1 for labels in ctc char models by @titu1994 :: PR: #3487

TTS

MixerTTS, MixerTTSDataset and small updates in tts tokenizers by @Oktai15 :: PR: #2859
ONNX and TorchScript support for Mixer-TTS by @Oktai15 :: PR: #3082
Update name of files to one style in TTS folder by @Oktai15 :: PR: #3189
Update TTS Dataset, FastPitch with TTS dataset and small improvements in HiFiGAN by @Oktai15 :: PR: #3205
Add Beta-binomial Interpolator to TTSDataset by @Oktai15 :: PR: #3230
Normalizer to TTS models, TTS tokenizer updates, AxisKind updates by @Oktai15 :: PR: #3271
Update Mixer-TTS, FastPitch and TTSDataset by @Oktai15 :: PR: #3366
Minor Updates to TTS Finetuning by @blisc :: PR: #3455

NLP / NMT

NMT timing and tokenizer stats utils by @michalivne :: PR: #3004
Add offsets calculation to MegatronGPTModel.complete method by @dimapihtar :: PR: #3117
NMT checkpoint averaging by @michalivne :: PR: #3096
NMT validation examples with inputs by @michalivne :: PR: #3194
Improve data pipeline for punctuation capitalization model and make other useful changes by @PeganovAnton :: PR: #3159
Reduce test time of punctuation and capitalization model by @PeganovAnton :: PR: #3286
NLP text augmentation by @michalivne :: PR: #3291
Adding Megatron NeMo Bert support by @yidong72 :: PR: #3303
Added Script to convert Megatron LM to . nemo file by @yidong72 :: PR: #3371
Support Changing Number of Tensor Parallel Partitions for Megatron by @aklife97 :: PR: #3365
Megatron AMP fix for scheduler step counter by @titu1994 :: PR: #3293
T5 Pre-training in NeMo using Megatron by @MaximumEntropy :: PR: #3036
NMT MIM mean variance fix by @michalivne :: PR: #3385
NMT Shared Embeddings Weights by @michalivne :: PR: #3340
Make saving .nemo during on_train_end configurable by @ericharper :: PR: #3427
Byte-level Multilingual NMT by @aklife97 :: PR: #3368
BioMegatron token classification tutorial fix to be compatible with current Megatron BERT by @yidong72 :: PR: #3435
NMT documentation for bottleneck architecture by @michalivne :: PR: #3464
(1) O2-style mixed precision recipe, (2) Persistent layer-norm, (3) Grade scale hysteresis, (4) gradient_as_bucket_view by @erhoo82 :: PR: #3259

Text Normalization / Inverse Text Normalization

Tn clean upsample by @yzhang123 :: PR: #3024
Tn add nn wfst and doc by @yzhang123 :: PR: #3135
Update english tn ckpt by @yzhang123 :: PR: #3143
WFST_tutorial for ITN development by @tbartley94 :: PR: #3128
German TN wfst by @yzhang123 :: PR: #3174
Add ITN Vietnamese by @binh234 :: PR: #3217
WFST TN updates by @ekmb :: PR: #3235
Itn german refactor by @yzhang123 :: PR: #3262
Tn german deterministic by @yzhang123 :: PR: #3308
TN updates by @ekmb :: PR: #3285
Added double digits to EN ITN by @yzhang123 :: PR: #3321
TN_non_deterministic optimized by @ekmb :: PR: #3343
Missing init for TN German by @ekmb :: PR: #3355
Ru TN by @ekmb :: PR: #3390
Update ContextNet models trained on more datasets by @titu1994 :: PR: #3440

NeMo Tools

CTC Segmentation-Citrinet support by @ekmb :: PR: #3279
Updated NumPy SDE requirement by @vsl9 :: PR: #3442

Export

ONNX and TorchScript support for Mixer-TTS by @Oktai15 :: PR: #3082
CTC Conformer fixes for ONNX/TS export by @borisfom :: PR: #3072

Documentation

Merge r1.5.0 bugfixes and doc updates to main by @ericharper :: PR: #3133
Tn add nn wfst and doc by @yzhang123 :: PR: #3135
Add apex into by @PeganovAnton :: PR: #3214
Final merge r1.5.0 bugfixes and doc updates to main by @ericharper :: PR: #3232
Nemo container docker building instruction - merge to main by @fayejf :: PR: #3236
Doc link fixes by @nithinraok :: PR: #3264
French ASR Doc updates by @tbartley94 :: PR: #3322
german asr doc page update by @yzhang123 :: PR: #3325
update docs and replace speakernet with titanet in tutorials by @nithinraok :: PR: #3405
Asr fr by @tbartley94 :: PR: #3404
Update copyright to 2022 by @ericharper :: PR: #3426
Update Speech Classificatoin - VAD doc by @fayejf :: PR: #3430
Update speaker diarization docs by @tango4j :: PR: #3419
NMT documentation for bottleneck architecture by @michalivne :: PR: #3464
Add verification helper function and update docs by @nithinraok :: PR: #3514
Prompt tuning documentation by @vadam5 :: PR: #3541
French ASR Doc updates by @tbartley94 :: PR: #3322
German asr doc page update by @yzhang123 :: PR: #3325

Bugfixes

Fixed wrong tgt_length for timing by @michalivne :: PR: #3050
Update nltk version with a CVE fix by @thomasdhc :: PR: #3054
Fix README by @ericharper :: PR: #3070
Transformer Decoder: Fix swapped input name issue by @aklife97 :: PR: #3066
Fixes bugs in collect_tokenizer_dataset_stats.py by @michalivne :: PR: #3060
Attribute is not working in . by @PeganovAnton :: PR: #3099
Merge r1.5.0 bugfixes and doc updates to main by @ericharper :: PR: #3133
A quick fix for issue #3094 index out-of-bound when truncating long text to max_seq_length by @bugface :: PR: #3131
Fixed two typos by @bene-ges :: PR: #3157
Merge r1.5.0 bugfixes to main by @ericharper :: PR: #3173
LJSpeech alignment scripts fixed for latest MFA by @m-toman :: PR: #3177
Add apex into by @PeganovAnton :: PR: #3214
Patch omegaconf for cfg by @fayejf :: PR: #3224
Final merge r1.5.0 bugfixes and doc updates to main by @ericharper :: PR: #3232
CTC Conformer fixes for ONNX/TS export by @borisfom :: PR: #3072
Fix Masked SE for Citrinets + export Limited Context Citrinet by @titu1994 :: PR: #3216
Fix text length type in TTSDataset for beta_binomial_interpolator by @Oktai15 :: PR: #3233
Fix cast type in _se_pool_step_script related functions by @Oktai15 :: PR: #3239
Doc link fixes by @nithinraok :: PR: #3264
Escape chars fix by @ekmb :: PR: #3253
Fix asr output - eval mode by @nithinraok :: PR: #3274
Remove ArrayLike because it is not supported in numpy 1.18 by @PeganovAnton :: PR: #3282
Fix megatron_gpt_ckpt_to_nemo.py with torch distributed by @yaoyu-33 :: PR: #3278
Reduce test time of punctuation and capitalization model by @PeganovAnton :: PR: #3286
Tn en money fix by @yzhang123 :: PR: #3290
Fixing the bucketing_batch_size bug. by @VahidooX :: PR: #3294
Adaptiv fixed positional embeddings by @michalivne :: PR: #3263
Fix specaugment time start for numba kernel by @titu1994 :: PR: #3299
Fix for Stalled ASR training/eval on Pytorch 1.10+ (multigpu/multinode) by @titu1994 :: PR: #3304
Fix bucketing list bug. by @VahidooX :: PR: #3315
Fix MixerTTS types and dimensions by @Oktai15 :: PR: #3330
Fix german and vietnames grammar by @yzhang123 :: PR: #3331
Fix readme to show cmd by @yzhang123 :: PR: #3345
Fix speaker label models training convergence by @nithinraok :: PR: #3354
Tqdm get datasets by @bmwshop :: PR: #3358
Fixed future masking in cross attention of Perceiver by @michalivne :: PR: #3314
Fixed the bug of fixed-size bucketing. by @VahidooX :: PR: #3364
Fix minor problems in punctuation and capitalization model by @PeganovAnton :: PR: #3376
Megatron AMP fix for scheduler step counter by @titu1994 :: PR: #3293
fixed the bug of bucketing when fixed-size batch is used. by @VahidooX :: PR: #3399
TalkNet Fix by @stasbel :: PR: #3092
Fix linear annealing not annealing lr to min_lr by @MaximumEntropy :: PR: #3400
Resume training on SLURM multi-node multi-gpu by @itzsimpl :: PR: #3374
Fix running token classification in multinode setting by @PeganovAnton :: PR: #3413
Fix order of lang checking to ignore input langs by @MaximumEntropy :: PR: #3417
NMT MIM mean variance fix by @michalivne :: PR: #3385
Fix bug for missing variable by @MaximumEntropy :: PR: #3437
Asr patches by @titu1994 :: PR: #3443
Prompt tuning loss mask fix by @vadam5 :: PR: #3438
BioMegatron token classification tutorial fix to be compatible with current Megatron BERT by @yidong72 :: PR: #3435
Fix hysterisis loading by @MaximumEntropy :: PR: #3460
Fix the tutorial notebooks bug by @yidong72 :: PR: #3465
Fix the errors/bugs in ASR with diarization tutorial by @tango4j :: PR: #3461
WFST Punct post fix + punct tutorial fixes by @ekmb :: PR: #3469
Process correctly label ids dataset parameter + standardize type of label ids model attribute + minor changes (error messages, typing) by @PeganovAnton :: PR: #3471
file name fix - Segmentation tutorial by @ekmb :: PR: #3474
Patch fix for the multiple last checkpoints issue by @nithinraok :: PR: #3468
Fix bug with arguments for TalkNet's preprocessor by @Oktai15 :: PR: #3481
Fix description by @PeganovAnton :: PR: #3482
typo fix in diarization notebooks by @nithinraok :: PR: #3480
Fix check...

Contributors

bmwshop, ryanleary, and 34 other contributors

Assets 2

04 Dec 00:00

blisc

v1.5.1

01419c3

NVIDIA Neural Modules 1.5.1

Features

Minor updates to expose speaker id, pitch, and duration on export of FastPitch #3192, #3207

Known Issues

Training of speaker models converge very slowly due to a bug (fixed in main: #3354)
ASR training does not reach adequate WER due to bug in Numba Spec Augment (fixed in main : #3299). For details refer to #3288 (comment) . For a temporary workaround, disable Numba Spec Augment with https://github.com/NVIDIA/NeMo/blob/main/nemo/collections/asr/modules/audio_preprocessing.py#L471 set to False in the config for SpecAugment in the yaml config. The fix will be part of 1.6.0.

Assets 2

20 Nov 01:55

ericharper

v1.5.0

e2d11bb

NVIDIA Neural Modules 1.5.0

Features

Megatron GPT pre-training with tensor model parallelism #2975
NMT encoder and decoder with different hidden size #2856
Logging timing of train/val/test steps #2936
Logging NMT encoder and decoder timing #2956
Logging timing per sentence length and tokenized text statistics #3004
Upgrade to PyTorch Lightning 1.5.0, bfloat support #2975
French Inverse Text Normalization #2921
Bucketing of tarred datasets for ASR models #2999
ASR with diarization #3007
Adding parallel transcribe for ASR models - suppports multi-gpu/multi-node #3017

Documentation Updates

RNNT

Contributors

@ericharper @michalivne @MaximumEntropy @VahidooX @titu1994 @blisc @okuchaiev @tango4j @erastorgueva-nv @fayejf @vadam5 @ekmb @yaoyu-33 @nithinraok @erhoo82 @tbartley94 @PeganovAnton @madhukarkm @yzhang123
(Please let us know if you have contributed to this release and we have missed you here.)

Contributors

titu1994, yzhang123, and 17 other contributors

Assets 2

02 Oct 00:49

ericharper

v1.4.0

0958184

NVIDIA Neural Modules 1.4.0

Features

Improved speaker clustering #2729
Upgrade to NVIDIA PyTorch 21.08 container #2799
RNNT mAES beam search support #2802
Transfer learning for new speakers #2684
Simplify speaker scripts #2777
Perceiver-encoder architecture #2737
Relative paths in tarred datasets #2776
Torch only TTS package #2643
Inverse text normalization for Spanish #2489

Tutorial Notebooks

Duration and pitch control for TTS # 2700

Bug fixes

Fixed max delta generation #2727
Waveglow export #2671, #2699

Contributors

@tango4j @titu1994 @paarthneekhara @nithinraok @michalivne @erastorgueva-nv @borisfom @blisc
(some contributors may not be listed explicitly)

Contributors

titu1994, blisc, and 6 other contributors

Assets 2

27 Aug 21:24

ericharper

v1.3.0

6d260c9

NVIDIA Neural Modules 1.3.0

Added

RNNT Exportable to ONNX #2510
Multi-batch inference support for speaker diarization #2522
DALI Integration for char/subword ASR #2567
VAD Postprocessing #2636
Perceiver encoder for NMT #2621
gRPC NMT server #2656
German ITN # 2486
Russian TN and ITN #2519
Save/restore connector # 2592
PTL 1.4+ # 2600

Tutorial Notebooks

Non-English downstream NLP task #2532
RNNT Basics #2651

Bug Fixes

NMESE clustering for very small audio files #2566

Contributors

@pasandi20 @ekmb @nithinraok @titu1994 @ryanleary @yzhang123 @ericharper @michalivne @MaximumEntropy @fayejf
(some contributors may not be listed explicitly)

Contributors

ryanleary, titu1994, and 8 other contributors

Assets 2

30 Jul 20:05

ericharper

v1.2.0

9b36aae

NVIDIA Neural Modules 1.2.0

Added

Improve performance of speak clustering (#2445)
Update Conformer for ONNX conversion (#2439)
Mean and length normalization for better embeddings speaker verification and diarization (#2397)
FastEmit RNNT Loss Numba for reducing latency (#2374)
Multiple datasets, right to left models, noisy channel re-ranking, ensembling for NMT (#2379)
Byte level tokenization (#2365)
Bottleneck with attention bridge for more efficient NMT training (#2390)
Tutorial notebook for NMT data cleaning and preprocessing (#2467)
Streaming Conformer inference script for long audio files (#2373)
Res2Net Ecapa equivalent implementation for speaker verification and diarization (#2468)
Update end-to-end tutorial notebook to use CitriNet (#2457)

Contributors

@nithinraok @tango4j @jbalam-nv @titu1994 @MaximumEntropy @mchrzanowski @michalivne @jbalam-nv @fayejf @okuchaiev

(some contributors may not be listed explicitly)

Known Issues

import nemo.collections.nlp as nemo_nlp will result in an error. This will be patched in the upcoming version. Please try to import the individual files as a work-around.

Contributors

mchrzanowski, titu1994, and 7 other contributors

Assets 2

02 Jul 21:51

ericharper

v1.1.0

e722f33

NVIDIA Neural Modules 1.1.0

NeMo 1.1.0 release is our first release in our new monthly release cadence. Monthly releases will focus on adding new features that enable new NeMo Models or improve existing ones.

Added

Pretrained Megatron-LM encoders (including model parallel) for NMT (#2238)
RNNT Numba loss (#1995)
Enable multiple models to be restored (#2245)
Audio based text normalization (#2285)
Multilingual NMT (#2160)
FastPitch export (#2355)
ASR fine-tuning tutorial for other languages (#2346)

Bugfixes

HiFiGan Export (#2279)
OmegaConf forward compatibilty (#2319)

Documentation

ONNX export documentation (#2330

Contributors

@borisfom @MaximumEntropy @ericharper @aklife97 @titu1994 @ekmb @yzhang123 @blisc

(some contributors may not be listed explicitly)

Assets 2

11 Jun 01:45

okuchaiev

v1.0.2

e85b42e

NVIDIA Neural Modules 1.0.2

Release 1.0.2

NeMo 1.0.2 is a minor change over 1.0.0 adding version checks for Hydra dependency.

Assets 2

09 Jun 05:40

okuchaiev

v1.0.1

2763c67

NVIDIA Neural Modules 1.0.1

Release 1.0.1

NeMo 1.0.1 is a minor change over 1.0.0 adding proper version bounds for some external dependencies.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug Fixes

ASR

TTS

NLP / NMT

Text Normalization / Inverse Text Normalization

NeMo Tools

Export

Documentation

Bugfixes

Contributors

Features

Known Issues

Features

Documentation Updates

Contributors

Contributors

Features

Tutorial Notebooks

Bug fixes

Contributors

Contributors

Added

Tutorial Notebooks

Bug Fixes

Contributors

Contributors

Added

Contributors

Known Issues

Contributors

Added

Bugfixes

Documentation

Contributors

Release 1.0.2

Release 1.0.1

Releases: NVIDIA/NeMo

NVIDIA Neural Modules 1.6.1

Bug Fixes

NVIDIA Neural Modules 1.6.0

ASR

TTS

NLP / NMT

Text Normalization / Inverse Text Normalization

NeMo Tools

Export

Documentation

Bugfixes

Contributors

NVIDIA Neural Modules 1.5.1

Features

Known Issues

NVIDIA Neural Modules 1.5.0

Features

Documentation Updates

Contributors

Contributors

NVIDIA Neural Modules 1.4.0

Features

Tutorial Notebooks

Bug fixes

Contributors

Contributors

NVIDIA Neural Modules 1.3.0

Added

Tutorial Notebooks

Bug Fixes

Contributors

Contributors

NVIDIA Neural Modules 1.2.0

Added

Contributors

Known Issues

Contributors

NVIDIA Neural Modules 1.1.0

Added

Bugfixes

Documentation

Contributors

NVIDIA Neural Modules 1.0.2

Release 1.0.2

NVIDIA Neural Modules 1.0.1

Release 1.0.1