Skip to content

Releases: lhotse-speech/lhotse

v1.27.0 - Crispy Momo

22 Aug 15:25
170046f
Compare
Choose a tag to compare

New recipes

Other enhancements

  • Cap the 'trng' random seeds to 2**31 avoiding numpy error by @pzelasko in #1379
  • CutSet.prefetch() for background cuts loading during iteration by @pzelasko in #1380
  • Include a copyright NOTICE listing major copyright holders by @pzelasko in #1381
  • Added has_custom to MixedCut by @anteju in #1383
  • Fix to fixed batch size bucketing and audio loading network connectio… by @pzelasko in #1387

New Contributors

Full Changelog: v1.26.0...v1.27.0

v1.26.0 - Uranium Fever

26 Jul 15:58
21b102c
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.25.0...v1.26.0

v1.25.0 - Himalayan Cat

18 Jul 23:45
18436e9
Compare
Choose a tag to compare

What's Changed

  • [feature] Add .narrowband() effect (mulaw, lpc10 codecs) by @rouseabout in #1348
  • [feature/optimization] Support for pre-determined batch sizes in DynamicBucketingSampler by @pzelasko in #1372
  • [bug] Fix MixedCut transforms serialization by @pzelasko in #1370

Full Changelog: v1.24.2...v1.25.0

v1.24.2

25 Jun 15:59
Compare
Choose a tag to compare

New recipes

New features

Several new APIs for manifest classes added in #1361:

  • cut.iter_data() which iterates over (key, manifest) pairs of all data items attached to a given cut (e.g., ("recording", Recording(...)), ("custom_features", TemporalArray(...)))
  • is_in_memory property for all manifest types to indicate if it contains data that is held in memory
  • is_placeholder for non-cut manifests to indicate if a manifest is just a placeholder (has some metadata, but can't be used to load data)
  • cut.drop_in_memory_data() which converts manifests with in-memory data to placeholders (this is useful for manifests that live longer than just dataloading to avoid blowing up CPU memory and/or slowing down the program)

Bug fixes

  • Restoring smart open for local files if available by @pzelasko in #1360
  • Fix Recording.to_dict() when transforms are dicts and transform pickling issues by @pzelasko in #1355
  • Utils for discovering attached data and dropping in-memory data by @pzelasko in #1361
  • Numpy 2.0 compatibility by @pzelasko in #1362

New Contributors

Full Changelog: v1.24.1...v1.24.2

v1.24.1

10 Jun 20:35
866e4a8
Compare
Choose a tag to compare

What's Changed

  • Support for reading data from AIStore using Python SDK by @pzelasko in #1354

Full Changelog: v1.24...v1.24.1

v1.24 - The World's Highest Wingsuit Jump

05 Jun 19:59
4d57d53
Compare
Choose a tag to compare

What's Changed

New features

Notably, there's a new optimization for dynamic bucketing sampler in multi-GPU training - it will choose the same (or the closest possible) bucket on each DDP rank to keep the total training step times closer. The expected speedup is dependent on the model and the number of GPUs. We observed 8 and 13% speedups across two experiments compared to non-synchronized bucket selection. The new option is called sync_buckets and is enabled by default.

Recipes

Other improvements

New Contributors

Full Changelog: v1.23...v1.24

v1.23 - Snowdrop

30 Apr 18:43
b2dce78
Compare
Choose a tag to compare

What's Changed

Recipes

Fixes to a regression in noise mixing augmentations

  • Enhance CutSet.mix() randomness and data utilization by @pzelasko in #1315
  • Fix randomness in CutMix transform by @pzelasko in #1316
  • select a random sub-region of the noise based on the delta duration by @osadj in #1317

Other improvements

New Contributors

Full Changelog: v1.22...v1.23

v1.22 - Sherpa's Paradise

07 Mar 19:38
d26d476
Compare
Choose a tag to compare

What's Changed

New features

  • Extending Lhotse dataloading to text/multimodal data by @pzelasko in #1295

As an experimental feature, we are extending the API of Lhotse samplers to enable key sampling features for non-audio data such as text. That means text (and other) data can be dynamically multiplexed and bucketed in the same way as audio data with some lightweight wrappers. Please refer to new documentation here: https://lhotse.readthedocs.io/en/latest/datasets.html#customizing-sampling-constraints

  • Multi-channel support improvements
    • Fix loading multi-channel custom recording fields in multi cuts by @pzelasko in #1298
    • Channel selection for multi-channel custom recording fields by @pzelasko in #1299

Lhotse MultiCuts:

  • are now exportable into Lhotse Shar format
  • gained a new method cut = cut.with_channels([0, 1, ...]) to modify the channels they refer to
  • can have multi-channel custom Recordings with channels selectable via a special custom key (e.g., if defining cut.target_recording, audio can be read via cut.load_target_recording() and channels will be auto-selected by looking up cut.target_recording_channel_selector).

Recipes

Other improvements

New Contributors

Full Changelog: v1.21...v1.22

v1.21 - Glaciology

13 Feb 19:57
769c273
Compare
Choose a tag to compare

What's Changed

This release patches lhotse to handle cases when libsox is not available for torchaudio. The audio backend code went through additional round of refactoring, and libsndfile is now preferred as a default since it showed faster audio decoding performance in our testing. Going forward, when LHOTSE_AUDIO_BACKEND is set, we will use the same backend for audio loading, audio saving, and reading audio metadata (if possible). This release also adds support for Python 3.12 and PyTorch 2.2.

  • Add VAD to Supervisions in LibriLight Recipe by @yfyeung in #1280
  • Fixes for manifest validation and fixing by @pzelasko in #1284
  • Handle error with cachedir creation gracefully by @pzelasko in #1287
  • AudioBackend specific save_audio and info, managing missing SoX in torchaudio, Python 3.12 / PyTorch 2.2 support, using libsndfile as preferred audio backend by @pzelasko in #1288

Full Changelog: v1.20...v1.21

v1.20 - Pining for the Fjords

31 Jan 20:51
455b20e
Compare
Choose a tag to compare

What's Changed

New features

  • Extended the subset of lhotse that works without installing torchaudio by @pzelasko in #1253 #1255
  • Ensure drop_last=False always returns an equal number of mini-batches by re-distributing and/or duplicating some data by @pzelasko in #1277
  • Improved CPU memory usage and shuffling + bucketing in DynamicBucketingSampler by @pzelasko in #1276
  • Enable seed randomization in dynamic samplers by @pzelasko in #1278

Recipes

Other improvements

  • Update docs with env vars used by Lhotse by @pzelasko in #1252
  • support whisper large v3; deepspeed launcher rank world_size setting by @yuekaizhang in #1260
  • Fix non-deterministic tests by @pzelasko in #1261
  • Fix duplication issues in CutSet.mix() by @pzelasko in #1268
  • Support controllable CutSet.mux weights in multiprocess dataloading by @pzelasko in #1266
  • Fix distributed sampler initialization and exceeded sampler warning false positives by @pzelasko in #1270
  • Install kaldi-native-io explicitly in the kaldi doc example. by @csukuangfj in #1275
  • Allow duplicate cut IDs in a CutSet (CutSet is list-like instead of dict-like) by @pzelasko in #1279

New Contributors

Full Changelog: v1.19...v1.20