Skip to content

Commit 5f035d0

Browse files
committed
Release v1.8.0
1 parent da4feb0 commit 5f035d0

File tree

131 files changed

+6577
-1938
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

131 files changed

+6577
-1938
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ compile_commands.json
2020
# testing
2121
.hypothesis
2222
.pytest_cache
23+
*.coverage
2324

2425
# logs
2526
*.log

.pre-commit-config.yaml

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ repos:
1313
rev: 5.12.0
1414
hooks:
1515
- id: isort
16-
args: [--profile=black, --project=rnnt_train]
16+
args: [--profile=black, --project=rnnt_train, --project=internal_rnnt_train]
1717
- repo: https://github.com/psf/black
1818
rev: 23.3.0
1919
hooks:
@@ -39,3 +39,8 @@ repos:
3939
]
4040
# Later versions of node are incompatible with Ubuntu 18
4141
language_version: "17.9.1"
42+
- repo: https://github.com/PyCQA/flake8
43+
rev: 7.0.0
44+
hooks:
45+
- id: flake8
46+
args: [--max-line-length=92, "--extend-ignore=E203,F401,F722"]

LICENSE

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ Benchmark, previously released under the Apache license. This Derivative Work,
33
derived by, and including new code written by, Myrtle.ai, is released under
44
the MIT license:
55

6-
Copyright (c) 2022 Myrtle.ai
6+
Copyright (c) 2023 Myrtle.ai
77

88
Permission is hereby granted, free of charge, to any person obtaining a copy
99
of this software and associated documentation files (the "Software"), to deal

README.md

Lines changed: 24 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -20,22 +20,22 @@ The solution supports two model configurations:
2020
where:
2121

2222
* **Realtime streams (RTS)** is the number of concurrent streams that can be serviced by a single accelerator
23-
* **p99 latency** is the 99th-percentile latency to process a single 60 ms audio frame and return any predictions. Note that latency increases with more concurrent streams.
23+
* **p99 latency** is the 99th-percentile latency to process a single 60 ms audio frame and return any predictions. Note that latency increases with the number of concurrent streams.
2424

25-
<sup>§</sup>The `large` model inference performance figures are provisional.
26-
27-
The **solution scales linearly with number of accelerators in the server** (tested up to 8000 RTS per server).
25+
The **solution scales linearly up to 8 accelerators and we have measured a single server supporting 16000 RTS** with the `base` model.
2826

2927
The `base` and `large` configurations are optimised for inference on FPGA with Myrtle's IP to achieve high-utilisation of the available resources. They were chosen after hyperparameter searches on 10k-50k hrs of training data.
3028

29+
<sup>§</sup>The `large` model inference performance figures are provisional.
30+
3131
### Word Error Rates (WERs)
3232

3333
When training on the 50k hrs of open-source data described below, the solution has the following WERs:
3434

35-
| Model | MLS | LibriSpeech-dev-clean | LibriSpeech-dev-other | Earnings21<sup>*</sup> |
36-
|---------|-------|-----------------------|-----------------------|------------------------|
37-
| `base` | 9.37% | 3.01% | 8.14% | 26.98% |
38-
| `large` | 7.93% | 2.69% | 7.14% | 23.33% |
35+
| Model | MLS | LibriSpeech-dev-clean | LibriSpeech-dev-other | Earnings21<sup>*</sup> |
36+
|-------------------|-------|-----------------------|-----------------------|------------------------|
37+
| `base`<sup>†</sup> | 9.37% | 3.01% | 8.14% | 26.98% |
38+
| `large` | 7.70% | 2.53% | 6.90% | 21.85% |
3939

4040
These WERs are for streaming scenarios without additional forward context. Both configurations have a frame size of 60ms, so, for a given segment of audio, the model sees between 0 and 60ms of future context before making predictions.
4141

@@ -48,28 +48,32 @@ The 50k hrs of training data is a mixture of the following open-source datasets:
4848

4949
This data has a `maximum_duration` of 20s and a mean length of 12.75s.
5050

51-
**<sup>*</sup>** None of these training data subsets include near-field unscripted utterances nor financial terminology. As such the Earnings21 benchmark is out-of-domain for these systems.
51+
<sup>*</sup>None of these training data subsets include near-field unscripted utterances nor financial terminology. As such the Earnings21 benchmark is out-of-domain for these systems.
52+
<sup>†</sup>`base` model WERs were not updated for the latest release. The provided values are from version [v1.6.1](https://github.com/MyrtleSoftware/myrtle-rnnt/releases/tag/v1.6.0).
5253

5354
### Training times <a name="train-timings"></a>
5455

5556
Training throughputs on an `8 x A100 (80GB)` system are as follows:
5657

57-
| Model | Training time | Throughput | No. of updates | per-gpu `batch_size` | `GRAD_ACCUMULATION_BATCHES` |
58-
|---------|---------------|-------------|----------------|----------------------|-----------------------------|
59-
| `base` | 1.8 days | 671 utt/sec | 100k | 32 | 4 |
60-
| `large` | 3.1 days | 380 utt/sec | 100k | 16 | 8 |
58+
| Model | Training time | Throughput | No. of updates | `grad_accumulation_batches` | `batch_split_factor` |
59+
|---------|---------------|-------------|----------------|-----------------------------|----------------------|
60+
| `base` | 1.6 days | 729 utt/sec | 100k | 1 | 8 |
61+
| `large` | 2.2 days | 550 utt/sec | 100k | 1 | 16 |
6162

6263
Training times on an `8 x A5000 (24GB)` system are as follows:
6364

64-
| Model | Training time | Throughput | No. of updates | per-gpu `batch_size` | `GRAD_ACCUMULATION_BATCHES` |
65-
|---------|---------------|-------------|----------------|----------------------|-----------------------------|
66-
| `base` | 4.4 days | 268 utt/sec | 100k | 8 | 16 |
67-
| `large` | 12.9 days | 92 utt/sec | 100k | 4 | 32 |
65+
| Model | Training time | Throughput | No. of updates | `grad_accumulation_batches` | `batch_split_factor` |
66+
|---------|---------------|-------------|----------------|-----------------------------|----------------------|
67+
| `base` | 3.1 days | 379 utt/sec | 100k | 1 | 16 |
68+
| `large` | 8.5 days | 140 utt/sec | 100k | 8 | 4 |
6869

6970
where:
7071

7172
* **Throughput** is the number of utterances seen per second during training (higher is better)
72-
* **No. of updates** is the number of optimiser steps at `GLOBAL_BATCH_SIZE=1024` that are required to train the models on the 50k hrs training dataset. You may need fewer steps when training with less data
73-
* **`GRAD_ACCUMULATION_BATCHES`** is the number of gradient accumulation steps per gpu required to achieve the `GLOBAL_BATCH_SIZE` of 1024. For all configurations the **per-gpu `batch_size`** is as large as possible meaning that `GRAD_ACCUMULATION_BATCHES` is set as small as possible.
73+
* **No. of updates** is the number of optimiser steps at `--global_batch_size=1024` that are required to train the models on the 50k hrs training dataset. You may need fewer steps when training with less data
74+
* **`grad_accumulation_batches`** is the number of gradient accumulation steps performed on each GPU before taking an optimizer step
75+
* **`batch_split_factor`** is the number of sub-batches that the `PER_GPU_BATCH_SIZE` is split into before these sub-batches are passed through the joint network and loss.
76+
77+
For more details on these hyper-parameters, including how to set them, please refer to the [batch size arguments](training/docs/batch_size_hyperparameters.md) documentation.
7478

75-
For more details on the batch size hyperparameters refer to the [Training Commands subsection of training/README.md](training/README.md#training). To get started with training see the [training/README.md](training/README.md).
79+
To get started with training see the [training/README.md](training/README.md).

scripts/get-version.sh

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
#!/usr/bin/env bash
2+
BRANCH_NAME=$(git rev-parse --abbrev-ref HEAD)
3+
4+
LATEST_VERSION_TAG=$(git describe --tags --match="v[0-9]*" external --abbrev=0)
5+
6+
LAST_MAJOR_MINOR=$(echo $LATEST_VERSION_TAG | sed -r 's/v([0-9]+)\.([0-9]+)\.[0-9]+/\1.\2/g')
7+
NEXT_MAJOR_MINOR=$(scripts/next-version.sh $LAST_MAJOR_MINOR)
8+
9+
if [[ $BRANCH_NAME == "main" ]]; then
10+
VERSION=main-$NEXT_MAJOR_MINOR-$(git rev-parse --short HEAD)
11+
elif [[ $BRANCH_NAME == "external" ]]; then
12+
# valid tag in this case is the version tag
13+
# ... but this should be a manual tagging step (for now at least)
14+
echo "Versioning must be manual on external branch"
15+
exit 1
16+
elif [[ $BRANCH_NAME =~ release/v[0-9]* ]]; then
17+
# ignore NEXT_MAJOR_MINOR and use the branch name version
18+
RELEASE_BRANCH_VERSION=$(echo $BRANCH_NAME | sed -r 's/release\///g')
19+
VERSION=rc-$RELEASE_BRANCH_VERSION-$(git rev-parse --short HEAD)
20+
else
21+
# replace "_" with "-"
22+
BRANCH_NAME=$(echo "$BRANCH_NAME" | sed -r 's/[_:]/-/g')
23+
# only allow a single tag for each feature branch to save docker image space
24+
VERSION=f-$BRANCH_NAME-$NEXT_MAJOR_MINOR
25+
fi
26+
27+
echo $VERSION

scripts/next-version.sh

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
#!/usr/bin/env bash
2+
3+
# script returns next semantic version.
4+
version=$1
5+
# Input version must be either major.minor.patch or major.minor.
6+
# Script will return the next patch or minor version respectively
7+
# so:
8+
# next-version.sh 1.2.3 will return 1.2.4
9+
# next-version.sh 1.2 will return 1.3
10+
11+
if [[ $version =~ ^v?[0-9]+\.[0-9]+ ]]; then
12+
# edited from https://unix.stackexchange.com/questions/23174/increment-number-in-bash-variable-string:
13+
[[ "$version" =~ (.*[^0-9])([0-9]+)$ ]] && version="${BASH_REMATCH[1]}$((${BASH_REMATCH[2]} + 1))";
14+
echo "v$version";
15+
else
16+
echo "Invalid version tag: '$version'"
17+
exit 1
18+
19+
fi

training/.coveragerc

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
[run]
2+
omit =
3+
*/tests/*
4+
5+
[report]
6+
exclude_lines =
7+
raise AssertionError
8+
raise NotImplementedError
9+
if __name__ == .__main__.:

training/Dockerfile

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -12,16 +12,16 @@
1212
# See the License for the specific language governing permissions and
1313
# limitations under the License.
1414

15-
ARG FROM_IMAGE_NAME=nvcr.io/nvidia/pytorch:23.03-py3
15+
ARG FROM_IMAGE_NAME=nvcr.io/nvidia/pytorch:23.10-py3
1616
FROM ${FROM_IMAGE_NAME}
1717

1818
# pytorch version taken from here:
19-
# https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel-23-03.html#rel-23-03
20-
ENV PYTORCH_VERSION=2.0.0a0+1767026
19+
# https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel-23-10.html#rel-23-10
20+
ENV PYTORCH_VERSION=2.1.0a0+32f93b1
2121

22-
# Added by rob@myrtle May 2022 to fix NVIDIA key rotation problem.
22+
# fix NVIDIA key rotation problem.
2323
# See https://forums.developer.nvidia.com/t/notice-cuda-linux-repository-key-rotation/212771 for details.
24-
RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/3bf863cc.pub
24+
RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/3bf863cc.pub
2525

2626
# need to set the tzdata time noninteractively
2727
RUN apt-get update && \
@@ -49,7 +49,7 @@ WORKDIR /workspace/training/lib
4949
# # Separating the build/install steps gives better stdout/stderr diagnostics
5050
RUN python setup.py build
5151
# This is a non-editable install (we need it to put the cuda extensions in the module path which -e does not do)
52-
RUN python -m pip install --use-feature=in-tree-build .
52+
RUN python -m pip install .
5353
# Reset the workspace, needed by following scripts
5454
WORKDIR /workspace/training
5555

0 commit comments

Comments
 (0)