fine-tune whisper vi

jupyter notebooks to fine tune whisper models on vietnamese using kaggle (should also work on colab but not throughly tested)

using my collection of vietnamese speech datasets: https://huggingface.co/collections/doof-ferb/vietnamese-speech-dataset-65c6af8c15c9950537862fa6

N.B.1 import any trainer or pipeline class from transformers crash kaggle TPU session (see huggingface/transformers#28609) so better use GPU

N.B.2 ~~trainer class from transformers can auto use multi-GPU like kaggle free T4×2 without code change~~ by default trainer use naive model parallelism which cannot fully use all gpu in same time, so better use distributed data parallelism

N.B.3 use default greedy search, because beam search trigger a spike in VRAM usage which may cause out-of-memory (original whisper use num beams = 5, something like do_sample=True, num_beams=5)

N.B.4 if use kaggle + resume training, remember to enable files persistency before launching

scripts

evaluate accuracy (WER) with batched inference:

on whisper models: evaluate-whisper.ipynb
on whisper with PEFT LoRA: evaluate-whisper-lora.ipynb
on wav2vec BERT v2 models: evaluate-w2vBERT.ipynb

fine-tune whisper tiny with traditional approach:

script: whisper-tiny-traditional.ipynb
model with evaluated WER: https://huggingface.co/doof-ferb/whisper-tiny-vi

fine-tine whisper large with PEFT-LoRA + int8:

script for 1 GPU: whisper-large-lora.ipynb
script for multi-GPU using distributed data parallelism: whisper-large-lora-DDP.ipynb
model with evaluated WER: https://huggingface.co/doof-ferb/whisper-large-peft-lora-vi

(testing - not always working) fine-tune wav2vec v2 bert: w2v-bert-v2.ipynb

docker image to run on AWS EC2: Dockerfile, comes with standalone scripts

convert to openai-whisper, whisper.cpp, faster-whisper, ONNX, TensorRT: not yet

miscellaneous: convert to huggingface audio datasets format

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
docker		docker
docs		docs
eval		eval
misc		misc
train		train
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

fine-tune whisper vi

scripts

resources

About

Languages

License

phineas-pta/fine-tune-whisper-vi

Folders and files

Latest commit

History

Repository files navigation

fine-tune whisper vi

scripts

resources

About

Topics

Resources

License

Stars

Watchers

Forks

Languages