-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent librosa versions PyTorch/SpeechSynthesis/All and CUDA-Optimized/FastSpeech #1369
Labels
bug
Something isn't working
Comments
xvdp
added a commit
to xvdp/DeepLearningExamples
that referenced
this issue
Jan 16, 2024
… latest librosa. modified: CUDA-Optimized/FastSpeech/fastspeech/dataset/ljspeech_dataset.py modified: CUDA-Optimized/FastSpeech/generate.py modified: CUDA-Optimized/FastSpeech/tacotron2/audio_processing.py modified: CUDA-Optimized/FastSpeech/tacotron2/layers.py modified: Kaldi/SpeechRecognition/notebooks/Kaldi_TRTIS_inference_offline_demo.ipynb modified: Kaldi/SpeechRecognition/notebooks/Kaldi_TRTIS_inference_online_demo.ipynb modified: PyTorch/SpeechRecognition/Jasper/requirements.txt modified: PyTorch/SpeechRecognition/QuartzNet/requirements.txt modified: PyTorch/SpeechRecognition/wav2vec2/requirements.txt modified: PyTorch/SpeechSynthesis/FastPitch/hifigan/data_function.py modified: PyTorch/SpeechSynthesis/FastPitch/requirements.txt modified: PyTorch/SpeechSynthesis/HiFiGAN/requirements.txt modified: PyTorch/SpeechSynthesis/Tacotron2/notebooks/conversationalai/client/speech_ai_demo/utils/jasper/speech_utils.py modified: PyTorch/SpeechSynthesis/Tacotron2/trtis_cpp/src/trt/requirements.txt
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
PyTorch/SpeechSynthesis/All and CUDA-Optimized/FastSpeech
librosa is used through all audio projects although only a few functions. requirements files refer to different versions. But not all syntax is coherent with the versions 'required`.
The main change in librosa > 7 is that many of the functions require kwargs, only positional args allowed are typically the data.
e.g.
librosa.core.resample(y: 'np.ndarray', *, orig_sr: 'float', target_sr: 'float', .. etc
PyTorch/SpeechSynthesis/Tacotron2/requirements.txt
requireslibrosa
PyTorch/SpeechSynthesis/Tacotron2/trtis_cpp/src/trt/requirements.txt
librosa==0.7.0
PyTorch/SpeechSynthesis/HiFiGAN/requirements.txt
librosa==0.9.0
PyTorch/SpeechSynthesis/FastPitch/requirements.txt
librosa==0.9.0
For consistency they should all require the same version. All but one function - listed below - can run on librosa 10
librosa_mel_fn(sampling_rate, n_fft, num_mels, fmin, fmax)
samples = librosa.core.resample(samples, sample_rate, target_sr)
librosa.effects.trim(samples, trim_db)
*CUDA-Optimized/FastSpeech/generate.py uses deprecated
librosa.output.write_wav(path, wav, hp.sr)
see librosa/librosa#1062win_sq = librosa_util.pad_center(win_sq, n_fft)
Several of those functions will. It is simple enough to clean the code.
Environment
*Driver Version: 535.129.03
*NVIDIA GeForce RTX 3080
The text was updated successfully, but these errors were encountered: