Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is pyannote not using my GPU ro CPU? So slow too. #1702

Open
CrackerHax opened this issue May 2, 2024 · 4 comments
Open

Why is pyannote not using my GPU ro CPU? So slow too. #1702

CrackerHax opened this issue May 2, 2024 · 4 comments

Comments

@CrackerHax
Copy link

CrackerHax commented May 2, 2024

Tested versions

latest version 3.1

System information

windows 11, amd 5950x cpu, ubuntu 20.04 python 3.9 latest pyannote 3.1

Issue description

CPU at 10%, both Nvidia rtx3080s at 0%, I only see the model taking up GPU memory and it is very small. Using the sample code provided in the README

Minimal reproduction example (MRE)

Use the sample speaker diarization README code

diarization = pyannote.audio.Pipeline.from_pretrained(
    "pyannote/speaker-diarization-3.1",
    use_auth_token="<my token>").to(torch.device("cuda"))

diarization_output = diarization(filename)

image
image

@CrackerHax CrackerHax reopened this May 3, 2024
@freshpearYoon
Copy link

Hi, I am facing the same issue. Did you solve the problem?

@CrackerHax
Copy link
Author

CrackerHax commented May 6, 2024

Hi, I am facing the same issue. Did you solve the problem?

No I didn't. I just ended up letting it take forever. It eventually does its job without fully utilizing my GPU(s).

@amas0
Copy link

amas0 commented May 8, 2024

Just wanted to chime in and say that I'm seeing similar-ish issues. I do see utilization on my GPU, so might not be the same thing, but general performance issues. I may have localized the issue to passing an audio file path to the pipeline directly.

TL;DR -- try preprocessing your audio into a waveform and running it that way. I dropped my processing from 50 seconds down to 12 seconds by doing so.

I had a 3 minute clip I used as a test here. Passing it into the pipeline as a path, e.g.

pipline = Pipeline.from_pretrained(...)
pipline('audio.mp3')

took about 50 seconds overall. Doing some profiling, I found that the code was spending a ton of time in the function pyannote.audio.core.io.Audio.crop.

Specifically this snippet here:

if "waveform" in file:
waveform = file["waveform"]
frames = waveform.shape[1]
sample_rate = file["sample_rate"]
elif "torchaudio.info" in file:
info = file["torchaudio.info"]
frames = info.num_frames
sample_rate = info.sample_rate
else:
info = get_torchaudio_info(file)
frames = info.num_frames
sample_rate = info.sample_rate

where passing an audio file directly follows the else: block and spends a lot of time seemingly doing file I/O by loading the file for get_torchaudio_info. The docstring of that function claims that it should cache the output torchaudio.info but I couldn't quite grok where it was cacheing it. Manually implementing a cache there lowered by time to run down to 33 secs or so. So a big chunk, but not everything. I poked a bit more to see what was going on and it still seemed to have all the performance issues in that one block for the audio file processing.

Insteading of running it completely to ground since it seemed to mostly be a problem with audio file processing, I tried preprocessing via a torchaudio load as recommended on the HF page:

waveform, sample_rate = torchaudio.load("audio.wav")
diarization = pipeline({"waveform": waveform, "sample_rate": sample_rate})

This resulted in my 6 min clip being diarized in 12 seconds and I seed solid GPU utilization along the way.

@melMass
Copy link

melMass commented May 28, 2024

Probably not directly related but the way pyannote is currently packaged overwrites your environment torch and installs the CPU version instead... I spent way too long tracking it down to this project so that might be it...

You can quickly check with: python -c "import torch;print(torch.cuda.is_available())"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants