-
Notifications
You must be signed in to change notification settings - Fork 290
Description
Problem Description
When running speech recognition with OpenVINO WhisperPipeline, setting num_beams=2 or higher causes a batch size mismatch error.
Error Message
Traceback (most recent call last):
File "", line 1, in
RuntimeError: Exception from src/core/src/runtime/tensor.cpp:121:
Exception from src/inference/dev_api/openvino/runtime/iremote_tensor.hpp:32:
Not Implemented
Code
from librosa import load
from openvino_genai import WhisperPipeline
audio, _ = load("./recording.wav", sr=16000)
pipe = WhisperPipeline("./whisper_model/whisper-medium/", device="GPU")
results = pipe.generate(
audio,
language="<|en|>",
num_beams=2, # Setting this value to 2 or higher causes the error
)
Expected Behavior
Speech recognition should work normally on the iGPU when Beam Search is enabled (num_beams = 2 or higher).
Confirmation of other information.
- Using
pipe = WhisperPipeline("./whisper_model/whisper-medium/", device="CPU")
works.
#[https://github.com/[WhisperPipeline] Batch Size Mismatch Error When Using Beam Search #2069] - Using GPU without beam works. (num_beams = 1)
CPU Information
Intel Core ultra 5 125H
OS Information
PRETTY_NAME="Ubuntu 24.04.1 LTS"
Python Version
python 3.12.3