Skip to content

Model @ 24kHz : Runtime Error: shape is invalid #80

Open
@Guppy16

Description

@Guppy16

When the win_duration is greater than or equal to the length of audio (audio_signal.signal_duration), then the model fails to decompress the audio when calling model.decompress() . This error appears in only some samples - not sure why. Below is some code to reproduce the error:

from audiotools import AudioSignal
import dac
from datasets import load_dataset

model_type = "24khz"
model_path = dac.utils.download(model_type=model_type)
model = dac.DAC.load(model_path)

ds = load_dataset("DynamicSuperb/SourceSeparation_libri2Mix_test", split="test")
ds = ds.with_format("torch")

# find the index of the failing case
fname = "data/test/1089-134686-0008_8230-279154-0012.wav"
for i, d in enumerate(ds):
    if d["file"] == fname:
        break

# Load audio signal
a = AudioSignal(ds[i]["audio"]["array"], sample_rate=ds[i]["audio"]["sampling_rate"].item())

# Run model
z = model.compress(a, win_duration=5.0)
synth = model.decompress(z)  # <--- Error

This gives the following error and traceback:

Traceback (most recent call last):
  File "/home/.idea/dac_codec_fail.py", line 24, in <module>
    synth = model.decompress(z)  # <--- Error
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/dac/model/base.py", line 289, in decompress
    recons.audio_data = recons.audio_data.reshape(
RuntimeError: shape '[-1, 1, 58240]' is invalid for input of size 58234

Please note: setting win_duration=1.0 resolves this issue, but the model is substantially slower.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions