-
-
Notifications
You must be signed in to change notification settings - Fork 707
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong usage of meta-protocols subsets in segmentation tasks #1709
Comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Tested versions
Reproducible in 3.2.0, tested with a73ded2
System information
Linux / pyannote.audio 3.12 / pyannote.database 5.1.0 / Python 3.12
Issue description
In the mixins of the segmentation task, filtering is done using
self.prepared_data["audio-metadata"]["subset"] == Subsets.index("train")
.This works perfectly with normal protocols, but with meta-protocols, it seems to rely on the "original" subset, not the meta one.
For example in meta protocol:
the 'train' subset will be considered empty (and pyannote will throw errors).
I haven't tested, but I suppose it "fails silently" (i.e. ignore the set) in other cases where there is data to train on:
Minimal reproduction example (MRE)
https://colab.research.google.com/drive/1kCy30rYG8fWltJfc_xPuX8AdL28y1gMc?usp=sharing
The text was updated successfully, but these errors were encountered: