You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
a split generator that looks for config+split-specific index files (train_index or train/index)
index files allow us to subset both parquets and examples
we then add a ds.filter before returning the dataset.
there might be an efficient arrow way to implement the filter
(this could also go directly into yaml but the index file solution is more modular).
assuming a dataset has an id field and an index.
The text was updated successfully, but these errors were encountered: