-
Notifications
You must be signed in to change notification settings - Fork 390
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom disaster-based train/test splits for xView2 dataset #2416
base: main
Are you sure you want to change the base?
Conversation
We decided on EuroSAT Spatial before, why switch to XView2 Dist Shift now? Will there be any corresponding citations for these new splits? It would be nice to move more of the shared code in the XView2 base class so that the only thing that needs to be changed in this subclass is the URLs. How different are these datasets? |
Spatial refers to the type of distribution shift revealed by the splits when they are rearranged. XView2, consists of multiple disasters, and the distribution shift is determined by the user's choice. The user can select any disaster as the training set and another as the test set -- which introduces varying types of distribution shifts. These shifts range from near-distribution shifts to far-distribution shifts, depending on how different the disasters in the splits are. And here, the difference is not limited to spatial factors but also includes temporal and contextual differences. That is why, Spatial would be a misleading naming for XView2 One alternative could be standardizing the naming for these subset datasets with a suffix like OOD or DistShift. What do you think?
They are basically the same dataset but with different splits. XView2DistShift allows users to select specific disasters for training and testing sets. Are you suggesting we curate the filenames for all disasters as HF links and dynamically load them as training or testing sets based on input? This approach would save us from |
Great dataset. highly recommend |
This is how it works:
All the existing methods are revised to make XView2DistShift work as it should. I cannot seem to find a way to prune further (unless I upstream a method or two to If it looks good, I can go ahead with unit tests. @adamjstewart, just to loop you in: As you may have noticed, we (cc: @calebrob6) are upstreaming some modifications to existing datasets to make them suitable for assessing models under controlled domain shifts. This unlocks a whole new research dimension in TG, enabling users to explore robustness, generalization ability, anomaly detection, novelty detection, OOD detection and more. If it reaches a certain level of maturity, I could even consider spinning it off as a standalone toolkit that also involves methods like our recent OOD detector! |
cc: @calebrob6
XView2DistShift
is a subclass ofXView2
designed to modify the original train/test splits. Similar to EuroSATSpatial #2074, this class enables domain adaptation and out-of-distribution (OOD) detection experiments.From the docstring:
TODO: test coverage