-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Report corrupt TIFF files, filter load_data where images are actually missing #76
Comments
Thanks @hanslovsky for flagging these. ccing @shntnu to bring this to his attention. |
I downloaded all sources except source 11 (still working on that) and found only one additional corrupt file in source 3. All other sources (except 11) did not have corrupt files. |
Thank you so much for reporting this @hanslovsky
|
I did run |
@Arkkienkeli your findings are consistent with mine (I did not report any corrupted images that are not in the metadata), with the exception of the one image of source 11. I did not report anything for source 11 in this issue because I was still working on it at that time. I will double-check my records to see if I have any notes on corrupted files for source 11. I know that I reported missing images for source 11 in #78 but I don't know if that includes any corrupted images. cc @shntnu |
@Arkkienkeli I just double-checked the images I reported missing in source 11 (source_11-404.txt) and I found the image you reported corrupted in there as well. Now I can conclusively say that both our reportings are consistent. Please note that I also found some images in source 11 that were simply not present, in plates EC000038and EC000066 |
I will drop in some notes for now
Internal notes
|
Alright, overall
@hanslovsky @Arkkienkeli -- thank you so much for reporting this! You can proceed by simply ignoring these images. Our task is to update the load data files to remove the discrepancy |
Regarding the corrupted files, we should likely take the same strategy – drop them from load_data. @Arkkienkeli -- You can proceed by ignoring these images because we no longer have access to the originals (thankfully that's only 34 images out of the gazillion) |
I found a few corrupt tiff files in the JUMP production dataset. So far, I have only seen corrupt tiff files in sources 1 and 7 (4 files each). I will report back any additional corrupt tiff files that I may find during my download/conversion.
Here is what I have so far:
How to confirm that these files are corrupt:
Notes:
The text was updated successfully, but these errors were encountered: