Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

idr0015-colin-taraoceans S-BIAD861 #645

Open
will-moore opened this issue Feb 22, 2023 · 82 comments
Open

idr0015-colin-taraoceans S-BIAD861 #645

will-moore opened this issue Feb 22, 2023 · 82 comments

Comments

@will-moore
Copy link
Member

idr0015-colin-taraoceans

@will-moore will-moore moved this to test convert in NGFF conversion Feb 22, 2023
@dominikl dominikl moved this from test convert to re-import test image in NGFF conversion Feb 27, 2023
@pwalczysko
Copy link

Reimport still in progress - cancelled once because of long wait on FILESET_UPLOAD_PREP.
The new import in progress since 8 March, also FILESET_UPLOAD_PREP (with parallel-upload=10)

@jburel
Copy link
Member

jburel commented Mar 10, 2023

Since we will be working on that study. We should take the opportunity to also fix the location metadata

@pwalczysko
Copy link

pwalczysko commented Apr 5, 2023

Imported without chunks and exchanged the symlink in ManagedRepo similarly to the idr0013 case. The new Plate on pilot-idrtesting is http://localhost:1080/webclient/?show=plate-254 and the name is idr0015-nochunks. All looks good, the thumbs and full viewer work fine.

@pwalczysko pwalczysko moved this from re-import test image to convert all data to NGFF in NGFF conversion Apr 5, 2023
@will-moore
Copy link
Member Author

Estimate data volume...

uint8, 4 channels, Z: 20, 2048 x 2048, 22 x 18 wells, 84 plates.
11 TB

@will-moore
Copy link
Member Author

will-moore commented Jul 2, 2023

Starting to free-up some space...

(base) [wmoore@pilot-zarr1-dev data]$ df -h /data
Filesystem      Size  Used Avail Use% Mounted on
/dev/vdb        4.9T  4.4T  587G  89% /data

$ cd /data
$ sudo rm -rf idr0011/

$ df -h ./
Filesystem      Size  Used Avail Use% Mounted on
/dev/vdb        4.9T  3.6T  1.3T  74% /data

@will-moore
Copy link
Member Author

will-moore commented Jul 2, 2023

Convert 1 screen...

screen -S idr0015_ngff
/home/wmoore/bioformats2raw-0.6.0-24/bin/bioformats2raw /uod/idr/metadata/idr0015-colin-taraoceans/screenA/patterns/TARA_HCS1_H5_G100001472_G100001473--2013_09_28_19_45_25_chamber--U00--V01.screen TARA_HCS1_H5_G100001472_G100001473--2013_09_28_19_45_25_chamber--U00--V01.ome.zarr

EDIT: permission denied - chown -R wmoore idr0015 then re-ran at 13:16...

@will-moore
Copy link
Member Author

Make bucket...

$ aws --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3 mb s3://idr0015
make_bucket: idr0015
$ aws --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3api put-bucket-policy --bucket idr0015 --policy file://policy.json
$ aws --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3api put-bucket-cors --bucket idr0015 --cors-configuration file://cors.json

@will-moore
Copy link
Member Author

will-moore commented Jul 2, 2023

Upload a previously-created plate from pilot-zarr1-dev

cd /data/idr0015
$ /home/wmoore/mc cp -r TARA_HCS1_H5_G100002411_G100002481--2013_08_28_14_46_59_chamber--U01--V01.ome.zarr uk1s3/idr0015/zarr
.../V/9/0/3/0/4/9/0/0: 53.04 GiB / 53.04 GiB ━━━━━━━━━━━━━━━ 25.02 MiB/s 36m10s

Looks good in validator

https://ome.github.io/ome-ngff-validator/?source=https://uk1s3.embassy.ebi.ac.uk/idr0015/zarr/TARA_HCS1_H5_G100002411_G100002481--2013_08_28_14_46_59_chamber--U01--V01.ome.zarr

And vizarr (although the omero rendering settings in the .zarr are different from what's in IDR).
E.g. only 3 channels active, although 5 are exported OK:

Screenshot 2023-07-02 at 07 35 34

@will-moore
Copy link
Member Author

will-moore commented Jul 2, 2023

Uploaded a 2nd plate, recently generated above:

(base) [wmoore@pilot-zarr1-dev idr0015]$ /home/wmoore/mc cp -r TARA_HCS1_H5_G100001472_G100001473--2013_09_28_19_45_25_chamber--U00--V01.ome.zarr uk1s3/idr0015/zarr
...me.zarr/V/9/0/3/0/4/9/0/0: 79.03 GiB / 79.03 GiB ━━━━━━━━━━━━━━━━━━━━ 37.09 MiB/s 36m21

@will-moore
Copy link
Member Author

Converting 4 more plates...

for i in TARA_HCS1_H5_G100001472_G100001473--2013_09_28_19_45_25_chamber--U01--V01 TARA_HCS1_H5_G100001988_G100001989--2013_09_23_19_42_50_chamber--U00--V01 TARA_HCS1_H5_G100001988_G100001989--2013_09_23_19_42_50_chamber--U01--V01 TARA_HCS1_H5_G100002411_G100002481--2013_08_28_14_46_59_chamber--U00--V0; do echo $i; ~/bioformats2raw-0.6.0-24/bin/bioformats2raw --memo-directory ../memo /uod/idr/metadata/idr0015-colin-taraoceans/screenA/patterns/$i.screen ${i%.*}.ome.zarr; done

@will-moore
Copy link
Member Author

Seeing errors writing memo files...

2023-07-02 20:38:56,486 [main] WARN  loci.formats.Memoizer - failed to save memo file: ../memo/uod/idr/metadata/idr0015-colin-taraoceans/screenA/patterns/.TARA_HCS1_H5_G100001988_G100001989--2013_09_23_19_42_50_chamber--U01--V01.screen.bfmemo
java.io.IOException: No such file or directory

@will-moore
Copy link
Member Author

And then... (caused by a typo in my command above: --V0 should be --V00.

Exception in thread "main" picocli.CommandLine$ExecutionException: Error while calling command (com.glencoesoftware.bioformats2raw.Converter@63a65a25): java.io.FileNotFoundException: /uod/idr/metadata/idr0015-colin-taraoceans/screenA/patterns/TARA_HCS1_H5_G100002411_G100002481--2013_08_28_14_46_59_chamber--U00--V0.screen (No such file or directory)

@will-moore
Copy link
Member Author

Try to avoid memo issues by not using memo directory, but allow writing to source dir...

$ cd /uod/idr/metadata/idr0015-colin-taraoceans/screenA/patterns/
$ sudo chown wmoore patterns/

$ df -h /data/
Filesystem      Size  Used Avail Use% Mounted on
/dev/vdb        4.9T  3.6T  1.4T  74% /data

Running 12 more (including repeat of last typo fix above):

for i in TARA_HCS1_H5_G100002411_G100002481--2013_08_28_14_46_59_chamber--U00--V01 TARA_HCS1_H5_G100002479_G100002163--2013_08_26_14_59_40_chamber--U00--V01 TARA_HCS1_H5_G100002479_G100002163--2013_08_26_14_59_40_chamber--U01--V01 TARA_HCS1_H5_G100002567_G100002568--2013_09_25_12_55_58_chamber--U00--V01 TARA_HCS1_H5_G100002567_G100002568--2013_09_25_12_55_58_chamber--U01--V01 TARA_HCS1_H5_G100002655_G100002656--2013_09_24_15_21_06_chamber--U00--V01 TARA_HCS1_H5_G100002655_G100002656--2013_09_24_15_21_06_chamber--U01--V01 TARA_HCS1_H5_G100002978_G100002980--2013_08_29_12_36_46_chamber--U00--V01 TARA_HCS1_H5_G100002978_G100002980--2013_08_29_12_36_46_chamber--U01--V01 TARA_HCS1_H5_G100003406_G100004906--2013_08_24_19_23_14_chamber--U00--V01 TARA_HCS1_H5_G100003406_G100004906--2013_08_24_19_23_14_chamber--U01--V01 TARA_HCS1_H5_G100003584_G100003586--2013_09_26_15_23_10_chamber--U00--V01; do echo $i; ~/bioformats2raw-0.6.0-24/bin/bioformats2raw /uod/idr/metadata/idr0015-colin-taraoceans/screenA/patterns/$i.screen ${i%.*}.ome.zarr; done

@will-moore will-moore self-assigned this Jul 3, 2023
@will-moore
Copy link
Member Author

will-moore commented Jul 3, 2023

Started to zip some...
First 4 zarrs...

screen -S idr0015_zip
for i in TARA_HCS1_H5_G100001472_G100001473--2013_09_28_19_45_25_chamber--U01--V01 TARA_HCS1_H5_G100001988_G100001989--2013_09_23_19_42_50_chamber--U00--V01 TARA_HCS1_H5_G100001988_G100001989--2013_09_23_19_42_50_chamber--U01--V01 TARA_HCS1_H5_G100002411_G100002481--2013_08_28_14_46_59_chamber--U00--V0; do zip -r "${i%/}.ome.zarr.zip" "$i.ome.zarr"; done

@will-moore
Copy link
Member Author

Still seeing memo issues...

2023-07-03 18:41:39,091 [pool-1-thread-4] WARN  loci.formats.Memoizer - skipping memo: directory not writeable - /uod/idr/filesets/idr0015-UNKNOWN-taraoceans/20150918-tara/RAW_DATA/TARA_HCS1_H5_G100003406_G100004906--2013_08_24_19_23_14/slide--S00/chamber--U00--V01/field--X17--Y21

@will-moore
Copy link
Member Author

Zip command above created 3 zips but failed with same typo as earlier! (oops again):

zip warning: name not matched: TARA_HCS1_H5_G100002411_G100002481--2013_08_28_14_46_59_chamber--U00--V0.ome.zarr

Conversion above was generating memo errors as before. Last was:

2023-07-03 21:47:48,790 [pool-1-thread-4] WARN  loci.formats.Memoizer - skipping memo: directory not writeable - /uod/idr/filesets/idr0015-UNKNOWN-taraoceans/20150918-tara/RAW_DATA/TARA_HCS1_H5_G100003584_G100003586--2013_09_26_15_23_10/slide--S00/chamber--U00--V01/field--X17--Y21

but ran to completion OK...

Current status...

$ ls -alh /data/idr0015
total 161G
drwxrwxr-x. 19 wmoore dlindner 4.0K Jul  4 05:08 .
drwxrwxr-x. 14 root   idr-data  270 Jul  2 22:53 ..
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  2 13:17 TARA_HCS1_H5_G100001472_G100001473--2013_09_28_19_45_25_chamber--U00--V01.ome.zarr
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  2 18:23 TARA_HCS1_H5_G100001472_G100001473--2013_09_28_19_45_25_chamber--U01--V01.ome.zarr
-rw-rw-r--.  1 wmoore wmoore    48G Jul  3 15:20 TARA_HCS1_H5_G100001472_G100001473--2013_09_28_19_45_25_chamber--U01--V01.ome.zarr.zip
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  2 20:03 TARA_HCS1_H5_G100001988_G100001989--2013_09_23_19_42_50_chamber--U00--V01.ome.zarr
-rw-rw-r--.  1 wmoore wmoore    63G Jul  3 22:49 TARA_HCS1_H5_G100001988_G100001989--2013_09_23_19_42_50_chamber--U00--V01.ome.zarr.zip
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  2 21:33 TARA_HCS1_H5_G100001988_G100001989--2013_09_23_19_42_50_chamber--U01--V01.ome.zarr
-rw-rw-r--.  1 wmoore wmoore    51G Jul  4 05:08 TARA_HCS1_H5_G100001988_G100001989--2013_09_23_19_42_50_chamber--U01--V01.ome.zarr.zip
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  3 06:27 TARA_HCS1_H5_G100002411_G100002481--2013_08_28_14_46_59_chamber--U00--V01.ome.zarr
drwxrwxr-x. 25 wmoore dlindner 4.0K Mar  9 11:30 TARA_HCS1_H5_G100002411_G100002481--2013_08_28_14_46_59_chamber--U01--V01.ome.zarr
-rw-rw-r--.  1 wmoore dlindner 115K Mar  9 10:13 TARA_HCS1_H5_G100002411_G100002481--2013_08_28_14_46_59_chamber--U01--V01.screen
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  3 07:54 TARA_HCS1_H5_G100002479_G100002163--2013_08_26_14_59_40_chamber--U00--V01.ome.zarr
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  3 09:33 TARA_HCS1_H5_G100002479_G100002163--2013_08_26_14_59_40_chamber--U01--V01.ome.zarr
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  3 10:44 TARA_HCS1_H5_G100002567_G100002568--2013_09_25_12_55_58_chamber--U00--V01.ome.zarr
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  3 11:55 TARA_HCS1_H5_G100002567_G100002568--2013_09_25_12_55_58_chamber--U01--V01.ome.zarr
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  3 13:03 TARA_HCS1_H5_G100002655_G100002656--2013_09_24_15_21_06_chamber--U00--V01.ome.zarr
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  3 14:14 TARA_HCS1_H5_G100002655_G100002656--2013_09_24_15_21_06_chamber--U01--V01.ome.zarr
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  3 15:24 TARA_HCS1_H5_G100002978_G100002980--2013_08_29_12_36_46_chamber--U00--V01.ome.zarr
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  3 16:29 TARA_HCS1_H5_G100002978_G100002980--2013_08_29_12_36_46_chamber--U01--V01.ome.zarr
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  3 18:42 TARA_HCS1_H5_G100003406_G100004906--2013_08_24_19_23_14_chamber--U00--V01.ome.zarr
drwxrwxr-x. 16 wmoore wmoore    232 Jul  3 20:34 TARA_HCS1_H5_G100003406_G100004906--2013_08_24_19_23_14_chamber--U01--V01.ome.zarr
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  3 21:47 TARA_HCS1_H5_G100003584_G100003586--2013_09_26_15_23_10_chamber--U00--V01.ome.zarr

Available space getting low...

$ df -h /data
Filesystem      Size  Used Avail Use% Mounted on
/dev/vdb        4.9T  4.7T  234G  96% /data

Need to delete data... $ sudo rm -rf idr0036

Upload first 3 zips...

$ ./ascp -P33001 -i ../etc/asperaweb_id_dsa.openssh -d /data/idr0015/idr0015 [email protected]:5f/13xxxxxxxx

@will-moore
Copy link
Member Author

Deleted 3 zips uploaded above and their .zarr dirs.

Move all remaining .zarr to batch1 for zipping and upload...

mkdir batch1
mv *.zarr batch1
cd batch1
for i in */; do zip -r "${i%/}.zip" "$i"; done

@will-moore
Copy link
Member Author

Converting 10 Filesets into "batch2"...

screen -r idr0015_ngff
mkdir batch2
for i in TARA_HCS1_H5_G100003584_G100003586--2013_09_26_15_23_10_chamber--U01--V01 TARA_HCS1_H5_G100003584_G100003586--2014_06_26_15_58_43_chamber--U00--V01 TARA_HCS1_H5_G100003584_G100003586--2014_06_26_15_58_43_chamber--U01--V01 TARA_HCS1_H5_G100003741_G100003739--2013_09_30_14_59_10_chamber--U00--V01 TARA_HCS1_H5_G100003741_G100003739--2013_09_30_14_59_10_chamber--U01--V01 TARA_HCS1_H5_G100004339_G100004341--2013_09_27_15_24_28_chamber--U00--V01 TARA_HCS1_H5_G100004339_G100004341--2013_09_27_15_24_28_chamber--U01--V01 TARA_HCS1_H5_G100004727_G100004940--2013_12_08_21_26_28_chamber--U00--V01 TARA_HCS1_H5_G100004727_G100004940--2013_12_08_21_26_28_chamber--U01--V01 TARA_HCS1_H5_G100004906_G100002201--2013_08_25_19_31_15_chamber--U00--V01; do echo $i; ~/bioformats2raw-0.6.0-24/bin/bioformats2raw /uod/idr/metadata/idr0015-colin-taraoceans/screenA/patterns/$i.screen batch2/${i%.*}.ome.zarr; done

@will-moore
Copy link
Member Author

will-moore commented Jul 6, 2023

Deleting individual ome.zarr filesets from batch1 once their ome.zarr.zip has been created.

Also upload a random (last) ome.zarr to s3 from batch1 to validate we're still good...

$ /home/wmoore/mc cp -r TARA_HCS1_H5_G100003584_G100003586--2013_09_26_15_23_10_chamber--U00--V01.ome.zarr uk1s3/idr0015/zarr
...me.zarr/A/1/0/0/0/0/5/0/0: 7.45 MiB
...me.zarr/V/9/0/3/0/4/9/0/0: 60.48 GiB / 60.48 GiB ━━━━━━━━━━━━━━━━━━━━ 28.41 MiB/s 36m19s

@will-moore
Copy link
Member Author

Current state of batch1:
In a day of generating zips, we only have 4 zips created (5th due soon):
Zips take 4-5 hours to generate, compared to only about 1.5 hours to convert the ome.zarr.

(base) [wmoore@pilot-zarr1-dev ~]$ ls -alh /data/idr0015/batch1
total 196G
drwxrwxr-x. 14 wmoore wmoore   4.0K Jul  6 20:21 .
drwxrwxr-x.  4 wmoore dlindner  138 Jul  5 22:55 ..
-rw-rw-r--.  1 wmoore wmoore    49G Jul  6 04:39 TARA_HCS1_H5_G100001472_G100001473--2013_09_28_19_45_25_chamber--U00--V01.ome.zarr.zip
-rw-rw-r--.  1 wmoore wmoore    32G Jul  6 08:48 TARA_HCS1_H5_G100002411_G100002481--2013_08_28_14_46_59_chamber--U00--V01.ome.zarr.zip
drwxrwxr-x.  8 wmoore dlindner  107 Jul  6 23:09 TARA_HCS1_H5_G100002411_G100002481--2013_08_28_14_46_59_chamber--U01--V01.ome.zarr
-rw-rw-r--.  1 wmoore wmoore    31G Jul  6 13:12 TARA_HCS1_H5_G100002411_G100002481--2013_08_28_14_46_59_chamber--U01--V01.ome.zarr.zip
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  3 07:54 TARA_HCS1_H5_G100002479_G100002163--2013_08_26_14_59_40_chamber--U00--V01.ome.zarr
-rw-rw-r--.  1 wmoore wmoore    61G Jul  6 20:14 TARA_HCS1_H5_G100002479_G100002163--2013_08_26_14_59_40_chamber--U00--V01.ome.zarr.zip
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  3 09:33 TARA_HCS1_H5_G100002479_G100002163--2013_08_26_14_59_40_chamber--U01--V01.ome.zarr
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  3 10:44 TARA_HCS1_H5_G100002567_G100002568--2013_09_25_12_55_58_chamber--U00--V01.ome.zarr
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  3 11:55 TARA_HCS1_H5_G100002567_G100002568--2013_09_25_12_55_58_chamber--U01--V01.ome.zarr
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  3 13:03 TARA_HCS1_H5_G100002655_G100002656--2013_09_24_15_21_06_chamber--U00--V01.ome.zarr
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  3 14:14 TARA_HCS1_H5_G100002655_G100002656--2013_09_24_15_21_06_chamber--U01--V01.ome.zarr
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  3 15:24 TARA_HCS1_H5_G100002978_G100002980--2013_08_29_12_36_46_chamber--U00--V01.ome.zarr
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  3 16:29 TARA_HCS1_H5_G100002978_G100002980--2013_08_29_12_36_46_chamber--U01--V01.ome.zarr
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  3 18:42 TARA_HCS1_H5_G100003406_G100004906--2013_08_24_19_23_14_chamber--U00--V01.ome.zarr
drwxrwxr-x. 16 wmoore wmoore    232 Jul  3 20:34 TARA_HCS1_H5_G100003406_G100004906--2013_08_24_19_23_14_chamber--U01--V01.ome.zarr
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  3 21:47 TARA_HCS1_H5_G100003584_G100003586--2013_09_26_15_23_10_chamber--U00--V01.ome.zarr
-rw-------.  1 wmoore wmoore    25G Jul  6 23:10 zitPNp7I

Current state of batch2 (started generating zips too):

(base) [wmoore@pilot-zarr1-dev ~]$ ls -alh /data/idr0015/batch2
total 88G
drwxrwxr-x. 12 wmoore wmoore   4.0K Jul  6 22:37 .
drwxrwxr-x.  4 wmoore dlindner  138 Jul  5 22:55 ..
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  6 00:29 TARA_HCS1_H5_G100003584_G100003586--2013_09_26_15_23_10_chamber--U01--V01.ome.zarr
-rw-rw-r--.  1 wmoore wmoore    38G Jul  6 17:24 TARA_HCS1_H5_G100003584_G100003586--2013_09_26_15_23_10_chamber--U01--V01.ome.zarr.zip
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  6 01:41 TARA_HCS1_H5_G100003584_G100003586--2014_06_26_15_58_43_chamber--U00--V01.ome.zarr
-rw-rw-r--.  1 wmoore wmoore    45G Jul  6 22:29 TARA_HCS1_H5_G100003584_G100003586--2014_06_26_15_58_43_chamber--U00--V01.ome.zarr.zip
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  6 02:53 TARA_HCS1_H5_G100003584_G100003586--2014_06_26_15_58_43_chamber--U01--V01.ome.zarr
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  6 04:06 TARA_HCS1_H5_G100003741_G100003739--2013_09_30_14_59_10_chamber--U00--V01.ome.zarr
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  6 05:11 TARA_HCS1_H5_G100003741_G100003739--2013_09_30_14_59_10_chamber--U01--V01.ome.zarr
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  6 06:27 TARA_HCS1_H5_G100004339_G100004341--2013_09_27_15_24_28_chamber--U00--V01.ome.zarr
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  6 07:45 TARA_HCS1_H5_G100004339_G100004341--2013_09_27_15_24_28_chamber--U01--V01.ome.zarr
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  6 08:59 TARA_HCS1_H5_G100004727_G100004940--2013_12_08_21_26_28_chamber--U00--V01.ome.zarr
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  6 10:16 TARA_HCS1_H5_G100004727_G100004940--2013_12_08_21_26_28_chamber--U01--V01.ome.zarr
drwxrwxr-x. 25 wmoore wmoore   4.0K Jul  6 11:37 TARA_HCS1_H5_G100004906_G100002201--2013_08_25_19_31_15_chamber--U00--V01.ome.zarr
-rw-------.  1 wmoore wmoore   5.1G Jul  6 23:14 ziBWfTRP

@dgault
Copy link

dgault commented Sep 26, 2023

Very odd if the dataset hasnt been modified at all and was successful last time. I will have to debug to see exactly what the failure is

@will-moore
Copy link
Member Author

@dgault Yes - I wonder if I was in a different python environment at the time - I'll try and check - dependencies might be different...

@will-moore
Copy link
Member Author

No - same error with the conda activate bioformats2raw env I have on pilot-zarr1-dev. Strange that it worked for one plate and not the other.

@will-moore
Copy link
Member Author

Plate https://idr.openmicroscopy.org/webclient/?show=plate-4653 on IDR is not visible, but NGFF version looks good: https://hms-dbmi.github.io/vizarr/?source=https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/S-BIAD861/73d2e66a-a737-46b8-b174-6d60e9145b45/73d2e66a-a737-46b8-b174-6d60e9145b45.zarr
Hopefully this will be fixed with the upgrade - NB: thumbnails will need generating!

@will-moore
Copy link
Member Author

will-moore commented Oct 8, 2023

One plate failed to extract for EBI - re-converting...

ssh pilot-zarr1-dev
cd /data/idr0015
screen -r idr0015_ngff

~/bioformats2raw-0.6.0-24/bin/bioformats2raw /uod/idr/metadata/idr0015-colin-taraoceans/screenA/patterns/TARA_HCS1_H5_G100008060_G100008062--2013_11_03_22_51_32_chamber--U00--V01.screen TARA_HCS1_H5_G100008060_G100008062--2013_11_03_22_51_32_chamber--U00--V01.ome.zarr

...
looks good:

(base) [wmoore@pilot-zarr1-dev idr0015]$ find ./TARA_HCS1_H5_G100008060_G100008062--2013_11_03_22_51_32_chamber--U00--V01.ome.zarr/ -name ".zgroup" | wc
    816     816   80292

@will-moore
Copy link
Member Author

will-moore commented Oct 9, 2023

Still failing to convert last plate on pilot-zarr1-dev - same error as above...

(omero_zarr_export) [wmoore@pilot-zarr1-dev idr0015]$ ~/bioformats2raw-0.6.0-24/bin/bioformats2raw /uod/idr/metadata/idr0015-colin-taraoceans/screenA/patterns/TARA_HCS1_H5_G100007665_G100007576--2013_10_28_21_05_26_chamber--U00--V01.screen TARA_HCS1_H5_G100007665_G100007576--2013_10_28_21_05_26_chamber--U00--V01.ome.zarr

Tried with latest release of bioformats2raw 0.7.0 from https://github.com/glencoesoftware/bioformats2raw/releases/tag/v0.7.0...

(base) [wmoore@pilot-zarr1-dev idr0015]$ ~/bioformats2raw-0.7.0/bin/bioformats2raw /uod/idr/metadata/idr0015-colin-taraoceans/screenA/patterns/TARA_HCS1_H5_G100007665_G100007576--2013_10_28_21_05_26_chamber--U00--V01.screen TARA_HCS1_H5_G100007665_G100007576--2013_10_28_21_05_26_chamber--U00--V01.ome.zarr
...
        at com.glencoesoftware.bioformats2raw.Converter.main(Converter.java:2826)
Caused by: loci.formats.UnknownFormatException: Unknown file format: /uod/idr/metadata/idr0015-colin-taraoceans/screenA/patterns/TARA_HCS1_H5_G100007665_G100007576--2013_10_28_21_05_26_chamber--U00--V01.screen

But I remember we need an IDR-specific version that recognises .screen files.
There is only the 1 release that we are already using: https://github.com/IDR/bioformats2raw/releases

@will-moore
Copy link
Member Author

will-moore commented Oct 9, 2023

On https://www.ebi.ac.uk/biostudies/submissions/files?path=%2Fuser%2Fidr0015
Deleted the 2 filesets that we have managed to re-convert...

TARA_HCS1_H5_G100008060_G100008062--2013_11_03_22_51_32_chamber--U00--V01.ome.zarr (failed to extract)
TARA_HCS1_H5_G100010241_G100010731--2013_09_29_19_14_59_chamber--U00--V01.ome.zarr (missing .zgroup)

Zip these on pilot-zarr1-dev...

screen -S idr0015_zip
cd /data/idr0015
for i in */; do zip -mr "${i%/}.zip" "$i"; done

@dgault
Copy link

dgault commented Oct 10, 2023

@will-moore, I was trying to reproduce the same issue you had seen in #645 (comment).
When running the same command as you using the IDR release https://github.com/IDR/bioformats2raw/releases/tag/v0.6.0-24, it looks as though it ran successfully without the same exception. Would you be able to try again using that release linked? (the releases on the Glencoe repos wont have the IDR specific changes included)

@will-moore
Copy link
Member Author

Just downloaded from that link (which is the same version I was using before), but still seeing the same issue...

(bioformats2raw) [wmoore@pilot-zarr1-dev idr0015]$ ./bioformats2raw-0.6.0-24/bin/bioformats2raw /uod/idr/metadata/idr0015-colin-taraoceans/screenA/patterns/TARA_HCS1_H5_G100007665_G100007576--2013_10_28_21_05_26_chamber--U00--V01.screen TARA_HCS1_H5_G100007665_G100007576--2013_10_28_21_05_26_chamber--U00--V01.ome.zarr
OpenJDK 64-Bit Server VM warning: You have loaded library /tmp/opencv_openpnp9903770868831694569/nu/pattern/opencv/linux/x86_64/libopencv_java342.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.esotericsoftware.kryo.util.UnsafeUtil (file:/data/idr0015/bioformats2raw-0.6.0-24/lib/kryo-2.24.0.jar) to constructor java.nio.DirectByteBuffer(long,int,java.lang.Object)
WARNING: Please consider reporting this to the maintainers of com.esotericsoftware.kryo.util.UnsafeUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Exception in thread "main" picocli.CommandLine$ExecutionException: Error while calling command (com.glencoesoftware.bioformats2raw.Converter@16150369): java.lang.NullPointerException
        at picocli.CommandLine.executeUserObject(CommandLine.java:1962)
        at picocli.CommandLine.access$1300(CommandLine.java:145)
        at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2346)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2311)
        at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:2172)
        at picocli.CommandLine.parseWithHandlers(CommandLine.java:2550)
        at picocli.CommandLine.parseWithHandler(CommandLine.java:2485)
        at picocli.CommandLine.call(CommandLine.java:2761)
        at com.glencoesoftware.bioformats2raw.Converter.main(Converter.java:2192)
Caused by: java.lang.NullPointerException
        at ome.xml.meta.OMEXMLMetadataImpl.getWellSampleImageRef(OMEXMLMetadataImpl.java:5205)
        at com.glencoesoftware.bioformats2raw.Converter.hasValidPlate(Converter.java:2055)
        at com.glencoesoftware.bioformats2raw.Converter.convert(Converter.java:604)
        at com.glencoesoftware.bioformats2raw.Converter.call(Converter.java:516)
        at com.glencoesoftware.bioformats2raw.Converter.call(Converter.java:107)
        at picocli.CommandLine.executeUserObject(CommandLine.java:1953)
        ... 9 more

@will-moore
Copy link
Member Author

On idr0-slot3.openmicroscopy I created a new env etc...

conda create -n bioformats2raw python=3.9
conda activate bioformats2raw
conda install -c ome bioformats2raw

Downloaded the IDR bioformats2raw-0.6.0-24 and ran that on this last plate...
Conversion started running OK but failed due to lack of disk space!

Thought that it's failing on pilot-zarr1-dev due to some dependency issue, so created a new env there too... But the problem persists, suggesting that it is failing on pilot-zarr1-dev due to an issue with the data.

@will-moore
Copy link
Member Author

will-moore commented Oct 16, 2023

Was going to try conversion on pilot-zarr2-dev but on that machine I can't see any data:

Installed conda in my home idr...

conda create -n bioformats2raw python=3.9
conda activate bioformats2raw
conda install -c ome bioformats2raw
cd
wget https://github.com/IDR/bioformats2raw/releases/download/v0.6.0-24/bioformats2raw-0.6.0-24.zip
unzip bioformats2raw-0.6.0-24.zip

But...

(bioformats2raw) [wmoore@pilot-zarr2-dev idr0015]$ ls /uod/idr/metadata/idr0015-colin-taraoceans/
(bioformats2raw) [wmoore@pilot-zarr2-dev idr0015]$ 
(bioformats2raw) [wmoore@pilot-zarr2-dev idr0015]$ ls /uod/idr/metadata/idr0015-UNKNOWN-taraoceans/
(bioformats2raw) [wmoore@pilot-zarr2-dev idr0015]$ 

@will-moore
Copy link
Member Author

will-moore commented Oct 16, 2023

$ ssh pilot-zarr2-dev
Last login: Mon Oct 16 11:41:47 2023 from pilot-proxy.sample.openstack.org
(base) [wmoore@pilot-zarr2-dev ~]$ df -h /data/idr0015
Filesystem      Size  Used Avail Use% Mounted on
/dev/vdb        750G  229G  522G  31% /data

Cloned a copy here - (didn't know how to get the submodule to fetch)...

(bioformats2raw) [wmoore@pilot-zarr2-dev idr0015-colin-taraoceans]$ cd /uod/idr/metadata/idr0015-colin-taraoceans/
(bioformats2raw) [wmoore@pilot-zarr2-dev idr0015-colin-taraoceans]$ sudo -Es git clone [email protected]:IDR/idr0015-colin-taraoceans.git
(bioformats2raw) [wmoore@pilot-zarr2-dev ~]$ cd /data/idr0015/idr0015
(bioformats2raw) [wmoore@pilot-zarr2-dev ~]$ ~/bioformats2raw-0.6.0-24/bin/bioformats2raw /uod/idr/metadata/idr0015-colin-taraoceans/idr0015-colin-taraoceans/screenA/patterns/TARA_HCS1_H5_G100007665_G100007576--2013_10_28_21_05_26_chamber--U00--V01.screen TARA_HCS1_H5_G100007665_G100007576--2013_10_28_21_05_26_chamber--U00--V01.ome.zarr
OpenJDK 64-Bit Server VM warning: You have loaded library /tmp/opencv_openpnp10574995368140289468/nu/pattern/opencv/linux/x86_64/libopencv_java342.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.

When done, checked for .zgroups... - looks good...

(base) [wmoore@pilot-zarr2-dev idr0015]$ find ./TARA_HCS1_H5_G100007665_G100007576--2013_10_28_21_05_26_chamber--U00--V01.ome.zarr/ -name ".zgroup" | wc
    816     816   80292

@will-moore
Copy link
Member Author

will-moore commented Oct 17, 2023

Deleted TARA_HCS1_H5_G100007665_G100007576--2013_10_28_21_05_26_chamber--U00--V01.ome.zarr.zip from https://www.ebi.ac.uk/biostudies/submissions/files?path=%2Fuser%2Fidr0015 and uploaded replacement...

@will-moore
Copy link
Member Author

will-moore commented Nov 2, 2023

Since https://ome.github.io/ome-ngff-validator/?source=https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/S-BIAD861/90ef79de-8222-4b6d-aa4e-d4f1bd2f1350/90ef79de-8222-4b6d-aa4e-d4f1bd2f1350.zarr
is not valid because the .zattrs doesn't contain plate info..

$ curl https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/S-BIAD861/90ef79de-8222-4b6d-aa4e-d4f1bd2f1350/90ef79de-8222-4b6d-aa4e-d4f1bd2f1350.zarr/.zattrs
{
  "bioformats2raw.layout" : 3
}

Need to check if any others have this issue...

mkdir idr0015_zattrs
for r in $(cat idr0015.csv); do
  biapath=$(echo $r | cut -d',' -f2)
  uuid=$(echo $biapath | cut -d'/' -f2)
  fsid=$(echo $r | cut -d',' -f3 | tr -d '[:space:]')
  curl https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/$biapath/$uuid.zarr/.zattrs > idr0015_zattrs/$uuid.zarr.zattrs
done

cd idr0015_zattrs
for i in $(ls); do echo $i; grep "plate" $i | wc; done

Found one other plate has the same issue:

https://ome.github.io/ome-ngff-validator/?source=https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/S-BIAD861/cfc34119-cbc6-4734-aa75-6923d3bd8c06/cfc34119-cbc6-4734-aa75-6923d3bd8c06.zarr

Those 2 plates are:

idr0015/TARA_HCS1_H5_G100010241_G100010731--2013_09_29_19_14_59_chamber--U00--V01.ome.zarr, S-BIAD861/90ef79de-8222-4b6d-aa4e-d4f1bd2f1350
idr0015/TARA_HCS1_H5_G100007665_G100007576--2013_10_28_21_05_26_chamber--U00--V01.ome.zarr, S-BIAD861/cfc34119-cbc6-4734-aa75-6923d3bd8c06

Both are these are plates that we have already re-converted and submitted to replace originals.
So hopefully we don't need to do anything else here - Just waiting on EBI to update them.

@will-moore
Copy link
Member Author

will-moore commented Nov 2, 2023

On idr-testing, Plate TARA_HCS1_H5_G100003406_G100004906--2013_08_24_19_23_14_chamber--U01--V01 http://localhost:1080/webclient/?show=plate-4653 has not fixed the broken thumbnails etc at https://idr.openmicroscopy.org/webclient/?show=plate-4653. Now we get ResourceError trying to view images.

Since NGFF looks good https://hms-dbmi.github.io/vizarr/?source=https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/S-BIAD861/73d2e66a-a737-46b8-b174-6d60e9145b45/73d2e66a-a737-46b8-b174-6d60e9145b45.zarr, let's try to import that to replace the plate (plate has no KV-pairs etc, so don't need to update annotations).

Let's import the METADATA.ome.xml then run mkngff to create the full plate...

as omero-server..

wget https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/S-BIAD861/73d2e66a-a737-46b8-b174-6d60e9145b45/73d2e66a-a737-46b8-b174-6d60e9145b45.zarr/OME/METADATA.ome.xml

omero import --transfer=ln_s --skip=all METADATA.ome.xml --file /tmp/idr0015_20231102.log  --errs /tmp/idr0015_20231102.err

2023-11-02 15:49:27,716 303        [      main] INFO          ome.formats.importer.ImportConfig - OMERO.blitz Version: 5.7.1
2023-11-02 15:49:27,739 326        [      main] INFO          ome.formats.importer.ImportConfig - Bioformats version: 7.0.0 revision: 3f8b3326cb578d59bd948fb84c838ff77e9f1b08 date: 1 August 2023
2023-11-02 15:49:27,806 393        [      main] INFO   formats.importer.cli.CommandLineImporter - Setting checksum algorithm to File-Size-64
2023-11-02 15:49:27,808 395        [      main] INFO   formats.importer.cli.CommandLineImporter - Skipping thumbnails creation
2023-11-02 15:49:27,808 395        [      main] INFO   formats.importer.cli.CommandLineImporter - Skipping minimum/maximum computation
2023-11-02 15:49:27,808 395        [      main] INFO   formats.importer.cli.CommandLineImporter - Disabling upgrade check
2023-11-02 15:49:27,808 395        [      main] INFO   formats.importer.cli.CommandLineImporter - Setting transfer to ln_s
2023-11-02 15:49:27,811 398        [      main] INFO   formats.importer.cli.CommandLineImporter - Log levels -- Bio-Formats: ERROR OMERO.importer: INFO
2023-11-02 15:49:28,202 789        [      main] INFO      ome.formats.importer.ImportCandidates - Depth: 4 Metadata Level: MINIMUM
2023-11-02 15:49:28,357 944        [      main] ERROR     ome.formats.importer.cli.ErrorHandler - FILE_EXCEPTION: /opt/omero/server/idr0015/METADATA.ome.xml
loci.formats.FormatException: loci.common.services.ServiceException: Could not get OME-XML version
        at loci.formats.in.OMEXMLReader.initFile(OMEXMLReader.java:267)
        at loci.formats.FormatReader.setId(FormatReader.java:1466)
        at loci.formats.ImageReader.setId(ImageReader.java:863)
        at ome.formats.importer.OMEROWrapper$4.setId(OMEROWrapper.java:167)
        at loci.formats.ReaderWrapper.setId(ReaderWrapper.java:660)
        at loci.formats.ChannelFiller.setId(ChannelFiller.java:234)
        at loci.formats.ReaderWrapper.setId(ReaderWrapper.java:660)
        at loci.formats.ChannelSeparator.setId(ChannelSeparator.java:293)
        at loci.formats.ReaderWrapper.setId(ReaderWrapper.java:660)
        at loci.formats.Memoizer.setId(Memoizer.java:698)
        at loci.formats.ReaderWrapper.setId(ReaderWrapper.java:660)
        at ome.formats.importer.ImportCandidates.singleFile(ImportCandidates.java:427)
        at ome.formats.importer.ImportCandidates.handleFile(ImportCandidates.java:576)
        at ome.formats.importer.ImportCandidates.execute(ImportCandidates.java:384)
        at ome.formats.importer.ImportCandidates.<init>(ImportCandidates.java:222)
        at ome.formats.importer.ImportCandidates.<init>(ImportCandidates.java:174)
        at ome.formats.importer.cli.CommandLineImporter.<init>(CommandLineImporter.java:148)
        at ome.formats.importer.cli.CommandLineImporter.main(CommandLineImporter.java:1021)
Caused by: loci.common.services.ServiceException: Could not get OME-XML version
        at loci.formats.services.OMEXMLServiceImpl.transformToLatestVersion(OMEXMLServiceImpl.java:204)
        at loci.formats.services.OMEXMLServiceImpl.createOMEXMLMetadata(OMEXMLServiceImpl.java:376)
        at loci.formats.services.OMEXMLServiceImpl.createOMEXMLMetadata(OMEXMLServiceImpl.java:363)
        at loci.formats.in.OMEXMLReader.initFile(OMEXMLReader.java:261)
        ... 17 common frames omitted
2023-11-02 15:49:28,358 945        [      main] INFO      ome.formats.importer.ImportCandidates - 1 file(s) parsed into 0 group(s) with 1 call(s) to setId in 149ms. (156ms total) [0 unknowns]
2023-11-02 15:49:28,407 994        [      main] INFO       ome.formats.OMEROMetadataStoreClient - Attempting initial SSL connection to localhost:4064
2023-11-02 15:49:28,972 1559       [      main] INFO       ome.formats.OMEROMetadataStoreClient - Insecure connection requested, falling back
2023-11-02 15:49:29,348 1935       [      main] INFO       ome.formats.OMEROMetadataStoreClient - Pinging session every 300s.
2023-11-02 15:49:29,361 1948       [      main] INFO       ome.formats.OMEROMetadataStoreClient - Server: 5.6.9
2023-11-02 15:49:29,361 1948       [      main] INFO       ome.formats.OMEROMetadataStoreClient - Client: 5.7.1
2023-11-02 15:49:29,362 1949       [      main] INFO       ome.formats.OMEROMetadataStoreClient - Java Version: 11.0.20
2023-11-02 15:49:29,362 1949       [      main] INFO       ome.formats.OMEROMetadataStoreClient - OS Name: Linux
2023-11-02 15:49:29,362 1949       [      main] INFO       ome.formats.OMEROMetadataStoreClient - OS Arch: amd64
2023-11-02 15:49:29,362 1949       [      main] INFO       ome.formats.OMEROMetadataStoreClient - OS Version: 3.10.0-1160.99.1.el7.x86_64
No imports due to errors!

Didn't have this problem at IDR/idr0125-way-cellpainting#4 (comment) - I wonder what the difference is???

@will-moore will-moore moved this from Data on Embassy s3 to create new Filesets in idr-next in NGFF conversion Nov 13, 2023
@will-moore
Copy link
Member Author

will-moore commented Nov 13, 2023

Fixing broken mkngff Filesets....

Ran mkngff sql on 3 newly generated EBI Plates...
Using Fileset IDs 21112.sql 21118.sql 21160.sql.
These are still valid on idr-testing, even though they have been swapped (not linked to any images now), since they still have the correct prefix for mkngff to work.

Generated sql... See IDR/idr-utils@08437f4

21160.sql - The new (broken) mkngff Fileset for this data is Fileset:ID 6313409.
Updated 21160.sql with this ID and ran...

12:30

$ psql -U omero -d idr -h $DBHOST -f 21160.sql
UPDATE 396
BEGIN
 mkngff_fileset 
----------------
        6314843
(1 row)

Same with 21112.sql -> Fileset ID: 6313428...

$ psql -U omero -d idr -h $DBHOST -f 21112.sql
UPDATE 396
BEGIN
 mkngff_fileset 
----------------
        6314844
(1 row)

COMMIT

For 21118.sql this didn't need any update (original Fileset ID still valid).
12:35

$ psql -U omero -d idr -h $DBHOST -f 21118.sql
UPDATE 396
BEGIN
 mkngff_fileset 
----------------
        6314845
(1 row)

COMMIT

@will-moore
Copy link
Member Author

Forgot to run symlink creation for those 3 plates...
Using idr0015.csv to manually run these... NB: these use the original Filesets (which haven't been deleted) to get the fileset prefix...

$ omero mkngff symlink /data/OMERO/ManagedRepository 21112 "/bia-integrator-data/S-BIAD861/1a29207c-d50b-48b7-a7c0-54c6252bfd9c/1a29207c-d50b-48b7-a7c0-54c6252bfd9c.zarr"

Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-06/06/00-58-20.828
Creating dir at /data/OMERO/ManagedRepository/demo_2/2016-06/06/00-58-20.828_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2016-06/06/00-58-20.828_mkngff/1a29207c-d50b-48b7-a7c0-54c6252bfd9c.zarr -> /bia-integrator-data/S-BIAD861/1a29207c-d50b-48b7-a7c0-54c6252bfd9c/1a29207c-d50b-48b7-a7c0-54c6252bfd9c.zarr


$ omero mkngff symlink /data/OMERO/ManagedRepository 21118 "/bia-integrator-data/S-BIAD861/d69df538-4684-4b32-8ded-d2f2af43af9f/d69df538-4684-4b32-8ded-d2f2af43af9f.zarr"
...
Creating symlink /data/OMERO/ManagedRepository/demo_2/2016-06/06/21-26-25.533_mkngff/d69df538-4684-4b32-8ded-d2f2af43af9f.zarr -> /bia-integrator-data/S-BIAD861/d69df538-4684-4b32-8ded-d2f2af43af9f/d69df538-4684-4b32-8ded-d2f2af43af9f.zarr

$ omero mkngff symlink /data/OMERO/ManagedRepository 21160 "/bia-integrator-data/S-BIAD861/0cc5dbe3-444a-4ea2-a335-b51cf89c1c53/0cc5dbe3-444a-4ea2-a335-b51cf89c1c53.zarr"
...
Creating symlink /data/OMERO/ManagedRepository/demo_2/2016-06/10/13-37-45.953_mkngff/0cc5dbe3-444a-4ea2-a335-b51cf89c1c53.zarr -> /bia-integrator-data/S-BIAD861/0cc5dbe3-444a-4ea2-a335-b51cf89c1c53/0cc5dbe3-444a-4ea2-a335-b51cf89c1c53.zarr

Checked all three links with e.g.

$ ls -alh /data/OMERO/ManagedRepository/demo_2/2016-06/10/13-37-45.953_mkngff/0cc5dbe3-444a-4ea2-a335-b51cf89c1c53.zarr/
total 133K
drwxrwxrwx. 2 root root 4.0K Nov  6 19:52 .
drwxrwxrwx. 2 root root 4.0K Oct 10 10:27 ..
drwxrwxrwx. 2 root root 4.0K Oct 10 10:27 A
drwxrwxrwx. 2 root root 4.0K Oct 10 10:27 B
drwxrwxrwx. 2 root root 4.0K Oct 10 10:27 C
drwxrwxrwx. 2 root root 4.0K Oct 10 10:27 D
drwxrwxrwx. 2 root root 4.0K Oct 10 10:27 E
drwxrwxrwx. 2 root root 4.0K Oct 10 10:27 F
drwxrwxrwx. 2 root root 4.0K Oct 10 10:27 G
drwxrwxrwx. 2 root root 4.0K Oct 10 10:27 H
drwxrwxrwx. 2 root root 4.0K Oct 10 10:27 I
drwxrwxrwx. 2 root root 4.0K Oct 10 10:27 J
drwxrwxrwx. 2 root root 4.0K Oct 10 10:27 K
drwxrwxrwx. 2 root root 4.0K Oct 10 10:27 L
drwxrwxrwx. 2 root root 4.0K Oct 10 10:27 M
drwxrwxrwx. 2 root root 4.0K Oct 10 10:27 N
drwxrwxrwx. 2 root root 4.0K Oct 10 10:27 O
drwxrwxrwx. 2 root root 4.0K Oct 10 10:27 OME
drwxrwxrwx. 2 root root 4.0K Oct 10 10:27 P
drwxrwxrwx. 2 root root 4.0K Oct 10 10:27 Q
drwxrwxrwx. 2 root root 4.0K Oct 10 10:27 R
drwxrwxrwx. 2 root root 4.0K Oct 10 10:27 S
drwxrwxrwx. 2 root root 4.0K Oct 10 10:27 T
drwxrwxrwx. 2 root root 4.0K Oct 10 10:27 U
drwxrwxrwx. 2 root root 4.0K Oct 10 10:27 V
-rw-rw-rw-. 1 root root  32K Nov  6 19:52 .zattrs
-rw-rw-rw-. 1 root root   23 Nov  6 19:52 .zgroup

@will-moore
Copy link
Member Author

will-moore commented Dec 1, 2023

Since we have a bunch of Plates on idr-testing that have failed memo file generation (via Seb's parallel script and via viewing in web), lets try on clean idr0125-pilot...

$ cd /uod/idr/metadata/idr-utils/scripts/ngff_filesets/idr0015

(venv3) (base) bash-4.2$ for r in $(cat $IDRID.csv); do
>   biapath=$(echo $r | cut -d',' -f2)
>   uuid=$(echo $biapath | cut -d'/' -f2)
>   fsid=$(echo $r | cut -d',' -f3 | tr -d '[:space:]')
>   psql -U omero -d idr -h $DBHOST -f "$IDRID/$fsid.sql"
>   omero mkngff symlink /data/OMERO/ManagedRepository $fsid "/bia-integrator-data/$biapath/$uuid.zarr" --bfoptions
> done
UPDATE 396
BEGIN
 mkngff_fileset 
----------------
        5288671
(1 row)

COMMIT
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-06/09/07-02-56.345
Creating dir at /data/OMERO/ManagedRepository/demo_2/2016-06/09/07-02-56.345_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2016-06/09/07-02-56.345_mkngff/0063750d-cd10-4759-8eca-c706a07b6693.zarr -> /bia-integrator-data/S-BIAD861/0063750d-cd10-4759-8eca-c706a07b6693/0063750d-cd10-4759-8eca-c706a07b6693.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-06/09/07-02-56.345
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/2016-06/09/07-02-56.345_mkngff/0063750d-cd10-4759-8eca-c706a07b6693.zarr.bfoptions
UPDATE 396
BEGIN
 mkngff_fileset 
----------------
        5288672
(1 row)

Try viewing image http://localhost:1040/webclient/userdata/?show=image-1962455 (12:25)...

Image is viewable: Memo file took 69 mins (4178531 ms)

(base) bash-4.2$ grep 54c6252bfd9c -A 2 -B 2 /opt/omero/server/OMERO.server/var/log/Blitz-0.log | grep -A 2 "saved memo"
2023-12-01 13:35:04,681 DEBUG [                   loci.formats.Memoizer] (l.Server-9) saved memo file: /data/OMERO/BioFormatsCache/data/OMERO/ManagedRepository/demo_2/2016-06/06/00-58-20.828_mkngff/1a29207c-d50b-48b7-a7c0-54c6252bfd9c.zarr/OME/.METADATA.ome.xml.bfmemo (735340 bytes)
2023-12-01 13:35:04,681 DEBUG [                   loci.formats.Memoizer] (l.Server-9) start[1701433526150] time[4178531] tag[loci.formats.Memoizer.setId]
2023-12-01 13:35:04,682 INFO  [                ome.io.nio.PixelsService] (l.Server-9) Creating BfPixelBuffer: /data/OMERO/ManagedRepository/demo_2/2016-06/06/00-58-20.828_mkngff/1a29207c-d50b-48b7-a7c0-54c6252bfd9c.zarr/OME/METADATA.ome.xml Series: 0

@will-moore
Copy link
Member Author

will-moore commented Dec 1, 2023

Plates where memo file generation failed...

Image:1962455 Fileset:21112 TARA_HCS1_H5_G100007665_G100007576--2013_10_28_21_05_26_chamber--U00--V01
Image:1964831 Fileset:21118 TARA_HCS1_H5_G100008060_G100008062--2013_11_03_22_51_32_chamber--U00--V01
Image:1972041 Fileset:211160 TARA_HCS1_H5_G100010241_G100010731--2013_09_29_19_14_59_chamber--U00--V01

When regenerating these, I didn't include the clientpath option...

(base) bash-4.2$ grep -c https *.sql
19237.sql:3195
19238.sql:3195
19269.sql:3195
19270.sql:3195
19301.sql:3195
19302.sql:3195
19308.sql:3195
19309.sql:3195
19471.sql:3195
20252.sql:3195
20302.sql:3195
20303.sql:3195
20305.sql:3195
20306.sql:3195
20901.sql:3195
20902.sql:3195
20903.sql:1890
21002.sql:3195
21051.sql:3195
21102.sql:3195
21103.sql:3195
21104.sql:3195
21105.sql:3195
21106.sql:3195
21107.sql:3195
21108.sql:3187
21109.sql:3187
21110.sql:3195
21111.sql:3195
21112.sql:0
21113.sql:3195
21114.sql:3195
21115.sql:3195
21116.sql:3195
21117.sql:3195
21118.sql:0
21119.sql:3195
21120.sql:3195
21121.sql:3195
21122.sql:3195
21123.sql:3195
21124.sql:3195
21125.sql:3195
21126.sql:3195
21151.sql:3195
21152.sql:3195
21153.sql:3195
21154.sql:3195
21155.sql:3195
21156.sql:3195
21157.sql:585
21158.sql:3195
21159.sql:3195
21160.sql:0
21161.sql:3195
21162.sql:3195
21163.sql:3195
21164.sql:3195
21165.sql:3195
21166.sql:3195
21167.sql:3195
21168.sql:3195
21169.sql:3195
21170.sql:3195
21171.sql:3195
21172.sql:3195
21173.sql:3195
21174.sql:3195
21175.sql:3195
21176.sql:3195
21177.sql:3195
21178.sql:3195
21179.sql:3195
21180.sql:3195
21182.sql:3195
21185.sql:3195
21188.sql:3195
21191.sql:3195
21194.sql:3195
21197.sql:3195
21200.sql:3195
21203.sql:3195
21218.sql:3195

Fixed in IDR/idr-utils@7bb3acf
using add_clientpath.py script in IDR/idr-utils@87e17ab

@will-moore
Copy link
Member Author

will-moore commented Dec 3, 2023

Tried viewing the other 2 Plates too:
http://localhost:1040/webclient/userdata/?show=image-1964831
http://localhost:1040/webclient/userdata/?show=image-1972041

These are viewable on idr0125-pilot. So it looks like all these 3 Filesets that aren't viewable on idr-testing are OK, but something has prevented them being viewable there, possibly a failed attempt to create memo file before the symlink

@will-moore will-moore moved this from check_pixels to check_pixels in progress in NGFF conversion Dec 4, 2023
@will-moore
Copy link
Member Author

On idr-next, I omitted running the 3 sql scripts for regenerated Filesets above: IDR/idr-utils@125c4e5

However, these are actually ready to go, since they are working on idr0125-pilot above (and now have clientpath links too).

Run on idr-next...

create idr0015a.csv:

idr0015/TARA_HCS1_H5_G100008060_G100008062--2013_11_03_22_51_32_chamber--U00--V01.ome.zarr,S-BIAD861/d69df538-4684-4b32-8ded-d2f2af43af9f,21118
idr0015/TARA_HCS1_H5_G100010241_G100010731--2013_09_29_19_14_59_chamber--U00--V01.ome.zarr,S-BIAD861/0cc5dbe3-444a-4ea2-a335-b51cf89c1c53,21160
idr0015/TARA_HCS1_H5_G100007665_G100007576--2013_10_28_21_05_26_chamber--U00--V01.ome.zarr,S-BIAD861/1a29207c-d50b-48b7-a7c0-54c6252bfd9c,21112

As wmoore, updated SECRET

for i in $(ls); do sudo sed -i 's/oldsecret/x1234xxxx-7b69-473b-98b2-xxxxxxxxx/g' $i; done

as omero-server

omero login
cd ngff_filesets
export IDRID=idr0015
for r in $(cat idr0015a.csv); do
  biapath=$(echo $r | cut -d',' -f2)
  uuid=$(echo $biapath | cut -d'/' -f2)
  fsid=$(echo $r | cut -d',' -f3 | tr -d '[:space:]')
  psql -U omero -d idr -h $DBHOST -f "$IDRID/$fsid.sql"
  omero mkngff symlink /data/OMERO/ManagedRepository $fsid "/bia-integrator-data/$biapath/$uuid.zarr" --bfoptions
done

UPDATE 396
BEGIN
 mkngff_fileset
----------------
        6314434
(1 row)

COMMIT
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-06/06/21-26-25.533
Creating dir at /data/OMERO/ManagedRepository/demo_2/2016-06/06/21-26-25.533_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2016-06/06/21-26-25.533_mkngff/d69df538-4684-4b32-8ded-d2f2af43af9f.zarr -> /bia-integrator-data/S-BIAD861/d69df538-4684-4b32-8ded-d2f2af43af9f/d69df538-4684-4b32-8ded-d2f2af43af9f.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-06/06/21-26-25.533
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/2016-06/06/21-26-25.533_mkngff/d69df538-4684-4b32-8ded-d2f2af43af9f.zarr.bfoptions
UPDATE 396
BEGIN
 mkngff_fileset 
----------------
        6314435
(1 row)

COMMIT
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-06/10/13-37-45.953
Creating dir at /data/OMERO/ManagedRepository/demo_2/2016-06/10/13-37-45.953_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2016-06/10/13-37-45.953_mkngff/0cc5dbe3-444a-4ea2-a335-b51cf89c1c53.zarr -> /bia-integrator-data/S-BIAD861/0cc5dbe3-444a-4ea2-a335-b51cf89c1c53/0cc5dbe3-444a-4ea2-a335-b51cf89c1c53.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-06/10/13-37-45.953
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/2016-06/10/13-37-45.953_mkngff/0cc5dbe3-444a-4ea2-a335-b51cf89c1c53.zarr.bfoptions
UPDATE 396
BEGIN
 mkngff_fileset 
----------------
        6314436
(1 row)

COMMIT
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-06/06/00-58-20.828
Creating dir at /data/OMERO/ManagedRepository/demo_2/2016-06/06/00-58-20.828_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2016-06/06/00-58-20.828_mkngff/1a29207c-d50b-48b7-a7c0-54c6252bfd9c.zarr -> /bia-integrator-data/S-BIAD861/1a29207c-d50b-48b7-a7c0-54c6252bfd9c/1a29207c-d50b-48b7-a7c0-54c6252bfd9c.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-06/06/00-58-20.828
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/2016-06/06/00-58-20.828_mkngff/1a29207c-d50b-48b7-a7c0-54c6252bfd9c.zarr.bfoptions

@will-moore
Copy link
Member Author

Viewing first image to trigger memo file...
http://localhost:1080/webclient/?show=image-1962455

@will-moore will-moore moved this from check_pixels in progress to Round 2 - psql fileset IDs checked in NGFF conversion Mar 19, 2024
@will-moore will-moore moved this from Other issues (not studies) to NGFF studies in NGFF conversion May 21, 2024
@will-moore will-moore mentioned this issue May 21, 2024
15 tasks
@dominikl
Copy link
Member

Size of the dataset on S3:

Total Objects: 23498354
Total Size: 6.0 TiB

(aws --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3 ls --summarize --human-readable --recursive --no-sign-request bia-integrator-data/S-BIAD861)

@sbesson
Copy link
Member

sbesson commented Jul 10, 2024

Interestingly that's a increase by more than a factor 2 compared to the original format (2.5TB in studies.tsv).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: NGFF studies
Development

No branches or pull requests

7 participants