Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

idr0026-weigelin-immunotherapy S-BIAD860 #648

Open
will-moore opened this issue Feb 22, 2023 · 37 comments
Open

idr0026-weigelin-immunotherapy S-BIAD860 #648

will-moore opened this issue Feb 22, 2023 · 37 comments

Comments

@will-moore
Copy link
Member

idr0026-weigelin-immunotherapy

@will-moore will-moore moved this to test convert in NGFF conversion Feb 22, 2023
@dominikl dominikl moved this from test convert to re-import test image in NGFF conversion Feb 27, 2023
@dominikl
Copy link
Member

dominikl commented Mar 6, 2023

Conversion time: 9min
Import time: 72h

@dominikl dominikl moved this from re-import test image to convert all data to NGFF in NGFF conversion Mar 6, 2023
@will-moore
Copy link
Member Author

Trying to estimate how much space is needed for this conversion.

First image is uint16 (2 bytes), 507 x 507 x 21 x 71 x 4 = approx 3 GB.

Images vary in size for the study, but about 111 .pattern images (see IDR/idr-utils#56)
need converting.

Maybe 300 GB or more needed (maybe up to 500 GB)?

@will-moore will-moore self-assigned this Jun 20, 2023
@will-moore
Copy link
Member Author

will-moore commented Jun 20, 2023

Looks like all the pattern files we need to convert are under:

$ ls /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/ | wc
    111     111    3970

Corresponds to image count from IDR/idr-utils#56

$ screen -S idr0026_bf2raw

$ conda activate bioformats2raw

$ cd /data
$ sudo chown wmoore ./idr0026
$ cd idr0026

$ for i in `ls /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/`; do echo $i; /home/wmoore/bioformats2raw-0.6.0-24/bin/bioformats2raw --memo-directory ../memo /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/$i ${i%.*}.ome.zarr; done

@will-moore will-moore moved this from convert all data to NGFF to upload data to s3 in NGFF conversion Jun 21, 2023
@will-moore
Copy link
Member Author

$ aws --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3 mb s3://idr0026
make_bucket: idr0026
$ aws --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3api put-bucket-policy --bucket idr0026 --policy file://policy.json
$ aws --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3api put-bucket-cors --bucket idr0026  --cors-configuration file://cors.json
$ /home/wmoore/mc cp -r idr0026/ uk1s3/idr0026/zarr
...3.140926_14-52-18.03.ome.zarr/OME/METADATA.ome.xml: 282.79 GiB / 282.79 GiB ━━━━━━━━━━━━━

@will-moore
Copy link
Member Author

will-moore commented Jun 22, 2023

Checking on s3...

E.g. https://ome.github.io/ome-ngff-validator/?source=https://uk1s3.embassy.ebi.ac.uk/idr0026/zarr/3.50.6-3.140922_11-36-07.00.ome.zarr/0/

This image has only a single omero:channels, so the images appear as single channel in vizarr, even though the zarr array data is 4-channels and they look OK in validator.

Image

cc @sbesson

@will-moore will-moore added the bug label Jun 22, 2023
@will-moore will-moore removed their assignment Jun 22, 2023
@will-moore
Copy link
Member Author

However, the OME.xml looks OK, e.g. https://uk1s3.embassy.ebi.ac.uk/idr0026/zarr/3.50.6-3.140922_11-36-07.00.ome.zarr/OME/METADATA.ome.xml
This has 4 channels, so maybe the .zattrs omero.channel info is not so critical when we import to OMERO. But it's still wrong!

<Pixels BigEndian="true" DimensionOrder="XYZCT" ID="Pixels:0" Interleaved="false" SignificantBits="16" SizeC="4" SizeT="71" SizeX="507" SizeY="507" SizeZ="21" Type="uint16">
<Channel ID="Channel:0:0" Name="FD6_GREEN" SamplesPerPixel="1">
<LightPath/>
</Channel>
<Channel ID="Channel:0:1" Name="FD5_BLUE" SamplesPerPixel="1">
<LightPath/>
</Channel>
<Channel ID="Channel:0:2" Name="BD8_RED" SamplesPerPixel="1">
<LightPath/>
</Channel>
<Channel ID="Channel:0:3" Name="BD7_RED" SamplesPerPixel="1">
<LightPath/>
</Channel>

@will-moore will-moore moved this from upload data to s3 to create new Fileset to replace original Fileset in NGFF conversion Jun 23, 2023
@will-moore will-moore moved this from create new Fileset to replace original Fileset to upload some data to s3 and test in NGFF conversion Jun 26, 2023
@will-moore
Copy link
Member Author

On pilot-idr0125...

sudo mkdir /idr0026 && sudo /opt/goofys --endpoint https://uk1s3.embassy.ebi.ac.uk/ -o allow_other idr0026 /idr0026


# copy metadata-only images....
screen -S idr0010_aws_sync
aws s3 sync --no-sign-request --exclude '*' --include "*/.z*" --include "*.xml" --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3://idr0026/zarr .

# import all images into Dataset

for dir in *; do
  omero import -d 15352 --transfer=ln_s --depth=100 --name=${dir/.ome.zarr/} --skip=all $dir --file /tmp/$dir.log  --errs /tmp/$dir.err;
done


$ python idr-utils/scripts/managed_repo_symlinks.py Dataset:15352 /idr0026/zarr

These look good in OMERO, compared to existing IDR

Image

@sbesson sbesson self-assigned this Jun 28, 2023
@sbesson
Copy link
Member

sbesson commented Jun 28, 2023

@will-moore not 100% sure of what went wrong on your conversion but using the converter library shipping with the current IDR version of Bio-Formats, I get

$ /opt/bioformats2raw/bioformats2raw-0.6.0-24/bin/bioformats2raw -p 3.49.6-3.140922_11-33-57.00.pattern 3.49.6-3.140922_11-33-57.00.zarr
OpenJDK 64-Bit Server VM warning: You have loaded library /tmp/opencv_openpnp6732331320596727672/nu/pattern/opencv/linux/x86_64/libopencv_java342.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
..>
[0/0]  99% │███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉│ 3239/3240 (0:02:38 / 0:00:00) 
[0/0] 100% │████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████│ 3240/3240 (0:02:38 / 0:00:00) 
[0/1] 100% │████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████│ 3240/3240 (0:00:04 / 0:00:00) 

and the omero key contains metadata about all four channels as expected

$ cat 3.49.6-3.140922_11-33-57.00.zarr/0/.zattrs 
{
  "multiscales" : [ {
    "metadata" : {
      "method" : "loci.common.image.SimpleImageScaler",
      "version" : "Bio-Formats 0.6.10"
    },
    "axes" : [ {
      "name" : "t",
      "type" : "time"
    }, {
      "name" : "c",
      "type" : "channel"
    }, {
      "name" : "z",
      "type" : "space"
    }, {
      "name" : "y",
      "type" : "space"
    }, {
      "name" : "x",
      "type" : "space"
    } ],
    "name" : "11-33-57_PMT - PMT [BD2_GREEN] [00]_Time Time0000.tif",
    "datasets" : [ {
      "path" : "0",
      "coordinateTransformations" : [ {
        "scale" : [ 1.0, 1.0, 1.0, 1.0, 1.0 ],
        "type" : "scale"
      } ]
    }, {
      "path" : "1",
      "coordinateTransformations" : [ {
        "scale" : [ 1.0, 1.0, 1.0, 2.0, 2.0 ],
        "type" : "scale"
      } ]
    } ],
    "version" : "0.4"
  } ],
  "omero" : {
    "channels" : [ {
      "color" : "FF0000",
      "coefficient" : 1,
      "active" : true,
      "label" : "BD2_GREEN",
      "window" : {
        "min" : 372.0,
        "max" : 15788.0,
        "start" : 372.0,
        "end" : 15788.0
      },
      "family" : "linear",
      "inverted" : false
    }, {
      "color" : "00FF00",
      "coefficient" : 1,
      "active" : true,
      "label" : "BD8_DEEPR",
      "window" : {
        "min" : 373.0,
        "max" : 16188.0,
        "start" : 373.0,
        "end" : 16188.0
      },
      "family" : "linear",
      "inverted" : false
    }, {
      "color" : "0000FF",
      "coefficient" : 1,
      "active" : true,
      "label" : "BD7_RED",
      "window" : {
        "min" : 759.0,
        "max" : 8978.0,
        "start" : 759.0,
        "end" : 8978.0
      },
      "family" : "linear",
      "inverted" : false
    }, {
      "color" : "FF0000",
      "coefficient" : 1,
      "active" : false,
      "label" : "FD6_FDRED",
      "window" : {
        "min" : 237.0,
        "max" : 12339.0,
        "start" : 237.0,
        "end" : 12339.0
      },
      "family" : "linear",
      "inverted" : false
    } ],
    "rdefs" : {
      "defaultT" : 0,
      "model" : "color",
      "defaultZ" : 13
    }
  }
}

Sounds like the best way forward would be to redo the whole conversion?

@will-moore
Copy link
Member Author

@sbesson Looking at #648 (comment), it looks like I used the same version: bioformats2raw-0.6.0-24.
But I'll delete and try again...

@will-moore
Copy link
Member Author

Testing... (and failing!)...

(bioformats2raw) [wmoore@pilot-zarr1-dev test]$ pwd
/data/idr0026/test
$ ~/bioformats2raw-0.6.0-24/bin/bioformats2raw /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/3.65.9-6.141023_15-45-09.03.pattern 3.65.9-6.141023_15-45-09.03.ome.zarr

cat 3.65.9-6.141023_15-45-09.03.ome.zarr/0/.zattrs

{
  "multiscales" : [ {
    "metadata" : {
      "method" : "loci.common.image.SimpleImageScaler",
      "version" : "Bio-Formats 0.6.10"
    },
    "axes" : [ {
      "name" : "t",
      "type" : "time"
    }, {
      "name" : "c",
      "type" : "channel"
    }, {
      "name" : "z",
      "type" : "space"
    }, {
      "name" : "y",
      "type" : "space"
    }, {
      "name" : "x",
      "type" : "space"
    } ],
    "name" : "15-45-09_PMT - PMT [BD7_RED] [03]_Time Time0074.tif",
    "datasets" : [ {
      "path" : "0",
      "coordinateTransformations" : [ {
        "scale" : [ 1.0, 1.0, 1.0, 1.0, 1.0 ],
        "type" : "scale"
      } ]
    }, {
      "path" : "1",
      "coordinateTransformations" : [ {
        "scale" : [ 1.0, 1.0, 1.0, 2.0, 2.0 ],
        "type" : "scale"
      } ]
    } ],
    "version" : "0.4"
  } ],
  "omero" : {
    "channels" : [ {
      "color" : "808080",
      "coefficient" : 1,
      "active" : true,
      "label" : "Channel 0",
      "window" : {
        "min" : 332.0,
        "max" : 15788.0,
        "start" : 332.0,
        "end" : 15788.0
      },
      "family" : "linear",
      "inverted" : false
    } ],
    "rdefs" : {
      "defaultT" : 0,
      "model" : "greyscale",
      "defaultZ" : 0
    }
  }
}

@will-moore
Copy link
Member Author

Even using the same lib as @sbesson gives me same result?!

(bioformats2raw) [wmoore@pilot-zarr1-dev test]$ /opt/bioformats2raw/bioformats2raw-0.6.0-24/bin/bioformats2raw /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/3.49.6-3.140922_11-33-57.01.pattern 3.49.6-3.140922_11-33-57.01.ome.zarr

$ cat 3.49.6-3.140922_11-33-57.01.ome.zarr/0/.zattrs
{
  "multiscales" : [ {
    "metadata" : {
      "method" : "loci.common.image.SimpleImageScaler",
      "version" : "Bio-Formats 0.6.10"
    },
    "axes" : [ {
      "name" : "t",
      "type" : "time"
    }, {
      "name" : "c",
      "type" : "channel"
    }, {
      "name" : "z",
      "type" : "space"
    }, {
      "name" : "y",
      "type" : "space"
    }, {
      "name" : "x",
      "type" : "space"
    } ],
    "name" : "11-33-57_PMT - PMT [FD6_FDRED] [01]_Time Time0028.tif",
    "datasets" : [ {
      "path" : "0",
      "coordinateTransformations" : [ {
        "scale" : [ 1.0, 1.0, 1.0, 1.0, 1.0 ],
        "type" : "scale"
      } ]
    }, {
      "path" : "1",
      "coordinateTransformations" : [ {
        "scale" : [ 1.0, 1.0, 1.0, 2.0, 2.0 ],
        "type" : "scale"
      } ]
    } ],
    "version" : "0.4"
  } ],
  "omero" : {
    "channels" : [ {
      "color" : "808080",
      "coefficient" : 1,
      "active" : true,
      "label" : "Channel 0",
      "window" : {
        "min" : 376.0,
        "max" : 15788.0,
        "start" : 376.0,
        "end" : 15788.0
      },
      "family" : "linear",
      "inverted" : false
    } ],
    "rdefs" : {
      "defaultT" : 0,
      "model" : "greyscale",
      "defaultZ" : 0
    }
  }

@sbesson
Copy link
Member

sbesson commented Jun 29, 2023

I suspect there's something wrong with your environment and particularly the bioformats2raw Conda environment that you're using. Can you try deactivating Conda completely and simply running /opt/bioformats2raw/bioformats2raw-0.6.0-24/bin/bioformats2raw /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/3.49.6-3.140922_11-33-57.01.pattern 3.49.6-3.140922_11-33-57.01.ome.zarr ?

@will-moore
Copy link
Member Author

That didn't work either!

I wanted to try on a different machine completely...

$ ssh pilot-zarr2-dev
$ cd /data
$ sudo mkdir idr0026
$ sudo chown wmoore idr0026
$ /opt/bioformats2raw/bioformats2raw-0.6.0-24/bin/bioformats2raw -p /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/3.49.6-3.140922_11-33-57.01.pattern 3.49.6-3.140922_11-33-57.01.ome.zarr
OpenJDK 64-Bit Server VM warning: You have loaded library /tmp/opencv_openpnp632426155291925201/nu/pattern/opencv/linux/x86_64/libopencv_java342.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
Exception in thread "main" picocli.CommandLine$ExecutionException: Error while calling command (com.glencoesoftware.bioformats2raw.Converter@6ce139a4): java.io.FileNotFoundException: /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/3.49.6-3.140922_11-33-57.01.pattern (No such file or directory)

$ cd /uod/idr/metadata/
$ ls
idr0010-doil-dnadamage  idr0054-segura-tonsilhyperion

Do I need to clone all of https://github.com/IDR/idr-metadata here?

@will-moore
Copy link
Member Author

Just confirming...

(base) [wmoore@pilot-zarr1-dev idr0026]$ mkdir test3
(base) [wmoore@pilot-zarr1-dev idr0026]$ cd test3 
(base) [wmoore@pilot-zarr1-dev test3]$ /opt/bioformats2raw/bioformats2raw-0.6.0-24/bin/bioformats2raw /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/3.49.6-3.140922_11-33-57.01.pattern 3.49.6-3.140922_11-33-57.01.ome.zarr
OpenJDK 64-Bit Server VM warning: You have loaded library /tmp/opencv_openpnp7313185331649622917/nu/pattern/opencv/linux/x86_64/libopencv_java342.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
2023-06-29 10:14:29,998 [main] WARN  loci.formats.in.BaseTiffReader - unknown creation date format: 2014-09-22 11:34:50

$ cat 3.49.6-3.140922_11-33-57.01.ome.zarr/0/.zattrs 
{
  "multiscales" : [ {
    "metadata" : {
      "method" : "loci.common.image.SimpleImageScaler",
      "version" : "Bio-Formats 0.6.10"
    },
    "axes" : [ {
      "name" : "t",
      "type" : "time"
    }, {
      "name" : "c",
      "type" : "channel"
    }, {
      "name" : "z",
      "type" : "space"
    }, {
      "name" : "y",
      "type" : "space"
    }, {
      "name" : "x",
      "type" : "space"
    } ],
    "name" : "11-33-57_PMT - PMT [FD6_FDRED] [01]_Time Time0028.tif",
    "datasets" : [ {
      "path" : "0",
      "coordinateTransformations" : [ {
        "scale" : [ 1.0, 1.0, 1.0, 1.0, 1.0 ],
        "type" : "scale"
      } ]
    }, {
      "path" : "1",
      "coordinateTransformations" : [ {
        "scale" : [ 1.0, 1.0, 1.0, 2.0, 2.0 ],
        "type" : "scale"
      } ]
    } ],
    "version" : "0.4"
  } ],
  "omero" : {
    "channels" : [ {
      "color" : "808080",
      "coefficient" : 1,
      "active" : true,
      "label" : "Channel 0",
      "window" : {
        "min" : 376.0,
        "max" : 15788.0,
        "start" : 376.0,
        "end" : 15788.0
      },
      "family" : "linear",
      "inverted" : false
    } ],
    "rdefs" : {
      "defaultT" : 0,
      "model" : "greyscale",
      "defaultZ" : 0
    }
  }
}

@sbesson
Copy link
Member

sbesson commented Jun 29, 2023

Do I need to clone all of https://github.com/IDR/idr-metadata here?

For the sake of testing, you might just want to copy the single .pattern file you want to test directly. Otherwise, yes need t clone the whole repository unless @francesw wants to look into extracting idr0026 into a standalone Git repository

Just confirming...

The inconsistency in the output is very concerning. Have you tried after fully deactivating Conda, not just your environment?

@will-moore
Copy link
Member Author

Have you tried after fully deactivating Conda, not just your environment?

No. How do you do that?

@sbesson
Copy link
Member

sbesson commented Jun 29, 2023

conda deactivate

@will-moore
Copy link
Member Author

I already did that. How's that different from deactivating your environment?

I tried on a different machine...
Cloned idr-metadata and moved it to /uod/idr/metadata

cd /uod/idr/metadata/
sudo -Es git clone [email protected]:IDR/idr-metadata.git
cd ../
sudo mv metadata/idr-metadata ./
sudo rm metadata   # symlink to /data/idr-metadata
sudo mv idr-metadata metadata

Then tried...

cd /data/idr0026/
wmoore@pilot-zarr2-dev idr0026]$ /opt/bioformats2raw/bioformats2raw-0.6.0-24/bin/bioformats2raw -p /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/3.49.6-3.140922_11-33-57.01.pattern 3.49.6-3.140922_11-33-57.01.ome.zarr

$ cat 3.49.6-3.140922_11-33-57.01.ome.zarr/0/.zattrs
{
  "multiscales" : [ {
    "metadata" : {
      "method" : "loci.common.image.SimpleImageScaler",
      "version" : "Bio-Formats 0.6.10"
    },
    "axes" : [ {
      "name" : "t",
      "type" : "time"
    }, {
      "name" : "c",
      "type" : "channel"
    }, {
      "name" : "z",
      "type" : "space"
    }, {
      "name" : "y",
      "type" : "space"
    }, {
      "name" : "x",
      "type" : "space"
    } ],
    "name" : "11-33-57_PMT - PMT [FD6_FDRED] [01]_Time Time0028.tif",
    "datasets" : [ {
      "path" : "0",
      "coordinateTransformations" : [ {
        "scale" : [ 1.0, 1.0, 1.0, 1.0, 1.0 ],
        "type" : "scale"
      } ]
    }, {
      "path" : "1",
      "coordinateTransformations" : [ {
        "scale" : [ 1.0, 1.0, 1.0, 2.0, 2.0 ],
        "type" : "scale"
      } ]
    } ],
    "version" : "0.4"
  } ],
  "omero" : {
    "channels" : [ {
      "color" : "808080",
      "coefficient" : 1,
      "active" : true,
      "label" : "Channel 0",
      "window" : {
        "min" : 376.0,
        "max" : 15788.0,
        "start" : 376.0,
        "end" : 15788.0
      },
      "family" : "linear",
      "inverted" : false
    } ],
    "rdefs" : {
      "defaultT" : 0,
      "model" : "greyscale",
      "defaultZ" : 0
    }
  }
}

WAT!?

@sbesson
Copy link
Member

sbesson commented Jun 29, 2023

@will-moore I think I found the source of the issue. Can you try one more test with your last configuration, running sudo /opt/bioformats2raw/bioformats2raw-0.6.0-24/bin/bioformats2raw -p /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/3.49.6-3.140922_11-33-57.01.pattern 3.49.6-3.140922_11-33-57.01.ome.zarr (note the sudo at the beginning of the command)?

@will-moore
Copy link
Member Author

will-moore commented Jun 30, 2023

The above removal of contents of the bucket ran very slowly and has only resulted in the removal of a handful of zarr filesets out of the 111 originally there.

Since we want to delete ALL the filesets uploaded, probably quicker to delete the bucket and recreate..

ran

./mc rb --force uk1s3/idr0026

This seemed to hang/time-out and doesn't seem to have had any affect:

$ ./mc ls uk1s3/idr0026/zarr | wc
     95     475    6719

Reverted to running the rm again in a screen

./mc rm --force --recursive uk1s3/idr0026/zarr

@will-moore
Copy link
Member Author

@sbesson - Seems that the memo issue is something it would be good to fix (or at least warn) to prevent others suffering the pain above! I can create an issue somewhere, but where?

@sbesson
Copy link
Member

sbesson commented Jun 30, 2023

From my side, the immediate candidates are:

Possibly the outstanding action would be to retest a similar scenario using bioformats2raw 0.7.0, a multi-channel pattern dataset and identify whether it's IDR specific. /cc @melissalinkert

@will-moore
Copy link
Member Author

Ah - apologies @sbesson: I just realised you meant that there is probably just 1 issue (not 4) but it needs testing to determine where the issue lies!

@sbesson
Copy link
Member

sbesson commented Jun 30, 2023

Retested with a simpler version of the pattern file with 2 timepoints compatible with upstream Bio-Formats

cat 3.49.6-3.140922_11-33-57.00.pattern 
/uod/idr/filesets/idr0026-weigelin-immunotherapy/20170222-symlinks/PNAS_2015/treatment start day 3/mouse 49/day 6-3/time lapse/140922_11-33-57/11-33-57_PMT - PMT [<BD2_GREEN,BD8_DEEPR,BD7_RED,FD6_FDRED>] [00]_Time Time<0000-0001>.tif

Placed a copy of this pattern file under patterns owned by a different user and executed the two following commands:

/opt/bioformats2raw/bioformats2raw-0.6.1/bin/bioformats2raw 3.49.6-3.140922_11-33-57.00.pattern 3.49.6-3.140922_11-33-57.00.zarr
 /opt/bioformats2raw/bioformats2raw-0.6.1/bin/bioformats2raw patterns/3.49.6-3.140922_11-33-57.00.pattern 3.49.6-3.140922_11-33-57.00_2.zarr

The .zattrs are identical between both conversion and contain omero metadata for the four channels specified in the pattern file:

 (base) [sbesson@pilot-zarr2-dev tmp]$ diff 3.49.6-3.140922_11-33-57.00.zarr/0/.zattrs 3.49.6-3.140922_11-33-57.00_2.zarr/0/.zattrs 
(base) [sbesson@pilot-zarr2-dev tmp]$ tail -n 50 3.49.6-3.140922_11-33-57.00.zarr/0/.zattrs
      },
      "family" : "linear",
      "inverted" : false
    }, {
      "color" : "00FF00",
      "coefficient" : 1,
      "active" : true,
      "label" : "BD8_DEEPR",
      "window" : {
        "min" : 401.0,
        "max" : 16188.0,
        "start" : 401.0,
        "end" : 16188.0
      },
      "family" : "linear",
      "inverted" : false
    }, {
      "color" : "0000FF",
      "coefficient" : 1,
      "active" : true,
      "label" : "BD7_RED",
      "window" : {
        "min" : 801.0,
        "max" : 8055.0,
        "start" : 801.0,
        "end" : 8055.0
      },
      "family" : "linear",
      "inverted" : false
    }, {
      "color" : "FF0000",
      "coefficient" : 1,
      "active" : false,
      "label" : "FD6_FDRED",
      "window" : {
        "min" : 250.0,
        "max" : 10867.0,
        "start" : 250.0,
        "end" : 10867.0
      },
      "family" : "linear",
      "inverted" : false
    } ],
    "rdefs" : {
      "defaultT" : 0,
      "model" : "color",
      "defaultZ" : 13
    }
  }
}

Based on the above, I am leaning towards options 1 and 2 i.e. it's an IDR/bioformats specific issue which probably will be classified as wontfix as one of the aims of the ongoing conversion work is to get rid of this fork entirely

@will-moore will-moore removed the bug label Jun 30, 2023
@will-moore
Copy link
Member Author

Started creating zips in a Screen

cd /data/idr0026
for i in */; do zip -r "${i%/}.zip" "$i"; done

@will-moore
Copy link
Member Author

will-moore commented Jul 1, 2023

With all the previous Filesets deleted from s3, uploaded just a couple of different new ones to test...

(base) [wmoore@pilot-zarr1-dev ~]$ ./mc cp -r /data/idr0026/3.49.6-3.140922_11-33-57.00.ome.zarr uk1s3/idr0026/zarr/3.49.6-3.140922_11-33-57.00.ome.zarr
...zarr/OME/METADATA.ome.xml: 2.78 GiB / 2.78 GiB ━━━━━━━━━━━━━━━ 61.69 MiB/s 46s(base) 
(base) [wmoore@pilot-zarr1-dev ~]$ ./mc cp -r /data/idr0026/7.56.10-3.140926_14-52-18.03.ome.zarr uk1s3/idr0026/zarr/7.56.10-3.140926_14-52-18.03.ome.zarr
...zarr/OME/METADATA.ome.xml: 7.69 GiB / 7.69 GiB ━━━━━━━━━━━━━━━ 73.85 MiB/s 1m46s

Ooops - got an extra directory in there, but the images look good:

https://ome.github.io/ome-ngff-validator/?source=https://uk1s3.embassy.ebi.ac.uk/idr0026/zarr/7.56.10-3.140926_14-52-18.03.ome.zarr/7.56.10-3.140926_14-52-18.03.ome.zarr/0/

@will-moore
Copy link
Member Author

Uploading zips to BioStudies...

(base) [wmoore@pilot-zarr1-dev bin]$ ./ascp -P33001 -i ../etc/asperaweb_id_dsa.openssh -d /data/idr0026/idr0026 [email protected]:5f/136e8d-xxxxxxxxx

@will-moore will-moore moved this from upload some data to s3 and test to BioStudies Submission in NGFF conversion Jul 1, 2023
@will-moore will-moore assigned will-moore and unassigned sbesson Jul 1, 2023
@will-moore
Copy link
Member Author

(base) [wmoore@pilot-zarr1-dev data]$ sudo rm -rf idr0026/

@will-moore will-moore assigned francesw and unassigned will-moore Jul 12, 2023
@francesw francesw changed the title idr0026-weigelin-immunotherapy to NGFF idr0026-weigelin-immunotherapy S-BIAD860 Aug 24, 2023
@francesw francesw removed their assignment Aug 24, 2023
@francesw francesw moved this from BioStudies Submission to Data on Embassy s3 in NGFF conversion Aug 24, 2023
@will-moore
Copy link
Member Author

Currently we have 20 out of 111 Filesets "viewable" at https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/pages/S-BIAD860.html...

idr0026/3.66.6-3.141020_15-41-29.02.ome.zarr,S-BIAD860/04219d38-3c9a-4ed7-97ba-65e8538b1e73,23273
idr0026/3.67.9-6.141023_12-39-26.04.ome.zarr,S-BIAD860/1506a279-9c9d-4fcc-b5ff-a89bacb80c11,23335
idr0026/3.66.6-3.141020_17-15-27.04.ome.zarr,S-BIAD860/1d535c04-916e-47a7-857f-f731aa1f1951,23280
idr0026/3.65.6-3.141020_15-39-00.02.ome.zarr,S-BIAD860/1e0d94df-af47-432e-917f-48687290f336,23377
idr0026/3.65.6-3.141020_17-15-07.04.ome.zarr,S-BIAD860/2e2d2806-53df-4c35-a9be-25c7ca53699d,23384
idr0026/3.66.9-6.141020_15-41-29.01.ome.zarr,S-BIAD860/2f3e36a6-05d8-4a60-9f4e-d8b87e5d8fdf,23302
idr0026/3.66.9-6.141020_15-41-29.00.ome.zarr,S-BIAD860/3b8e0297-c95c-4460-adf8-75a29bfc132b,23301
idr0026/3.66.9-6.141020_15-41-29.03.ome.zarr,S-BIAD860/487f0bdd-a020-4cff-bfcb-887edd21c9ca,23304
idr0026/3.66.6-3.141020_15-41-29.04.ome.zarr,S-BIAD860/4dab8ca2-3511-43c0-a0e9-9ec1a87aabb6,23275
idr0026/3.66.9-6.141023_15-49-01.00.ome.zarr,S-BIAD860/519ad2f4-0f5a-4ad4-ac6f-5535573f11bf,23311
idr0026/3.66.6-3.141020_15-41-29.03.ome.zarr,S-BIAD860/5a578c22-3dac-456a-ac08-b240c85c7b8a,23274
idr0026/7.51.10-3.140926_10-43-58.00.ome.zarr,S-BIAD860/7b7cc2ee-5dfd-445d-a0b4-4f58448486d0,23415
idr0026/3.65.6-3.141020_15-39-00.04.ome.zarr,S-BIAD860/7ee9776a-95ec-4861-950e-c6f0884ef27b,23379
idr0026/7.48.10-3.140926_12-18-43.00.ome.zarr,S-BIAD860/9640c08d-8cba-4e32-a32d-f593b230fadf,23445
idr0026/7.48.10-3.140926_12-18-43.02.ome.zarr,S-BIAD860/a1b618b9-4e99-4c91-95d5-fbcf45f44109,23447
idr0026/3.65.9-6.141023_15-45-09.03.ome.zarr,S-BIAD860/aa0dece8-179b-4f72-9468-df0ad91a1c20,23408
idr0026/3.66.6-3.141020_15-41-29.01.ome.zarr,S-BIAD860/aef6ffa0-5360-49f2-aa89-1f52b924cc3a,23272
idr0026/3.64.9-6.141023_12-21-30.02.ome.zarr,S-BIAD860/cc6b7eac-c829-463f-aa52-14007014da5b,23397
idr0026/7.51.10-3.140926_10-43-58.03.ome.zarr,S-BIAD860/d6a19971-d7f4-47e0-beb1-77788de12d93,23418
idr0026/7.51.10-3.140926_10-43-58.02.ome.zarr,S-BIAD860/dd0be90a-ff66-410c-86b8-63d3fb6faedb,23417
for r in $(cat idr0026.csv); do
  biapath=$(echo $r | cut -d',' -f2)
  uuid=$(echo $biapath | cut -d'/' -f2)
  fsid=$(echo $r | cut -d',' -f3)
  omero mkngff sql --symlink_repo /data/OMERO/ManagedRepository --secret=$SECRET $fsid "/bia-integrator-data/$biapath/$uuid.zarr" > "$fsid.sql"
done

...Found prefix demo_2/2017-04/13 // 07-17-06.573 for fileset 23418
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2017-04/13/07-17-06.573
Creating dir at /data/OMERO/ManagedRepository/demo_2/2017-04/13/07-17-06.573_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2017-04/13/07-17-06.573_mkngff/d6a19971-d7f4-47e0-beb1-77788de12d93.zarr -> /bia-integrator-data/S-BIAD860/d6a19971-d7f4-47e0-beb1-77788de12d93/d6a19971-d7f4-47e0-beb1-77788de12d93.zarr
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/2017-04/13 // 07-06-10.670 for fileset 23417
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2017-04/13/07-06-10.670
Creating dir at /data/OMERO/ManagedRepository/demo_2/2017-04/13/07-06-10.670_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2017-04/13/07-06-10.670_mkngff/dd0be90a-ff66-410c-86b8-63d3fb6faedb.zarr -> /bia-integrator-data/S-BIAD860/dd0be90a-ff66-410c-86b8-63d3fb6faedb/dd0be90a-ff66-410c-86b8-63d3fb6faedb.zarr
for r in $(cat idr0026.csv); do
  fsid=$(echo $r | cut -d',' -f3)
  psql -U omero -d idr -h $DBHOST -f "$fsid.sql"
done

...
BEGIN
 mkngff_fileset 
----------------
        5287479
(1 row)
COMMIT
BEGIN
 mkngff_fileset 
----------------
        5287480
(1 row)
COMMIT

@will-moore
Copy link
Member Author

All good (missing thumbnails in screenshot are for images not included in the 20 updated by mkngff above:

Screenshot 2023-08-29 at 19 23 50

@will-moore will-moore moved this from Data on Embassy s3 to create new Filesets in idr-next in NGFF conversion Sep 4, 2023
@will-moore
Copy link
Member Author

will-moore commented Sep 12, 2023

Testing on idr-testing:omeroreadwrite...

Updated to today's OMEZarrReader.jar (only on omeroreadwrite server - not proxies).

Use all 111 Images in idr0026.csv - see IDR/idr-utils@003b3a3

Started mkngff at 10:37...

@will-moore
Copy link
Member Author

mkngff just done (nearly 12:00).
apply sql and view image on just readwrite server with ssh -A idr-testing.openmicroscopy.org -L 1080:omeroreadwrite:80
E.g. http://localhost:1080/webclient/?show=image-3261651

$ grep -A 2 "saved memo" /opt/omero/server/OMERO.server/var/log/Blitz-0.log | grep -A 2 "13-14-13.681_mkngff"
2023-09-12 10:59:40,189 DEBUG [                   loci.formats.Memoizer] (l.Server-2) saved memo file: /data/OMERO/BioFormatsCache/data/OMERO/ManagedRepository/demo_2/2017-04/12/13-14-13.681_mkngff/3e8c077e-5612-4ae1-a385-cfb5fb507822.zarr/OME/.METADATA.ome.xml.bfmemo (39334 bytes)
2023-09-12 10:59:40,189 DEBUG [                   loci.formats.Memoizer] (l.Server-2) start[1694516354319] time[25869] tag[loci.formats.Memoizer.setId]
2023-09-12 10:59:40,189 INFO  [                ome.io.nio.PixelsService] (l.Server-2) Creating BfPixelBuffer: /data/OMERO/ManagedRepository/demo_2/2017-04/12/13-14-13.681_mkngff/3e8c077e-5612-4ae1-a385-cfb5fb507822.zarr/OME/METADATA.ome.xml Series: 0

25869ms is 26 secs for setId

@will-moore will-moore moved this from check_pixels to pixels validated in NGFF conversion Nov 28, 2023
@imagesc-bot
Copy link

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/file-format-to-store-images-using-ngff-coverter/98320/10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: NGFF studies
Development

No branches or pull requests

5 participants