OME-Zarr parsing for sparsely recorded fields #8

gusqgm · 2022-04-22T14:02:31Z

Can .zarr containers contain only the data present in the fields that actually contain data within a well? This would require the parsing of the Yokogawa experiment to be able to extract (z,y,x) stage positions, assigning corresponding values to corresponding fields.

As a first step, we need to parse the x,y,width and height from the metadata file from yokogawa experiments, and one example is found in the yokogawa_image_collection_task from the Liberali Workflows repository. All metadata is saved in the so-called mlf_frame file

On the simplest case, the sparse fields would still fall onto a grid pattern, so parsing should then be able to extrapolate the size of the entire grid, and be able to place the recorded fields in the correct grid positions.

Currently Drogon creates an overview of 0's where fields were not recorded, filling up the gaps. The saved overviews also comprise of a mask overview which is used for retaining only the imaged fields for processing and avoid overfilling memory.

One interesting question here is whether OME-Zarr can actually deal with sparse field grids, meaning that the empty fields would remain empty, or would be assigned to same-sized stacks filled with 0's.

gusqgm · 2022-04-22T20:59:59Z

In the case of field grids, basically we need, in case fields are sparse area in between should be comprised of regions with no information, to be able to provide correct overview estimation.
This means that we need to:

parse (z,y,x) field position as coming from stage position of the microscope from metadata and use it for creation of sparse grid
empty regions are created as spacers for the data (does .zarr allow for sparse image writing/reading?)
grid is generated naming the fields as matrix, field grid position stored as metadata in .zarr -> only recorded images are saved on disk, no extra data is created for the spacers.

gusqgm · 2022-04-25T11:28:36Z

For gridless field acquisitions, the microscope tries to center the fields around the samples, and currently we keep also a mask so that we retrieve the fields from non-imaged areas, as seen in this example:

We can possibly just adapt the current Liberali Workflow that deals with parsing the stage positions onto creation of overviews.

jluethi · 2022-05-09T10:53:22Z

High level description from @gusqgm (moved into the issue for clarity):

Most experiments from the Liberali Lab deal with sparse data, i.e. samples are scattered throughout space.

If this is on a grid, during image acquisition in the microscope, fields that do not contain samples matching certain criteria are not imaged, so that in the end we have a sparsely acquired field grid.
For grid-less sparse acquisitions, fields are acquired centered on particular samples without following a grid pattern.
We need to be able to parse this data properly so all imaged fields are still assigned to their correct grid position during the parsing and the saving into the zarr container.

jluethi · 2022-05-18T08:38:04Z

@gusqgm Do we have a test set of this kind of data? Is this what's on the sftp share at the moment?
If so, @mfranzon is it already on the Fractal share?
Let's document here where the test data is, as soon as it's on the share :)

gusqgm · 2022-05-19T08:33:20Z

@jluethi, dataset called 20220316_sec_FOCM_test-R1_E2 within gridless_Yokogawa_recording_FMI has been uploaded and @mfranzon has been performing copying to UZH server.
It contains well E02 which is represented in the image above. The raw data is present in the TIF folder, already in compressed form.

gusqgm · 2022-05-20T14:28:32Z

Regarding the microscope acquisition:

the image above CAN have overlapping fov's...
Which brings me the following thought: we could have an identifier for each fov, where we record the positional information of each one in the metadata, and on top add a flag to say which other fov id's might be overlapping with this one. Would this be a viable option?

gusqgm · 2022-06-13T15:19:05Z

Hi all, here is an example of a Search First experiment where the microscope acquires a sparse regular grid:

As you can see, there are 15 FOVs where 4 have not been recorded.

Important: To make this image I have simply opened plane number 10 of each of the recorded fields and assembled them according to the Overview created by our luigi overview workflow. Now here comes the catch: I have added on top of the image the corresponding part of the file name associated with the FOV number, and, as you can see, we have an issue with the "...F006...", "...F007...", "...F008..." images, as they should have been shifted by 1 in order to fit the expected zig-zag naming pattern.

This is an important finding, as it means that we simply cannot rely on the file naming during parsing of the microscope data, at least for all of the sparse grid imaging experiments. Rather, we need to address the microscope metadata (so this directly combines with #25 )

I am right now checking on the code for the image collection task and the overview creation task to check on details, and will write below on more details.

Anyways, for now the data has been added to the sftp server, details being sent now via email to you @mfranzon .

gusqgm · 2022-06-13T15:33:23Z

Here some brief information about the test data shown above:

There are 3 subfolders :

220304_172545_220304_172605
220304_172545_220304_172605_segmentation
220304_172545_220304_175557

The two first ones correspond to the usual output coming from the search first part of the acquisition, and should not be considered initially. This should be easy, as they have the same time-fingerprint.
The last folder contains all of the imaging data and the associated metadata in one place.

Please note that all folders reflect a dataset where the barcoding reader failed and a date-name was added in it, following issue fractal-analytics-platform/fractal-client#48 .

I have adapted the .mlf metadata to represent the single well dataset, and this runs within Drogon, so should be parsed correctly currently.

Please not that all of the other metadata are for the time being kept unchanged, and are still pointing to the full plate information.

tcompa · 2022-06-14T10:20:45Z

Thanks @gusqgm for these details. Quick question:

I am right now checking on the code for the image collection task and the overview creation task to check on details, and will write below on more details.

Are we supposed to have access to those repositories?

jluethi · 2022-06-14T11:03:10Z

If you don't have access yet, I can go through this code today or tomorrow and put together the code for a minimal parsing example, such that the above dataset is parsed correctly. I'd than hand over the example so we wrap it into a Fractal function :)

But also, let's make sure you are added to the repositories so you can have access if needed in the future :)

tcompa · 2022-06-14T11:06:03Z

If you don't have access yet, I can go through this code today or tomorrow and put together the code for a minimal parsing example, such that the above dataset is parsed correctly. I'd than hand over the example so we wrap it into a Fractal function :)

My to-do list is not so thin at the moment, so I'd say take your time.

But also, let's make sure you are added to the repositories so you can have access if needed in the future :)

Sure, thanks.

gusqgm · 2022-06-14T14:57:04Z

Update from my side:
@tcompa you have been invited to the repository, my bad.

Regarding the implementation of correct metadata parsing and field assignments: @jluethi and I agreed to go through the current code in the aforementioned repository so that we can already filter out the most timportant parts and perform minimal tests for consistency check. Once this is done we would pass them on to you, as @jluethi mentioned.

jluethi · 2022-06-17T16:49:47Z

Update on the parsing of metadata (from the sparse example above as well as other Yokogawa metadata) in this issue: https://github.com/fractal-analytics-platform/mwe_fractal/issues/46

jluethi · 2022-08-29T14:07:39Z

Quick note: When we save sparse array, let's make sure we use the write_empty_chunks=False option and have a fill value of 0 (see here: https://zarr.readthedocs.io/en/stable/tutorial.html#empty-chunks)

jluethi · 2022-09-13T09:05:04Z

To facilitate tackling this issue:

We need to fix Remove rows & cols parameter from yokogawa to Zarr #13 to use the metadata instead of row & col parameters (let's not throw away the row & col logic: It's useful to have, but should not be the default and should not be used when there is metadata available).
Let's add to this to make sure it can also handle a search-first (grid-based): See Search-first 1) here Overview Test Datasets fractal-client#213
Let's test it on a larger search first dataset: See Search-first 2)

Let's discuss 1) in #13

For 2)
Careful, this dataset has different channels. Until we have fixed #5, let's make sure we specify them manually
When the processing works correctly, it should look like this (see also above from @gusqgm ):

Still like a grid, but not all parts have content. We will need to ensure that we build an empty array of the correct size initially to be able to fill in all the positions. And we can't make assumptions based on the count of FOVs or just look at first and last (otherwise, we're not tackling part 3)). => We will need to look at the metadata table, find the top left and bottom right corner based on all the x & y position and the image sizes.

tcompa · 2022-09-15T07:48:51Z

We will need to look at the metadata table, find the top left and bottom right corner based on all the x & y position and the image sizes.

At the moment, this happens based on the FOV-ROI indices, rather than on the x/y physical positions. In the parsing task there is a block like

        adata = read_zarr(f"{zarrurl}/tables/FOV_ROI_table")
        fov_rois = convert_ROI_table_to_indices(adata, full_res_pxl_sizes_zyx=pxl_size)

        max_x = max(roi[5] for roi in fov_rois)
        max_y = max(roi[3] for roi in fov_rois)
        max_z = max(roi[1] for roi in fov_rois)
        [...]
        canvas = da.zeros((max_z, max_y, max_x), dtype=sample.dtype, chunks=(1, chunk_size_y, chunk_size_x))

where 5,3,1 are the indices corresponding to end_z,end_y,end_x, as in roi=[start_z, end_z, start_y, end_y, start_x, end_x]. Note that the max_z part is a bit redundant, as all FOV should have the same number of Z planes.

@jluethi, do you notice anything unexpected in the way we are doing it?

jluethi · 2022-09-15T08:55:20Z

I'd be a bit cautious with using an index-based selection of the columns here. It assumes that the ordering of the AnnData ROI tables will always be the same. While our system should be consistent here, we may want to support other OME-Zarr files that contain tables with the same information, but the order of columns shouldn't be a part of that spec
=> Can we switch to selecting columns based on their names in var? Also, can we implement that such that we have default column names, but those can be overwritten as an optional input? (because the spec for how those columns are named may change in the future).

It's good to have max_z, as it may vary between wells, see e.g. the test case /data/active/fractal/3D/PelkmansLab/CardiacMultiplexing/Cycle1_5x5_10wells (fractal-analytics-platform/fractal-client#213). Mostly, I expect FOVs within a well to be consistent, but I'm not even sure that will always be the case and is not required by the specification.

Given that it handles the 2x2 case, it seems to be able to handle negative coordinates. The 2x2 case has X positions like X="-1448.3", so that's good :)
The general canvas definition and max position finding looks good to me

…g get_ROIs_bounding_box (ref #8) to lib_regions_of_interest

tcompa · 2022-09-19T08:11:54Z

As of 5c1410e, the new function get_ROIs_bounding_box by @mfranzon and me is meant to take care of this last comment.

Missing features:

The column names can be overwritten, but at the moment this is not exposed to the user.
By now we are not shifting ROI positions (to make them start from zero), because our AnnData tables are already shifted (that takes place in prepare_FOV_ROI_table). We probably need to be more general, if we want to read zarr files produced outside fractal, but I'd say we can postpone the implementation of this feature.

jluethi · 2022-09-19T08:17:01Z

Sounds good, made an issue here describing when we may need to work on the 0, 0, 0 assumption :)
#82

jluethi · 2022-09-27T20:21:10Z

The /data/active/fractal/Liberali/1_well_15_fields_20_planes_SF_w_errors/D10_R1/220304_172545_220304_175557 search first dataset is processed successfully with the current Fractal version:

The other 2 search first test cases appear to have some issues in the metadata parsing (#109 & #110), but that's most likely unrelated to the search first part of it.

I'll need to have a brief check whether the segmentation and measurements have worked correctly on this dataset. After that, we can close this issue

gusqgm added enhancement New feature or request question labels Apr 22, 2022

gusqgm changed the title ~~OME parsing for sparsely recorded field grids~~ OME-ZARR parsing for sparsely recorded field grids Apr 22, 2022

gusqgm changed the title ~~OME-ZARR parsing for sparsely recorded field grids~~ OME-Zarr parsing for sparsely recorded field grids Apr 22, 2022

gusqgm removed the question label Apr 22, 2022

gusqgm changed the title ~~OME-Zarr parsing for sparsely recorded field grids~~ OME-Zarr parsing for sparsely recorded fields Apr 25, 2022

This was referenced May 19, 2022

Open parsed OME-Zarr container and resave it with applied changes fractal-analytics-platform/fractal-client#26

Closed

Parsing metadata from Yokogawa experiments #25

Closed

jluethi mentioned this issue Jun 16, 2022

Lazy loading of wells and multi-site plates fractal-analytics-platform/fractal-client#66

Closed

jluethi self-assigned this Jul 12, 2022

This was referenced Sep 2, 2022

OME-Zarr visualization with varying levels of Z stacks #52

Closed

Handle overlap between input images #15

Closed

Remove rows & cols parameter from yokogawa to Zarr #13

Closed

jluethi mentioned this issue Sep 2, 2022

Overview of image input modalities and processing approaches #11

Closed

3 tasks

jluethi transferred this issue from fractal-analytics-platform/fractal-client Sep 2, 2022

jluethi added this to the Support Yokogawa Search-First Modality milestone Sep 2, 2022

tcompa added a commit that referenced this issue Sep 19, 2022

Update yokogawa_to_zarr, with new build_pyramid function and by movin…

5c1410e

…g get_ROIs_bounding_box (ref #8) to lib_regions_of_interest

jluethi mentioned this issue Sep 19, 2022

Support ROI tables that don't start at 0,0,0 #82

Closed

jluethi mentioned this issue Sep 27, 2022

Yokogawa Parsing chunk size mismatch #105

Closed

This was referenced Sep 28, 2022

Replicate zarr structure / MIP issue for search first #113

Closed

[WIP] Add search first example fractal-analytics-platform/fractal-client#271

Merged

jluethi closed this as completed Sep 28, 2022

Repository owner moved this from Backlog to Done in Fractal Project Management Sep 28, 2022

jluethi moved this from Done to Done Archive in Fractal Project Management Oct 5, 2022

jluethi removed this from Fractal Project Management Apr 4, 2024

OME-Zarr parsing for sparsely recorded fields #8

OME-Zarr parsing for sparsely recorded fields #8

Comments

gusqgm commented Apr 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

gusqgm commented Apr 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gusqgm commented Apr 25, 2022

Uh oh!

jluethi commented May 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jluethi commented May 18, 2022

Uh oh!

gusqgm commented May 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gusqgm commented May 20, 2022

Uh oh!

gusqgm commented Jun 13, 2022

Uh oh!

gusqgm commented Jun 13, 2022

Uh oh!

tcompa commented Jun 14, 2022

Uh oh!

jluethi commented Jun 14, 2022

Uh oh!

tcompa commented Jun 14, 2022

Uh oh!

gusqgm commented Jun 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jluethi commented Jun 17, 2022

Uh oh!

jluethi commented Aug 29, 2022

Uh oh!

jluethi commented Sep 13, 2022

Uh oh!

tcompa commented Sep 15, 2022

Uh oh!

jluethi commented Sep 15, 2022

Uh oh!

tcompa commented Sep 19, 2022

Uh oh!

jluethi commented Sep 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jluethi commented Sep 27, 2022

Uh oh!

gusqgm commented Apr 22, 2022 •

edited

Loading

gusqgm commented Apr 22, 2022 •

edited

Loading

jluethi commented May 9, 2022 •

edited

Loading

gusqgm commented May 19, 2022 •

edited

Loading

gusqgm commented Jun 14, 2022 •

edited

Loading

jluethi commented Sep 19, 2022 •

edited

Loading