Skip to content

Fix zarr parsing for multi-well setup: correct plate .zattrs #34

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jluethi opened this issue May 9, 2022 · 6 comments
Closed

Fix zarr parsing for multi-well setup: correct plate .zattrs #34

jluethi opened this issue May 9, 2022 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@jluethi
Copy link
Collaborator

jluethi commented May 9, 2022

Fixes needed for the .zattrs on the plate level:
First, wells are currently saved as this:

"wells": [
            {
                "path": "B/03"
            },
            {
                "path": "B/05"
            },
            {
                "path": "B/09"
            },

According to the spec, they should be saved like this:

"wells": [
        {
            "path": "A/1",
            "rowIndex": 0,
            "columnIndex": 0
        },
        {
            "path": "A/2"
            "rowIndex": 0,
            "columnIndex": 1
        },

Looking at this, it seems they do 0-based indexing for both columns. Thus, we will need to parse A => 0, B => 1 and so on for the rows, column numbers go 01 => 0, 03 => 2 and so on.
Should look like this at the end:

"wells": [
            {
                "path": "B/03",
                "rowIndex": 1,
                "columnIndex": 2
            },
            {
                "path": "B/05",
                "rowIndex": 1,
                "columnIndex": 4
            },
            {
                "path": "B/09",
                "rowIndex": 1,
                "columnIndex": 4
            },

Second, similar to this (96b50aa) bugfix for the rows, we'll need the equivalent for columns. When I parsed a 23 well experiment, I get this list of columns in the plate .zattrs file. Thus, the columns show up multiple times

"columns": [
            {
                "name": "03"
            },
            {
                "name": "05"
            },
            {
                "name": "09"
            },
            {
                "name": "11"
            },
            {
                "name": "04"
            },
            {
                "name": "06"
            },
            {
                "name": "08"
            },
            {
                "name": "10"
            },
            {
                "name": "05"
            },
            {
                "name": "09"
            },
            {
                "name": "11"
            },
            {
                "name": "04"
            },
            {
                "name": "06"
            },
            {
                "name": "08"
            },
            {
                "name": "10"
            },
            {
                "name": "03"
            },
            {
                "name": "05"
            },
            {
                "name": "09"
            },
            {
                "name": "11"
            },
            {
                "name": "06"
            },
            {
                "name": "08"
            },
            {
                "name": "10"
            }
        ],
        "rows": [
            {
                "name": "B"
            },
            {
                "name": "E"
            },
            {
                "name": "F"
            },
            {
                "name": "C"
            },
            {
                "name": "G"
            },
            {
                "name": "D"
            }
        ],

That should get us pretty close to the standard, we can then implement things like version, field_count or times later, if they become relevant.

@jluethi
Copy link
Collaborator Author

jluethi commented May 9, 2022

We may need to define all the columns up to the max column that exists. The spec says we should and once we actually use the indices (columnIndex, rowIndex), that would start to matter.
"Each column in the physical plate MUST be defined, even if no wells in the column are defined"
Same for the rows as well: "Each row in the physical plate MUST be defined, even if no wells in the row are defined"

Thus, our columns & rows should probably not be parsed based on what is available, but set either by plate style (e.g. as a user-input) or parsed from max row/column. Columns for the above should look something like this:

"columns": [
            {
                "name": "01"
            },
            {
                "name": "02"
            },
            {
                "name": "03"
            },
            {
                "name": "04"
            },
            {
                "name": "05"
            },
            {
                "name": "06"
            },
            {
                "name": "07"
            },
            {
                "name": "08"
            },
            {
                "name": "09"
            },
            {
                "name": "10"
            },
            {
                "name": "11"
            },
]

(with optionally also column '12', because that physically exists on the plate. But shouldn't matter for our usecase)

@jluethi
Copy link
Collaborator Author

jluethi commented May 10, 2022

And regarding the wells dictionary, here is the full definition:

wells
A list of JSON objects defining the wells of the plate. Each well object MUST contain a path key identifying the path to the well subgroup. The path MUST consist of a name in the rows list, a file separator (/), and a name from the columns list, in that order. The path MUST NOT contain additional leading or trailing directories. Each well object MUST contain both a rowIndex key identifying the index into the rows list and a columnIndex key indentifying the index into the columns list. rowIndex and columnIndex MUST be 0-based. The rowIndex, columnIndex, and path MUST all refer to the same row/column pair.

@mfranzon
Copy link
Collaborator

@jluethi two things here, The actual implementation is base on v0.3.0 https://ngff.openmicroscopy.org/0.3/
What you suggest is the latest which should be the 0.4/0.5-dev if I am not wrong. So, regarding the metadata we could keep in mind to update them, whereas the duplication of the columns is a bug which I will solve very very soon! :)

@jluethi jluethi added enhancement bug Something isn't working and removed bug Something isn't working labels May 10, 2022
@jluethi
Copy link
Collaborator Author

jluethi commented May 10, 2022

Sounds good. Let's fix the columns part first, then go into metadata afterwards :)

@jluethi
Copy link
Collaborator Author

jluethi commented May 17, 2022

So column uniqueness should be fixed with the latest commit (haven't tested this yet). But this metadata actually needs to follow another criteria: It needs to be in order. In the current parsing, rows & columns are ordered randomly, thus wells are displayed at random positions:
Screenshot 2022-05-17 at 10 22 24
Screenshot 2022-05-17 at 10 22 44

If I manually order the columns correctly, then the wells are shown in the correct setup:
Screenshot 2022-05-17 at 11 08 22
with this:

        "columns": [
            {
                "name": "03"
            },
            {
                "name": "04"
            },
            {
                "name": "05"
            },
            {
                "name": "06"
            },
            {
                "name": "08"
            },
            {
                "name": "09"
            },
            {
                "name": "10"
            },
            {
                "name": "11"
            }
        ],
        "rows": [
            {
                "name": "B"
            },
            {
                "name": "C"
            },
            {
                "name": "D"
            },
            {
                "name": "E"
            },
            {
                "name": "F"
            },
            {
                "name": "G"
            }
        ],

=> @mfranzon Can we introduce sorting to the rows & columns?

Also, maybe some of the loading wrong version parts is due to the version of our OME metadata? We get this warning in the beginning:
WARNING version mismatch: detected:FormatV03, requested:FormatV04
=> Can we upgrade to V04 of OME-NGFF?

@jluethi
Copy link
Collaborator Author

jluethi commented May 18, 2022

I just tested the parsing on my 23 well example again and it runs through successfully, wells are placed correctly and the row & column metadata looks good! Thank you @mfranzon!
OME-NGFF standard version discussion can continue here: #44

@jluethi jluethi closed this as completed May 18, 2022
Repository owner moved this from In Progress (current sprint) to Done in Fractal Project Management May 18, 2022
jacopo-exact pushed a commit that referenced this issue Nov 16, 2022
First attempt towards pelkmanslab parsl/slurm config - ref #34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants