Skip to content

Handle image overlaps, to fix metadata round-off errors #107

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Tracked by #106
tcompa opened this issue Sep 27, 2022 · 15 comments · Fixed by #106
Closed
Tracked by #106

Handle image overlaps, to fix metadata round-off errors #107

tcompa opened this issue Sep 27, 2022 · 15 comments · Fixed by #106
Labels
High Priority Current Priorities & Blocking Issues

Comments

@tcompa
Copy link
Collaborator

tcompa commented Sep 27, 2022

This is a fresh issue to keep track of work towards fixing #91 (the off-by-one-pixel shape errors due to 0.1 um-rounding in metadata files). In principle it should prepare the ground for then addressing #10 and #15 (the more general handling of overlapping images).

For testing purposes, I'm using the metadata file of a single 9x8 well, which has some overlapping FOVs - see #91. The file is in /data/active/fractal/3D/PelkmansLab/CardiacMultiplexing/Cycle1_9x8_singleWell/.

@tcompa
Copy link
Collaborator Author

tcompa commented Sep 27, 2022

To set the stage, I'm parsing the metadata files and building a simplified dataframe:

import sys
from fractal_tasks_core.metadata_parsing import parse_yokogawa_metadata
import pandas as pd


folder = sys.argv[1]
if not folder.endswith("/"):
    folder += "/"
mlf_path = folder + "MeasurementData.mlf"
mrf_path = folder + "MeasurementDetail.mrf"
df, total_files = parse_yokogawa_metadata(mrf_path, mlf_path)

new_df = pd.DataFrame()

new_df["xmin"] = df["x_micrometer"]
new_df["ymin"] = df["y_micrometer"]
new_df["xmax"] = df["x_micrometer"] + df["pixel_size_x"] * df["x_pixel"]
new_df["ymax"] = df["y_micrometer"] + df["pixel_size_y"] * df["y_pixel"]

new_df.to_csv(folder + "metadata.csv")

(this is not meant to be part of fractal workflows, but it's just to simplify tests)

@tcompa
Copy link
Collaborator Author

tcompa commented Sep 27, 2022

Using this script
handle_overlaps.txt

I identified the FOVs that have some overlap (the ones with thick boundaries):
fig_fovs

The original csv (see last comment) is
metadata.csv
and its head reads

            xmin    ymin    xmax    ymax
field_id                                
1        -1448.3 -1517.7 -1032.3 -1166.7
2        -1032.3 -1517.7  -616.3 -1166.7
3         -616.3 -1517.7  -200.3 -1166.7
4         -200.3 -1517.7   215.7 -1166.7
5          215.6 -1517.7   631.6 -1166.7
6          631.6 -1517.7  1047.6 -1166.7
7         1047.6 -1517.7  1463.6 -1166.7
8         1463.6 -1517.7  1879.6 -1166.7
9        -1448.3 -1166.7 -1032.3  -815.7
10       -1032.3 -1166.7  -616.3  -815.7
11        -616.3 -1166.7  -200.3  -815.7
12        -200.3 -1166.7   215.7  -815.7
13         215.6 -1166.7   631.6  -815.7
14         631.6 -1166.7  1047.6  -815.7
15        1047.6 -1166.7  1463.6  -815.7
16        1463.6 -1166.7  1879.6  -815.7
17       -1448.3  -815.7 -1032.3  -464.7
18       -1032.3  -815.7  -616.3  -464.7
19        -616.3  -815.7  -200.3  -464.7
20        -200.3  -815.7   215.7  -464.7
21         215.6  -815.7   631.6  -464.7
22         631.6  -815.7  1047.6  -464.7
23        1047.6  -815.7  1463.6  -464.7
24        1463.6  -815.7  1879.6  -464.7
25       -1448.3  -464.7 -1032.3  -113.7
26       -1032.3  -464.7  -616.3  -113.7
27        -616.3  -464.7  -200.3  -113.7
28        -200.3  -464.7   215.7  -113.7
29         215.6  -464.7   631.6  -113.7
30         631.6  -464.7  1047.6  -113.7
31        1047.6  -464.7  1463.6  -113.7
32        1463.6  -464.7  1879.6  -113.7
33       -1448.3  -113.7 -1032.3   237.3
34       -1032.3  -113.7  -616.3   237.3
35        -616.3  -113.7  -200.3   237.3
36        -200.3  -113.7   215.7   237.3
37         215.6  -113.7   631.6   237.3
38         631.6  -113.7  1047.6   237.3
39        1047.6  -113.7  1463.6   237.3
40        1463.6  -113.7  1879.6   237.3
41       -1448.3   237.2 -1032.3   588.2
42       -1032.3   237.2  -616.3   588.2
43        -616.3   237.2  -200.3   588.2
44        -200.3   237.2   215.7   588.2
45         215.6   237.2   631.6   588.2
46         631.6   237.2  1047.6   588.2
47        1047.6   237.2  1463.6   588.2
48        1463.6   237.2  1879.6   588.2
49       -1448.3   588.2 -1032.3   939.2
50       -1032.3   588.2  -616.3   939.2

@tcompa
Copy link
Collaborator Author

tcompa commented Sep 27, 2022

For the record, in this example the "error" always seems to take place when the X or Y coordinate passes from negative to positive.

@tcompa
Copy link
Collaborator Author

tcompa commented Sep 27, 2022

It's also useful to look at the (x position of the) first few FOVs:

            xmin    xmax
field_id                
1        -1448.3 -1032.3
2        -1032.3  -616.3
3         -616.3  -200.3
4         -200.3   215.7
5          215.6   631.6
6          631.6  1047.6
7         1047.6  1463.6
8         1463.6  1879.6

Contrary to my expectations, it's not enough to shift a single FOV (e.g. FOV 4 in this list) to get the "correct" positions, but this will indeed trigger a series of other shifts (as @jluethi had anticipated at some point), to make space for the new FOV 4.

@tcompa
Copy link
Collaborator Author

tcompa commented Sep 27, 2022

Given the last comment, a minimal working example for removing overlaps is in this (python) script:
handle_overlaps.txt

It's still quite dirty (and it needs to be translated back to the original dataframe form), but the logic is the one we would expect:

  1. get_overlapping_pair gives the first overlapping pair of FOVs, while scanning all pairs.
  2. This overlap is fixed by shifting some FOVs to the right (or up, depending on whether the overlap is horizontal/vertical) by the right amount. The FOVs which are shifted are all those that satisfy a given condition (e.g. their left boundary is >= than a reference value).
  3. After updating all FOVs, we call again get_overlapping_pair, and we repeat the overlap removal for the next pair of overlapping FOVs.
  4. We stop when all overlaps have been removed. For the case in Handle metadata imprecisions: Single pixel overlaps #91, this takes just two iterations (one along X, one along Y). For more complex cases it could be a few more, but it shouldn't hit any loop (because it keeps moving FOVs to the right/top). The total number of iteration is capped, just to be on the safe side.

In principle this function could scale quadratically with the number of FOVs, but that quadratic scaling is only achieved when there are few/no overlaps - and then the prefactor is a small number (close to 1) and we can live with it. In practice, this version of the function solves overlap for the 23 wells in less than a second, so I am not that worried.

@jluethi
Copy link
Collaborator

jluethi commented Sep 27, 2022

Hmm, kinda looks like there is some weird rounding happening depending on whether a value is positive or negative 🙈

In any case, for the more general implementation for #10 and #15, I'd anyway expect that moving one FOV always forces us to also move the others.

Maybe a good general heuristic would be to loop through them in order of FOV IDs (1, 2, 3, 4 etc.) [not FOV001, FOV010, etc or random order]. In that way, for most grids, we shouldn't have to move FOVs more than once.
Not sure what the best heuristic is to know which direction something should be moved in general. In e.g. the grid case or this off-by-1 case, moving everything towards higher values in the coordinate system would work if we do ascending order. But if FOVs were in a different order, we wouldn't want to e.g. flip 44 & 45 to be 45 => 44.

@tcompa
Copy link
Collaborator Author

tcompa commented Sep 27, 2022

Addendum to my last comment:
It is useful (and maybe necessary) to work with a finite tolerance, when comparing finite-precision numbers. This means setting a certain value (e.g. I used tol=1e-8), and transforming some comparisons like a<b into a<b-tol. The obvious requirement is that tol is much smaller than both the pixel size and the microscope rounding, which is clearly the case for 1e-8 micrometers.

@jluethi
Copy link
Collaborator

jluethi commented Sep 27, 2022

haha, overlap here in comment time.

(because it keeps moving FOVs to the right/top)

Sounds reasonable. Which one do you move? And looking at this grid, it starts bottom left again, while I think our FOV ordering starts top left (FOV 1 is where FOV 65 would be). Let's make sure we don't flip the Y axis again :)

@jluethi
Copy link
Collaborator

jluethi commented Sep 27, 2022

Regarding tolerance: As long as round(position) gets the correct value (i.e. the starting integer should be 2160, not 2159 for the second image), that should be precision enough, right? We can of course move finer differences, but given that we anyway round to integers in the end, that is what will matter most

@tcompa
Copy link
Collaborator Author

tcompa commented Sep 27, 2022

And looking at this grid, it starts bottom left again, while I think our FOV ordering starts top left (FOV 1 is where FOV 65 would be). Let's make sure we don't flip the Y axis again :)

I noticed it, but I am not sure it is relevant at the moment. The dataframe is shown in #107 (comment), and it has the correct signs (e.g. -1517 for the first FOV), and only that information should matter. After I clean this up, and make it fit into a task, we can check that the napari visualization is still correct.

Sounds reasonable. Which one do you move?

Given an overlap between FOVs 4 and 5 (I'm now using the labels as shown in the figure above), I shift FOVs 5,6,7,8,13,14,15,16,21,22,23,24,... to the right (namely all the ones with x position >= to the one of FOV 5). This is the first iteration. The second iteration does the same, but vertically - e.g. to remove the overlap between FOVs 33 and 41.

For the current example, removing two overlaps is sufficient to remove all of them. In a more complex search-first dataset, we may have to go through a larger number of overlaps.

@jluethi
Copy link
Collaborator

jluethi commented Sep 27, 2022

Sounds good. I think the assumption that we should move the later FOV to larger x & y values will mostly hold, at least for datasets I'm aware of. But it's not a general rule we'll be able to rely on. If we implement it like this for the moment, let's make a backlog issue to be aware of this limitation.

It would come up when someone has a different imaging pattern, e.g. FOVs are imaged in vertical patterns or zig-zag horizontal, instead of straight horizontal lines like we usually get

@jluethi jluethi added the High Priority Current Priorities & Blocking Issues label Sep 28, 2022
@tcompa tcompa mentioned this issue Sep 28, 2022
6 tasks
@jluethi
Copy link
Collaborator

jluethi commented Sep 28, 2022

Let's check the metadata file here /data/active/fractal/Liberali/1_well_15_fields_20_planes_SF_w_errors/D10_R1/220304_172545_220304_175557

@tcompa
Copy link
Collaborator Author

tcompa commented Sep 28, 2022

Let's check the metadata file here /data/active/fractal/Liberali/1_well_15_fields_20_planes_SF_w_errors/D10_R1/220304_172545_220304_175557

fig_fovs

@tcompa
Copy link
Collaborator Author

tcompa commented Sep 28, 2022

And here are the FOV left boundaries:

well_id  field_id
A06      1          -781.6
         2            50.3
         4          -781.6
         5            50.3
         6          -781.6
         7            50.3
         8           882.3
         11           50.3
         12          882.3
         14           50.3
         15          882.3
Name: x_micrometer, dtype: float64

with the usual issue when crossing the x=0 axis.

The "wrong" ones

@tcompa
Copy link
Collaborator Author

tcompa commented Sep 28, 2022

tl;dr
Both cases point towards a reproducible behavior, where upon crossing x=0 the microscope-metadata positions are 0.1 um too close to 0. This behavior is correctly handled by the current approach, so let's move on with this (via #106).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
High Priority Current Priorities & Blocking Issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants