Skip to content

Update ROI_table_names argument of copy_ome_zarr #448

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Tracked by #850
tcompa opened this issue Jun 26, 2023 · 7 comments · Fixed by #449
Open
Tracked by #850

Update ROI_table_names argument of copy_ome_zarr #448

tcompa opened this issue Jun 26, 2023 · 7 comments · Fixed by #449

Comments

@tcompa
Copy link
Collaborator

tcompa commented Jun 26, 2023

Ref:

Python

In principle we'd like to support multiple use cases:

  1. ROI_table_names = ["something", "else"] -> only these ROIs are copied
  2. ROI_table_names = None -> some default ROIs are copied (i.e. "FOV_ROI_table", "well_ROI_table"])
  3. ROI_table_names = [] -> no ROIs are copied

Support for all three cases is already in-place in the Python function, but it's not fractal-web-friendly.

JSON Schema and fractal-web

The relevant bit of the JSON Schema is

{
        "title": "CopyOmeZarr",
        "type": "object",
        "properties": {
          ...
          "ROI_table_names": {
            "title": "Roi Table Names",
            "type": "array",
            "items": {
              "type": "string"
            },
            "description": ...
          }
        },
        "required": [ "input_paths", "output_path", "metadata" ],
        "additionalProperties": false
      }

This means that the following args are identified as valid by a JSON Schema validator:

{ ..., "ROI_table_names": []}
{ ..., "ROI_table_names": ["something", "else"]}
{ ... }

However, we are currently stripping empty objects/arrays from server API calls in fractal-web (ref fractal-analytics-platform/fractal-web#209). This is in principle an arbitrary choice, but we don't want to end up with a a case-by-case heuristics for what should be done - since the current JSON Schema does not provide any guidance here (all three cases above are valid).

Back to Python

The obvious solution would be to set

def copy_ome_zarr(
    ...,
    ROI_table_names : list[str] = ["FOV_ROI_table", "well_ROI_table"]
    ):

but having mutable default values is a bad practice which we should avoid.
(For the record: it wouldn't matter in our Fractal-embedded use of tasks, but it would be risky when someone uses tasks as Python functions, therefore we should simply avoid it).

This issue is to find an alternative solution, if any.

@jluethi
Copy link
Collaborator

jluethi commented Jun 26, 2023

So how would we provide default "lists" of arguments best? Normally, I'd say we provide tuples in such a case, though I'm not sure what the schema / fractal-web implication of introducing tuples would be.

@tcompa
Copy link
Collaborator Author

tcompa commented Jun 26, 2023

So how would we provide default "lists" of arguments best? Normally, I'd say we provide tuples in such a case, though I'm not sure what the schema / fractal-web implication of introducing tuples would be.

Using tuples looks like a good idea, but we need the tuple[str, ...] construct, which is not available for old python versions. I can check the actual minimal version required, but that's not really a problem since fractal-tasks-core requires python>=3.9.

For the record, PEP 484 (which appeared for Python 3.5) states:

Tuple, used by listing the element types, for example Tuple[int, int, str]. The empty tuple can be typed as Tuple[()]. Arbitrary-length homogeneous tuples can be expressed using one type and ellipsis, for example Tuple[int, ...]. (The ... here are part of the syntax, a literal ellipsis.)

@tcompa
Copy link
Collaborator Author

tcompa commented Jun 26, 2023

Let's see if we hit any unexpected edge case, but the JSON-Schema side of this solution is quite transparent (see the manifest diff in 55ae926):

          "ROI_table_names": {
            "title": "Roi Table Names",
+            "default": [
+              "FOV_ROI_table",
+              "well_ROI_table"
+            ],
            "type": "array",
            "items": {

Therefore this could provide a nice little workaround, which is simple to implement in Python and simple to document. Let's continue in fractal-analytics-platform/fractal-task-tools#6.

@tcompa
Copy link
Collaborator Author

tcompa commented Jun 26, 2023

This was closed, but it needs to be re-opened.
It is still impossible (in the current fractal-web) to set ROI_table_names=[], because of

If we remove one item from the two-items list, we get only one. If we remove both, we go back to the default value (i.e. both items).
Maybe we will have to expose an option on fractal-web, to really set an empty list. Once again (as it is now evident): this is not trivial, due to the interaction of tasks/server/web. To be discussed further.

@tcompa tcompa reopened this Jun 26, 2023
@jluethi
Copy link
Collaborator

jluethi commented Jun 26, 2023

Ok, but that's now a different issue to me. The user can see and change what ROI tables are copied. Due to the issue with [], it can't be validated as a correct input though if an empty list is provided.

This doesn't need to be fixed for 0.10.0, let's think about how we want to eventually handle this though.

@tcompa
Copy link
Collaborator Author

tcompa commented Jun 26, 2023

Ok, but that's now a different issue to me. The user can see and change what ROI tables are copied. Due to the issue with [], it can't be validated as a correct input though if an empty list is provided.

I agree: viewing the ROI-table names is better than relying on the side effects of an empty list (which is then transformed to null, removed, and finally set to a default value from within the task). It's clearly an improvement!

Still: Fractal as a whole (tasks+server+web) doesn't support passing an empty-list argument, but for this task it could be useful (if I want to play with a minimal OME-Zarr, which has no ROI tables). We can improve the task, and maybe only raise warnings if the required ROI tables do not exist, but the main issue remains.

This doesn't need to be fixed for 0.10.0

That would be too late anyway ;)

@tcompa
Copy link
Collaborator Author

tcompa commented Jun 26, 2023

[], it can't be validated as a correct input though if an empty list is provided.

Note that [] is in fact a valid input, and ajv would not complain about it. We only strip it away because we want to enforce some global behavior, but in this case there is no real reason to do it.

@tcompa tcompa removed the High Priority Current Priorities & Blocking Issues label Oct 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
2 participants