-
Notifications
You must be signed in to change notification settings - Fork 7
Extend/improve copy-zarr task #279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thank you @tcompa for adding this! The main contextual example for this would be in my opinion the case scenario where a user is creating a new workflow from scratch and needs to test several parameters from one or more tasks. Instead of doing this over the entire .zarr data, the user can, at the desired moment of the workflow, generate a .zarr file with a subset of the data in order to use it for all required tests. This .zarr file structure can be short-lived, i.e. created and used only for testing purposes, and later on discarded once the workflow is set and complete to run on full datasets. Could also be used to share data among collaborators, along with workflows, for example. However, we would need to enforce some information to be added to this zarr file so that it is not confused with its parent dataset, maybe adding a '_subset' as a suffix of the name? Also, the largest drawback of this is that it requires the user to be vigilant and not have unnecessary multiple partial copies of the same data. The I will think of more points as well. |
To quickly summarize my comments on this from the call: Regarding cleanup, number of copies etc: I suggest we make a tmp folder in the output folder with such intermediary OME-Zarr files. Let's not get fancy about sharing or cleanup for the start, but just have them in their own space. The major goal: Allow users to test some parameters, check them on a small subset of the output and then adapt their workflows accordingly. Another question: Should this be a task? Technically maybe. But it's something very different in a typical user story and a typical flow. A user may have a workflow of existing tasks but e.g. want to try a few different parameters for the cellpose task on a single FOV. |
Just to note this down before I forget: (we can decide to only cover this later, having this flexibility for all later steps is great. Let's just be aware that we probably also want to support this user flow above) |
This has been already covered by the image-glob-pattern argument of zarr-creation tasks, so that the current issue only concerns the situation where we already have an OME-Zarr. |
Based on last week meetings, it seems that an improved version of the copy-ome-zarr task could be a nice starting point for the "let me work on an experimental branch of my workflow" use case - even if this use case is not yet fully defined on the server/web side. Some of the proposed new features:
Concerning the subset-filter, here are some possibilities (sorted by increasing complexity): |
Uh oh!
There was an error while loading. Please reload this page.
EDIT: I'm revamping this somewhat old discussion, based on last week meetings. The new comments start from #279 (comment).
As per discussion with @gusqgm this morning.
It's a task that copies a subset of a zarr, something like:
The text was updated successfully, but these errors were encountered: