|
2 | 2 |
|
3 | 3 | The `fractal-tasks-core` repository is the reference implementation for Fractal
|
4 | 4 | tasks and for Fractal task packages, but the Fractal platform can also be
|
5 |
| -used to execute custom tasks. This page lists the Fractal-compatibility |
6 |
| -requirements, for a [single custom task](#single-custom-task) or for a [task |
7 |
| -package](#task-package). |
| 5 | +used to execute custom tasks. |
8 | 6 |
|
9 |
| -Note that these specifications evolve frequently, see e.g. discussion at |
10 |
| -https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/151. |
| 7 | +For the most recent versions of Fractal (namely `fractal-server` v2 and `fractal-tasks-core` v1), the instructions for building your own tasks are available at https://fractal-analytics-platform.github.io/build_your_own_fractal_task. |
11 | 8 |
|
12 |
| -> **Note**: While the contents of this page remain valid, the recommended |
13 |
| -> procedure to get up to speed and build a Python package of Fractal-compatible |
14 |
| -> tasks is to use the template available at |
15 |
| -> https://github.com/fractal-analytics-platform/fractal-tasks-template. |
16 |
| -
|
17 |
| -A Fractal task is mainly formed by two components: |
18 |
| - |
19 |
| -1. A set of metadata, which are stored in the `task` table of the database of a |
20 |
| - `fractal-server` instance, see [Task metadata](#task-metadata). |
21 |
| -2. An executable command, which can take some specific command-line arguments |
22 |
| - (see [Command-line interface](#command-line-interface)); the standard |
23 |
| - example is a Python script. |
24 |
| - |
25 |
| - |
26 |
| -In the following we explain what are the Fractal-compatibility requirements for |
27 |
| -a single task, and then for a task package. |
28 |
| - |
29 |
| -## Single custom task |
30 |
| - |
31 |
| -We describe how to define the multiple aspects of a task, and provide a [Full task example](#full-task-example). |
32 |
| - |
33 |
| -### Task metadata |
34 |
| - |
35 |
| -Each task must be associated to some metadata, so that it can be used in |
36 |
| -Fractal. The full specification is |
37 |
| -[here](https://fractal-analytics-platform.github.io/fractal-server/reference/fractal_server/app/models/task/#fractal_server.app.models.task.Task), |
38 |
| -and the required attributes are: |
39 |
| - |
40 |
| -* `name`: the task name, e.g. `"Create OME-Zarr structure"`; |
41 |
| -* `command`: a command that can be executed from the command line; |
42 |
| -* `input_type`: this can be any string (typical examples: `"image"` or `"zarr"`); |
43 |
| - the special value `"Any"` means that Fractal won't perform any check of the |
44 |
| - `input_type` when applying the task to a dataset. |
45 |
| -* `output_type`: same logic as `input_type`. |
46 |
| -* `source`: this is meant to be as close as possible to unique task identifier; |
47 |
| - for custom tasks, it can be anything (e.g. `"my_task"`), but for task that |
48 |
| - are collected automatically from a package (see [Task package](#task-package) this |
49 |
| - attribute will have a very specific form (e.g. |
50 |
| - `"pip_remote:fractal_tasks_core:0.10.0:fractal-tasks::convert_yokogawa_to_ome-zarr"`). |
51 |
| -* `meta`: a JSON object (similar to a Python dictionary) with some additional |
52 |
| - information, see [Task meta-parameters](#task-meta-parameters). |
53 |
| - |
54 |
| -There are multiple ways to get the appropriate metadata into the database, |
55 |
| -including a POST request to the `fractal-server` API (see `Tasks` section in |
56 |
| -the [`fractal-server` API |
57 |
| -documentation](https://fractal-analytics-platform.github.io/fractal-server/openapi)) |
58 |
| -or the automated addition of a whole set of tasks through specific API |
59 |
| -endpoints (see [Task package](#task-package)). |
60 |
| - |
61 |
| - |
62 |
| -### Command-line interface |
63 |
| - |
64 |
| -Some examples of task commands may look like |
65 |
| - |
66 |
| -* `python3 /some/path/my_task.py`, |
67 |
| -* `/some/absolute/path/python3.10 /some/other/absolute/path/my_task.py`, |
68 |
| -* `/some/path/my_executable_task.py`, |
69 |
| -* any other executable command (not necessarily based on Python). |
70 |
| - |
71 |
| -Given a task command, Fractal will add two additional command-line arguments to it: |
72 |
| - |
73 |
| -* `-j /some/path/input-arguments.json` |
74 |
| -* `--metadata-out /some/path/output-metadata-update.json` |
75 |
| - |
76 |
| -Therefore the task command must accept these additional command-line arguments. |
77 |
| -If the task is a Python script, this can be achieved easily by using the |
78 |
| -`run_fractal_task` function - which is available as part of |
79 |
| -[`fractal_tasks_core.tasks._utils`](https://github.com/fractal-analytics-platform/fractal-tasks-core/blob/main/fractal_tasks_core/tasks/_utils.py). |
80 |
| - |
81 |
| -### Task meta-parameters |
82 |
| - |
83 |
| -The `meta` attribute of tasks (see the corresponding item in [Task |
84 |
| -metadata](#task-metadata)) is where we specify some requirements on how the |
85 |
| -task should be run. This notably includes: |
86 |
| - |
87 |
| -* If the task has to be run in parallel (e.g. over multiple wells of an |
88 |
| - OME-Zarr dataset), then `meta` should include a key-value pair like |
89 |
| - `{"parallelization_level": "well"}`. If the `parallelization_level` key is |
90 |
| - missing, the task is considered as non-parallel. |
91 |
| -* If Fractal is configured to run on a SLURM cluster, `meta` may include |
92 |
| - additional information on the SLRUM requirements (more info on the Fractal |
93 |
| - SLURM backend |
94 |
| - [here](https://fractal-analytics-platform.github.io/fractal-server/internals/runners/slurm/)). |
95 |
| - |
96 |
| -### Task input parameters |
97 |
| - |
98 |
| -When a task is run via Fractal, its input parameters (i.e. the ones in the file |
99 |
| -specified via the `-j` command-line otion) will always include a set of keyword |
100 |
| -arguments with specific names: |
101 |
| - |
102 |
| -* `input_paths` |
103 |
| -* `output_path` |
104 |
| -* `metadata` |
105 |
| -* `component` (only for parallel tasks) |
106 |
| - |
107 |
| -### Task output |
108 |
| - |
109 |
| -The only task output which will be visible to Fractal is what goes in the |
110 |
| -output metadata-update file (i.e. the one specified through the |
111 |
| -`--metadata-out` command-line option). Note that this only holds for |
112 |
| -non-parallel tasks, while (for the moment) Fractal fully ignores the output of |
113 |
| -parallel tasks. |
114 |
| - |
115 |
| -> **IMPORTANT**: This means that each task must always write any output to |
116 |
| -> disk, before ending. |
117 |
| -
|
118 |
| - |
119 |
| -### Advanced features |
120 |
| - |
121 |
| -The description of other advanced features is not yet available in this page. |
122 |
| - |
123 |
| -1. Also other attributes of the [Task metadata](#task-metadata) exist, and they |
124 |
| - would be recognized by other Fractal components (e.g. `fractal-server` or |
125 |
| - `fractal-web`). These include JSON Schemas for input parameters and additional |
126 |
| - documentation-related attributes. |
127 |
| -2. In `fractal-tasks-core`, we use [`pydantic |
128 |
| - v1`](https://docs.pydantic.dev/1.10) to fully coerce and validate the input |
129 |
| - parameters into a set of given types. |
130 |
| - |
131 |
| -### Full task example |
132 |
| - |
133 |
| -Here we describe a simplified example of a Fractal-compatible Python task (for |
134 |
| -more realistic examples see the `fractal-task-core` [tasks |
135 |
| -folder](https://github.com/fractal-analytics-platform/fractal-tasks-core/tree/main/fractal_tasks_core/tasks)). |
136 |
| - |
137 |
| -The script `/some/path/my_task.py` may look like |
138 |
| -```python |
139 |
| -# Import a helper function from fractal_tasks_core |
140 |
| -from fractal_tasks_core.tasks._utils import run_fractal_task |
141 |
| - |
142 |
| -def my_task_function( |
143 |
| - # Reserved Fractal arguments |
144 |
| - input_paths, |
145 |
| - output_path, |
146 |
| - metadata, |
147 |
| - # Task-specific arguments |
148 |
| - argument_A, |
149 |
| - argument_B = "default_B_value", |
150 |
| -): |
151 |
| - # Do something, based on the task parameters |
152 |
| - print("Here we go, we are in `my_task_function`") |
153 |
| - with open(f"{output_path}/output.txt", "w") as f: |
154 |
| - f.write(f"argument_A={argument_A}\n") |
155 |
| - f.write(f"argument_B={argument_B}\n") |
156 |
| - # Compile the output metadata update and return |
157 |
| - output_metadata_update = {"nothing": "to add"} |
158 |
| - return output_metadata_update |
159 |
| - |
160 |
| -# Thi block is executed when running the Python script directly |
161 |
| -if __name__ == "__main__": |
162 |
| - run_fractal_task(task_function=my_task_function) |
163 |
| -``` |
164 |
| -where we use `run_fractal_task` so that we don't have to take care of the [command-line arguments](#command-line-interface). |
165 |
| - |
166 |
| -Some valid [metadata attributes](#task-metadata) for this task would be: |
167 |
| -```python |
168 |
| -name="My Task" |
169 |
| -command="python3 /some/path/my_task.py" |
170 |
| -input_type="Any" |
171 |
| -output_type="Any" |
172 |
| -source="my_custom_task" |
173 |
| -meta={} |
174 |
| -``` |
175 |
| - |
176 |
| -> Note that this was an example of a non-parallel tasks; to have a parallel |
177 |
| -> one, we would also need to: |
178 |
| -> |
179 |
| -> 1. Set `meta={"parallelization_level": "something"}`; |
180 |
| -> 2. Include `component` in the input arguments of `my_task_function`. |
181 |
| -
|
182 |
| - |
183 |
| -## Task package |
184 |
| - |
185 |
| -Given a set of Python scripts corresponding to Fractal tasks, it is useful to |
186 |
| -combine them into a single Python package, using the [standard |
187 |
| -tools](https://packaging.python.org/en/latest/tutorials/packaging-projects) or |
188 |
| -other options (e.g. for `fractal-tasks-core` we use |
189 |
| -[poetry](https://python-poetry.org/)). |
190 |
| - |
191 |
| - |
192 |
| -### Reasons |
193 |
| - |
194 |
| -Creating a package is often a good practice, for reasons unrelated to Fractal: |
195 |
| - |
196 |
| -1. It makes it simple to assign a global version to the package, and to host it |
197 |
| - on a public index like PyPI; |
198 |
| -2. It may reduce code duplication: |
199 |
| - * The scripts may have a shared set of external dependencies, which are |
200 |
| - defined in a single place for a package. |
201 |
| - * The scripts may import functions from a shared set of auxiliary Python |
202 |
| - modules, which can be included in the package. |
203 |
| - |
204 |
| -Moreover, having a single package also streamlines some Fractal-related |
205 |
| -operations. Given the package `MyTasks` (available on PyPI, or locally), the |
206 |
| -Fractal platform offers a feature that automatically: |
207 |
| - |
208 |
| -3. Downloads the wheel file of package `MyTasks` (if it's on a public index, |
209 |
| - rather than a local file); |
210 |
| -4. Creates a Python virtual environment (venv) which is specific for a given |
211 |
| - version of the `MyTasks` package, and installs the `MyTasks` package in that |
212 |
| - venv; |
213 |
| -5. Populates all the corresponding entries in the `task` database table with |
214 |
| - the appropriate [Task metadata](#task-metadata), which are extracted from |
215 |
| - the package manifest. |
216 |
| - |
217 |
| -This feature is currently exposed in the `/api/v1/task/collect/pip/` endpoint of `fractal-server` (see [API documentation](https://fractal-analytics-platform.github.io/fractal-server/openapi)). |
218 |
| - |
219 |
| -### Requirements |
220 |
| - |
221 |
| -To be compatible with Fractal, a task package must satisfy some additional requirements: |
222 |
| - |
223 |
| -* The package is built as a a wheel file, and can be installed via `pip`. |
224 |
| -* The `__FRACTAL_MANIFEST__.json` file is bundled in the package, in its root |
225 |
| - folder. If you are using `poetry`, no special operation is needed. If you |
226 |
| - are using a `setup.cfg` file, see |
227 |
| - [this |
228 |
| - comment](https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/151#issuecomment-1524929477). |
229 |
| -* Include JSON Schemas. The tools in `fractal_tasks_core.dev` are used to |
230 |
| - generate JSON Schema's for the input parameters of each task in |
231 |
| - `fractal-tasks-core`. They are meant to be flexible and re-usable to perform |
232 |
| - the same operation on an independent package, but they are not thoroughly |
233 |
| - documented/tested for more general use; feel free to open an issue if something |
234 |
| - is not clear. |
235 |
| -* Include additional task metadata like `docs_info` or `docs_link`, which will |
236 |
| - be displayed in the Fractal web-client. Note: this feature is not yet |
237 |
| - implemented. |
238 |
| - |
239 |
| - |
240 |
| -> The ones in the list are the main requirements; if you hit unexpected |
241 |
| -> behaviors, also have a look at |
242 |
| -> https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/151 |
243 |
| -> or open a new issue. |
| 9 | +As a reference, here is [a copy of the legacy instructions for older Fractal versions](./custom_tasks_old.md), which are currently obsolete. |
0 commit comments