Skip to content

Tracking changes in v0.4.0. #400

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Aug 27, 2023
Merged

Tracking changes in v0.4.0. #400

merged 12 commits into from
Aug 27, 2023

Conversation

tobiasraabe
Copy link
Member

No description provided.

@codecov
Copy link

codecov bot commented Jul 23, 2023

Codecov Report

Merging #400 (b299cc9) into main (39bfcdf) will decrease coverage by 0.54%.
The diff coverage is 92.05%.

❗ Current head b299cc9 differs from pull request most recent head 0c4b0c5. Consider uploading reports for the commit 0c4b0c5 to get more accurate results

@@            Coverage Diff             @@
##             main     #400      +/-   ##
==========================================
- Coverage   96.16%   95.62%   -0.54%     
==========================================
  Files          93       94       +1     
  Lines        7797     8254     +457     
==========================================
+ Hits         7498     7893     +395     
- Misses        299      361      +62     
Flag Coverage Δ
end_to_end 82.92% <88.29%> (+0.26%) ⬆️
integration 39.62% <36.36%> (-1.33%) ⬇️
unit 66.06% <55.38%> (-0.54%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
src/_pytask/cli.py 98.66% <ø> (-0.04%) ⬇️
src/_pytask/mark/__init__.py 96.63% <ø> (ø)
src/_pytask/report.py 100.00% <ø> (ø)
tests/test_console.py 100.00% <ø> (ø)
tests/test_debugging.py 89.89% <ø> (-0.04%) ⬇️
tests/test_dry_run.py 100.00% <ø> (ø)
tests/test_graph.py 100.00% <ø> (ø)
tests/test_skipping.py 100.00% <ø> (ø)
src/_pytask/_inspect.py 16.12% <16.12%> (ø)
src/_pytask/click.py 92.30% <50.00%> (+3.49%) ⬆️
... and 42 more

... and 5 files with indirect coverage changes

@tobiasraabe tobiasraabe mentioned this pull request Jul 28, 2023
21 tasks
@NickCrews
Copy link
Contributor

NickCrews commented Jul 28, 2023

Hey @tobiasraabe, I just came here after reading the docs on the API changes at https://pytask-dev--392.org.readthedocs.build/en/392/tutorials/write_a_task.html

I'm sorry to say this, I'm sure you don't want to hear this, but I don't think going in the Annotated direction is a good idea. It ties the implementation logic in too closely with the orchestration logic. If there is some 3rd party function that I want to just wrap as a step, or use a simple lambda, I can't do that, because I need to add type hints. I want to be able to do:

task_drop_column = pytask.Task(lambda df: df.drop(columns="foo"), ins=DfLoader("in.parquet"), outs=DfSaver("out.parquet"))

from fuzzywuzzy import fuzz 

task_calc_ratio = pytask.Task(fuzz.ratio, ins=[..., ...], outs=...)

I think moving from separate produces and depends_on decorators into a single decorator/wrapper is a good idea, but I think the "wrapping" idea is essential. What was the pain point of decorators that you were trying to get away from?

@tobiasraabe
Copy link
Member Author

No worries. I appreciate your honest feedback. I have way too much fun redesigning pytask than being disappointed if the first draft does not work out.

You are correct that I did not think about lambdas and third-party functions. TBH, I have never heard from someone before that uses pytask like you do and this is why the current v0.4 focuses so far on making the interface better for the (probably silent) majority of users that follows mostly the design in the docs.

That being said, I believe your use case is something that pytask should support and strive to improve. To continue supporting your use case, pytask could treat produces as a magic argument name like it does now + inject values via @pytask.mark.task. But, I will think about this more in the next weeks and hopefully develop a better proposal.

The main pain points were that

  • the decorators are unnecessary if you use @pytask.mark.task(kwargs=...) or defaults of arguments to inject values into the function.
  • Also, you can choose better names for dependencies and products than depends_on and produces if we allow to use the whole signature of a function.
  • Lastly, type hints, especially for dependencies, become more precise.

Let us continue discussions like that in #361. Otherwise, our messages are everywhere, and nobody can follow them. Please feel free to post a summary/ies there if you'd like. If not, I will do it hopefully soon with a proposal.

@NickCrews
Copy link
Contributor

Ahh, #361 looks like a good central place for this discussion, I'll go over there :)

@tobiasraabe tobiasraabe added this to the v0.4.0 milestone Jul 30, 2023
@tobiasraabe tobiasraabe marked this pull request as ready for review August 27, 2023 16:10
@tobiasraabe tobiasraabe merged commit 0c4b0c5 into main Aug 27, 2023
@tobiasraabe tobiasraabe deleted the v4 branch August 27, 2023 16:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants