-
Notifications
You must be signed in to change notification settings - Fork 28
Separate input and output directory settings #395
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Follow-up of #340
Im not sure about this PR. What's the goal? There are scenarios where syncing back to current workdir is fine (CI in example) |
If I recall correctly, 🔔 @casperdcl also had a preference for the old 🔔 @iterative/cml, any opinions/alternatives? |
I see, but it's a funny mechanism to be guessed. We should at least document it through-fully.
I consider this superior because changing it would allow to change your mind if we make this property non destructive. |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
I believe that the behavior introduced with this pull request is cleaner and more explicit:1
What you propose is a bit more complex to understand, at least for me: 🤔
Footnotes
|
I wholeheartedly agree, no matter which option we choose.
Can't we just configure
To my mind, it's quite the opposite, but I guess that it's a subjective perception in both cases. |
I have never contemplated the idea of not pushing input data. I do not even envision too many scenarios like that.
This is to my mind the most common and expected case. Something that already happens in the CI. If I generate artefacts during the task I want them to be in the same place as it would have been if run locally. |
There might be some users that don't want synchronizing a local working directory but want to retrieve the results after the execution. E.g. clone the repository as part of the script to reduce wait times, or DVC–driven tasks. |
You seem particularly focused on CI/CD scenarios. I wonder why. 😉
Sure? Such a behavior would:
|
To my mind is not the case because as I stated I think that the most expected way is:
I do not envision cloning the repo and work locally because that kills the experimentation. Should they change and immediately commit?
DVC driven task use to need at least a dvc pipeline that needs to be pushed? I might be wrong but we should probably discuss this urgently instead of taking personal assumptions |
In CI/CD systems this may hold true, but overwriting files as warned on #395 (comment) does't look like the most user–friendly experience for local users.
🤔 Should we even assume that users have some version control system in place?
CI/CD scenarios are pretty clear and work as expected. 👍🏼
But probably with DVC itself instead of
Shall we return to this page and refine the existing proposals or suggest new ones? 🤔 |
No. But Im not expecting them to clone |
Overwriting is the most common behaviour in the majority of the commands involving files. The issue here is wanting to do multiple runs in the same folder, but this is a scenario that is managed by a regular user on a daily basis. I would say |
We already had a behaviour in place. The question is that introducing skip is altering completely the at-least-known-not-yet-validated behaviour.
It's not because of the CI. Its because as stated I think is the most common way. Let's imagine that task is not running in the cloud. It's just a magical power that enhances our laptop making it able of training with GPU. As I stated the most common way to understand the output is that the artefacts are going to be in the place the script is generating them. Also we should always reduce the complexity in the story. We can not say for local do a, for CI do b. Thats more confusing than say for local or CI do a |
Sounds like a valid use case? 🤔
Overwriting is a common behavior in operations that explicitly take a list of files. Doesn't seem the expected behavior for such a complex process, especially when the working directory contains both the user's project and the Terraform code & state files.
What do you mean by “managed by a regular user on a daily basis”? Is this process slow enough to not require any automation? 🤔
Do you mean the original behavior, prior to #340?
Agreed! That's the behavior we had in place before #340, but that behavior caused #306 and overwrote local files with outdated copies.
Agreed!
Sometimes, both |
Co-authored-by: Casper da Costa-Luis <[email protected]>
Follow-up of #340,
task
: optionally skip downloading artefacts ondestroy
#303Old behavior
When
output
was unset,""
ornull
it implicitly mirrored the value ofinput
.New behavior
When
output
is unset,""
ornull
it will be disabled. Doesn't implicitly take the value ofinput
anymore.