Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

workflow restart logic not needed for non-spot #1135

Open
casperdcl opened this issue Aug 23, 2022 · 6 comments
Open

workflow restart logic not needed for non-spot #1135

casperdcl opened this issue Aug 23, 2022 · 6 comments
Assignees
Labels
bug Something isn't working cml-runner Subcommand

Comments

@casperdcl
Copy link
Contributor

No description provided.

@0x2b3bfa0
Copy link
Member

...unless your workflow runs take more than 35 days1 to finish?

Footnotes

  1. On GitHub, as per https://github.com/iterative/cml/pull/1067.

@dacbd
Copy link
Contributor

dacbd commented Sep 9, 2022

I think we should strongly consider dropping that.

Reasons I would use cml runner

  • short-lived instance life-cycle management because I don't want to pay for that cloud GPU any longer than is required.
  • Super Simple setup of the CI agent on a system

I think that something that runs longer than 35 days doesn't really fall into one of the above.

/opinions?

@0x2b3bfa0
Copy link
Member

0x2b3bfa0 commented Sep 9, 2022

I wholeheartedly agree: the 35 day limit is enough for all the considered use cases unless proven otherwise, and the maintenance overhead of this feature probably outweighs the dubious edge cases where it becomes useful.

@casperdcl
Copy link
Contributor Author

casperdcl commented Sep 9, 2022

>35 days: use LEO

@DavidGOrtega
Copy link
Contributor

short-lived instance life-cycle management because I don't want to pay for that cloud GPU any longer than is required.

Fine tune Stable difussion, GPT-2 or GPT-3 alternatives might take more than 35 days. Just mentioning 😁

@DavidGOrtega
Copy link
Contributor

>35 days: use LEO

Or anything use LEO

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cml-runner Subcommand
Projects
None yet
Development

No branches or pull requests

4 participants