-
Notifications
You must be signed in to change notification settings - Fork 13
example-dvc-experiments
: Include CML configuration
#83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'd strongly suggest the CML case is pushed to a different repo ( |
how is it different in terms of the project? let's try to scope it here also, let's scope the "happy-path", get started experience, etc ... what is the purpose of the repo for CML - tutorial, use case, get started? what are things that we'd like to show?
this should not be a problem to my mind, additional GH action config is totally fine to have (people won't see it unless you point to it)
agreed, if we talk about CML in general (when DVC is not being used at all). If we talk about DVC+CML - I'm not sure why that would be confusing? And to clarify, name should be generic here in that case -
example per branch is bad for a lot of reasons - branches are first class citizens in the DVC workflow and mixing them this way is bad to my mind (think about connecting such a repo to DVC Studio), or running a command like
yep, agreed - I would not do branches. See above. It should be a simple repo like the existing get-started one that covers happy path across DVC, CML, DVCLive ... to clarify, I also don't think that it will cover everyrthing ... but we should be all optimizing for simplicity and try hard to have a common ground where all tools integrate nicely |
|
this scope sounds good to me, that's what we do for the get-started-example, and there is not contradiction so far. I see only benefits in this.
yes, but this discussion started when we were trying to use mnist repo (and codify it) for CML as far as I understand? Ideally I would then plan a bit - what kind of repositories will you need for CML, what of them you will need to codify, etc? No doubt there will be a lot of smaller repos (considering that we have Gitlab/Github/Bitbucket + different clouds + different scenarios like Ternsorboard). It's a separate question how do you want to build them, which of them to codify etc. Same with Here we are talking more about get started experience I think. Back to my initial question - would it be useful/possible to create |
I'd propose to determine the most common cases (i.e. happy path?) for the related technologies and bundle them in a common repository, and additionally have smaller repositories that may be used as a showcase. In the CML case, it seems Github configuration with AWS. This can be default in Codification for the configuration is straightforward. We just need to determine at which stage it's most relevant to configure. |
example-dvc-experiments
: Improve to use for CML example-dvc-experiments
: Include CML configuration
After reviewing this again, I think providing a repository generator (a la
Otherwise, it will be difficult to keep tabs to create a separate repository on every possible setup. Also, I'm not sure we know happy path for all kinds of users, some may want a simple repository, others may want bells and whistles. |
Sounds a lot like creating our own cookiecutter. Having a fork of https://github.com/drivendata/cookiecutter-data-science could be a way to get users started quickly. |
I have a repo that its a full example (also integration tester) of DVC-CML for GL, BB and GH. |
That's a better idea. @dberenbaum |
srry haven't followed this since Sept 2021 🙈 😅 See the list at the top of #100 for the current CML example repo layout:
So it's a lot of potential complexity. In terms of "single example happy path showcase of all products" I'd suggest 2 options:
I don't know whether this is within the scope of |
@casperdcl what do we need from to make this repo useful for CML happy-path?
@iesahin @casperdcl as we discussed can we make it
example-experiments
that would cover basic scenarios with predefined language (python), predefined framework (let's say tensorflow for now). It would have CML action from the first (?) commit that could be run if it's needed (and may be even runs automatically).Then it'll be a good repo that we can even meaningfully present in Studio?
What do we need to make it substantially useful for CML?
Originally posted by @shcheklein in #79 (comment)
The text was updated successfully, but these errors were encountered: