Skip to content

SIG Autoscaling Charter and sigs.yaml updates #2425

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 7, 2018

Conversation

DirectXMan12
Copy link
Contributor

This adds a SIG Autoscaling charter, and updates sigs.yaml to make sure it has the right subprojects assigned to sig autoscaling vs sig instrumentation

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. labels Jul 26, 2018
@DirectXMan12
Copy link
Contributor Author

@kubernetes/sig-api-machinery-misc I think SIG Autoscaling should probably own the polymorphic scale client (since the primary motivating use case is the HPA controller), but I can be convinced otherwise. WDYT?

cc @kubernetes/sig-autoscaling-misc

@k8s-ci-robot k8s-ci-robot added the sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. label Jul 26, 2018
Covers development and maintenance of componets for automated scaling in
Kubernetes. This includes automated vertical and horizontal pod
autoscaling, initial resource estimation, cluster-proportional system
component autoscaling, and autoscaling of Kubernetes clusters themselves.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently cluster proportional autoscaling is not owned by sig-autoscaling:

  • addon-resizer is maintained by sig-instrumentation (AFAIK it's primarily used for heapster and metrics-server)
  • kube-dns-autoscaler is maintained by sig-networking

Ideally sig-autoscaling should take ownership of this use-case, but we're not there yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, do we not own the addon-resizer? I thought it was under our purview from the pre-sig-instrumentation Heapster days, and I didn't think it got transferred over, since it's an autoscaler.

We probably should own those, at any rate. @kubernetes/sig-instrumentation-misc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addon-resizer is maintained by sig-instrumentation (AFAIK it's primarily used for heapster and metrics-server)

I was not aware of this, and it's also not in our sig components. I was under the impression that sig-autoscaling own it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean no-one is really doing much work on addon-resizer, but I believe @kawych and @x13n are the ones maintaining it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every kubernetes piece of code needs to be owned by a SIG. I think it makes sense for us to own it, especially since it doesn't fall under the responsibilities of SIG instrumentation normally. @kawych and @x13n can continue to be the ones to work on it, and we can list them as the owners of the subproject, but it'll fall under the purview of sig autoscaling.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DirectXMan12’s suggestion sounds good to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kubernetes/sig-network-misc the cluster-proportional-autoscaler doesn't seem to be listed under sigs.yaml (unless my searching is failing me). I think it makes some amount of sense for it to be a subproject of sig-autoscaling (no change in ownership necessary, but it should fall under a sig umbrella), so I'd appreciate your opinions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DirectXMan12 what you suggested sounds great. Also wanted to point out yet another component that might fall under this same umbrella: cluster-proportional-vertical-autoscaler.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, thanks for the heads up. I'll update sigs.yaml with all of those

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this SIG owns an incubator project, unless it was created by SIG members who want it to be within the scope of the SIG...


- Ensuring API interfaces (the scale subresource) are availble and usable
to enable other SIG to write autoscalable objects, and enable people to
interact with those interfaces.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd argue this goes requires a lot of apimachinery know-how, which is not too closely related to anything else we do in sig. I suspect you're the only one who actually knows how the polymorphic scale client actually works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but we're responsible for maintaining the HPA, and the HPA can't work w/o the scale subresource, and the scale subresource's object is part of the autoscaling API group.

@kubernetes/sig-api-machinery-misc any opinions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, api-machinery owns mechanisms, specific sigs own specific APIs.

In this case, I'd expect api-machinery to own the apiserver code that enables subresources, and autoscaling to own the in-tree scale subresource implementations, the autoscaling/v1 Scale API types, and custom code in the scale client.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack, that's more-or-less what I expected

autoscaling behaves properly.

- Coordinating with SIG Instrumentation to ensure that metrics APIs are
suitable for autoscaling on.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add coordinating with sig-scheduling to ensure new scheduling features are compatible with cluster autoscaling?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

### Out of scope

- Testing performance at scale (this falls under the purview of [SIG
Scalability].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We certainly do it for cluster autoscaler. And I don't think sig-scalability is doing any tests for HPA. Maybe we should reconsider? I would argue some level of scalability should be pre-requisite for HPAv2 GA, which would make it very much our problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SIG scalability and SIG autoscaling are often confused in issue requests, and I kind-of wanted to make a clear deliniation. It doesn't mean we can't test our own components, but we're not responsible for general cluster scale testing (e.g. if you can't launch 100000 pods at once, it's not our job to look into why). I'll try to clarify.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, gotcha, I completely misunderstood what you meant.

@k8s-ci-robot k8s-ci-robot added sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. sig/network Categorizes an issue or PR as relevant to SIG Network. labels Jul 30, 2018
@DirectXMan12 DirectXMan12 force-pushed the autoscaling/charter branch from 20e257c to ef9235c Compare July 30, 2018 19:40

## Scope

Covers development and maintenance of componets for automated scaling in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: componets -> components

Covers development and maintenance of componets for automated scaling in
Kubernetes. This includes automated vertical and horizontal pod
autoscaling, initial resource estimation, cluster-proportional system
component autoscaling, and autoscaling of Kubernetes clusters themselves.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DirectXMan12 what you suggested sounds great. Also wanted to point out yet another component that might fall under this same umbrella: cluster-proportional-vertical-autoscaler.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 14, 2018
@spiffxp
Copy link
Member

spiffxp commented Aug 14, 2018

@DirectXMan12 can you split the sigs.yaml change apart from the charter change? I'd like the sigs.yaml changes in sooner, charter review will take longer

@DirectXMan12
Copy link
Contributor Author

See #2534

sigs.yaml Outdated
- name: autoscaler
- name: scale-client
owners:
- https://raw.githubusercontent.com/kubernetes/client-go/master/scale/OWNERS # doesn't exist yet
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

api machinery owns the level above this. Is it OK for OWNERS files in this list to overlap? I assume it must be. I don't have a strong opinion about who should own this code at the moment. If we had a client-focussed SIG, they would own it, I think, but we don't.

sigs.yaml Outdated
owners:
- https://raw.githubusercontent.com/kubernetes/metrics/master/OWNERS
- https://raw.githubusercontent.com/kubernetes/autoscaler/master/OWNERS
- name: horizntal-pod-autoscaler
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spelling

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 18, 2018
@spiffxp
Copy link
Member

spiffxp commented Aug 31, 2018

/committee steering
/assign @brendanburns @michelleN
per kubernetes/steering#31

@k8s-ci-robot
Copy link
Contributor

@spiffxp: GitHub didn't allow me to assign the following users: brendanburns.

Note that only kubernetes members and repo collaborators can be assigned.
For more information please see the contributor guide

In response to this:

/committee steering
/assign @brendanburns @michelleN
per kubernetes/steering#31

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the committee/steering Denotes an issue or PR intended to be handled by the steering committee. label Aug 31, 2018
@brendanburns
Copy link
Contributor

brendanburns commented Aug 31, 2018 via email

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Nov 12, 2018
@DirectXMan12
Copy link
Contributor Author

@spiffxp @brendanburns rebased. What else has to be done?

@DirectXMan12
Copy link
Contributor Author

This should fix the verify issues.

Copy link
Member

@spiffxp spiffxp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spotted two silly typos. From a content perspective this LGTM. I think it would help to see an explicit LGTM or confirmation that you've addressed @mwielgus' concerns


## Scope

Covers development and maintenance of components for automated scaling in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that there should be something about "integrated into kubernetes by default" or something like that.

Basically, I want it to be clear that if someone develops a random autoscaler in their own repo, it's not covered by this SIG. I think that's pretty clearly understood by everyone involved, but this text seems to lay claim to all Kubernetes autoscaling

(and the thread below that points to an incubator project as part of the scope makes this distinction even more important to make)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also own officially "sponsored" projects. The term "integrated in to Kubernetes by default" is pretty ambiguous, especially in the era of "do it out-of-tree using a CRD" (i.e. vertical pod autoscaler and cluster autoscaler). If someone wants to get their project sponsored by a SIG, SIG Autoscaling is the right place, yet it may not be "integrated into Kubernetes by default".


#### Code, Binaries and Services

- Components and utilities that take automated action to scale a component
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are all kind of vague, pointers to actual repos/folders?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was under the impression that concrete folders is what sigs.yaml was supposed to be for, and that this was more of "conceptually, what do you cover". That being said, I can certainly provide some examples here, or actually links if my impression was incorrection.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed to mention the actual subproject names

- Components and utilities that take automated action to scale the cluster
itself

- Special parts of client-go for interacting with with the scaling
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely need a pointer into the project here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done -- mentioned the specific subproject. Actual folder definitions are in sigs.yaml

This updates the mission statement for SIG Autoscaling to correspond to
the scope of SIG Autoscaling in the charter.
@DirectXMan12
Copy link
Contributor Author

@spiffxp typos fixed
@brendanburns fixed the comments re: specific links by mentioning specific subprojects that can be crossreferenced with sigs.yaml to find folders. Adding specific folders in the charter seems like a recipe for getting out of date when we already have sigs.yaml.

@spiffxp I don't see any comments from @mwielgus ... what do you mean?

@spiffxp
Copy link
Member

spiffxp commented Dec 4, 2018

@DirectXMan12 wow yeah where did I get that name from, I meant @MaciekPytel, I think their comments are satisfied? just looking for a confirm

@DirectXMan12
Copy link
Contributor Author

looks like @MaciekPytel has responded to most of the threads that they raised confirming that, but just in case ping @MaciekPytel to confirm.

@MaciekPytel
Copy link
Contributor

I'm happy with how my comments were addressed, lgtm.

@DirectXMan12
Copy link
Contributor Author

DirectXMan12 commented Dec 5, 2018

@spiffxp @brendanburns good to go then? (want to get this merged by KubeCon ;-) )

@brendanburns
Copy link
Contributor

brendanburns commented Dec 6, 2018 via email

@spiffxp
Copy link
Member

spiffxp commented Dec 6, 2018

as do we, I poked @brendandburns and @michelleN both during steering meeting today, if you're still hanging tomorrow I can serve as another SC reviewer lgtm

@k8s-ci-robot
Copy link
Contributor

@brendanburns: changing LGTM is restricted to assignees, and only kubernetes/community repo collaborators may be assigned issues.

In response to this:

/lgtm

On Wed, Dec 5, 2018, 1:21 PM Solly Ross <[email protected] wrote:

@spiffxp https://github.com/spiffxp @brendanburns
https://github.com/brendanburns good to go then?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#2425 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABOYXomds4p5qiNM81_zdD-MtzEJBfsKks5u2DjBgaJpZM4ViaGd
.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@spiffxp
Copy link
Member

spiffxp commented Dec 7, 2018

/lgtm
Here's your proxy LGTM from @brendandburns
/approve
Here's a second SC reviewer saying LGTM

Thanks for working through this!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 7, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: spiffxp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 7, 2018
@k8s-ci-robot k8s-ci-robot merged commit 6ac21ab into kubernetes:master Dec 7, 2018
@DirectXMan12 DirectXMan12 deleted the autoscaling/charter branch December 7, 2018 02:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. committee/steering Denotes an issue or PR intended to be handled by the steering committee. lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. sig/network Categorizes an issue or PR as relevant to SIG Network. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.