kubernetes · atiratree · Mar 24, 2025 · aojea · Mar 27, 2025 · atiratree
diff --git a/OWNERS_ALIASES b/OWNERS_ALIASES
@@ -142,6 +142,11 @@ aliases:
     - jeremyrickard
     - liggitt
     - micahhausler
+  wg-node-lifecycle-leads:
+    - atiratree
+    - fabriziopandini
+    - humblec
+    - rthallisey
   wg-policy-leads:
     - JimBugwadia
     - poonam-lamba

diff --git a/communication/slack-config/channels.yaml b/communication/slack-config/channels.yaml
@@ -584,6 +584,7 @@ channels:
   - name: wg-multitenancy
   - name: wg-naming
     archived: true
+  - name: wg-node-lifecycle
   - name: wg-onprem
     archived: true
   - name: wg-policy

diff --git a/liaisons.md b/liaisons.md
@@ -59,6 +59,7 @@ members will assume one of the departing members groups.
 | [WG Device Management](wg-device-management/README.md) | Patrick Ohly (**[@pohly](https://github.com/pohly)**) |
 | [WG etcd Operator](wg-etcd-operator/README.md) | Maciej Szulik (**[@soltysh](https://github.com/soltysh)**) |
 | [WG LTS](wg-lts/README.md) | Sascha Grunert (**[@saschagrunert](https://github.com/saschagrunert)**) |
+| [WG Node Lifecycle](wg-node-lifecycle/README.md) | TBD (**[@TBD](https://github.com/TBD)**) |
 | [WG Policy](wg-policy/README.md) | Patrick Ohly (**[@pohly](https://github.com/pohly)**) |
 | [WG Serving](wg-serving/README.md) | Maciej Szulik (**[@soltysh](https://github.com/soltysh)**) |
 | [WG Structured Logging](wg-structured-logging/README.md) | Sascha Grunert (**[@saschagrunert](https://github.com/saschagrunert)**) |

diff --git a/sig-apps/README.md b/sig-apps/README.md
@@ -59,6 +59,7 @@ subprojects, and resolve cross-subproject technical issues and decisions.
 The following [working groups][working-group-definition] are sponsored by sig-apps:
 * [WG Batch](/wg-batch)
 * [WG Data Protection](/wg-data-protection)
+* [WG Node Lifecycle](/wg-node-lifecycle)
 * [WG Serving](/wg-serving)
 
 

diff --git a/sig-architecture/README.md b/sig-architecture/README.md
@@ -58,6 +58,7 @@ The Chairs of the SIG run operations and processes governing the SIG.
 The following [working groups][working-group-definition] are sponsored by sig-architecture:
 * [WG Device Management](/wg-device-management)
 * [WG LTS](/wg-lts)
+* [WG Node Lifecycle](/wg-node-lifecycle)
 * [WG Policy](/wg-policy)
 * [WG Serving](/wg-serving)
 * [WG Structured Logging](/wg-structured-logging)

diff --git a/sig-autoscaling/README.md b/sig-autoscaling/README.md
@@ -48,6 +48,7 @@ The Chairs of the SIG run operations and processes governing the SIG.
 The following [working groups][working-group-definition] are sponsored by sig-autoscaling:
 * [WG Batch](/wg-batch)
 * [WG Device Management](/wg-device-management)
+* [WG Node Lifecycle](/wg-node-lifecycle)
 * [WG Serving](/wg-serving)
 
 

diff --git a/sig-cli/README.md b/sig-cli/README.md
@@ -60,6 +60,12 @@ subprojects, and resolve cross-subproject technical issues and decisions.
     - [@kubernetes/sig-cli-test-failures](https://github.com/orgs/kubernetes/teams/sig-cli-test-failures) - Test Failures and Triage
 - Steering Committee Liaison: Paco Xu 徐俊杰 (**[@pacoxu](https://github.com/pacoxu)**)
 
+## Working Groups
+
+The following [working groups][working-group-definition] are sponsored by sig-cli:
+* [WG Node Lifecycle](/wg-node-lifecycle)
+
+
 ## Subprojects
 
 The following [subprojects][subproject-definition] are owned by sig-cli:

diff --git a/sig-cloud-provider/README.md b/sig-cloud-provider/README.md
@@ -58,6 +58,7 @@ subprojects, and resolve cross-subproject technical issues and decisions.
 ## Working Groups
 
 The following [working groups][working-group-definition] are sponsored by sig-cloud-provider:
+* [WG Node Lifecycle](/wg-node-lifecycle)
 * [WG Structured Logging](/wg-structured-logging)
 
 

diff --git a/sig-cluster-lifecycle/README.md b/sig-cluster-lifecycle/README.md
@@ -52,6 +52,7 @@ subprojects, and resolve cross-subproject technical issues and decisions.
 
 The following [working groups][working-group-definition] are sponsored by sig-cluster-lifecycle:
 * [WG LTS](/wg-lts)
+* [WG Node Lifecycle](/wg-node-lifecycle)
 * [WG etcd Operator](/wg-etcd-operator)
 
 

diff --git a/sig-list.md b/sig-list.md
@@ -66,6 +66,7 @@ When the need arises, a [new SIG can be created](sig-wg-lifecycle.md)
 |[Device Management](wg-device-management/README.md)|[device-management](https://github.com/kubernetes/kubernetes/labels/wg%2Fdevice-management)|* Architecture<br>* Autoscaling<br>* Network<br>* Node<br>* Scheduling<br>|* [John Belamaric](https://github.com/johnbelamaric), Google<br>* [Kevin Klues](https://github.com/klueska), NVIDIA<br>* [Patrick Ohly](https://github.com/pohly), Intel<br>|* [Slack](https://kubernetes.slack.com/messages/wg-device-management)<br>* [Mailing List](https://groups.google.com/a/kubernetes.io/g/wg-device-management)|* Regular WG Meeting: [Tuesdays at 8:30 PT (Pacific Time) (biweekly)](TBD)<br>
 |[etcd Operator](wg-etcd-operator/README.md)|[etcd-operator](https://github.com/kubernetes/kubernetes/labels/wg%2Fetcd-operator)|* Cluster Lifecycle<br>* etcd<br>|* [Benjamin Wang](https://github.com/ahrtr), VMware<br>* [Ciprian Hacman](https://github.com/hakman), Microsoft<br>* [Josh Berkus](https://github.com/jberkus), Red Hat<br>* [James Blair](https://github.com/jmhbnz), Red Hat<br>* [Justin Santa Barbara](https://github.com/justinsb), Google<br>|* [Slack](https://kubernetes.slack.com/messages/wg-etcd-operator)<br>* [Mailing List](https://groups.google.com/a/kubernetes.io/g/wg-etcd-operator)|* Regular WG Meeting: [Tuesdays at 11:00 PT (Pacific Time) (bi-weekly)](https://zoom.us/my/cncfetcdproject)<br>
 |[LTS](wg-lts/README.md)|[lts](https://github.com/kubernetes/kubernetes/labels/wg%2Flts)|* Architecture<br>* Cluster Lifecycle<br>* K8s Infra<br>* Release<br>* Security<br>* Testing<br>|* [Jeremy Rickard](https://github.com/jeremyrickard), Microsoft<br>* [Jordan Liggitt](https://github.com/liggitt), Google<br>* [Micah Hausler](https://github.com/micahhausler), Amazon<br>|* [Slack](https://kubernetes.slack.com/messages/wg-lts)<br>* [Mailing List](https://groups.google.com/a/kubernetes.io/g/wg-lts)|* Regular WG Meeting: [Tuesdays at 07:00 PT (Pacific Time) (biweekly)](https://zoom.us/j/92480197536?pwd=dmtSMGJRQmNYYTIyZkFlQ25JRngrdz09)<br>
+|[Node Lifecycle](wg-node-lifecycle/README.md)|[node-lifecycle](https://github.com/kubernetes/kubernetes/labels/wg%2Fnode-lifecycle)|* Apps<br>* Architecture<br>* Autoscaling<br>* CLI<br>* Cloud Provider<br>* Cluster Lifecycle<br>* Network<br>* Node<br>* Scheduling<br>* Storage<br>|* [Filip Křepinský](https://github.com/atiratree), Red Hat<br>* [Fabrizio Pandini](https://github.com/fabriziopandini), VMware<br>* [Humble Chirammal](https://github.com/humblec), VMware<br>* [Ryan Hallisey](https://github.com/rthallisey), NVIDIA<br>|* [Slack](https://kubernetes.slack.com/messages/wg-node-lifecycle)<br>* [Mailing List](https://groups.google.com/a/kubernetes.io/g/wg-node-lifecycle)|* WG Node Lifecycle Weekly Meeting: [TBDs at TBD TBD (weekly)]()<br>
 |[Policy](wg-policy/README.md)|[policy](https://github.com/kubernetes/kubernetes/labels/wg%2Fpolicy)|* Architecture<br>* Auth<br>* Multicluster<br>* Network<br>* Node<br>* Scheduling<br>* Storage<br>|* [Jim Bugwadia](https://github.com/JimBugwadia), Kyverno/Nirmata<br>* [Poonam Lamba](https://github.com/poonam-lamba), Google<br>* [Andy Suderman](https://github.com/sudermanjr), Fairwinds<br>|* [Slack](https://kubernetes.slack.com/messages/wg-policy)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-wg-policy)|* Regular WG Meeting: [Wednesdays at 8:00 PT (Pacific Time) (semimonthly)](https://zoom.us/j/7375677271)<br>
 |[Serving](wg-serving/README.md)|[serving](https://github.com/kubernetes/kubernetes/labels/wg%2Fserving)|* Apps<br>* Architecture<br>* Autoscaling<br>* Instrumentation<br>* Network<br>* Node<br>* Scheduling<br>* Storage<br>|* [Eduardo Arango](https://github.com/ArangoGutierrez), NVIDIA<br>* [Jiaxin Shan](https://github.com/Jeffwan), Bytedance<br>* [Sergey Kanzhelev](https://github.com/SergeyKanzhelev), Google<br>* [Yuan Tang](https://github.com/terrytangyuan), Red Hat<br>|* [Slack](https://kubernetes.slack.com/messages/wg-serving)<br>* [Mailing List](https://groups.google.com/a/kubernetes.io/g/wg-serving)|* WG Serving Weekly Meeting ([calendar](https://calendar.google.com/calendar/embed?src=e896b769743f3877edfab2d4c6a14132b2aa53287021e9bbf113cab676da54ba%40group.calendar.google.com)): [Wednesdays at 9:00 PT (Pacific Time) (weekly)](https://zoom.us/j/92615874244?pwd=VGhxZlJjRTNRWTZIS0dQV2MrZUJ5dz09)<br>
 |[Structured Logging](wg-structured-logging/README.md)|[structured-logging](https://github.com/kubernetes/kubernetes/labels/wg%2Fstructured-logging)|* API Machinery<br>* Architecture<br>* Cloud Provider<br>* Instrumentation<br>* Network<br>* Node<br>* Scheduling<br>* Storage<br>|* [Mengjiao Liu](https://github.com/mengjiao-liu), Independent<br>* [Patrick Ohly](https://github.com/pohly), Intel<br>|* [Slack](https://kubernetes.slack.com/messages/wg-structured-logging)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-wg-structured-logging)|

diff --git a/sig-network/README.md b/sig-network/README.md
@@ -70,6 +70,7 @@ subprojects, and resolve cross-subproject technical issues and decisions.
 
 The following [working groups][working-group-definition] are sponsored by sig-network:
 * [WG Device Management](/wg-device-management)
+* [WG Node Lifecycle](/wg-node-lifecycle)
 * [WG Policy](/wg-policy)
 * [WG Serving](/wg-serving)
 * [WG Structured Logging](/wg-structured-logging)

diff --git a/sig-node/README.md b/sig-node/README.md
@@ -55,6 +55,7 @@ subprojects, and resolve cross-subproject technical issues and decisions.
 The following [working groups][working-group-definition] are sponsored by sig-node:
 * [WG Batch](/wg-batch)
 * [WG Device Management](/wg-device-management)
+* [WG Node Lifecycle](/wg-node-lifecycle)
 * [WG Policy](/wg-policy)
 * [WG Serving](/wg-serving)
 * [WG Structured Logging](/wg-structured-logging)

diff --git a/sig-scheduling/README.md b/sig-scheduling/README.md
@@ -67,6 +67,7 @@ subprojects, and resolve cross-subproject technical issues and decisions.
 The following [working groups][working-group-definition] are sponsored by sig-scheduling:
 * [WG Batch](/wg-batch)
 * [WG Device Management](/wg-device-management)
+* [WG Node Lifecycle](/wg-node-lifecycle)
 * [WG Policy](/wg-policy)
 * [WG Serving](/wg-serving)
 * [WG Structured Logging](/wg-structured-logging)

diff --git a/sig-storage/README.md b/sig-storage/README.md
@@ -59,6 +59,7 @@ subprojects, and resolve cross-subproject technical issues and decisions.
 
 The following [working groups][working-group-definition] are sponsored by sig-storage:
 * [WG Data Protection](/wg-data-protection)
+* [WG Node Lifecycle](/wg-node-lifecycle)
 * [WG Policy](/wg-policy)
 * [WG Serving](/wg-serving)
 * [WG Structured Logging](/wg-structured-logging)

diff --git a/sigs.yaml b/sigs.yaml
@@ -3697,6 +3697,58 @@ workinggroups:
     liaison:
       github: saschagrunert
       name: Sascha Grunert
+- dir: wg-node-lifecycle
+  name: Node Lifecycle
+  mission_statement: >
+    Explore and improve node and pod lifecycle in Kubernetes. This should result in
+    better node drain/maintenance support and better pod disruption/termination. It
+    should also improve node and pod autoscaling, better application migration and
+    availability, load balancing, de/scheduling, node shutdown, cloud provider integrations,
+    and support other new scenarios and integrations.
+
+  charter_link: charter.md
+  stakeholder_sigs:
+  - Apps
+  - Architecture
+  - Autoscaling
+  - CLI
+  - Cloud Provider
+  - Cluster Lifecycle
+  - Network
+  - Node
+  - Scheduling
+  - Storage
+  label: node-lifecycle
+  leadership:
+    chairs:
+    - github: atiratree
+      name: Filip Křepinský
+      company: Red Hat
+      email: [email protected]
+    - github: fabriziopandini
+      name: Fabrizio Pandini
+      company: VMware
+      email: [email protected]
+    - github: humblec
+      name: Humble Chirammal
+      company: VMware
+      email: [email protected]
+    - github: rthallisey
+      name: Ryan Hallisey
+      company: NVIDIA
+      email: [email protected]
+  meetings:
+  - description: WG Node Lifecycle Weekly Meeting
+    day: TBD
+    time: TBD
+    tz: TBD
+    frequency: weekly
+  contact:
+    slack: wg-node-lifecycle
+    mailing_list: https://groups.google.com/a/kubernetes.io/g/wg-node-lifecycle
+    liaison:
+      github: TBD
+      name: TBD
 - dir: wg-policy
   name: Policy
   mission_statement: >

diff --git a/wg-node-lifecycle/OWNERS b/wg-node-lifecycle/OWNERS
@@ -0,0 +1,8 @@
+# See the OWNERS docs at https://go.k8s.io/owners
+
+reviewers:
+  - wg-node-lifecycle-leads
+approvers:
+  - wg-node-lifecycle-leads
+labels:
+  - wg/node-lifecycle
diff --git a/wg-node-lifecycle/README.md b/wg-node-lifecycle/README.md
@@ -0,0 +1,45 @@
+<!---
+This is an autogenerated file!
+
+Please do not edit this file directly, but instead make changes to the
+sigs.yaml file in the project root.
+
+To understand how this file is generated, see https://git.k8s.io/community/generator/README.md
+--->
+# Node Lifecycle Working Group
+
+Explore and improve node and pod lifecycle in Kubernetes. This should result in better node drain/maintenance support and better pod disruption/termination. It should also improve node and pod autoscaling, better application migration and availability, load balancing, de/scheduling, node shutdown, cloud provider integrations, and support other new scenarios and integrations.
+
+The [charter](charter.md) defines the scope and governance of the Node Lifecycle Working Group.
+
+## Stakeholder SIGs
+* [SIG Apps](/sig-apps)
+* [SIG Architecture](/sig-architecture)
+* [SIG Autoscaling](/sig-autoscaling)
+* [SIG CLI](/sig-cli)
+* [SIG Cloud Provider](/sig-cloud-provider)
+* [SIG Cluster Lifecycle](/sig-cluster-lifecycle)
+* [SIG Network](/sig-network)
+* [SIG Node](/sig-node)
+* [SIG Scheduling](/sig-scheduling)
+* [SIG Storage](/sig-storage)
+
+## Meetings
+*Joining the [mailing list](https://groups.google.com/a/kubernetes.io/g/wg-node-lifecycle) for the group will typically add invites for the following meetings to your calendar.*
+* WG Node Lifecycle Weekly Meeting: [TBDs at TBD TBD]() (weekly). [Convert to your timezone](http://www.thetimezoneconverter.com/?t=TBD&tz=TBD).
+
+## Organizers
+
+* Filip Křepinský (**[@atiratree](https://github.com/atiratree)**), Red Hat
+* Fabrizio Pandini (**[@fabriziopandini](https://github.com/fabriziopandini)**), VMware
+* Humble Chirammal (**[@humblec](https://github.com/humblec)**), VMware
+* Ryan Hallisey (**[@rthallisey](https://github.com/rthallisey)**), NVIDIA
+
+## Contact
+- Slack: [#wg-node-lifecycle](https://kubernetes.slack.com/messages/wg-node-lifecycle)
+- [Mailing list](https://groups.google.com/a/kubernetes.io/g/wg-node-lifecycle)
+- [Open Community Issues/PRs](https://github.com/kubernetes/community/labels/wg%2Fnode-lifecycle)
+- Steering Committee Liaison: TBD (**[@TBD](https://github.com/TBD)**)
+<!-- BEGIN CUSTOM CONTENT -->
+
+<!-- END CUSTOM CONTENT -->
diff --git a/wg-node-lifecycle/charter.md b/wg-node-lifecycle/charter.md
@@ -0,0 +1,160 @@
+# WG Node Lifecycle Charter
+
+This charter adheres to the conventions described in the [Kubernetes Charter README] and uses
+the Roles and Organization Management outlined in [wg-governance].
+
+[Kubernetes Charter README]: /committee-steering/governance/README.md
+
+## Scope
+
+The Kubernetes ecosystem currently faces challenges in node maintenance scenarios, with multiple
+projects independently addressing similar issues. The goal of this working group is to develop
+unified APIs that the entire ecosystem can depend on, reducing the maintenance burden across
+projects and addressing scenarios that impede node drain or cause improper pod termination. Our
+objective is to create easily configurable, out-of-the-box solutions that seamlessly integrate with
+existing APIs and behaviors. We will strive to make these solutions minimalistic and extensible to
+support advanced use cases across the ecosystem.
+
+To properly solve the node drain, we must first understand the node lifecycle. This includes
+provisioning/sunsetting of the nodes, PodDisruptionBudgets, API-initiated eviction and node
+shutdown. This then impacts both the node and pod autoscaling, de/scheduling, load balancing, and
+the applications running in the cluster. All of these areas have issues and would benefit from a
+unified approach.
+
+### In scope
+
+- Explore a unified way of draining the nodes and managing node maintenance by introducing new APIs
+  and extending the current ones. This includes exploring extension to or interactions with the Node
+  object.
+- Analyze the node lifecycle, the Node API, and possible interactions. We want to explore augmenting
+  the Node API to expose additional state or status in order to coalesce other core Kubernetes and
+  community APIs around node lifecycle management.
+- Improve the disruption model that is currently implemented by API-initiated Eviction API and PDBs.
+  Improve the descheduling, availability and migration capabilities of today's application
+  workloads. Also explore the interactions with other eviction mechanisms.
+- Improve the Graceful/Non-Graceful Node Shutdown and consider how this affects the node lifecycle.
+  To graduate the [Graceful Node Shutdown](https://github.com/kubernetes/enhancements/issues/2000)
+  feature to GA and resolve the associated node shutdown issues.
+- Improve the scheduling and pod/node autoscaling to take into account ongoing node maintenance and
+  the new disruption model/evictions. This includes balancing of the pods according to scheduling
+  constraints. 
+- Consider improving the pod lifecycle of DaemonSets and Static pods during a node maintenance.
+- Explore the cloud provider use cases and how they can hook in into the node lifecycle. So that the
+  users can use the same APIs or configurations across the board.
+- Migrate users of the eviction based kubectl-like drain (kubectl, cluster autoscaler, karpenter,
+  ...) and other scenarios to use the new approach.
+- Explore possible scenarios behind the reason why the node was terminated/drained/killed and how to
+  track and react to each of them. Consider past discussions/historical perspective
+  (e.g. "thumbstones").
+
+### Out of scope
+
+- Implementing cloud provider specific logic, the goal is to have high-level API that the providers
+  can use, hook into, or extend.
+- Infrastructure provisioning, deprovisioning solution or physical infrastructure lifecycle
+  management solution.
+
+## Stakeholders
+
+- SIG Apps
+- SIG Architecture
+- SIG Autoscaling
+- SIG CLI
+- SIG Cloud Provider
+- SIG Cluster Lifecycle
+- SIG Network
+- SIG Node
+- SIG Scheduling
+- SIG Storage
+
+Stakeholders span from multiple SIGs to a broad set of end users,
+public and private cloud providers, Kubernetes distribution providers,
+and cloud provider end-users. Here are some user stories:
+
+- As a cluster admin I want to have a simple interface to initiate a node drain/maintenance without
+  any required manual interventions. I also want to be able to observe the node drain via the API
+  and check on its progress. I also want to be able to discover workloads that are blocking the node
+  drain.
+- To support the new features, node maintenance, scheduler, descheduler, pod autoscaling, kubelet
+  and other actors should use a new eviction API to gracefully remove pods. This should enable new
+  migration strategies that prefer to surge (upscale) pods first rather than downscale them. It
+  should also allow other users/components to monitor pods that are gracefully removed/terminated
+  and provide better behaviour in terms of de/scheduling, scaling and availability.
+- As a cluster admin, I want to be able to perform arbitrary actions after the node drain is
+  complete, such as resetting GPU drivers, resetting NICs, performing software updates or shutting
+  down the machine.
+- As an end user, I would like more alternatives to blue-green upgrades, especially with special
+  hardware accelerators; it's far too expensive. I would like to choose a strategy on how to
+  coordinate the node drain and the upgrade to achieve better cost-effectiveness.
+- As a cloud provider, I need to perform regular maintenance on the hardware in my fleet. Enhancing
+  Kubernetes to help CSPs safely remove hardware will reduce operational costs.
+- Modelling the cost of doing accelerator maintenance in today's world can be massive. And since
+  hardware accelerators tend to need more love and care, having software support to coordinate
+  maintenance will reduce operational costs.
+- As a cluster admin, I would like to use a mixture of on-demand and temporary spot instances in my
+  clusters to reduce cloud expenditure. Having more reliable lifecycle and drain mechanisms for
+  nodes will improve cluster stability in scenarios where instances may be terminated by the cloud
+  provider due to cost-related thresholds.
+- As a user, I want to prevent any disruption to my pet or expensive workloads (VMs, ML with
+  accelerators) and either prevent termination altogether or have a reliable migration path. 
+  Features like `terminationGracePeriodSeconds` are not sufficient as the termination/migration can
+  take hours if not days.
+- As a user, I want my application to finish all network and storage operations before terminating a
+  pod. This includes closing pod connections, removing pods from endpoints, writing cached writes
+  to the underlying storage and completing storage cleanup routines.
+
+## Deliverables
+
+The WG will coordinate requirement gathering and design, eventually leading to
+KEP(s)s and code associated with the ideas.
+
+Area we expect to explore:
+
+- An API to express node drain/maintenance.
+  Currently tracked in https://github.com/kubernetes/enhancements/issues/4212.
+- An API to solve the problems wrt the API-initiated Eviction API and PDBs.
+  Currently tracked in https://github.com/kubernetes/enhancements/issues/4563.
+- An API/mechanism to gracefully terminate pods during a node shutdown.
+  Graceful node shutdown feature tracked in https://github.com/kubernetes/enhancements/issues/2000.
+- An API to deschedule pods that use DRA devices.
+  DRA: device taints and tolerations feature tracked in https://github.com/kubernetes/enhancements/issues/5055.
+- An API to remove pods from endpoints before they terminate.
+  Currently tracked in https://docs.google.com/document/d/1t25jgO_-LRHhjRXf4KJ5xY_t8BZYdapv7MDAxVGY6R8/edit?tab=t.0#heading=h.i4lwa7rdng7y.
+- Introduce enhancements across multiple Kubernetes SIGs to add support for the new APIs to solve
+  wide range of issue.
+
+We expect to provide reference implementations of the new APIs including but not limited to
+controllers, API validation, integration with existing core components and extension points for the
+ecosystem. This should be accompanied by E2E / Conformance tests.
+
+## Relevant Projects
+
+This is a list of known projects that solve similar problems in the ecosystem or would benefit from
+the efforts of this WG:
+
+- https://github.com/aws/aws-node-termination-handler
+- https://github.com/foriequal0/pod-graceful-drain
+- https://github.com/kubereboot/kured
+- https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler
+- https://github.com/kubernetes-sigs/karpenter
+- https://github.com/kubevirt/kubevirt
+- https://github.com/medik8s/node-maintenance-operator
+- https://github.com/Mellanox/maintenance-operator
+- https://github.com/openshift/machine-config-operator
+- https://github.com/planetlabs/draino
+- https://github.com/strimzi/drain-cleaner
+
+There are also internal custom solutions that companies use.
+
+## Roles and Organization Management
+
+This WG adheres to the Roles and Organization Management outlined in [wg-governance]
+and opts-in to updates and modifications to [wg-governance].
+
+[wg-governance]: /committee-steering/governance/wg-governance.md
+
+## Timelines and Disbanding
+
+The working group will disband when the KEPs we create are completed. We will
-The working group will disband when the KEPs we create are completed. We will
+The working group will disband once the core APIs defined in the KEPs have reached a stable state (GA) and ongoing maintenance ownership is established within the relevant SIGs. We will
-The working group will disband when the KEPs we create are completed. We will
+The working group will disband once the core APIs defined in the KEPs have reached a stable state (GA) and ongoing maintenance ownership is established within the relevant SIGs. We will
+review whether the working group should disband if appropriate SIG ownership
+can't be reached.