Skip to content

Commit 15077fb

Browse files
fixup! Update KEP 4381 with standard device attribute definitions.
1 parent 842945b commit 15077fb

File tree

1 file changed

+26
-15
lines changed
  • keps/sig-node/4381-dra-structured-parameters

1 file changed

+26
-15
lines changed

keps/sig-node/4381-dra-structured-parameters/README.md

Lines changed: 26 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -502,12 +502,17 @@ In production, a similar PodTemplateSpec in a Deployment will be used.
502502
#### Co-locating devices based on hardware topology
503503

504504
As a user, I want to easily request multiple devices (like a CPU and a NIC) that
505-
are physically related in the system's hardware topology,
505+
are physically connected to the same CPU in the system's hardware. This improves
506+
performance by avoiding costly inter-CPU interconnects for data transfer in
507+
multi-socket systems.
508+
509+
For instance, in a multi-CPU (also known as multi-socket) system, I'd prefer
510+
that a requested NIC is attached to the CPU being allocated to improve
511+
performance by avoiding costly inter-CPU interconnects.
506512

507513
I'll define a `ResourceClaim` for my workload, specifying constraints with
508514
`MatchAttribute` to ensure the devices share the same underlying hardware
509-
characteristics. For instance, if I need a CPU and a NIC that minimize
510-
communication latency, I'd want them on the same NUMA node:
515+
characteristics
511516

512517
```
513518
apiVersion: resource.k8s.io/v1beta1
@@ -523,13 +528,13 @@ spec:
523528
deviceClassName: nic.vendor2.com # This is a hypothetical NIC device class
524529
constraints:
525530
- requests: ["cpu-request", "nic-request"]
526-
matchAttribute: k8s.io/numaNode
531+
matchAttribute: k8s.io/cpuSocketNumber
527532
```
528533
529534
I'll use one of the standard attributes provided by Kubernetes, choosing between
530-
`k8s.io/numaNode` or `k8s.io/pcieRoot` depending on my specific alignment needs.
531-
I know that even if my CPU and NIC are managed by different DRA drivers, they
532-
are likely publishing the information I can use for alignment through these
535+
`k8s.io/cpuSocketNumber` or `k8s.io/pcieRoot` depending on my specific alignment
536+
needs. I know that even if my CPU and NIC are managed by different DRA drivers,
537+
they are likely publishing the information I can use for alignment through these
533538
standard attributes.
534539
535540
### Publishing node resources
@@ -616,9 +621,10 @@ We are reserving the `k8s.io/` domain (and subdomains) prefix for attributes and
616621
capacities for standardization by the Kubernetes project. This reservation
617622
allows us to define common attributes that can describe hardware characteristics
618623
across resources from different vendors. Currently, we are defining two such
619-
standard attributes: `k8s.sio/numaNode` and `k8s.io/pcieRoot`. Details on their
620-
meaning and how they should be exposed by DRA drivers are available in the [API
621-
design section under ResourceSlice's](#resourceslice) QualifiedName definition.
624+
standard attributes: `k8s.io/cpuSocketNumber` and `k8s.io/pcieRoot`. Details on
625+
their meaning and how they should be exposed by DRA drivers are available in the
626+
[API design section under ResourceSlice's](#resourceslice) QualifiedName
627+
definition.
622628

623629
**Note:** If a driver needs to remove a device or change its attributes,
624630
then there is a risk that a claim gets allocated based on the old
@@ -1270,11 +1276,16 @@ const ResourceSliceMaxAttributesAndCapacitiesPerDevice = 32
12701276
//
12711277
// Currently, the two standard attributes are:
12721278
//
1273-
// 1. `k8s.io/numaNode`: An integer value referring to a Non-Uniform Memory
1274-
// Access (NUMA) node within the system's NUMA topology. This attribute can
1275-
// be used to describe which NUMA node a device is physically associated
1276-
// with. DRA drivers MAY discover this value for PCI devices via
1277-
// sysfs, for example, by reading `/sys/bus/pci/devices/<PCI_ADDRESS>/numa_node`.
1279+
// 1. `k8s.io/cpuSocketNumber`: An integer value referring to the logical
1280+
// identifier assigned by the operating system for the physical CPU socket
1281+
// that a device is associated with. For a PCIe device, DRA drivers can
1282+
// determine this value by first reading its associated NUMA node from
1283+
// `/sys/bus/pci/devices/<PCI_ADDRESS>/numa_node`. Then, using that NUMA
1284+
// node, the value for `cpuSocketNumber` can be found by reading the
1285+
// `physical_package_id` from any CPU within that node (e.g.,
1286+
// `/sys/devices/system/node<NUMA_NODE>/cpuX/topology/physical_package_id`).
1287+
// Similarly, for a logical CPU X, its `cpuSocketNumber` can be identified
1288+
// from `/sys/devices/system/cpu/cpuX/topology/physical_package_id`.
12781289
//
12791290
// 2. `k8s.io/pcieRoot`: A string value in the format `pci<domain>:<bus>`,
12801291
// referring to a PCIe (Peripheral Component Interconnect Express) Root

0 commit comments

Comments
 (0)