Skip to content

Commit 76a0206

Browse files
committed
low latency tuning refactor
Changes for Martin Martin's 3rd review comments Martin's 4th review Final comments from Martin More updates for Martin Tweaks for Martin Martin's comments 22-Apr - workload pods Apr 30 review comments
1 parent 1c440c5 commit 76a0206

File tree

67 files changed

+896
-985
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

67 files changed

+896
-985
lines changed

_topic_maps/_topic_map.yml

+12-5
Original file line numberDiff line numberDiff line change
@@ -2949,14 +2949,21 @@ Topics:
29492949
File: what-huge-pages-do-and-how-they-are-consumed-by-apps
29502950
Distros: openshift-origin,openshift-enterprise
29512951
- Name: Low latency tuning
2952-
File: cnf-low-latency-tuning
2952+
Dir: low_latency_tuning
29532953
Distros: openshift-origin,openshift-enterprise
2954-
- Name: Performing latency tests for platform verification
2955-
File: cnf-performing-platform-verification-latency-tests
2954+
Topics:
2955+
- Name: Understanding low latency
2956+
File: cnf-understanding-low-latency
2957+
- Name: Tuning nodes for low latency with the performance profile
2958+
File: cnf-tuning-low-latency-nodes-with-perf-profile
2959+
- Name: Provisioning real-time and low latency workloads
2960+
File: cnf-provisioning-low-latency-workloads
2961+
- Name: Debugging low latency tuning
2962+
File: cnf-debugging-low-latency-tuning-status
2963+
- Name: Performing latency tests for platform verification
2964+
File: cnf-performing-platform-verification-latency-tests
29562965
- Name: Improving cluster stability in high latency environments using worker latency profiles
29572966
File: scaling-worker-latency-profiles
2958-
- Name: Creating a performance profile
2959-
File: cnf-create-performance-profiles
29602967
Distros: openshift-origin,openshift-enterprise
29612968
- Name: Workload partitioning
29622969
File: enabling-workload-partitioning

edge_computing/ztp-advanced-policy-config.adoc

+2-4
Original file line numberDiff line numberDiff line change
@@ -33,9 +33,7 @@ include::modules/ztp-using-pgt-to-configure-power-states.adoc[leveloffset=+1]
3333
[role="_additional-resources"]
3434
.Additional resources
3535

36-
* xref:../scalability_and_performance/cnf-low-latency-tuning.adoc#cnf-understanding-workload-hints_cnf-master[Understanding workload hints]
37-
38-
* xref:../scalability_and_performance/cnf-low-latency-tuning.adoc#configuring-workload-hints_cnf-master[Configuring workload hints manually]
36+
* xref:../scalability_and_performance/low_latency_tuning/cnf-tuning-low-latency-nodes-with-perf-profile.adoc#configuring-workload-hints_cnf-low-latency-perf-profile[Configuring node power consumption and realtime processing with workload hints]
3937
4038
include::modules/ztp-using-pgt-to-configure-performance-mode.adoc[leveloffset=+2]
4139

@@ -46,7 +44,7 @@ include::modules/ztp-using-pgt-to-configure-power-saving-mode.adoc[leveloffset=+
4644
[role="_additional-resources"]
4745
.Additional resources
4846

49-
* xref:../scalability_and_performance/cnf-low-latency-tuning.adoc#node-tuning-operator-pod-power-saving-config_cnf-master[Enabling critical workloads for power saving configurations]
47+
* xref:../scalability_and_performance/low_latency_tuning/cnf-tuning-low-latency-nodes-with-perf-profile.adoc#cnf-configuring-power-saving-for-nodes_cnf-low-latency-perf-profile[Configuring power saving for nodes that run colocated high and low priority workloads]
5048
5149
* xref:../edge_computing/ztp-reference-cluster-configuration-for-vdu.adoc#ztp-du-configuring-host-firmware-requirements_sno-configure-for-vdu[Configuring host firmware for low latency and high performance]
5250

installing/installing-preparing.adoc

+1-1
Original file line numberDiff line numberDiff line change
@@ -114,7 +114,7 @@ For a production cluster, you must configure the following integrations:
114114
[id="installing-preparing-cluster-for-workloads"]
115115
== Preparing your cluster for workloads
116116

117-
Depending on your workload needs, you might need to take extra steps before you begin deploying applications. For example, after you prepare infrastructure to support your application xref:../cicd/builds/build-strategies.adoc#build-strategies[build strategy], you might need to make provisions for xref:../scalability_and_performance/cnf-low-latency-tuning.adoc#cnf-low-latency-tuning[low-latency] workloads or to xref:../nodes/pods/nodes-pods-secrets.adoc#nodes-pods-secrets[protect sensitive workloads]. You can also configure xref:../observability/monitoring/enabling-monitoring-for-user-defined-projects.adoc#enabling-monitoring-for-user-defined-projects[monitoring] for application workloads.
117+
Depending on your workload needs, you might need to take extra steps before you begin deploying applications. For example, after you prepare infrastructure to support your application xref:../cicd/builds/build-strategies.adoc#build-strategies[build strategy], you might need to make provisions for xref:../scalability_and_performance/low_latency_tuning/cnf-tuning-low-latency-nodes-with-perf-profile.adoc#cnf-low-latency-perf-profile[low-latency] workloads or to xref:../nodes/pods/nodes-pods-secrets.adoc#nodes-pods-secrets[protect sensitive workloads]. You can also configure xref:../observability/monitoring/enabling-monitoring-for-user-defined-projects.adoc#enabling-monitoring-for-user-defined-projects[monitoring] for application workloads.
118118
If you plan to run xref:../windows_containers/enabling-windows-container-workloads.adoc#enabling-windows-container-workloads[Windows workloads], you must enable xref:../networking/ovn_kubernetes_network_provider/configuring-hybrid-networking.adoc#configuring-hybrid-networking[hybrid networking with OVN-Kubernetes] during the installation process; hybrid networking cannot be enabled after your cluster is installed.
119119

120120
[id="supported-installation-methods-for-different-platforms"]

installing/installing_openstack/installing-openstack-nfv-preparing.adoc

+2-1
Original file line numberDiff line numberDiff line change
@@ -42,4 +42,5 @@ After you perform preinstallation tasks, install your cluster by following the m
4242
* Consult the following references after you deploy your cluster to improve its performance:
4343
** xref:../../networking/hardware_networks/using-dpdk-and-rdma.adoc#nw-openstack-ovs-dpdk-testpmd-pod_using-dpdk-and-rdma[A test pod template for clusters that use OVS-DPDK on OpenStack].
4444
** xref:../../networking/hardware_networks/add-pod.adoc#nw-openstack-sr-iov-testpmd-pod_add-pod[A test pod template for clusters that use SR-IOV on OpenStack].
45-
** xref:../../scalability_and_performance/cnf-create-performance-profiles.adoc#installation-openstack-ovs-dpdk-performance-profile_cnf-create-performance-profiles[A performance profile template for clusters that use OVS-DPDK on OpenStack].
45+
** xref:../../scalability_and_performance/low_latency_tuning/cnf-tuning-low-latency-nodes-with-perf-profile.adoc#installation-openstack-ovs-dpdk-performance-profile_cnf-low-latency-perf-profile[A performance profile template for clusters that use OVS-DPDK on OpenStack]
46+
.

modules/cnf-about-irq-affinity-setting.adoc

+4-3
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,11 @@
11
// Module included in the following assemblies:
22
//
3-
// scalability_and_performance/cnf-low-latency-tuning.adoc
3+
// * scalability_and_performance/cnf-low-latency-tuning.adoc
4+
// * scalability_and_performance/low_latency_tuning/cnf-tuning-low-latency-nodes-with-perf-profile.adoc
45

56
:_mod-docs-content-type: CONCEPT
67
[id="about_irq_affinity_setting_{context}"]
7-
= About support of IRQ affinity setting
8+
= Finding the effective IRQ affinity setting for a node
89

910
Some IRQ controllers lack support for IRQ affinity setting and will always expose all online CPUs as the IRQ mask. These IRQ controllers effectively run on CPU 0.
1011

@@ -60,4 +61,4 @@ $ find /proc/irq -name effective_affinity -printf "%p: " -exec cat {} \;
6061
/proc/irq/34/effective_affinity: 2
6162
----
6263

63-
Some drivers use `managed_irqs`, whose affinity is managed internally by the kernel and userspace cannot change the affinity. In some cases, these IRQs might be assigned to isolated CPUs. For more information about `managed_irqs`, see link:https://access.redhat.com/solutions/4819541[Affinity of managed interrupts cannot be changed even if they target isolated CPU].
64+
Some drivers use `managed_irqs`, whose affinity is managed internally by the kernel and userspace cannot change the affinity. In some cases, these IRQs might be assigned to isolated CPUs. For more information about `managed_irqs`, see link:https://access.redhat.com/solutions/4819541[Affinity of managed interrupts cannot be changed even if they target isolated CPU].

modules/cnf-about-the-profile-creator-tool.adoc

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
// Module included in the following assemblies:
2-
// Epic CNF-792 (4.8)
3-
// * scalability_and_performance/cnf-create-performance-profiles.adoc
2+
//
3+
// * scalability_and_performance/low_latency_tuning/cnf-tuning-low-latency-nodes-with-perf-profile.adoc
44

55
:_mod-docs-content-type: CONCEPT
66
[id="cnf-about-the-profile-creator-tool_{context}"]

modules/cnf-about_hyperthreading_for_low_latency_and_real_time_applications.adoc

+2-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
// Module included in the following assemblies:
22
//
3-
// scalability_and_performance/cnf-low-latency-tuning.adoc
3+
// * scalability_and_performance/cnf-low-latency-tuning.adoc
4+
// * scalability_and_performance/low_latency_tuning/cnf-understanding-low-latency.adoc
45

56
:_mod-docs-content-type: CONCEPT
67
[id="about_hyperthreading_for_low_latency_and_real_time_applications_{context}"]

modules/cnf-adjusting-nic-queues-with-the-performance-profile.adoc

+3-3
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
// Module included in the following assemblies:
2-
//CNF-1483 (4.8)
3-
// * scalability_and_performance/low-latency-tuning.adoc
2+
//
3+
// * scalability_and_performance/low_latency_tuning/cnf-tuning-low-latency-nodes-with-perf-profile.adoc
44

55
:_mod-docs-content-type: PROCEDURE
66
[id="adjusting-nic-queues-with-the-performance-profile_{context}"]
@@ -165,4 +165,4 @@ spec:
165165
[source,terminal]
166166
----
167167
$ oc apply -f <your_profile_name>.yaml
168-
----
168+
----

modules/cnf-allocating-multiple-huge-page-sizes.adoc

+3-3
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
1-
// CNF-538 Promote Multiple Huge Pages Sizes for Pods and Containers to beta
21
// Module included in the following assemblies:
32
//
4-
// *scalability_and_performance/cnf-low-latency-tuning.adoc
3+
// * scalability_and_performance/cnf-low-latency-tuning.adoc
4+
// * scalability_and_performance/low_latency_tuning/cnf-tuning-low-latency-nodes-with-perf-profile.adoc
55

66
[id="cnf-allocating-multiple-huge-page-sizes_{context}"]
77
= Allocating multiple huge page sizes
@@ -22,4 +22,4 @@ spec:
2222
- count: 4
2323
node: 1
2424
size: 1G
25-
----
25+
----

modules/cnf-collecting-low-latency-tuning-debugging-data-for-red-hat-support.adoc

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
1-
// CNF-643 Support and debugging tools for CNF
21
// Module included in the following assemblies:
32
//
4-
// *scalability_and_performance/cnf-low-latency-tuning.adoc
3+
// * scalability_and_performance/cnf-low-latency-tuning.adoc
4+
// * scalability_and_performance/low_latency_tuning/cnf-debugging-low-latency-tuning-status.adoc
55

66
:_mod-docs-content-type: PROCEDURE
77
[id="cnf-collecting-low-latency-tuning-debugging-data-for-red-hat-support_{context}"]
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
1-
// Module included in the following assemblies:
1+
// Module included in the following assemblies:
22
//
3-
// scalability_and_performance/cnf-low-latency-tuning.adoc
3+
// * scalability_and_performance/low_latency_tuning/cnf-tuning-low-latency-nodes-with-perf-profile.adoc
44

55
:_mod-docs-content-type: PROCEDURE
66
[id="configuring_for_irq_dynamic_load_balancing_{context}"]
7-
= Configuring a node for IRQ dynamic load balancing
7+
= Configuring node interrupt affinity
88

99
Configure a cluster node for IRQ dynamic load balancing to control which cores can receive device interrupt requests (IRQ).
1010

@@ -34,154 +34,8 @@ spec:
3434
+
3535
[NOTE]
3636
====
37-
When you configure reserved and isolated CPUs, the infra containers in pods use the reserved CPUs and the application containers use the isolated CPUs.
37+
When you configure reserved and isolated CPUs, operating system processes, kernel processes and systemd services run on reserved CPUs.
38+
Infrastructure pods run on any CPU except where the low latency workload is running.
39+
low latency workload pods run on exclusive CPUs from the isolated pool.
40+
For more information, see "Restricting CPUs for infra and application containers".
3841
====
39-
40-
. Create the pod that uses exclusive CPUs, and set `irq-load-balancing.crio.io` and `cpu-quota.crio.io` annotations to `disable`. For example:
41-
+
42-
[source,yaml,subs="attributes+"]
43-
----
44-
apiVersion: v1
45-
kind: Pod
46-
metadata:
47-
name: dynamic-irq-pod
48-
annotations:
49-
irq-load-balancing.crio.io: "disable"
50-
cpu-quota.crio.io: "disable"
51-
spec:
52-
securityContext:
53-
runAsNonRoot: true
54-
seccompProfile:
55-
type: RuntimeDefault
56-
containers:
57-
- name: dynamic-irq-pod
58-
image: "registry.redhat.io/openshift4/cnf-tests-rhel8:v{product-version}"
59-
command: ["sleep", "10h"]
60-
resources:
61-
requests:
62-
cpu: 2
63-
memory: "200M"
64-
limits:
65-
cpu: 2
66-
memory: "200M"
67-
securityContext:
68-
allowPrivilegeEscalation: false
69-
capabilities:
70-
drop: [ALL]
71-
nodeSelector:
72-
node-role.kubernetes.io/worker-cnf: ""
73-
runtimeClassName: performance-dynamic-irq-profile
74-
# ...
75-
----
76-
77-
. Enter the pod `runtimeClassName` in the form performance-<profile_name>, where <profile_name> is the `name` from the `PerformanceProfile` YAML, in this example, `performance-dynamic-irq-profile`.
78-
. Set the node selector to target a cnf-worker.
79-
. Ensure the pod is running correctly. Status should be `running`, and the correct cnf-worker node should be set:
80-
+
81-
[source,terminal]
82-
----
83-
$ oc get pod -o wide
84-
----
85-
+
86-
.Expected output
87-
+
88-
[source,terminal]
89-
----
90-
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
91-
dynamic-irq-pod 1/1 Running 0 5h33m <ip-address> <node-name> <none> <none>
92-
----
93-
. Get the CPUs that the pod configured for IRQ dynamic load balancing runs on:
94-
+
95-
[source,terminal]
96-
----
97-
$ oc exec -it dynamic-irq-pod -- /bin/bash -c "grep Cpus_allowed_list /proc/self/status | awk '{print $2}'"
98-
----
99-
+
100-
.Expected output
101-
+
102-
[source,terminal]
103-
----
104-
Cpus_allowed_list: 2-3
105-
----
106-
. Ensure the node configuration is applied correctly. Log in to the node to verify the configuration.
107-
+
108-
[source,terminal]
109-
----
110-
$ oc debug node/<node-name>
111-
----
112-
+
113-
.Expected output
114-
+
115-
[source,terminal]
116-
----
117-
Starting pod/<node-name>-debug ...
118-
To use host binaries, run `chroot /host`
119-
120-
Pod IP: <ip-address>
121-
If you do not see a command prompt, try pressing enter.
122-
123-
sh-4.4#
124-
----
125-
126-
. Verify that you can use the node file system:
127-
+
128-
[source,terminal]
129-
----
130-
sh-4.4# chroot /host
131-
----
132-
+
133-
.Expected output
134-
+
135-
[source,terminal]
136-
----
137-
sh-4.4#
138-
----
139-
140-
. Ensure the default system CPU affinity mask does not include the `dynamic-irq-pod` CPUs, for example, CPUs 2 and 3.
141-
+
142-
[source,terminal]
143-
----
144-
$ cat /proc/irq/default_smp_affinity
145-
----
146-
+
147-
.Example output
148-
+
149-
[source,terminal]
150-
----
151-
33
152-
----
153-
. Ensure the system IRQs are not configured to run on the `dynamic-irq-pod` CPUs:
154-
+
155-
[source,terminal]
156-
----
157-
find /proc/irq/ -name smp_affinity_list -exec sh -c 'i="$1"; mask=$(cat $i); file=$(echo $i); echo $file: $mask' _ {} \;
158-
----
159-
+
160-
.Example output
161-
+
162-
[source,terminal]
163-
----
164-
/proc/irq/0/smp_affinity_list: 0-5
165-
/proc/irq/1/smp_affinity_list: 5
166-
/proc/irq/2/smp_affinity_list: 0-5
167-
/proc/irq/3/smp_affinity_list: 0-5
168-
/proc/irq/4/smp_affinity_list: 0
169-
/proc/irq/5/smp_affinity_list: 0-5
170-
/proc/irq/6/smp_affinity_list: 0-5
171-
/proc/irq/7/smp_affinity_list: 0-5
172-
/proc/irq/8/smp_affinity_list: 4
173-
/proc/irq/9/smp_affinity_list: 4
174-
/proc/irq/10/smp_affinity_list: 0-5
175-
/proc/irq/11/smp_affinity_list: 0
176-
/proc/irq/12/smp_affinity_list: 1
177-
/proc/irq/13/smp_affinity_list: 0-5
178-
/proc/irq/14/smp_affinity_list: 1
179-
/proc/irq/15/smp_affinity_list: 0
180-
/proc/irq/24/smp_affinity_list: 1
181-
/proc/irq/25/smp_affinity_list: 1
182-
/proc/irq/26/smp_affinity_list: 1
183-
/proc/irq/27/smp_affinity_list: 5
184-
/proc/irq/28/smp_affinity_list: 1
185-
/proc/irq/29/smp_affinity_list: 0
186-
/proc/irq/30/smp_affinity_list: 0-5
187-
----
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * scalability_and_performance/low_latency_tuning/cnf-provisioning-low-latency-workloads.adoc
4+
5+
:_mod-docs-content-type: PROCEDURE
6+
[id="cnf-configuring-high-priority-workload-pods_{context}"]
7+
= Disabling power saving mode for high priority pods
8+
9+
You can configure pods to ensure that high priority workloads are unaffected when you configure power saving for the node that the workloads run on.
10+
11+
When you configure a node with a power saving configuration, you must configure high priority workloads with performance configuration at the pod level, which means that the configuration applies to all the cores used by the pod.
12+
13+
By disabling P-states and C-states at the pod level, you can configure high priority workloads for best performance and lowest latency.
14+
15+
.Configuration for high priority workloads
16+
[cols="1,2,3", options="header"]
17+
18+
|===
19+
| Annotation | Possible Values | Description
20+
21+
|`cpu-c-states.crio.io:` a| * `"enable"`
22+
* `"disable"`
23+
* `"max_latency:microseconds"` | This annotation allows you to enable or disable C-states for each CPU. Alternatively, you can also specify a maximum latency in microseconds for the C-states. For example, enable C-states with a maximum latency of 10 microseconds with the setting `cpu-c-states.crio.io`: `"max_latency:10"`. Set the value to `"disable"` to provide the best performance for a pod.
24+
25+
| `cpu-freq-governor.crio.io:` | Any supported `cpufreq governor`. | Sets the `cpufreq` governor for each CPU. The `"performance"` governor is recommended for high priority workloads.
26+
|===
27+
28+
.Prerequisites
29+
30+
* You have configured power saving in the performance profile for the node where the high priority workload pods are scheduled.
31+
32+
.Procedure
33+
34+
. Add the required annotations to your high priority workload pods. The annotations override the `default` settings.
35+
+
36+
.Example high priority workload annotation
37+
[source,yaml]
38+
----
39+
apiVersion: v1
40+
kind: Pod
41+
metadata:
42+
...
43+
annotations:
44+
...
45+
cpu-c-states.crio.io: "disable"
46+
cpu-freq-governor.crio.io: "performance"
47+
...
48+
...
49+
spec:
50+
...
51+
runtimeClassName: performance-<profile_name>
52+
...
53+
----
54+
55+
. Restart the pods to apply the annotation.

modules/cnf-configuring-huge-pages.adoc

+2-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
// Module included in the following assemblies:
2-
//CNF-78 (4.4)
2+
//
33
// * scalability_and_performance/cnf-low-latency-tuning.adoc
4+
// * scalability_and_performance/low_latency_tuning/cnf-tuning-low-latency-nodes-with-perf-profile.adoc
45

56
[id="cnf-configuring-huge-pages_{context}"]
67
= Configuring huge pages

0 commit comments

Comments
 (0)