Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify When to Set the timestamp and containersStatuses Fields in PodSandboxStatusResponse #129857

Open
HirazawaUi opened this issue Jan 28, 2025 · 11 comments · May be fixed by #131127
Open

Clarify When to Set the timestamp and containersStatuses Fields in PodSandboxStatusResponse #129857

HirazawaUi opened this issue Jan 28, 2025 · 11 comments · May be fixed by #131127
Assignees
Labels
kind/documentation Categorizes issue or PR as related to documentation. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@HirazawaUi
Copy link
Contributor

HirazawaUi commented Jan 28, 2025

When the evented PLEG is enabled, the kubelet determines whether to retrieve container statuses from the PodSandboxStatusResponse by checking if its timestamp field is empty.
Ref:

if resp.Timestamp == 0 {
// If the Evented PLEG is enabled in the kubelet, but not in the runtime
// then the pod status we get will not have the timestamp set.
// e.g. CI job 'pull-kubernetes-e2e-gce-alpha-features' will runs with
// features gate enabled, which includes Evented PLEG, but uses the
// runtime without Evented PLEG support.
klog.V(4).InfoS("Runtime does not set pod status timestamp", "pod", klog.KObj(pod))
containerStatuses, err = m.getPodContainerStatuses(ctx, uid, name, namespace)
if err != nil {
if m.logReduction.ShouldMessageBePrinted(err.Error(), podFullName) {
klog.ErrorS(err, "getPodContainerStatuses for pod failed", "pod", klog.KObj(pod))
}
return nil, err
}
} else {
// Get the statuses of all containers visible to the pod and
// timestamp from sandboxStatus.
timestamp = time.Unix(0, resp.Timestamp)
for _, cs := range resp.ContainersStatuses {
cStatus := m.convertToKubeContainerStatus(cs)
containerStatuses = append(containerStatuses, cStatus)
}
}

However, in containerd/containerd#10740, the timestamp field of PodSandboxStatusResponse was populated, while the containersStatuses field was not. This clearly breaks the evented PLEG functionality.

The maintainers of containerd were unaware of how the kubelet uses the timestamp field, and there are no existing comments documenting when these fields should be populated.

It's worth noting that the KEP mentions that the PodSandboxStatusRequest would include an includeContainers field, which we use to determine whether to return the timestamp and containerStatuses in the PodSandboxStatusResponse. However, we have not added this field in the CRI proto definitions.

To address this, I propose clarifying in the comments of PodSandboxStatusResponse the conditions under which the timestamp and containersStatuses fields should be set.

While this is a minor problem, it has already caused misunderstandings and impacted functionality. For this reason, I am opening an issue to discuss and formally document the expectations around these fields.

relate: #129818

@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jan 28, 2025
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@HirazawaUi
Copy link
Contributor Author

/sig node
/kind documentation

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. kind/documentation Categorizes issue or PR as related to documentation. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jan 28, 2025
@HirazawaUi
Copy link
Contributor Author

HirazawaUi commented Feb 5, 2025

I've also opened a PR in the k/k repo to address the issue where kubelet is unable to retrieve container status from PodSandboxStatusResponse.

@HirazawaUi
Copy link
Contributor Author

@HirazawaUi
Copy link
Contributor Author

HirazawaUi commented Feb 21, 2025

Currently in CRI-O, the parameter enable_pod_events determines whether CRI-O reports container events and includes container states in the `PodSandboxStatusResponse.

However, containerd does not have this parameter. When I opened a PR requesting that containerd return both the timestamp and containersStatuses fields, the containerd maintainers have the same question regarding this issue.
ref: containerd/containerd#11408 (comment)

This would avoid returning unnecessary container states in the PodSandboxStatus interface if users do not actually require them. If we do not plan to retain the includeContainers field, containerd may need to introduce a parameter similar to CRI-O's enable_pod_events.

@cdacosta512
Copy link

/assign

@cdacosta512
Copy link

@HirazawaUi I'll go ahead and update the comments in the PodSandboxStatusResponse proto message to clarify the expectations around the timestamp and containers_statuses fields. Specifically, I’ll document that if timestamp is set (indicating Evented PLEG support), then containers_statuses must also be populated. This should help prevent future misunderstandings between the Kubelet and runtime implementations.

I'll open a PR shortly.

@HirazawaUi
Copy link
Contributor Author

@HirazawaUi I'll go ahead and update the comments in the PodSandboxStatusResponse proto message to clarify the expectations around the timestamp and containers_statuses fields. Specifically, I’ll document that if timestamp is set (indicating Evented PLEG support), then containers_statuses must also be populated. This should help prevent future misunderstandings between the Kubelet and runtime implementations.

I'll open a PR shortly.

Please wait. This issue hasn’t been discussed by sig-node yet, so we’re still unsure about what the correct behavior should be.

@cdacosta512
Copy link

@HirazawaUi Ok understood. PR was opened to address the change but what should the path forward be regarding the PR. Can it remain unreviewed or do you need me to close it for now? Tks

@HirazawaUi
Copy link
Contributor Author

@HirazawaUi Ok understood. PR was opened to address the change but what should the path forward be regarding the PR. Can it remain unreviewed or do you need me to close it for now? Tks

This issue might not be resolved in the short term. If you’re willing to push it forward, you could keep the PR open until the issue gains attention and a resolution direction is determined (though this might take quite a bit of time...).

@cdacosta512
Copy link

OK, I most likely close in the morning, but I'll pick up another issue in the interim. Tks

@SergeyKanzhelev SergeyKanzhelev moved this from Triaged to Triage in SIG Node Bugs Apr 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/documentation Categorizes issue or PR as related to documentation. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
Status: Triage
3 participants