-
Notifications
You must be signed in to change notification settings - Fork 277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Don't try watching items with missing resourceVersion #704
base: master
Are you sure you want to change the base?
fix: Don't try watching items with missing resourceVersion #704
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #704 +/- ##
==========================================
- Coverage 54.26% 53.33% -0.94%
==========================================
Files 64 64
Lines 6164 6433 +269
==========================================
+ Hits 3345 3431 +86
- Misses 2549 2726 +177
- Partials 270 276 +6 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
ff1a458
to
10b1bbe
Compare
Trying to watch items without resourceVersion will fail. Instead of failing and retrying every second, wait until the next resync interval and try again then. This reduces load and log spam. Signed-off-by: Daniel Vassdal <[email protected]>
10b1bbe
to
8d347a2
Compare
|
Does this mean an aggregated API, or something different? |
Yeah that's it, sorry for the imprecise language. If I read the Kubernetes API Conventions document right I think it's probably safe to assume only aggregated APIs will ever return resources without resourceVersion, but I'm not certain. |
Watching fails for resources missing a
resourceVersion
.The watch is then retried again a second later, ad infinitum.
This leads to extra resource use, and an obscene amount of noise in the logs.
When
resourceVersion
is missing, this is normally because it's not a real object, but rather an interface to get some information from an api extension - e.g. Antrea uses this pattern, as shown in the example log below.This issue appears to be the same thing: argoproj/argo-cd#21092
This merge request changes the behaviour for missing
resourceVersion
to wait for the next resync interval before trying again.I figure this should be fine as these resources are already not working.
If you have any concerns or would like me to go about this another way (e.g. by implementing backoff instead), let me know and I'll update the MR.
Example of log illustrating the issue
The following is a three second excerpt from a cluster with multiple resources affected by this.
Example of log with this MR applied
Note: Only contains lines containing
Deployment
andNetworkPolicy.controlplane
.