-
Notifications
You must be signed in to change notification settings - Fork 53
Fix: EC2 controller for vpcendpoint does not update status #191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hi @nnbu. Thanks for your PR. I'm waiting for a aws-controllers-k8s member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thank you @nnbu !
Shall we make a release for this?
build_date: "2024-05-21T22:16:46Z" | ||
build_hash: d660ee36fe947607ebea039acd47c35477b4a836 | ||
go_version: go1.21.1 | ||
version: v0.28.0-58-gd660ee3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can please you use the latest version of code-gen?
/hold
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@nnbu can you also update the e2e tests? i think we might have to wait more than 5x30 seconds |
Issue : aws-controllers-k8s/community#2075 Description of changes: The issue is that vpce CR status is not updated. This happens because when vpce is created, it takes around a minute for it to be created in aws. Due to this, it's status is kept pending. But there is no sync enforced on the reconciler. Due to this, reconciliation happens after 10h default time. At this time, status would be set correctly. This fix enforces reconciliation to make sure CR is updated correctly when vpce creation in aws is successful. By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @nnbu !
/lgtm
/test all
/unhold |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: a-hilaly, nnbu The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…ollers-k8s#191) Issue : aws-controllers-k8s/community#2075 Description of changes: The issue is that vpce CR status is not updated. This happens because when vpce is created, it takes around a minute for it to be created in aws. Due to this, it's status is kept `pending`. But there is no sync enforced on the reconciler. Due to this, reconciliation happens after 10h default time. At this time, status would be set correctly. This fix enforces reconciliation to make sure CR is updated correctly when vpce creation in aws is successful. By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
…ollers-k8s#191) Issue : aws-controllers-k8s/community#2075 Description of changes: The issue is that vpce CR status is not updated. This happens because when vpce is created, it takes around a minute for it to be created in aws. Due to this, it's status is kept `pending`. But there is no sync enforced on the reconciler. Due to this, reconciliation happens after 10h default time. At this time, status would be set correctly. This fix enforces reconciliation to make sure CR is updated correctly when vpce creation in aws is successful. By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
…ollers-k8s#191) Issue : aws-controllers-k8s/community#2075 Description of changes: The issue is that vpce CR status is not updated. This happens because when vpce is created, it takes around a minute for it to be created in aws. Due to this, it's status is kept `pending`. But there is no sync enforced on the reconciler. Due to this, reconciliation happens after 10h default time. At this time, status would be set correctly. This fix enforces reconciliation to make sure CR is updated correctly when vpce creation in aws is successful. By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
I found a potential bug with this implementation: aws-controllers-k8s/community#1933 (comment) It seems that we stop syncing the state once the VPCEndpoint becomes |
The fix addresses the issue for the case when the vpc endpoint goes from
`pending` to `available`.
Your latest comment is about the case when the state changes from
`available` to `rejected`. As you mentioned, the trigger for this
transition is manual trigger which is not managed/controlled by ack
controller.
It is not possible to handle such cases without reconciling every single
resource at a sufficiently high frequency. E.g. consider that if we
delete/change any resource out of the scope of ack controllers, then it
won't be reflected in ack immediately.
Ack has a default reconciling frequency of 10 hours. So, only at that
frequency, ack detects the change of state of the resource.
…On Wed, Jun 5, 2024, 2:55 PM Junfeng Wu ***@***.***> wrote:
@nnbu <https://github.com/nnbu> @a-hilaly <https://github.com/a-hilaly>
I found a potential bug with this implementation: aws-controllers-k8s/community#1933
(comment)
<aws-controllers-k8s/community#1933 (comment)>
It seems that we stop syncing the state once the VPCEndpoint becomes
available. However an available VPCEndpoint can still later be rejected
and the controller will not update the state accordingly.
—
Reply to this email directly, view it on GitHub
<#191 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ASMB42WQGIMFZJF3N3TRNULZF3KSLAVCNFSM6AAAAABICO2JPWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNBZGMYTOMBYGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
|
||
if !vpcEndpointAvailable(&resource{ko}) { | ||
// Setting resource synced condition to false will trigger a requeue of | ||
// the resource. No need to return a requeue error here. | ||
ackcondition.SetSynced(&resource{ko}, corev1.ConditionFalse, nil, nil) | ||
} else { | ||
ackcondition.SetSynced(&resource{ko}, corev1.ConditionTrue, nil, nil) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nnbu Thanks for the explanation in this comment, it makes sense to me mostly! I still have one question over this implmenetation.
Here we consider a VPCEndpoint
resource not synced as long as it's not in available
state. However a VPC endpoint can be in pendingAcceptance
state after creation and stay in that state forever/later be in rejected
state -- either way it never becomes available
It seems to me that in controller's perspective it should be considered synced
in these cases. In earlier versions of the controller that indeed is that case. But with this change, the VPCEndpoint will remain unsynced after its creation unless it becomes available
later, which may never happen in some cases.
Does it make sense to only force the resource to be unsynced only when the resource is in pending
instead of when it's in a non available
state? This seems more inline with the endpoint states definition:
Pending - The service provider accepted the connection request. This is the initial state if requests are automatically accepted. The VPC endpoint returns to this state if the service consumer modifies the VPC endpoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have such automated pipeline that for a VPCEndpoint that requires manual acceptance, we first wait til the resource to be synced and then accept the connection through API. The change will break the behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
> Does it make sense to only force the resource to be unsynced only when
the resource is in pending instead of when it's in a non available state?
This makes sense totally. What are some of the transitionary states for vpc
endpoint, similar to pending? We should basically treat all such
transitionary states as `notSynced`.
@amine will be able to check with the service teams and get such states.
Depending on that, we can make the change. Your PR looks good to me if
`pending` is the only transitionary state.
…On Wed, Jun 5, 2024, 4:42 PM Junfeng Wu ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In templates/hooks/vpc_endpoint/sdk_read_many_post_set_output.go.tpl
<#191 (comment)>
:
> +
+ if !vpcEndpointAvailable(&resource{ko}) {
+ // Setting resource synced condition to false will trigger a requeue of
+ // the resource. No need to return a requeue error here.
+ ackcondition.SetSynced(&resource{ko}, corev1.ConditionFalse, nil, nil)
+ } else {
+ ackcondition.SetSynced(&resource{ko}, corev1.ConditionTrue, nil, nil)
+ }
@nnbu <https://github.com/nnbu> @a-hilaly <https://github.com/a-hilaly> ,
if it makes sense to you, here's the fix I proposed on top of this change:
#198 <#198>
—
Reply to this email directly, view it on GitHub
<#191 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ASMB42RVPZASY5FXIMXYM53ZF3XC7AVCNFSM6AAAAABICO2JPWVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDAOJYHAZDMNJYGQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
|
Context: #191 (comment) A synced VPC endpoint should only be forced to re-sync when it's in `pending` state. For other states like `pendingAcceptance`, `rejected`, `failed`, `expired`, it could be the final state of the endpoint and the sync should be considered completed. By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
…ollers-k8s#191) Issue : aws-controllers-k8s/community#2075 Description of changes: The issue is that vpce CR status is not updated. This happens because when vpce is created, it takes around a minute for it to be created in aws. Due to this, it's status is kept `pending`. But there is no sync enforced on the reconciler. Due to this, reconciliation happens after 10h default time. At this time, status would be set correctly. This fix enforces reconciliation to make sure CR is updated correctly when vpce creation in aws is successful. By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
…trollers-k8s#198) Context: aws-controllers-k8s#191 (comment) A synced VPC endpoint should only be forced to re-sync when it's in `pending` state. For other states like `pendingAcceptance`, `rejected`, `failed`, `expired`, it could be the final state of the endpoint and the sync should be considered completed. By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
Issue : aws-controllers-k8s/community#2075
Description of changes:
The issue is that vpce CR status is not updated. This happens because when vpce is created, it takes around a minute for it to be created in aws. Due to this, it's status is kept
pending
. But there is no sync enforced on the reconciler. Due to this, reconciliation happens after 10h default time. At this time, status would be set correctly.This fix enforces reconciliation to make sure CR is updated correctly when vpce creation in aws is successful.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.