-
Notifications
You must be signed in to change notification settings - Fork 85
Amend the endpoint picker protocol to support multiple fallback endpoints #761
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Amend the endpoint picker protocol to support multiple fallback endpoints #761
Conversation
Welcome @wbpcode! |
Hi @wbpcode. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
…ints Signed-off-by: wangbaiping(wbpcode) <[email protected]>
77fbe07
to
c62d3b3
Compare
When I am reviewing the related Envoy implementation of this EPP (envoyproxy/envoy#38757), I found the current design only support single fallback endpoint. That inspire me to create this PR. And if we want to support multiple fallback endpoints, the EPP must send a list of addresses to proxy. If there must be a list of addresses, i think it would be better to make the x-gateway-destination-endpoint a list directly. Then we needn't break the list into different two headers or metadata entries. And from the proxy's pespective, I will prefer only one source to get the external specfied endpoints. And the proxy could iterate the endpoints list in order and select the first valid one. If the retrying happens, the proxy could continue the iteration from the position where it stop last time. cc @yanavlasov |
/assign @LiorLieberman for review /hold |
/ok-to-test |
/cc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @wbpcode!
Question - how would it work with an implementation that leverage envoy subsets?
For ref - see the istio implementation of this API
Sorry,I am not allowed to access the doc. But I think it is impossible for subset lb based solution to support multiple fallback endpoints. That's say, if the istio's implementation is used, then the EPP couldn't support multiple fallback endpoints anyway and the legacy format should be used. And I believe the subset lb based solution couldn't support retrying also. (The problem is similar with the original dst based solution.) I will also suggest the istio to update the implementation to use the new override host lb. From the reference document, seems it's an experimental implementation and hasn't implemented the fallback mechanism, so I think it's possible to take the retrying and multiple fallback endpoints into account at this timepoint. |
I think another alternative is to define a new key like The istio haven't release this feature and I think it's possible to switch subset lb to the override host lb in the future. |
} | ||
``` | ||
|
||
Single fallback endpoint also CAN be set using the key `x-gateway-destination-endpoint-fallback` in the same metadata namespace as one used for `x-gateway-destination-endpoint` as follows: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the API break change acceptable or is there a way to deprecate the x-gateway-destination-endpoint-fallback
? Ideally, the EPP should has a flag to enable the new format or use the legacy format. And then we can deprecate the legacy format gradually.
- Setting different value leads to unpredictable behavior because proxies aren't guaranteed to support both paths, and so this protocol does not define what takes precedence. | ||
|
||
### Destination endpoint fallback | ||
A single fallback endpoint CAN be set using the key `x-gateway-destination-endpoint-fallback` in the same metadata namespace as one used for `x-gateway-destination-endpoint` as follows: | ||
|
||
For each HTTP request, if destination endpoint fallback is necessary or possible, the EPP CAN set the `x-gateway-destination-endpoint` HTTP header or metadata entry with multiple addresses in `<ip:port>,<ip:port>,...` format. Multiple addresses are separated by commas. The first valid endpoint in the addresses list will be used as the primary endpoint. And if retrying is happening, the proxy will try the endpoints after the selected endpoint in order. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated content of the ## Destination Endpoint
section.
The value of the header or metadata entry MUST contains at least one endpoint in `<ip:port>` format or multiple endpoints in `<ip:port>,<ip:port>,...` format. Multiple endpoints are separated by commas. The first valid endpoint in the value will be used. And if retrying is happening, the proxy will try the endpoints after the previously selected endpoint in order.
The x-gateway-destination-endpoint-fallback
is still be kept for backward compatibility.
- Setting different value leads to unpredictable behavior because proxies aren't guaranteed to support both paths, and so this protocol does not define what takes precedence. | ||
|
||
### Destination endpoint fallback | ||
A single fallback endpoint CAN be set using the key `x-gateway-destination-endpoint-fallback` in the same metadata namespace as one used for `x-gateway-destination-endpoint` as follows: | ||
|
||
For each HTTP request, if destination endpoint fallback is necessary or possible, the EPP CAN set the `x-gateway-destination-endpoint` HTTP header or metadata entry with multiple addresses in `<ip:port>,<ip:port>,...` format. Multiple addresses are separated by commas. The first valid endpoint in the addresses list will be used as the primary endpoint. And if retrying is happening, the proxy will try the endpoints after the selected endpoint in order. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this is a breaking change that will impact the Istio implementation at least, which is based on Envoy LB subsetting and requires a single value to be specified.
Adding @LiorLieberman for feedback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From API's perspective, this is not a break change. But if take the implementation detail into account, this actually may break the istio implementation. But ideally, we should have a flag to enable the new format or use legacy format in the EPP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left a comment below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The istio haven't release this feature and I think it's possible to switch subset lb to the override host lb in the future.
Its definitely possible to switch. I assume you are talking about envoyproxy/envoy#38757 ?
Will need to to catch up with this PR its been a month or two since I last look at it.
This PR is not even merged yet though - whats the timeline to get it a non experimental API in envoy?
We hope the envoyproxy/envoy#38757 could be used to support EPP with multiple fallback endpionts and retrying support. If we can get agreement with the multiple addresses header/metadata format ( |
cc @LiorLieberman The override host lb is merged and will be released at next envoy major version. :) With the new override host lb, both the legacy format and the new format could be supported. With following configuration, all the endpoints will be joined as an endpoints list. The first valid endpoint will be used. And if retrying is happening, the proxy will try the endpoints after the previously selected endpoint in order. By this way, multiple fallback endpoints, retrying could be supported. And it's compatible to the legacy format (every header/metadata contains only one endpoint).
or
or
|
Signed-off-by: wangbaiping(wbpcode) <[email protected]>
Signed-off-by: wangbaiping(wbpcode) <[email protected]>
Thanks, will take a deeper look tmrw. FTR I opened istio/istio#56230 and cc'd you earlier today. Also https://docs.google.com/document/d/1F82G_o3ENM94Imx5pYKG0x-m508_FSJyfbu1d8J8CCk/edit which is still in early stages (you will need to be either istio or dev@Kubernetes to view it) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am going to take this proposal with other istio folks, hopefully we can converge on override_host_lb, but would like to get the feedback first.
@wbpcode To clarify - what is the benefit of supporting multiple keys in x-gateway-destination-endpoint
as opposed to x-gateway-destination-endpoint-fallback
or -fallback*s*
?
- This will simplify the proxy implementation. The proxy needn't to handle different endpoint sources.
- This will simplify the EPP implementation because the EPP needn't to set different endpoint sources.
- It's possible to specify the fallback endpoints by HTTP header now.
I dont see any major simplification by moving the fallbacks to x-gateway-destination-endpoint. I would in fact prefer to leave x-gateway-destination-endpoint
a single value now, and in the future maybe evolve it to support some key:value map or list formatted values. For example one random idea in mind, x-gateway-destination-endpoint: zone-a:<ip:port>,zone-b:<ip-port>
to support some version of topology. Is this too crazy?
The key benefit is the multiple fallback endpoints support. In the current proposal, only one primary endpoint and an optional fallback endpoint could be selected. With the new format, the EPP server could select a list of endpoints and the proxy could try them one by one. And from the perspective of the API, this is a compatible change.
At the initial design of the override host lb, we designed a little complex API to support fallback endpoints. This is why I think moving the fallbacks to x-gateway-destination-endpoint could simplify the proxy implementation. But finally we resolved it in the new design. But anyway, I still think single/clear/explicit endpoints source would be helpful for both proxy and EPP server.
If there is reasonable requirement, I think it's fine to support more complex format in the future. But considering the fact the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need 2 lists of endpoints? I would expect either a single endpoint + fallback OR a list of endpoints, but not 2 lists?
@@ -58,7 +60,16 @@ dynamicMetadata: { | |||
} | |||
``` | |||
|
|||
### Why envoy.lb namespace as a default? | |||
The endpoints specified in `x-gateway-destination-endpoint-fallback` MAY be tried after the endpoints specified in `x-gateway-destination-endpoint` if all the endpoints specified in `x-gateway-destination-endpoint` are unavailable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When x-gateway-destination-endpoint
defines more than one IP:Port
, the additional entries act as a fallback, correct? If so, why is x-gateway-destination-endpoint-fallback
still needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's unnecessary, but not sure if it's ok or not to remove it directly from the proposal. I will removed it first.
Yeah. I think we all prefer the only one list of endpoints. In the format, Only the I have removed the |
Signed-off-by: wangbaiping(wbpcode) <[email protected]>
Signed-off-by: wangbaiping(wbpcode) <[email protected]>
@LiorLieberman, can you confirm whether |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is reasonable requirement, I think it's fine to support more complex format in the future. But considering the fact the ip:port basically is the most fine-grained unit, I didn't get why the zone make sense here?
Theoretically the API can evolve to support something like "this is the best endpoint in this zone" (or any other topology key). Thats what I was referring to.
@LiorLieberman, can you confirm whether x-gateway-destination-endpoint-fallback is still required by the Istio implementation? If so, @wbpcode, we can mark it as deprecated and remove it in a future release.
Istio does not support this key currently, so it is safe from this perspective. But would be good to check with all implementations of this API, last time we had a breaking change in the EPP istio release could not function.
If you all in agreement that one list is better than two, I am fine with that. I thought a different key is slightly cleaner and leaves more room for expansion but this opinion is not very strong.
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: LiorLieberman, wbpcode The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@ahg-g, since you applied the hold, PTAL now that @LiorLieberman provided feedback. cc: @arkodg and @kfswain to confirm whether Envoy Gateway uses |
friendly ping @ahg-g |
Part of #414
x-gateway-destination-endpoint
to support multiple endpoints.