Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support full duplex streaming #450

Merged
merged 18 commits into from
Mar 6, 2025
Merged

Conversation

kfswain
Copy link
Collaborator

@kfswain kfswain commented Mar 5, 2025

This PR supports the FULL_DUPLEX_STREAMED mode for ext-proc.

Fixes #388

This feature is currently only enabled via an env_var as it is still experimental.

Follow ups:

  • support full streaming, we are currently buffering the body onto the EPP. We can instead use decoder.Token() to read JSON tokens and stream back the body as it comes in.
  • Merge both server implementations together
    • If not feasible, refactor to share logic between extproc server implementations
  • Test coverage (this pr woefully has none, we need to fix this ASAP)

@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 5, 2025
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 5, 2025
@kfswain kfswain requested a review from ahg-g March 5, 2025 02:15
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kfswain

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 5, 2025
Copy link

netlify bot commented Mar 5, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 7c6c599
🔍 Latest deploy log https://app.netlify.com/sites/gateway-api-inference-extension/deploys/67c8e278285ad400084c6ce5
😎 Deploy Preview https://deploy-preview-450--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 5, 2025
# operation:
# op: replace
# path: "/default_filter_chain/filters/0/typed_config/http_filters/0/typed_config/processing_mode/response_header_mode"
# value: SEND

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should you set the request_header_mode to SEND as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It actually already is via:

We just need to populate the request field for it to be included as per: https://gateway.envoyproxy.io/latest/api/extension_types/#extprocprocessingmode

I suppose we could include it here for completeness though. Open to either

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I would like to do away with this specific patch policy stuff and use EnvoyExtensionPolicy for this. I have PRs out to Envoy to support envoyproxy/gateway#5349 & envoyproxy/envoy#38578. I just need to follow up on those and get them unstuck.

}

switch v := req.Request.(type) {
case *extProcPb.ProcessingRequest_RequestHeaders:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few questions:

  1. Does the streaming server buffer the entire body, or buffer just a portion of the body, then send the response?

  2. Do you have some integration tests with Envoy <-> StreamingServer to test the end-to-end functionalities?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Currently we are buffering the whole body as it was faster to implement, but a follow up is to stream the response back.
  2. Not yet. I've been testing this all manually on my own cluster. I also have that as a follow up

Copy link

linux-foundation-easycla bot commented Mar 5, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Mar 5, 2025
@kfswain
Copy link
Collaborator Author

kfswain commented Mar 5, 2025

EasyCLA started happy hour a little early today: communitybridge/easycla#4605

@robscott robscott mentioned this pull request Mar 5, 2025
@robscott
Copy link
Member

robscott commented Mar 5, 2025

/check-cla

Copy link
Contributor

@ahg-g ahg-g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Kellen, mostly asking questions

- -grpcPort
- "9002"
- -grpcHealthPort
- "9003"
env:
- name: USE_STREAMING
value: "true"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use_streaming even if the ext-proc is setup in buffered mode?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried it, and it didn't work out of the box. I think because we have to use the streaming response body.

We can probably do some generalizing (i.e. if we recieve a stream in, we can assume stream out)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, then we probably want to set this to false then since we are commenting out the streaming part in the patch.

type StreamRequestState int

const (
RequestReceived StreamRequestState = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you comment if those are states that we defined or states part of the ext-proc streaming protocol?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are states I defined. But based on the protocol we need to follow. I used the states because simple nil checking of the response object we need to send back is not enough. (Imagine the case where we receive a stream chunk, but that chunk doesnt have the model field. So even though the bodyResponse is non-nil, we cant send it yet b/c we havent sent the header yet)

case *extProcPb.ProcessingRequest_RequestBody:
loggerVerbose.Info("Incoming body chunk", "body", string(v.RequestBody.Body), "EoS", v.RequestBody.EndOfStream)
go func() {
_, err := writer.Write(v.RequestBody.Body)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this were we block to get the whole stream?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, each stream chunk that comes in is passed to the writer, the writer then blocks until it's read. Once we get an EndofStream we let the decoder use the reader to read them all, and close the reader (which closes the corresponding writer(s)).

For a true stream, we would need to leverage decode.Token() and stream back what we can. Although since we modify the body it may be challenging to not buffer.

- -grpcPort
- "9002"
- -grpcHealthPort
- "9003"
env:
- name: USE_STREAMING
value: "true"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, then we probably want to set this to false then since we are commenting out the streaming part in the patch.

// To buffer the full message, we create a goroutine with a writer.Write()
// call, which will block until the corresponding reader reads from it.
// We do not read until we receive the EndofStream signal, and then
// decode the entire JSON body.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a sequence id on the chunks that we need to adhere to when writing to the pipe? is it possible that we receive the chunks out of order?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all the web protocol portion is handled before the Process() func is called, as I didn't see any race condition issues related to this with my testing.

Additionally, the only fields exposed in the RequestBody are Body and EndofStream: https://github.com/envoyproxy/go-control-plane/blob/66fc0a3b55b04be5eeb362fb9639a51845218b38/envoy/service/ext_proc/v3/external_processor.pb.go#L605

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kfswain kfswain force-pushed the duplex-stream branch 2 times, most recently from 4fe978e to f65f43b Compare March 5, 2025 23:45
kfswain added 4 commits March 5, 2025 23:46

Verified

This commit was signed with the committer’s verified signature.
kfswain Kellen Swain

Verified

This commit was signed with the committer’s verified signature.
kfswain Kellen Swain

Verified

This commit was signed with the committer’s verified signature.
kfswain Kellen Swain

Verified

This commit was signed with the committer’s verified signature.
kfswain Kellen Swain
kfswain added 14 commits March 5, 2025 23:46

Verified

This commit was signed with the committer’s verified signature.
kfswain Kellen Swain
… request and response

Verified

This commit was signed with the committer’s verified signature.
kfswain Kellen Swain

Verified

This commit was signed with the committer’s verified signature.
kfswain Kellen Swain

Verified

This commit was signed with the committer’s verified signature.
kfswain Kellen Swain

Verified

This commit was signed with the committer’s verified signature.
kfswain Kellen Swain
…ementing on both servers

Verified

This commit was signed with the committer’s verified signature.
kfswain Kellen Swain

Verified

This commit was signed with the committer’s verified signature.
kfswain Kellen Swain

Verified

This commit was signed with the committer’s verified signature.
kfswain Kellen Swain

Verified

This commit was signed with the committer’s verified signature.
kfswain Kellen Swain

Verified

This commit was signed with the committer’s verified signature.
kfswain Kellen Swain

Verified

This commit was signed with the committer’s verified signature.
kfswain Kellen Swain

Verified

This commit was signed with the committer’s verified signature.
kfswain Kellen Swain

Verified

This commit was signed with the committer’s verified signature.
kfswain Kellen Swain

Verified

This commit was signed with the committer’s verified signature.
kfswain Kellen Swain
@ahg-g
Copy link
Contributor

ahg-g commented Mar 5, 2025

This looks good to me, but we need to add proper test coverage as a followup. Let me know when ready and I can tag it.

@kfswain kfswain changed the title [WIP] Support full duplex streaming Support full duplex streaming Mar 5, 2025
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 5, 2025
@BenTheElder
Copy link
Member

/check-cla

@BenTheElder
Copy link
Member

Maybe communitybridge/easycla#4605

@ahg-g
Copy link
Contributor

ahg-g commented Mar 6, 2025

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 6, 2025
@kfswain kfswain merged commit 70965a0 into kubernetes-sigs:main Mar 6, 2025
6 of 8 checks passed
@kfswain
Copy link
Collaborator Author

kfswain commented Mar 6, 2025

I forced merged this due to EasyCLA having an outage: communitybridge/easycla#4605

That was the only blocker holding this PR.
Apologies for any issues/alarms that go off.

@jarias-lfx
Copy link

/easycla

@kfswain kfswain deleted the duplex-stream branch March 27, 2025 19:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support ext_proc FULL_DUPLEX_STREAMED mode
7 participants