Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PoC implementation #4

Merged
merged 1 commit into from
Sep 18, 2024
Merged

PoC implementation #4

merged 1 commit into from
Sep 18, 2024

Conversation

kfswain
Copy link
Collaborator

@kfswain kfswain commented Sep 4, 2024

Implementation of PoC shared in wg-serving.

Making the code public so that other contributors can test our implementation, reproduce numbers and check assumptions.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Sep 4, 2024
cluster: original_destination_cluster
timeout: 1000s # Increase route timeout
http_filters:
- name: envoy.filters.http.ext_proc
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey, there's a first class API for ext proc which could replace most of the bootstrap config
https://gateway.envoyproxy.io/docs/tasks/extensibility/ext-proc/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How widely is the API supported? If it's more widely supported than the bootstrap config (or dramatically easier for most who would want to run the demo), then I agree we should follow up with that quickly. Or present it as an alternative.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the bootstrap has some hardcoded values like

                          socket_address:
                            address: 34.118.231.147
                            port_value: 9002

so this is most likely not going to work when reproduced

The Ext Proc API will eliminate the need to hardcode IPs and ports as well as reduce a lot of the envoy proxy filter config which is generated internally.

The API was introduced in v1.1, its in v1alpha1 but its here to stay, and so is the bootstrap field, the API provides a better interface to the user for this use case

@smarterclayton
Copy link
Contributor

If this runs, I would suggest we merge and move ahead and take additional feedback as comments to the merged PR or new PRs / issues. The value of the PoC is people trying it, not reviewing it.

Also, I'd suggest we follow Kube conventions for commits, which are:

One line below commit title length - summarize the issue (70? chars)
any other description in a separate paragraphs

Squash commits and rebase down to minimal diffs (you can have multiple commits if each is logically distinct, like adding a test that fails and then fixing the issue, or delivering a fix before your main feature changes).

@kfswain
Copy link
Collaborator Author

kfswain commented Sep 13, 2024

Squash commits and rebase down to minimal diffs

Ack. Can we do this at merge? Or is it typically before?

@smarterclayton
Copy link
Contributor

At merge or before

Generally recommend everyone squash before they get approval, with a good comment and description that a future version of yourself and others can interpret to understand why the code was included.

Copy link
Member

@terrytangyuan terrytangyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this runs, I would suggest we merge and move ahead and take additional feedback as comments to the merged PR or new PRs / issues. The value of the PoC is people trying it, not reviewing it.

I agree. It would be easier to test out once the PR is merged. This project is in early phase without releases yet so people should understand the risk.

@kfswain
Copy link
Collaborator Author

kfswain commented Sep 16, 2024

At merge or before

Generally recommend everyone squash before they get approval, with a good comment and description that a future version of yourself and others can interpret to understand why the code was included.

Done

- Go (for local development)

## vLLM
***This PoC uses a modified vLLM fork, the public image of the fork is here: `ghcr.io/tomatillo-and-multiverse/vllm:demo`***
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to use a modified fork?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document why the fork.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And specifically link to the commits/commit-range/branch/PR that added them.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I linked to the fork, there is not currently a PR for the fork

@smarterclayton
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 18, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kfswain, smarterclayton

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 18, 2024
@k8s-ci-robot k8s-ci-robot merged commit 90c2b64 into kubernetes-sigs:main Sep 18, 2024
2 checks passed
@kfswain kfswain deleted the poc branch October 28, 2024 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants