-
Notifications
You must be signed in to change notification settings - Fork 59
e2e CI Job #259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I don't think it is required as long as we can run the test manually after each commit. btw, this requires #244 |
I can help work on the vLLM CPU version. I will give some update soon |
I see that there was intent for vllm to support a |
Okay but manual testing post commit is a bit risky. Adding the CI job should be a high priority post v0.1. |
I build one, publish it in my personal dockerhub and run it successfully. The only thing worth attention is the image size is still ~9GB which could introduce
Scripts I used
Following code snippet should be working as well.
|
@Jeffwan I can reproduce #259 (comment) but I can't run the qwen model. See this gist for details. |
@danehans Let me check the gist and see if there's anything I can help with |
@Jeffwan I also tried running it, in my env it fails to run even in Docker (also fails in an Openshift cluster). |
@Jeffwan @danehans I was able to make a lot of progress and run the cpu based in Openshift.
|
I was able to run the following: see below the models REST call:
and the
one point to notice is that the completions request took 2:23 minutes. the next step would be to expose the pods via gateway and HttpRoute and then do the same calls from outside the cluster. |
the next question I have is - |
This gist should help guide you through the process: https://gist.github.com/danehans/f19f8805155571b65c68cfe057b6e09c If pulling the Qwen model requires a HF token, we'll need to reopen this: @robscott may have additional guidance to share. |
/assign |
the cpu based example seems to work without the HF token. |
opened issue kubernetes/test-infra#34495, but this should wait until #485 is merged. |
Create a pre-submit CI job that runs e2e test.
The text was updated successfully, but these errors were encountered: