Skip to content

[Bug]: Guided Decoding Backend options with the OpenAI server recently broken #17002

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1 task done
tjohnson31415 opened this issue Apr 22, 2025 · 7 comments · Fixed by #17008
Closed
1 task done

[Bug]: Guided Decoding Backend options with the OpenAI server recently broken #17002

tjohnson31415 opened this issue Apr 22, 2025 · 7 comments · Fixed by #17008
Labels
bug Something isn't working structured-output

Comments

@tjohnson31415
Copy link
Contributor

tjohnson31415 commented Apr 22, 2025

Your current environment

vLLM installed with:

pip install https://wheels.vllm.ai/5536b30a4c7877d75758d21bdaf39b3a59aa2dc2/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl

🐛 Describe the bug

After merging #16789, using "options" for guided decoding backends no longer works. Attempting to include a backend option results in:

$ vllm serve meta-llama/Llama-3.2-3B-Instruct --guided-decoding-backend xgrammar:disable-any-whitespace
INFO 04-22 18:45:12 [__init__.py:239] Automatically detected platform cuda.
usage: vllm serve [model_tag] [options]
vllm serve: error: argument --guided-decoding-backend: invalid choice: 'xgrammar:disable-any-whitespace' (choose from 'auto', 'outlines', 'lm-format-enforcer', 'xgrammar')

The new type checking of the args checks against a Literal type for the backend name, disallowing any options. For reference, backend options are briefly documented REF:

Additional backend-specific options can be supplied in a comma separated list following a colon after the backend name. For example "xgrammar:no-fallback" will not allow vLLM to fallback to a different backend on error.

Note that there are a few backend options that can be combined like guidance:disable-any-whitespace,no-fallback, so simply adding entries to the list of Literals seems untenable. I encountered this bug when writing up a PR to add another option #15949.

cc: @russellb @hmellor

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@tjohnson31415 tjohnson31415 added the bug Something isn't working label Apr 22, 2025
@hmellor
Copy link
Member

hmellor commented Apr 22, 2025

Thanks for the bug report!

Personally, I'd prefer to split this into at least two args rather than making one arg do two things.

cc @joerunde whon added the backend:option1,option2 functionality.

@russellb
Copy link
Member

If we change it, we should probably have a deprecation period where the previous format still works but raises a warning. Either way, I think the previous format needs to keep working for now.

@hmellor
Copy link
Member

hmellor commented Apr 22, 2025

Here's a draft PR (unfinished) which would re-enable these options and keep the old format until we decide we can drop it #17008

@hmellor
Copy link
Member

hmellor commented Apr 23, 2025

Before I put more work into that PR, is everyone happy with that as a solution? Rather simply fixing the custom arg processing we had befoire?

@tjohnson31415
Copy link
Contributor Author

IMO, using an option string was a good quick-fix for no-fallback, but doesn't scale well as options/backends are added. If we are making a change, switching to something like --guided-decoding-config and allowing it to be a dict/JSON (with backend-specific fields) would generalize better and is more aligned with other configurations.

WDYT @russellb @joerunde?

@joerunde
Copy link
Collaborator

IMO, using an option string was a good quick-fix for no-fallback, but doesn't scale well as options/backends are added. If we are making a change, switching to something like --guided-decoding-config and allowing it to be a dict/JSON (with backend-specific fields) would generalize better and is more aligned with other configurations.

👍 , yeah I was looking for the quick fix there and it seemed like a much better idea than adding more environment variables at the time. I agree we can change it to split out these options into a second list or dict, a dict would probably be more flexible in case we need key-value pairs in the future. I'm fine if we stick a warning on the existing string options now and delete that support in v0.9.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working structured-output
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

4 participants