Poor quality results with reasoning models and structured output #15670
Unanswered
matthew-at-qamcom
asked this question in
Q&A
Replies: 1 comment
-
Great news! The JSON component of structured output doesn't suffer the same issues. Here are some results from different approaches I tried. Guided decoding by Regex (like in the example)Paris: 58 Guided decoding by Regex (like in the example) but this time with temperature set to zeroParis: 48 Turning off the guided Regex and going back to a temperature of a oneParis: 100 (I'm just looking to see if "Paris" or "London" appears in the response.) Trying a one-shot prompt with guided regexI modified the prompt to be Paris: 6 Wow. Same one-shot prompt without guided regexParis: 100 Testing out guided JSON
Paris: 100 Much better :o) Guided JSON that's closer to the original problem
Paris: 100 So my solution will be to use guided JSON and not Regex. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm using QwQ-32B [1] with structured outputs. I'm finding the results are of very low quality. Am I doing something wrong?
I have modified some of the example code that asks if the capital of France is either Paris or London.
With a temperate of 1, it picks London in 54 out of 100 attempts!
Changing the temperature to 0 has no real impact (44 out of 100 runs picked London).
This is less than ideal. Without structured output, I doubt QwQ would ever get the answer wrong.
Any suggestions?
Thanks in advance.
I'm running the model using:
My modifications to the example code:
[1] Specifically, I'm using ospatch/QwQ-32B-INT8-W8A8
Beta Was this translation helpful? Give feedback.
All reactions