Skip to content

Structured output & json mode #122

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 59 commits into from

Conversation

kieranklaassen
Copy link
Contributor

@kieranklaassen kieranklaassen commented Apr 18, 2025

Solves #11

  • Only OpenAI support with strict mode, all other support as well
  • Added a strict: false flag for people to "experiment" with for non OpenAI
  • Testing on prod with https://cora.computer/
cora-app(dev)* schema = {
cora-app(dev)*   type: "object",
cora-app(dev)*   properties: {
cora-app(dev)*     name: { type: "string" },
cora-app(dev)*     age: { type: "integer" },
cora-app(dev)*     interests: { 
cora-app(dev)*       type: "array", 
cora-app(dev)*       items: { type: "string" }
cora-app(dev)*     }
cora-app(dev)*   },
cora-app(dev)*   required: ["name", "age", "interests"]
cora-app(dev)> }
cora-app(dev)> 
cora-app(dev)> # Get structured output as a Hash
cora-app(dev)> response = RubyLLM.chat(model: "gemini-2.0-flash")
cora-app(dev)> .with_output_schema(schema, strict:false)
cora-app(dev)> .ask("Create a profile for a Ruby developer").content
=> {"name" => "Ruby Developer Extraordinaire", "age" => 30, "interests" => ["Ruby on Rails", "Web Development", "REST APIs", "PostgreSQL", "Software Architecture", "Agile Methodologies"]}
cora-app(dev)* schema = {
cora-app(dev)*   type: "object",
cora-app(dev)*   properties: {
cora-app(dev)*     name: { type: "string" },
cora-app(dev)*     age: { type: "integer" },
cora-app(dev)*     interests: { 
cora-app(dev)*       type: "array", 
cora-app(dev)*       items: { type: "string" }
cora-app(dev)*     }
cora-app(dev)*   },
cora-app(dev)*   required: ["name", "age", "interests"]
cora-app(dev)> }
cora-app(dev)> 
cora-app(dev)> # Get structured output as a Hash
cora-app(dev)> response = RubyLLM.chat(model: "claude-3-7-sonnet-20250219")
cora-app(dev)> .with_output_schema(schema, strict:false)
cora-app(dev)> .ask("Create a profile for a Ruby developer").content
=> {"name" => "Alex Johnson", "age" => 32, "interests" => ["Ruby on Rails", "Web development", "Open source contribution", "Hiking", "Jazz music"]}
cora-app(dev)* schema = {
cora-app(dev)*   type: "object",
cora-app(dev)*   properties: {
cora-app(dev)*     name: { type: "string" },
cora-app(dev)*     age: { type: "integer" },
cora-app(dev)*     interests: { 
cora-app(dev)*       type: "array", 
cora-app(dev)*       items: { type: "string" }
cora-app(dev)*     }
cora-app(dev)*   },
cora-app(dev)*   required: ["name", "age", "interests"]
cora-app(dev)> }
cora-app(dev)> 
cora-app(dev)> # Get structured output as a Hash
cora-app(dev)> response = RubyLLM.chat(model: "gpt-4.1-nano")
cora-app(dev)> .with_output_schema(schema)
cora-app(dev)> .ask("Create a profile for a Ruby developer").content
=> {"name" => "Alex Johnson", "age" => 30, "interests" => ["Ruby programming", "Web development", "Open source contributions", "Agile methodologies", "Continuous integration"]}
cora-app(dev)> 

kieranklaassen and others added 13 commits April 18, 2025 11:30
…emove structured output references

- Renamed `supports_structured_output?` to `supports_json_mode?` in capabilities.
- Updated the `complete` method in the chat provider to remove structured output handling.
- Adjusted tests to reflect the changes in capabilities and removed references to structured output.
- Deleted obsolete VCR cassette for structured JSON output.
This ensures all providers have compatible interfaces for working with structured output, even if they don't use it directly.
Keep consistent naming across all providers by using supports_structured_output?
…t chat parameter

This change ensures consistency across the chat providers by allowing the parse_completion_response method to accept an optional chat parameter, enhancing compatibility with structured output handling.
…re to improve clarity

- Changed the parameter name in supports_structured_output? methods across multiple providers to improve clarity.
- Enhanced error message formatting in the chat module for better readability.
- Simplified conditional checks in render_payload methods for consistency.
CLAUDE.md Outdated
@@ -0,0 +1,23 @@
# CLAUDE.md
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: should we include or not?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like yet another thing to manage IMO, but if you don't think it'll need to be modified often then sure, why not I guess?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no

- Included examples demonstrating how to access structured data using hash keys in the README.
- Enhanced clarity for users on utilizing the output from the chat with_output_schema method.
…ict modes

- Added detailed explanations for strict mode and non-strict mode in RubyLLM.
- Clarified the behavior of unsupported models in strict mode, including the new error raised.
- Included guidance on using non-strict mode for experimentation with various models.
- Improved overall clarity and structure of the documentation for better user understanding.
…s for structured output

- Added comprehensive implementation details for structured output in RubyLLM, including behavior for OpenAI and other providers.
- Clarified limitations regarding schema validation and response format consistency.
- Enhanced documentation to improve user understanding of structured output features and their current alpha status.
- Removed complex logic from the extract_content method, which previously handled both string content and structured JSON content.
- The method now directly returns the content, streamlining its functionality and improving readability.
- Renamed test descriptions for clarity, indicating that JSON and Hash content are passed through without modification.
- Updated assertions to verify that the content remains unchanged and is valid JSON when applicable.
- Improved test readability by clarifying the expected behavior of the `to_llm` method for different content types.
- Replaced the previous content assignment logic to directly use the message's content attribute.
- This change simplifies the code by removing unnecessary variable assignment for content value, ensuring that the content is updated correctly in the message transaction.
…nce chat providers

- Introduced a new StructuredOutputParser module to handle JSON parsing consistently across providers.
- Updated chat providers (Anthropic, Gemini, OpenAI) to utilize the new structured output parsing logic.
- Enhanced documentation for structured output, detailing behavior in strict and non-strict modes.
- Added tests to verify structured output handling for Gemini in non-strict mode.
- Improved error handling and logging for JSON parsing failures.
…ured output parsing

- Added a new Utils module for the Gemini provider to centralize shared utility methods, including model ID extraction.
- Updated chat and streaming modules to utilize the new Utils module for model ID extraction.
- Enhanced structured output parsing in chat providers to handle JSON content more robustly, ensuring proper error handling.
- Improved documentation for utility methods and structured output parsing behavior.
…vider

- Deleted the render_payload method from the DeepSeek chat provider as it was not utilized in the current implementation.
- This change simplifies the codebase by eliminating unnecessary methods, improving maintainability.
- Changed the `supports_structured_output` field from `false` to `null` for various models to reflect updated support status.
- Adjusted timestamps for multiple models to correct timezone discrepancies, ensuring accurate creation dates.
- Added new models including "Computer Use Preview" and various "Davinci" models with updated attributes.
@@ -15,7 +15,7 @@ module RubyLLM
class ModelInfo
attr_reader :id, :created_at, :display_name, :provider, :metadata,
:context_window, :max_tokens, :supports_vision, :supports_functions,
:supports_json_mode, :input_price_per_million, :output_price_per_million, :type, :family

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is json_mode not a separate thing from structured output (albeit related)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it is kind of, but I don't see us using it anywhere in the code. So I decided to remove/replace it for now. Thoughts?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. Good point. Might want to add the functionality later.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with removing that.

# @return [self] Returns self for method chaining
# @raise [ArgumentError] If the schema is not a Hash or valid JSON string
# @raise [UnsupportedStructuredOutputError] If the model doesn't support structured output
def with_output_schema(schema, strict: true)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know @crmne has already given his pov on naming, but would it make sense to name this with_response_format instead? This is consistent with how OpenAI goes about it, and would allow for the a json_mode implementation later.

See here how the structured outputs vs json objects are implemented:
https://platform.openai.com/docs/guides/structured-outputs#structured-outputs-vs-json-mode

Structured output:
response_format: { type: "json_schema", json_schema: {"strict": true, "schema": ...} }

JSON mode:
response_format: { type: "json_object" }"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that is what I originally had so I agree! @crmne Open to changing the name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to make it a nice API as well since the OpenAI one is so confusing, how about this:

# OpenAI schema with structured output 
chat.with_response_format(schema)

# OpenAI JSON mode
chat.with_response_format(:json)

# Unofficial nonstrict mode (otherwise it will raise since it is not supported
chat.with_response_format(schema, strict: false)
chat.with_response_format(:json, strict: false)

Thoughts?

Copy link

@danielfriis danielfriis Apr 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had something similar in mind! With this approach, I would use either schema or :json_object.

I did think about this API also:

# Returning valid JSON object, but no schema difinition
chat.with_response_format(:json)

# JSON adhering to schema
chat.with_response_format(:json, schema: schema)

Copy link
Contributor Author

@kieranklaassen kieranklaassen Apr 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with my route but like what you have too. I think what I have will make the API maybe less verbose and clenaer? Not sure. Imagine these will work in the future too:

# With a RubyLLM/Structify schema definition
chat.with_response_format(Candidate)

# With schema Hash/String
chat.with_response_format(schema)

# With json mode
chat.with_response_format(:json)

# XML mode?
chat.with_response_format(:xml)

I like it's just one argument. @crmne let us know what you think, API design is important here

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think your original instinct of calling it with_response_schema was better. That naming is more consistent with the APIs and response_format typically means something else (xml vs json vs html).

I know you were considering merging the 2 types here but I think that violates SRP.

I do think you should add an alias anyways (in case our instincts were wrong)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't know if it will get deprecated and as long as it's supported, it makes sense to have RubyLLM allow for easy access to that endpoint...

The use case shouldn't matter, but I've been using it to extract metadata from various objects where I don't know exactly what to extract.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like with_response_format

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about this, and you could even just support with_response_format(:json) which under the hood would use structured outputs type: :object.

Best of both worlds, more cross-api compatible, same result!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After doing some research I can see that using json_schema mode requires the properties keys to work with OpenAI's API, so that's the main difference.

I still wonder if that logic should be abstracted-away here?

under the hood the provider can switch between json_mode and json_schema, but at the top level the user doesn't really care. They just want with_response_format(:object) or something.

Agree with with_response_format though since others prefer it 👍

@kieranklaassen kieranklaassen changed the title Structured output OpenAI Structured output Apr 19, 2025
Copy link
Owner

@crmne crmne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for the effort @kieranklaassen but I feel like the PR needs significant work, in particular:

  1. Is Structured Output really implemented with system prompts in the real SDKs?
  2. Passing the chat object around seems excessive and violates many SE principles. I stopped reviewing there.

You can find more comments there.

CLAUDE.md Outdated
@@ -0,0 +1,23 @@
# CLAUDE.md
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no

Comment on lines 144 to 188
# Adds system message guidance for schema-based JSON output
# If a system message already exists, it appends to it rather than replacing
# @return [self] Returns self for method chaining
def add_system_format_guidance
guidance = <<~GUIDANCE
You must format your output as a JSON value that adheres to the following schema:
#{JSON.pretty_generate(@response_format)}

Format your entire response as valid JSON that follows this schema exactly.
Do not include explanations, markdown formatting, or any text outside the JSON.
GUIDANCE

update_or_create_system_message(guidance)
self
end

# Adds guidance for simple JSON output format
# @return [self] Returns self for method chaining
def add_json_guidance
guidance = <<~GUIDANCE
You must format your output as a valid JSON object.
Format your entire response as valid JSON.
Do not include explanations, markdown formatting, or any text outside the JSON.
GUIDANCE

update_or_create_system_message(guidance)
self
end

# Updates existing system message or creates a new one with the guidance
# @param guidance [String] Guidance text to add to system message
def update_or_create_system_message(guidance)
system_message = messages.find { |msg| msg.role == :system }

if system_message
# Append to existing system message
updated_content = "#{system_message.content}\n\n#{guidance}"
@messages.delete(system_message)
add_message(role: :system, content: updated_content)
else
# No system message exists, create a new one
with_instructions(guidance)
end
end

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you absolutely sure we need this? I would be surprised if the official Python SDK would use system prompts to implement response_format: https://github.com/search?q=repo%3Aopenai%2Fopenai-python+response_format&type=code&p=0

Copy link
Contributor Author

@kieranklaassen kieranklaassen Apr 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need this, yes. OpenAI will give an error if you do not have these or similar instructions in the prompt. For all other providers, there is no API, so we need them. You can not include them and hope the user will add these, but considering ease of use, I would add them. However, I'm happy not to add them if you think that's a better design.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be precise, it's only necessary in JSON mode, not for Structured Output. Personally, I'd prefer to have control over how prompts are formulated. Keep in mind, they may also be in another language.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

@@ -15,7 +15,7 @@ module RubyLLM
class ModelInfo
attr_reader :id, :created_at, :display_name, :provider, :metadata,
:context_window, :max_tokens, :supports_vision, :supports_functions,
:supports_json_mode, :input_price_per_million, :output_price_per_million, :type, :family
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with removing that.

@@ -10,7 +10,7 @@ module Provider
module Methods # rubocop:disable Metrics/ModuleLength
extend Streaming

def complete(messages, tools:, temperature:, model:, &block) # rubocop:disable Metrics/MethodLength
def complete(messages, tools:, temperature:, model:, chat: nil, &block) # rubocop:disable Metrics/MethodLength
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😮

@jayelkaake
Copy link

2. Passing the chat object around seems excessive and violates many SE principles. I stopped reviewing there.

@kieranklaassen let me know if you want help with updating this to not pass the chat object around. I agree with @crmne and want to move your great initiative along :)

I guess conflicts need to be resolved too now.

@kieranklaassen
Copy link
Contributor Author

kieranklaassen commented Apr 21, 2025

@crmne Thanks for the review. I pushed the changes requested. Do you still feel this needs a lot of work? If yes, could you explain your feeling so I can try to understand what needs to be done?

If you feel good about this direction I can finalize tests, I need to review and see coverage.

This is my first time co tributing to open source so any guidance is appreciated

@kieranklaassen kieranklaassen requested a review from crmne April 21, 2025 14:46
@jayelkaake
Copy link

jayelkaake commented Apr 21, 2025

FYI I decided to take a stab at this and took a slightly different approach which addresses some of @crmne's concerns #131

Let me know what you guys think!

PS @kieranklaassen if you want want to merge our PRs we can do that too. We're in this together 😄

@kieranklaassen
Copy link
Contributor Author

kieranklaassen commented Apr 22, 2025

Closing this one to let @jayelkaake take it from here! LFG

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants