-
-
Notifications
You must be signed in to change notification settings - Fork 101
Structured output response support #11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I got one, maybe for more inspiration https://github.com/kieranklaassen/structify |
nice, was also thinking about looking into this. happy to help out if needed |
Amazing gem! Structured output support would be very helpful indeed. Maybe it could be as simple as providing the schema as a ruby hash. Here's another example: https://mdoliwa.com/how-to-use-structured-outputs-with-ruby-openai-gem |
I’ve been using alexrudall/ruby-openai#508 (comment) for quite sometime. Whatever direction we take should be different, because this is really brittle. @kieranklaassen your thing looks great, I vote we use it as a dependency in this repo |
I think we’d want an implementation for schema definition and structured responses that doesn’t have to rely on rails given this is a Ruby gem that can optionally be used with rails. I’ve also used this before and it works well ( alexrudall/ruby-openai#508 (comment) ) but I really like the syntax of structify @kieranklaassen shared, especially the native chain of thought capability (which really helps improve output quality). Seems like maybe the essence of structify’s approach can be tapped into in this gem without the dependency of rails/rails models? |
@kieranklaassen are you comfortable if I point Claude 3.7 at your repo and have it extract/remove activerecord as a dependency? |
I actually just used repomix to pack both rubyllm and structify's repos to use with claude 3.7 to ask how to combine the capabilities, it created a reasonably good plan from my quick skimming. I don't have time right now to create a PR, but I'll share its output here for now: RubyLLM Gem Code Changeslib/ruby_llm/schema.rbmodule RubyLLM
# Schema class for defining structured output formats
class Schema
attr_reader :title, :description, :version, :fields, :thinking_enabled
# Class-level DSL methods for defining schemas
class << self
attr_reader :schema_title, :schema_description, :schema_version,
:schema_fields, :schema_thinking_enabled
# Set the schema title
# @param name [String] The title
def title(name)
@schema_title = name
end
# Set the schema description
# @param desc [String] The description
def description(desc)
@schema_description = desc
end
# Set the schema version
# @param num [Integer] The version number
def version(num)
@schema_version = num
end
# Enable or disable thinking mode
# @param enabled [Boolean] Whether to enable thinking mode
def thinking(enabled)
@schema_thinking_enabled = enabled
end
# Define a field in the schema
# @param name [Symbol] Field name
# @param type [Symbol] Field type (:string, :integer, :number, :boolean, :array, :object)
# @param required [Boolean] Whether field is required
# @param description [String] Field description
# @param enum [Array] Possible values for the field
# @param items [Hash] For array type, schema for array items
# @param properties [Hash] For object type, properties of the object
# @param options [Hash] Additional options
def field(name, type, required: false, description: nil, enum: nil,
items: nil, properties: nil, **options)
@schema_fields ||= []
field_def = {
name: name,
type: type,
required: required,
description: description
}
field_def[:enum] = enum if enum
field_def[:items] = items if items && type == :array
field_def[:properties] = properties if properties && type == :object
# Add any additional options
options.each { |k, v| field_def[k] = v }
@schema_fields << field_def
end
# Create an instance from class definition
def create_instance
new(
title: @schema_title,
description: @schema_description,
version: @schema_version || 1,
thinking: @schema_thinking_enabled || false,
fields: @schema_fields || []
)
end
end
def initialize(title: nil, description: nil, version: 1, thinking: false, fields: nil, &block)
@title = title
@description = description
@version = version
@thinking_enabled = thinking
@fields = fields || []
instance_eval(&block) if block_given?
end
# Define a field in the schema
# @param name [Symbol] Field name
# @param type [Symbol] Field type (:string, :integer, :number, :boolean, :array, :object)
# @param required [Boolean] Whether field is required
# @param description [String] Field description
# @param enum [Array] Possible values for the field
# @param items [Hash] For array type, schema for array items
# @param properties [Hash] For object type, properties of the object
# @param options [Hash] Additional options
def field(name, type, required: false, description: nil, enum: nil,
items: nil, properties: nil, **options)
field_def = {
name: name,
type: type,
required: required,
description: description
}
field_def[:enum] = enum if enum
field_def[:items] = items if items && type == :array
field_def[:properties] = properties if properties && type == :object
# Add any additional options
options.each { |k, v| field_def[k] = v }
@fields << field_def
end
# Enable or disable thinking mode
# @param enabled [Boolean] Whether to enable thinking mode
def thinking(enabled)
@thinking_enabled = enabled
end
# Set the schema title
# @param name [String] The title
def title(name)
@title = name
end
# Set the schema description
# @param desc [String] The description
def description(desc)
@description = desc
end
# Set the schema version
# @param num [Integer] The version number
def version(num)
@version = num
end
# Convert to JSON schema for various LLM providers
def to_json_schema
required_fields = @fields.select { |f| f[:required] }.map { |f| f[:name].to_s }
properties = {}
# Add chain_of_thought field if thinking mode is enabled
if @thinking_enabled
properties["chain_of_thought"] = {
type: "string",
description: "Explain your thinking process step by step before determining the final values."
}
end
# Add all other fields
@fields.each do |f|
prop = { type: f[:type].to_s }
prop[:description] = f[:description] if f[:description]
prop[:enum] = f[:enum] if f[:enum]
# Handle array specific properties
if f[:type] == :array && f[:items]
prop[:items] = f[:items]
end
# Handle object specific properties
if f[:type] == :object && f[:properties]
prop[:properties] = {}
object_required = []
f[:properties].each do |prop_name, prop_def|
prop[:properties][prop_name] = prop_def.dup
if prop_def[:required]
object_required << prop_name
prop[:properties][prop_name].delete(:required)
end
end
prop[:required] = object_required unless object_required.empty?
end
properties[f[:name].to_s] = prop
end
# Return the complete schema
{
name: @title,
description: @description,
parameters: {
type: "object",
required: required_fields,
properties: properties
}
}
end
end
# Response parser class that converts structured JSON responses to Ruby objects
class StructuredResponse
attr_reader :raw_data, :schema
def initialize(raw_data, schema)
@raw_data = raw_data
@schema = schema
parse_response
end
def method_missing(method, *args, &block)
key = method.to_s
if @parsed_data && @parsed_data.key?(key)
@parsed_data[key]
else
super
end
end
def respond_to_missing?(method, include_private = false)
key = method.to_s
(@parsed_data && @parsed_data.key?(key)) || super
end
private
def parse_response
@parsed_data = {}
# Skip parsing if empty response
return if @raw_data.nil? || @raw_data.empty?
# For each field in schema, extract and convert the value
@schema.fields.each do |field|
field_name = field[:name].to_s
next unless @raw_data.key?(field_name)
value = @raw_data[field_name]
@parsed_data[field_name] = convert_value(value, field)
end
# Include chain_of_thought if present
if @schema.thinking_enabled && @raw_data.key?("chain_of_thought")
@parsed_data["chain_of_thought"] = @raw_data["chain_of_thought"]
end
end
def convert_value(value, field)
case field[:type]
when :integer
value.to_i
when :number
value.to_f
when :boolean
!!value
when :array
Array(value)
when :object
value.is_a?(Hash) ? value : {}
else
value.to_s
end
end
end
end lib/ruby_llm/chat_extensions.rbmodule RubyLLM
class Chat
# Add a method for structured output
def with_structured_output(schema_or_definition = nil, &block)
schema = if schema_or_definition.is_a?(Schema)
schema_or_definition
elsif schema_or_definition.is_a?(Class) && schema_or_definition < Schema
schema_or_definition.create_instance
else
Schema.new(&block)
end
StructuredChat.new(self, schema)
end
end
# StructuredChat wraps a regular Chat but adds structured output capabilities
class StructuredChat
def initialize(chat, schema)
@chat = chat
@schema = schema
end
# Forward most methods to the underlying chat
def method_missing(method, *args, &block)
@chat.send(method, *args, &block)
end
def respond_to_missing?(method, include_private = false)
@chat.respond_to?(method, include_private) || super
end
# Override ask to handle structured responses
def ask(prompt, **options)
# Get the schema
schema = @schema.to_json_schema
# Determine which LLM provider we're using and adapt accordingly
case @chat.provider.class.name
when /OpenAI/
# For OpenAI, use JSON mode with response_format
# First, prepare our system message with schema information
system_message = "You will extract structured information according to a specific schema."
if schema[:description]
system_message += " " + schema[:description]
end
# Prepare all messages including the system message
all_messages = [
{role: 'system', content: system_message}
] + @chat.conversation.messages + [{role: 'user', content: prompt}]
# Call the API with JSON response format
response = @chat.send(:perform_request,
model: options[:model] || @chat.provider.default_model,
messages: all_messages,
response_format: {
type: "json_object",
schema: schema[:parameters]
},
temperature: options[:temperature] || 0.7
)
# Extract JSON from response
if response['choices'] && response['choices'][0]['message']['content']
begin
# Parse the JSON response
json_response = JSON.parse(response['choices'][0]['message']['content'])
# Add response to conversation
@chat.conversation.add('assistant', response['choices'][0]['message']['content'])
# Create structured response object
StructuredResponse.new(json_response, @schema)
rescue JSON::ParserError => e
# Handle parsing error
response_content = response['choices'][0]['message']['content']
@chat.conversation.add('assistant', response_content)
# Log error but return the raw response
puts "Warning: Could not parse JSON response: #{e.message}"
response_content
end
else
# Handle unexpected response format
response_content = response['choices'][0]['message']['content'] rescue "No response content"
@chat.conversation.add('assistant', response_content)
response_content
end
when /Anthropic/
# For Anthropic (Claude), use their tool calling
tool = {
type: "function",
function: {
name: "extract_structured_data",
description: schema[:description] || "Extract structured data based on the provided schema",
parameters: schema[:parameters]
}
}
# Prepare the messages
messages = @chat.conversation.messages.map do |msg|
{ role: msg[:role], content: msg[:content] }
end
messages << { role: 'user', content: prompt }
# Prepare tool choice
tool_choice = {
type: "function",
function: { name: "extract_structured_data" }
}
# Prepare request options
request_options = options.merge(
messages: messages,
tools: [tool],
tool_choice: tool_choice
)
# Perform the request
response = @chat.send(:perform_request, request_options)
# Extract tool outputs from response
if response['content'] && response['content'][0]['type'] == 'tool_use'
tool_use = response['content'][0]['tool_use']
function_args = JSON.parse(tool_use['arguments'])
# Add response to conversation
text_content = response['content'].find { |c| c['type'] == 'text' }
@chat.conversation.add('assistant', text_content ? text_content['text'] : '')
# Create structured response object
StructuredResponse.new(function_args, @schema)
else
# Handle regular response
response_content = response['content'][0]['text']
@chat.conversation.add('assistant', response_content)
response_content
end
else
# For other providers, use standard prompt engineering
system_prompt = "You must respond with a valid JSON object that follows this schema: #{schema.to_json}. " +
"Do not include any explanatory text outside the JSON."
# Add system prompt
@chat.conversation.add('system', system_prompt)
# Call the API
response = @chat.ask(prompt, **options)
# Try to parse response as JSON
begin
json_response = JSON.parse(response)
StructuredResponse.new(json_response, @schema)
rescue JSON::ParserError
# If parsing fails, return raw response
response
end
end
end
end
end Usage Examplesrequire 'ruby_llm'
# Example 1: Define a schema class
class ArticleSchema < RubyLLM::Schema
title "Article Extraction"
description "Extract key information from article text"
field :title, :string, required: true, description: "The article's title"
field :summary, :text, description: "A brief summary of the article"
field :category, :string, enum: ["tech", "business", "science"]
field :tags, :array, items: { type: "string" }
field :author_info, :object, properties: {
"name" => { type: "string", required: true },
"email" => { type: "string" }
}
end
# Use the schema class
chat = RubyLLM::Chat.new
response = chat.with_structured_output(ArticleSchema)
.ask("Here's an article about AI developments: [article text...]")
# Now you can access fields as methods
puts "Title: #{response.title}"
puts "Summary: #{response.summary}"
puts "Category: #{response.category}"
puts "Tags: #{response.tags.join(', ')}"
puts "Author: #{response.author_info['name']}"
# Example 2: Define a schema inline
response = RubyLLM::Chat.new.with_structured_output do
title "Article Extraction"
description "Extract key information from article text"
thinking true # Enable chain of thought reasoning
field :title, :string, required: true, description: "The article's title"
field :summary, :text, description: "A brief summary of the article"
field :sentiment, :string, enum: ["positive", "neutral", "negative"]
end.ask("Here's a news article about climate change: [article text...]")
# Access the chain of thought reasoning
puts "Reasoning: #{response.chain_of_thought}"
puts "Title: #{response.title}"
puts "Summary: #{response.summary}"
puts "Sentiment: #{response.sentiment}"
# Example 3: Combining with tools
calculator = RubyLLM::Tool.new("Calculator", "Performs calculations") do |expression|
eval(expression).to_s
end
product_schema = RubyLLM::Schema.new do
field :name, :string, required: true
field :price, :number, required: true
field :quantity, :integer, required: true
field :total_cost, :number, required: true
field :categories, :array, items: { type: "string" }
end
# Combine tools with structured output
response = chat.with_tool(calculator)
.with_structured_output(product_schema)
.ask("I need a product entry for a MacBook Air that costs $1299. I'm ordering 3 units.")
puts "Product: #{response.name}"
puts "Price: $#{response.price}"
puts "Quantity: #{response.quantity}"
puts "Total Cost: $#{response.total_cost}" # The LLM likely used the calculator to compute this Integrating Structured Output into RubyLLMThis guide explains how to integrate the structured output capabilities into the existing RubyLLM gem. Step 1: Add New FilesAdd these new files to the gem:
Step 2: Update the Main FileModify require "ruby_llm/version"
require "ruby_llm/providers"
require "ruby_llm/conversation"
require "ruby_llm/chat"
require "ruby_llm/tool"
require "ruby_llm/schema" # Add this line
require "ruby_llm/structured_chat" # Add this line
module RubyLLM
class Error < StandardError; end
# ... rest of the file
end Step 3: Add Method to Chat ClassAdd the module RubyLLM
class Chat
# ... existing methods
# Add a method for structured output
def with_structured_output(schema_or_definition = nil, &block)
schema = schema_or_definition.is_a?(Schema) ?
schema_or_definition :
Schema.new(&block)
StructuredChat.new(self, schema)
end
end
end Step 4: Update TestsAdd new test files:
Step 5: Update README and DocumentationUpdate the README.md with examples of using structured output, similar to the usage examples provided. Step 6: Version UpdateIncrement the gem version in Step 7: Update ChangelogAdd an entry to CHANGELOG.md describing the new structured output feature. Step 8: Integration with Existing FeaturesEnsure that structured output works well with existing features like tools, by testing combinations of methods like |
(completely untested btw, not sure how well it would work but it's a good starting point it looks like) |
Structured outputs are definitely on the roadmap. I like Kieran's approach with structify, but RubyLLM is committed to staying Rails-optional. The code sample shared looks promising but has some rough edges - "thinking enabled" doesn't fit our design philosophy, and the response parsing should live within provider modules rather than being generic. If anyone wants to tackle this, I'd love to see a PR that maintains our core principles:
Check out sergiobayona/easy_talk and other alternatives before reinventing the wheel - but let's make sure our implementation is the cleanest in the ecosystem. |
Can you elaborate on:
|
Would love to have this feature implemented!!!!!!!! |
I'm personally going to use kieranklaassen/structify with ruby_llm most likely because I need it to integrate with rails. I would love to see BYO here. What I do not like with most libraries is that they do too many things at once, I rather pick my faves and plug them together. @crmne I like the feel of your lib a lot. I have also looked at every single other scheme definition lib but have not found any that I like. Maybe there is a way to extract something from kieranklaassen/structify or be inspired. I'll keep you all posted on how it goes using the two together. |
@adenta In the code sample shared earlier, there's parsing logic (JSON.parse, digs, etc.) in I'd want to see that parsing code moved to provider-specific modules like @kieranklaassen Thank you! I've been thinking more about this, and I believe we can create something even better than what's out there. What if we had a schema definition that felt truly Ruby-native? class Delivery < RubyLLM::Schema
datetime :timestamp
array :dimensions
string :address, required: true
end
# Then use it like:
response = chat.with_structured_output(Delivery)
.ask("Extract delivery info from: Next day delivery to 123 Main St...")
puts response.timestamp # => 2025-03-20 14:30:00
puts response.dimensions # => [12, 8, 4]
puts response.address # => "123 Main St, Springfield" This approach would be lightweight, expressive, and completely framework-agnostic. RubyLLM would handle adapting it to each provider's specific structured output mechanism. I'd be very interested in collaborating on extracting something like this from structify. The core schema definition could be a standalone gem that other libraries could build upon. If you're up for it, let's think about what that might look like! |
chat.ask("Extract delivery info from: Next day delivery to 123 Main
St..." with_structured_output: Delivery)
Sounds better IMO
…On Mon, Mar 17, 2025 at 4:08 PM Carmine Paolino ***@***.***> wrote:
@adenta <https://github.com/adenta> In the code sample shared earlier,
there's parsing logic (JSON.parse, digs, etc.) in chat_extensions.rb.
This doesn't align with RubyLLM's architecture, where parsing logic belongs
in provider modules as it most likely provider specific.
I'd want to see that parsing code moved to provider-specific modules like
lib/ruby_llm/providers/openai/structured_outputs.rb to maintain our clean
separation of concerns.
@kieranklaassen <https://github.com/kieranklaassen> Thank you!
I've been thinking more about this, and I believe we can create something
even better than what's out there. What if we had a schema definition that
felt truly Ruby-native?
class Delivery < RubyLLM::Schema
datetime :timestamp
array :dimensions
string :address, required: trueend
# Then use it like:response = chat.with_structured_output(Delivery)
.ask("Extract delivery info from: Next day delivery to 123 Main St...")
puts response.timestamp # => 2025-03-20 14:30:00puts response.dimensions # => [12, 8, 4] puts response.address # => "123 Main St, Springfield <https://www.google.com/maps/search/123+Main+St,+Springfield?entry=gmail&source=g>"
This approach would be lightweight, expressive, and completely
framework-agnostic. RubyLLM would handle adapting it to each provider's
specific structured output mechanism.
I'd be very interested in collaborating on extracting something like this
from structify. The core schema definition could be a standalone gem that
other libraries could build upon. If you're up for it, let's think about
what that might look like!
—
Reply to this email directly, view it on GitHub
<#11 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACGUU5XRZH2DZUHUAJXPFYT2U4TSPAVCNFSM6AAAAABY6LKNY6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOMZQG4YTOMRZG4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
[image: crmne]*crmne* left a comment (crmne/ruby_llm#11)
<#11 (comment)>
@adenta <https://github.com/adenta> In the code sample shared earlier,
there's parsing logic (JSON.parse, digs, etc.) in chat_extensions.rb.
This doesn't align with RubyLLM's architecture, where parsing logic belongs
in provider modules as it most likely provider specific.
I'd want to see that parsing code moved to provider-specific modules like
lib/ruby_llm/providers/openai/structured_outputs.rb to maintain our clean
separation of concerns.
@kieranklaassen <https://github.com/kieranklaassen> Thank you!
I've been thinking more about this, and I believe we can create something
even better than what's out there. What if we had a schema definition that
felt truly Ruby-native?
class Delivery < RubyLLM::Schema
datetime :timestamp
array :dimensions
string :address, required: trueend
# Then use it like:response = chat.with_structured_output(Delivery)
.ask("Extract delivery info from: Next day delivery to 123 Main St...")
puts response.timestamp # => 2025-03-20 14:30:00puts response.dimensions # => [12, 8, 4] puts response.address # => "123 Main St, Springfield <https://www.google.com/maps/search/123+Main+St,+Springfield?entry=gmail&source=g>"
This approach would be lightweight, expressive, and completely
framework-agnostic. RubyLLM would handle adapting it to each provider's
specific structured output mechanism.
I'd be very interested in collaborating on extracting something like this
from structify. The core schema definition could be a standalone gem that
other libraries could build upon. If you're up for it, let's think about
what that might look like!
—
Reply to this email directly, view it on GitHub
<#11 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACGUU5XRZH2DZUHUAJXPFYT2U4TSPAVCNFSM6AAAAABY6LKNY6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOMZQG4YTOMRZG4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
After a quick search, I think we should make use of RBS https://www.honeybadger.io/blog/ruby-rbs-type-annotation/ as it's ruby native and very pretty: class Delivery
@timestamp : DateTime? # ? means it can be nil
@dimensions : Array[Integer]?
@address : String
end |
I like the PORO route. What about nested more complex json schemas? Enums, max-min etc? for me it also should be easy to use in Rails, since that is where most people would use it and store it. So that's why I created structify. But it would be great if we can create some kind of Ruby like pydantic that works well for LLMs. I have not seen anything I like the syntax of. I'm not sure if I love rbs type annotations. Are they extendible enough? Love the collaboration here! |
I wouldn't couple Rails with the RubyLLM interface for structured outputs. I would like to be able to pass in some data structure that represents the schema I am looking for, which then gets used in the LLM API calls, that will then feedback out a JSON / hash structure with the response from the LLM. Moving in and out of Rails objects, I could do that on my own side decoupled from the interaction with the LLM for the purposes of coding my own business logic / etc. |
hey there glad to see my gem mentioned sergiobayona/easy_talk would gladly help integrating with ruby_llm. |
I'm thinking it would be really great if we can bring our own system too, since we all need something slightly different. Like we can use @sergiobayona sergiobayona/easy_talk or my more Rails focussed kieranklaassen/structify. Maybe an interface like this: # RubyLLM
class Delivery
@timestamp : DateTime? # ? means it can be nil
@dimensions : Array[Integer]?
@address : String
end
# Structify
class Delivery < ApplicationRecord
include Structify::Model
schema_definition do
field :timestamp, :datetime
field :dimensions, :array, items: { type: "number" }
field :address, :string, required: true
end
end
# Easytalk
class Delivery
include EasyTalk::Model
define_schema do
property :timestamp, DateTime
property :dimensions, T::Array[Float], optional: true
property :address, String
end
end
# Then use it like:
response = chat.with_structured_output(Delivery)
.ask("Extract delivery info from: Next day delivery to 123 Main St...")
puts response.timestamp # => 2025-03-20 14:30:00
puts response.dimensions # => [12, 8, 4]
puts response.address # => "123 Main St, Springfield" We can use an adapter pattern. Thoughts? Also maybe we can add a raw_response, so you can do your own processing? |
I support the above proposal. At the end of the day the schema is getting evaluated down to jsonschema, so however one wants to get there should be fine |
This is how Langchain does it: parser = Langchain::OutputParsers::StructuredOutputParser.from_json_schema(json_schema)
prompt = Langchain::Prompt::PromptTemplate.new(template: "Generate details of a fictional character.\n{format_instructions}\nCharacter description: {description}", input_variables: ["description", "format_instructions"])
prompt_text = prompt.format(description: "Korean chemistry student", format_instructions: parser.get_format_instructions) While I do not like this long name |
question for community, can you all post some json schemas to get an idea of all the shapes? This is one of mine: {
"name": "newsletter_summary",
"description": "Extracts essential metadata from a newsletter or digest-style email and returns a detailed summary. Follows these writing rules: always use past tense (e.g., 'Matt asked'), be concrete and specific, and if there is a question in the subject, provide an answer in the summary. For digest emails, summarize the main story in one extensive paragraph, then list the three most important stories or points in three concise sentences (or lines).",
"parameters": {
"type": "object",
"required": [
"title",
"summary",
"priority",
"labels"
],
"properties": {
"chain_of_thought": {
"type": "string",
"description": "Explain your thought process step by step before determining the final values."
},
"title": {
"type": "string"
},
"summary": {
"type": "text"
},
"priority": {
"type": "integer",
"description": "Priority rating for the newsletter: 0 means it's mostly promotional or unimportant, 1 indicates it has moderate usefulness, and 2 indicates high value and relevance.",
"enum": [
0,
1,
2
]
},
"newsletter_url": {
"type": "string",
"description": "Direct URL to the web version of the newsletter if available, otherwise null. Do NOT include any unsubscribe link."
},
"cover_image_url": {
"type": "string",
"description": "URL of the newsletter's cover image if present, otherwise null."
},
"labels": {
"type": "array",
"description": "Relevant thematic tags describing the newsletter content, e.g. 'AI', 'Productivity', 'Leadership'.",
"items": {
"type": "string"
}
}
}
}
}
|
Hey folks, just reviving the discussion here. Can you post some JSON schemas you are using as @kieranklaassen said? That'll help us prioritize and implement it better. |
Here is one where I was generating fake twitch chat messages to test my Pokemon Showdown bot
Picking a button in a game of pokemon fire red
Choosing a place to go to on a map based on a list of map coordinates
|
@adenta could you send us the JSON version of all of them? |
Done! |
I started implementing this a bit, see what I can do this week, pushing as I go to: #65 |
Just adding this gist for inspiration. IMO a clean way to define a schema, similar to how tools are defined now in RubyLLM. class MathReasoning < StructuredOutputs::Schema
def initialize
super do
define :step do
string :explanation
string :output
end
array :steps, items: ref(:step)
string :final_answer
end
end
end |
here's an example JSON schema i'm looking to implement: {
"name": "question_extractor",
"description": "Extract structured data from a list of questions",
"parameters": {
"type": "object",
"required": [
"questions"
],
"properties": {
"questions": {
"type": "array",
"items": {
"type": "object",
"properties": {
"text": {
"type": "string"
},
"speaker": {
"type": "string"
},
"start_time": {
"type": "number",
"multipleOf": 0.01
},
"end_time": {
"type": "number",
"multipleOf": 0.01
}
},
"required": [
"text",
"speaker",
"start_time",
"end_time"
]
}
}
}
}
} |
Thanks for the excitement about structured output support and the draft PRs @kieranklaassen and @danielfriis . I need to take a deep dive into this before deciding anything since it's not something I personally use, so don't expect quick answers here or in the PRs. However, you all can help with information:
My preference - as you may have guessed - goes to clean, elegant, beautiful DSLs. |
Long response coming in, I have thoughts from needing this in my app and trying to find a solution for 2 years. I think RubyLLM is the one! Thanks @crmne I'm completely open to however we decide to implement this. As the repo owner, you're the one who should make the final call on the approach - I'm just providing options and considerations to help inform that decision. Design ConsiderationsWhen implementing structured output support for RubyLLM, several key considerations should guide the design:
Proposed DSL Approaches1. Simple Class-based Schema DefinitionDefine schemas using plain Ruby objects with a class Delivery
attr_accessor :timestamp, :dimensions, :address
def self.json_schema
{
type: "object",
properties: {
timestamp: { type: "string", format: "date-time" },
dimensions: { type: "array", items: { type: "number" } },
address: { type: "string" }
},
required: ["address"]
}
end
end
# Usage
response = chat.with_response_format(Delivery)
.ask("Extract delivery info from: Next day delivery to 123 Main St...")
# Response object
puts response.timestamp # => "2025-03-20T14:30:00Z"
puts response.dimensions # => [12, 8, 4]
puts response.address # => "123 Main St, Springfield" Advantages:
Disadvantages:
2. RBS Type SignaturesLeverage Ruby's type system using RBS syntax: class Delivery
@timestamp : DateTime? # ? means optional
@dimensions : Array[Float]?
@address : String
end
# Usage
response = chat.with_response_format(Delivery)
.ask("Extract delivery info from: Next day delivery to 123 Main St...")
# Response object
puts response.timestamp # => #<DateTime: 2025-03-20T14:30:00>
puts response.dimensions # => [12.0, 8.0, 4.0]
puts response.address # => "123 Main St, Springfield" Advantages:
Disadvantages:
3. Schema DSL with Method ChainingA more expressive DSL for schema definition: class Delivery < RubyLLM::Schema
string :name, required: true
number :price
array :dimensions, items: { type: "number" }
object :address do
string :street
string :city
string :zip
end
end
# Usage
response = chat.with_structured_output(Delivery)
.ask("Extract: John bought a package for $25, size 12x8x4, sent to 123 Main St, Springfield, 12345")
# Response object
puts response.name # => "John"
puts response.price # => 25
puts response.dimensions # => [12, 8, 4]
puts response.address.street # => "123 Main St"
puts response.address.city # => "Springfield"
puts response.address.zip # => "12345" Advantages:
Disadvantages:
4. Adapter Pattern for Multiple Schema SystemsSupport multiple schema definition styles: # RubyLLM's native schema definition
class Delivery < RubyLLM::Schema
string :address, required: true
array :dimensions
end
# Or use your own schema library:
require 'easy_talk'
class DeliveryET
include EasyTalk::Model
define_schema do
property :address, String
property :dimensions, T::Array[Float], optional: true
end
end
# Usage with any schema system
response = chat.with_response_format(DeliveryET)
.ask("Extract delivery info...")
# Response returns a DeliveryET instance
puts response.address # => "123 Main St, Springfield"
puts response.dimensions # => [12.0, 8.0, 4.0] Advantages:
Disadvantages:
Real-World ExamplesProduct Information Extractionclass Product < RubyLLM::Schema
string :name, required: true
string :category, enum: ["Electronics", "Clothing", "Food", "Other"]
number :price, required: true
array :features, items: { type: "string" }
boolean :in_stock
end
response = chat.with_response_format(Product)
.ask("Extract details: iPhone 15, $999, Electronics, features: A17 chip, 48MP camera. In stock: yes")
puts response.name # => "iPhone 15"
puts response.category # => "Electronics"
puts response.price # => 999
puts response.features # => ["A17 chip", "48MP camera"]
puts response.in_stock # => true Customer Sentiment Analysisclass FeedbackAnalysis < RubyLLM::Schema
string :sentiment, enum: ["positive", "neutral", "negative"], required: true
number :satisfaction_score, minimum: 1, maximum: 10
array :key_concerns do
string
end
array :positive_points do
string
end
string :summary, required: true
end
feedback = "The product works well but shipping took too long and the packaging was damaged."
response = chat.with_response_format(FeedbackAnalysis)
.ask("Analyze this customer feedback: #{feedback}")
puts response.sentiment # => "neutral"
puts response.satisfaction_score # => 6
puts response.key_concerns # => ["slow shipping", "damaged packaging"]
puts response.positive_points # => ["product works well"]
puts response.summary # => "Customer is satisfied with product functionality but disappointed with shipping experience" Method Naming Options# Option 1: Focus on response format
chat.with_response_format(Schema)
.ask("...") # => Returns a Schema-based object
# Option 2: Focus on structured data
chat.with_structured_output(Schema)
.ask("...") # => Returns a Schema-based object
# Option 3: Focus on schema
chat.with_schema(Schema)
.ask("...") # => Returns a Schema-based object
# Option 4: Focus on extraction
chat.extract_as(Schema)
.ask("...") # => Returns a Schema-based object Custom ParsersSince we're already building a parser system for JSON, it makes sense to extend this framework to support additional response formats. The custom parser system allows for future extensibility without changing the core API. Why Support Custom Parsers?While structured JSON output handles most data extraction needs, LLMs can generate content in many other formats:
By leveraging the same parsing architecture we use for JSON, we can provide a consistent interface for all response formats. Custom Parser Examples# XML Parser
response = chat.with_parser(:xml, tag: 'data')
.ask("Can you provide the answer in XML? <data>42</data>")
puts response.content # => "42"
# Regex Parser
email = chat.with_parser(:regex, pattern: 'Email: ([a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,})')
.ask("My email is: Email: [email protected]")
puts email # => "[email protected]"
# CSV Parser
RubyLLM::ResponseParser.register(:csv, CsvParser)
result = chat.with_parser(:csv)
.ask("Give me a CSV with name,age,city for 3 people")
puts result.first # => {"name"=>"John", "age"=>"30", "city"=>"New York"} Building Your Own ParserCustom parsers are simple modules that implement a standard interface: module CsvParser
def self.parse(response, options)
# Skip processing if not a string
return response unless response.content.is_a?(String)
rows = response.content.strip.split("\n")
headers = rows.first.split(',')
rows[1..-1].map do |row|
values = row.split(',')
headers.zip(values).to_h
end
end
end
# Register your parser
RubyLLM::ResponseParser.register(:csv, CsvParser) This extensible design ensures RubyLLM can evolve to handle any response format needed in the future, while sharing the same infrastructure we build for structured JSON output. Recommended ApproachA hybrid approach combining clean Ruby-native DSL with adapter support offers the best balance:
This approach provides a clean, flexible API that feels natural to Ruby developers while allowing integration with existing code. Open QuestionsRails Integration and PersistenceSince RubyLLM already has persistence built-in with chat storage, how should we approach structured output persistence with Rails? The key question is whether we should store the raw response, the parsed response, or both. Storing only the parsed response is more space-efficient but loses information. Storing both provides the most flexibility but requires more storage. The best approach may depend on specific use cases and how often users need to reprocess historical responses with new schemas. |
@kieranklaassen I'm onboard with your recommended approach, but I suggest we limit the first PR to just the first method (Primary Schema Definition) and then expand from there in other PRs. Just to keep changes as small and contained as possible. Similarly, I would argue to also add customer parsers in new, separate, PRs. Also — while I like the idea of parsers — I'm not sure, if @crmne would consider them within the scope of RubyLLM? Edit: Worth mentioning for you @crmne, as you asked about quirks with OpenAI schemas, that both |
Google also supports Structured outputs. And agreed in taking small steps. I just want to make sure I can actually use the RubyLLM for my real world use case so I want to agree on the direction we go in. Regarding parsers, I think it aligns well with the batteries included philosophy. |
+1 for @kieranklaassen approach! |
Example of code before and after a possible parser addition from above: # frozen_string_literal: true
# Extracts industry and time-sensitive categories from an account's emails
# and stores them as memories for future reference
class FactExtractorService
# @param account [Account] The account to extract facts from
def initialize(account)
@account = account
Current.account = @account # Set Current.account for tools
end
# Runs the fact extraction process and stores the results as memories
# @return [Boolean] true if extraction was successful, false otherwise
def run
result = extract_facts.content
return false unless result
# Parse the result to extract industry and time-sensitive categories
industry, explanation, categories = parse_result(result)
# Store the extracted information as memories
store_industry_memory(industry, explanation)
store_category_memories(categories)
true
rescue StandardError => e
Rails.logger.error("Error in FactExtractorService: #{e.message}")
false
ensure
Current.account = nil # Clean up Current.account after use
end
private
# Extracts facts about the account using the fact_extractor_prompt
# @return [String, nil] The raw response from the LLM or nil if extraction failed
def extract_facts
# Use PromptReader to read the prompt with the account's context
prompt_content = PromptReader.read("fact_extractor_prompt", **@account.to_llm_context)
# Create an ephemeral RubyLLM instance - no chat persistence
chat = RubyLLM.chat(model: "gemini-2.0-flash-exp")
# Ask the question and return the response
chat.ask(prompt_content)
end
# Parses the raw LLM response to extract structured information
# @param result [String] The raw LLM response
# @return [Array<String, String, Array>] Industry, explanation, and categories
def parse_result(result)
# Extract industry
industry_match = result.match(/<determined_industry>(.*?)<\/determined_industry>/m)
industry = industry_match ? industry_match[1].strip : "Unknown"
# Extract explanation
explanation_match = result.match(/<explanation>(.*?)<\/explanation>/m)
explanation = explanation_match ? explanation_match[1].strip : ""
# Extract time-sensitive categories
categories_match = result.match(/<time_sensitive_categories>(.*?)<\/time_sensitive_categories>/m)
categories_text = categories_match ? categories_match[1].strip : ""
# Split categories text into individual category definitions
categories = categories_text.split(/(?=- [A-Z])/).map(&:strip).reject(&:empty?)
[industry, explanation, categories]
end
# Stores the industry as a memory
# @param industry [String] The detected industry
# @param explanation [String] The explanation for the industry detection
# @return [Memory] The created memory
def store_industry_memory(industry, explanation)
# Stuff
end
# Stores time-sensitive categories as memories
# @param categories [Array<String>] Array of category descriptions
# @return [Array<Memory>] Array of created memories
def store_category_memories(categories)
# Stuff
end
end With parsers: # frozen_string_literal: true
# Extracts industry and time-sensitive categories from an account's emails
# and stores them as memories for future reference
class FactExtractorService
# @param account [Account] The account to extract facts from
def initialize(account)
@account = account
Current.account = @account # Set Current.account for tools
end
# Runs the fact extraction process and stores the results as memories
# @return [Boolean] true if extraction was successful, false otherwise
def run
response = extract_facts
return false unless response
# Store the extracted information as memories
store_industry_memory(response.industry, response.explanation)
store_category_memories(response.categories)
true
rescue StandardError => e
Rails.logger.error("Error in FactExtractorService: #{e.message}")
false
ensure
Current.account = nil # Clean up Current.account after use
end
private
# Extracts facts about the account using the fact_extractor_prompt with regex parser
# @return [ParsedResponse, nil] The parsed response from the LLM or nil if extraction failed
def extract_facts
# Use PromptReader to read the prompt with the account's context
prompt_content = PromptReader.read("fact_extractor_prompt", **@account.to_llm_context)
# Create an ephemeral RubyLLM instance - no chat persistence
chat = RubyLLM.chat(model: "gemini-2.0-flash-exp")
# Use the regex parser pattern to extract structured data from the response
chat.with_parser(:regex, patterns: {
industry: /<determined_industry>(.*?)<\/determined_industry>/m,
explanation: /<explanation>(.*?)<\/explanation>/m,
categories_text: /<time_sensitive_categories>(.*?)<\/time_sensitive_categories>/m
}).ask(prompt_content)
end
# Stores the industry as a memory
# @param industry [String] The detected industry
# @param explanation [String] The explanation for the industry detection
# @return [Memory] The created memory
def store_industry_memory(industry, explanation)
# Stuff
end
# Stores time-sensitive categories as memories
# @param categories [Array<String>] Array of category descriptions
# @return [Array<Memory>] Array of created memories
def store_category_memories(categories_text)
# Stuff
end
end |
I’m having a hard time following the above code. Should we be thinking of formatting prompts as a separate concern than formatting/structuring outputs? I love the idea of standard ways to format prompt data! Just unsure if we should try to tackle both at once. |
The example is just for the parser, the rest is just code I have. I'm just sharing real world code and problems it could solve. |
I just switch from |
This comment has been minimized.
This comment has been minimized.
Okay, given the awesome amount of interest let's do the following to merge it quickly:
Let's make it simple. Who's up for it? |
I can do it! Just want to make sure we do not do double work |
@danielfriis Maybe you can take the RubyLLM::Schema in a different PR? As long as we have a |
@kieranklaassen sure, I'll follow your lead. |
I think what you had is already almost done, I'll make it work like this: response = chat.with_output_schema(Product.json_schema)
.ask("Extract details: iPhone 15, $999, Electronics, features: A17 chip, 48MP camera. In stock: yes")
JSON.parse(response.content)
They can work nicely together |
Awesome stuff btw, just been lurking on the PR, would love to get this working with Gemini. |
Sadly we need this before we can get this working on Gemini #121 It works for OpenAI: |
After some collaboration with @tpaulshippy I've posted this PR #124 to add support for more more complex param schemas without changing the way the gem currently does things. It is reverse compatible. I really like the If we can get both these PRs merged it'll be awesome! |
Great looking project, very natural ruby approach!
Thinking about how to get official structured outputs supported (as in, not just validating a json schema against a text response from the LLM, but actually using the model’s officially supported output response formatting), it looks like there are two good projects that either can be leveraged in this project, or at least borrow some ideas from:
https://github.com/nicieja/lammy
https://github.com/instructor-ai/instructor-rb (Official clone of instructor from python, but doesn’t look to have been updated recently)
Do you have initial thoughts on how you want to approach this?
The text was updated successfully, but these errors were encountered: