Add Local Model Support via Ollama Integration #2

crmne · 2025-03-11T08:48:21Z

TL;DR: Add support for running fully local models via Ollama.

Background

While cloud models offer state-of-the-art capabilities, there are compelling cases for running models locally:

Privacy & Compliance: Keep sensitive data entirely on-premise
Cost Control: Eliminate ongoing API costs for high-volume applications
Latency: Remove network overhead for latency-sensitive applications
Offline Operation: Run AI features without internet connectivity

Ollama provides an excellent way to run models like Llama, Mistral, and others locally with a simple API that's compatible with our existing architecture.

Proposed Solution

Add a new provider interface for Ollama that implements our existing abstractions:

# Configuration
RubyLLM.configure do |config|
  config.ollama_host = "http://localhost:11434" # Default
end

# Usage remains identical to cloud models
chat = RubyLLM.chat(model: 'llama2')
chat.ask("What's the capital of France?")

# Or with embeddings
RubyLLM.embed("Ruby is a programmer's best friend", model: 'nomic-embed-text')

Technical Details

For those looking to help implement this, you'll need to:

Create a new provider module in lib/ruby_llm/providers/ollama/
Implement the core provider interface methods:
- complete - For chat functionality
- embed - For embeddings
- api_base - Returns the Ollama API endpoint
- capabilities - Define model capabilities
Handle the payload formatting differences between Ollama and OpenAI/Claude

The PR should include:

Provider implementation
Configuration option for Ollama host
Tests that can be run against a local Ollama instance
Documentation updates

Benefits

Cost efficiency: Eliminate API costs for many use cases
Privacy: Keep sensitive data local
Flexibility: Mix and match local and cloud models in the same codebase
Performance: Reduce latency for response-time sensitive applications

The text was updated successfully, but these errors were encountered:

Mizokuiam · 2025-03-15T06:48:56Z

Hey @crmne, this is a fantastic proposal! Local model support via Ollama would be a huge win for ruby_llm users. The benefits you've outlined – privacy, cost control, latency reduction, and offline capabilities – are all spot-on.

I really like the proposed configuration and usage pattern. Keeping the API consistent with existing cloud models minimizes the learning curve for users. The example snippet is clear and concise:

# Configuration
RubyLLM.configure do |config|
  config.ollama_host = "http://localhost:11434" # Default
end

# Usage remains identical to cloud models
chat = RubyLLM.chat(model: 'llama2', provider: :ollama) # Explicitly set provider
chat.ask("What's the capital of France?")

# Or with embeddings
RubyLLM.embed("Ruby is a programmer's best friend", model: 'nomic-embed-text', provider: :ollama) # Explicitly set provider

One minor suggestion: Adding an explicit provider: :ollama argument to chat and embed calls would provide more clarity and control, especially when using a mixed local/cloud setup. It also future-proofs the codebase for potential naming conflicts if cloud providers ever introduce similarly named models.

Regarding the technical details, your outline is solid. The separation into complete and embed methods makes sense, and handling payload formatting within the provider module is the right approach. Ensuring comprehensive tests against a local Ollama instance is crucial.

One thing to consider during implementation is error handling. Ollama might return different error codes and messages compared to cloud providers. The provider should gracefully handle these differences and translate them into consistent ruby_llm exceptions. We should also consider how to handle cases where the Ollama server is unavailable or returns unexpected responses.

bdevel · 2025-03-25T20:25:35Z

FWIW, Ollama has an OpenAI compatible API. In theory this should work so long as you can specify a base URL, example:

client = OpenAI(
    base_url = 'http://localhost:11434/v1',
    api_key='ollama'
)

Further, Ollama also supports tools which would be great to integrate. The Ollama docs state that the OpenAI compatible API also supports tools.

MadBomber · 2025-04-06T06:28:35Z

I'm working with api.x.ai/v1 which like ollama and many other providers supports the OpenAI end points and in general - but you just can't trues it - the same JSON interface. Here is as far as I was able to get tonight on bring up xAI within RubyLLM before I ran into an api key issue. I know the API key value for x.ai is good because I have the same base code implemented using ruby-openai and via curl

require 'ruby_llm'
require 'pathname'

HERE = Pathname.pwd

module RubyLLM
  module Providers
    module OpenAI
      def api_base
        XAI_BASE_URL
      end
    end
  end

  class Models
    class << self
      def models_file
        File.join(HERE, 'xai_models.json')
      end
    end
  end
end

# Configure ruby_llm with the API key
RubyLLM.configure do |config|
  config.openai_api_key = XAI_API_KEY
end

# Initialize the chat client with the model
chat = RubyLLM.chat(model: 'grok-2-latest')

# Send the message
response = chat.ask('What is the meaning of xyzzy?')

debug_me { [:response] }

The ask method does not return. There is an exception issued which looks like its because the api key was being validated by OpanAI rather than the base_url.

/Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/ruby_llm-1.0.1/lib/ruby_llm/error.rb:56:in 'RubyLLM::ErrorMiddleware.parse_error': Incorrect API key provided: xai-vloD************************************************************************G91T. You can find your API key at https://platform.openai.com/account/api-keys. (RubyLLM::UnauthorizedError)
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/ruby_llm-1.0.1/lib/ruby_llm/error.rb:42:in 'block in RubyLLM::ErrorMiddleware#call'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/faraday-2.12.2/lib/faraday/response.rb:42:in 'Faraday::Response#on_complete'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/ruby_llm-1.0.1/lib/ruby_llm/error.rb:41:in 'RubyLLM::ErrorMiddleware#call'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/faraday-2.12.2/lib/faraday/middleware.rb:56:in 'Faraday::Middleware#call'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/faraday-2.12.2/lib/faraday/middleware.rb:56:in 'Faraday::Middleware#call'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/faraday-retry-2.3.1/lib/faraday/retry/middleware.rb:171:in 'block in Faraday::Retry::Middleware#call'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/faraday-retry-2.3.1/lib/faraday/retry/retryable.rb:7:in 'Faraday::Retryable#with_retries'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/faraday-retry-2.3.1/lib/faraday/retry/middleware.rb:167:in 'Faraday::Retry::Middleware#call'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/faraday-2.12.2/lib/faraday/middleware.rb:56:in 'Faraday::Middleware#call'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/faraday-2.12.2/lib/faraday/response/logger.rb:23:in 'Faraday::Response::Logger#call'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/faraday-2.12.2/lib/faraday/rack_builder.rb:152:in 'Faraday::RackBuilder#build_response'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/faraday-2.12.2/lib/faraday/connection.rb:452:in 'Faraday::Connection#run_request'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/faraday-2.12.2/lib/faraday/connection.rb:280:in 'Faraday::Connection#post'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/ruby_llm-1.0.1/lib/ruby_llm/provider.rb:73:in 'RubyLLM::Provider::Methods#post'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/ruby_llm-1.0.1/lib/ruby_llm/provider.rb:55:in 'RubyLLM::Provider::Methods#sync_response'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/ruby_llm-1.0.1/lib/ruby_llm/provider.rb:27:in 'RubyLLM::Provider::Methods#complete'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/ruby_llm-1.0.1/lib/ruby_llm/chat.rb:81:in 'RubyLLM::Chat#complete'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/ruby_llm-1.0.1/lib/ruby_llm/chat.rb:30:in 'RubyLLM::Chat#ask'
	from ./xai.rb:77:in '<main>'

crmne · 2025-04-23T13:43:32Z

Closed by 89c371a

This was referenced Mar 12, 2025

Support OpenAI compatible providers through a base URL #6

Closed

Add Ollama as a supported provider #10

Closed

crmne mentioned this issue Mar 14, 2025

Custom URL support #9

Closed

adenta mentioned this issue Mar 15, 2025

How would I connect to a local Ollama Server? #24

Closed

crmne marked this as a duplicate of #24 Mar 17, 2025

crmne added the new provider label Mar 23, 2025

crmne closed this as completed Apr 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GitHub Sponsors

Add Local Model Support via Ollama Integration #2

Add Local Model Support via Ollama Integration #2

crmne commented Mar 11, 2025

Mizokuiam commented Mar 15, 2025

bdevel commented Mar 25, 2025

MadBomber commented Apr 6, 2025

crmne commented Apr 23, 2025

Add Local Model Support via Ollama Integration #2

Add Local Model Support via Ollama Integration #2

Comments

crmne commented Mar 11, 2025

Background

Proposed Solution

Technical Details

Benefits

Mizokuiam commented Mar 15, 2025

bdevel commented Mar 25, 2025

MadBomber commented Apr 6, 2025

crmne commented Apr 23, 2025