Skip to content

Add Local Model Support via Ollama Integration #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
crmne opened this issue Mar 11, 2025 · 4 comments
Closed

Add Local Model Support via Ollama Integration #2

crmne opened this issue Mar 11, 2025 · 4 comments
Labels
new provider New provider integration

Comments

@crmne
Copy link
Owner

crmne commented Mar 11, 2025

TL;DR: Add support for running fully local models via Ollama.

Background

While cloud models offer state-of-the-art capabilities, there are compelling cases for running models locally:

  1. Privacy & Compliance: Keep sensitive data entirely on-premise
  2. Cost Control: Eliminate ongoing API costs for high-volume applications
  3. Latency: Remove network overhead for latency-sensitive applications
  4. Offline Operation: Run AI features without internet connectivity

Ollama provides an excellent way to run models like Llama, Mistral, and others locally with a simple API that's compatible with our existing architecture.

Proposed Solution

Add a new provider interface for Ollama that implements our existing abstractions:

# Configuration
RubyLLM.configure do |config|
  config.ollama_host = "http://localhost:11434" # Default
end

# Usage remains identical to cloud models
chat = RubyLLM.chat(model: 'llama2')
chat.ask("What's the capital of France?")

# Or with embeddings
RubyLLM.embed("Ruby is a programmer's best friend", model: 'nomic-embed-text')

Technical Details

For those looking to help implement this, you'll need to:

  1. Create a new provider module in lib/ruby_llm/providers/ollama/
  2. Implement the core provider interface methods:
    • complete - For chat functionality
    • embed - For embeddings
    • api_base - Returns the Ollama API endpoint
    • capabilities - Define model capabilities
  3. Handle the payload formatting differences between Ollama and OpenAI/Claude

The PR should include:

  • Provider implementation
  • Configuration option for Ollama host
  • Tests that can be run against a local Ollama instance
  • Documentation updates

Benefits

  • Cost efficiency: Eliminate API costs for many use cases
  • Privacy: Keep sensitive data local
  • Flexibility: Mix and match local and cloud models in the same codebase
  • Performance: Reduce latency for response-time sensitive applications
@Mizokuiam
Copy link

Hey @crmne, this is a fantastic proposal! Local model support via Ollama would be a huge win for ruby_llm users. The benefits you've outlined – privacy, cost control, latency reduction, and offline capabilities – are all spot-on.

I really like the proposed configuration and usage pattern. Keeping the API consistent with existing cloud models minimizes the learning curve for users. The example snippet is clear and concise:

# Configuration
RubyLLM.configure do |config|
  config.ollama_host = "http://localhost:11434" # Default
end

# Usage remains identical to cloud models
chat = RubyLLM.chat(model: 'llama2', provider: :ollama) # Explicitly set provider
chat.ask("What's the capital of France?")

# Or with embeddings
RubyLLM.embed("Ruby is a programmer's best friend", model: 'nomic-embed-text', provider: :ollama) # Explicitly set provider

One minor suggestion: Adding an explicit provider: :ollama argument to chat and embed calls would provide more clarity and control, especially when using a mixed local/cloud setup. It also future-proofs the codebase for potential naming conflicts if cloud providers ever introduce similarly named models.

Regarding the technical details, your outline is solid. The separation into complete and embed methods makes sense, and handling payload formatting within the provider module is the right approach. Ensuring comprehensive tests against a local Ollama instance is crucial.

One thing to consider during implementation is error handling. Ollama might return different error codes and messages compared to cloud providers. The provider should gracefully handle these differences and translate them into consistent ruby_llm exceptions. We should also consider how to handle cases where the Ollama server is unavailable or returns unexpected responses.

@crmne crmne marked this as a duplicate of #24 Mar 17, 2025
@crmne crmne added the new provider New provider integration label Mar 23, 2025
@bdevel
Copy link

bdevel commented Mar 25, 2025

FWIW, Ollama has an OpenAI compatible API. In theory this should work so long as you can specify a base URL, example:

client = OpenAI(
    base_url = 'http://localhost:11434/v1',
    api_key='ollama'
)

Further, Ollama also supports tools which would be great to integrate. The Ollama docs state that the OpenAI compatible API also supports tools.

@MadBomber
Copy link

I'm working with api.x.ai/v1 which like ollama and many other providers supports the OpenAI end points and in general - but you just can't trues it - the same JSON interface. Here is as far as I was able to get tonight on bring up xAI within RubyLLM before I ran into an api key issue. I know the API key value for x.ai is good because I have the same base code implemented using ruby-openai and via curl

require 'ruby_llm'
require 'pathname'

HERE = Pathname.pwd

module RubyLLM
  module Providers
    module OpenAI
      def api_base
        XAI_BASE_URL
      end
    end
  end

  class Models
    class << self
      def models_file
        File.join(HERE, 'xai_models.json')
      end
    end
  end
end

# Configure ruby_llm with the API key
RubyLLM.configure do |config|
  config.openai_api_key = XAI_API_KEY
end

# Initialize the chat client with the model
chat = RubyLLM.chat(model: 'grok-2-latest')

# Send the message
response = chat.ask('What is the meaning of xyzzy?')

debug_me { [:response] }

The ask method does not return. There is an exception issued which looks like its because the api key was being validated by OpanAI rather than the base_url.

/Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/ruby_llm-1.0.1/lib/ruby_llm/error.rb:56:in 'RubyLLM::ErrorMiddleware.parse_error': Incorrect API key provided: xai-vloD************************************************************************G91T. You can find your API key at https://platform.openai.com/account/api-keys. (RubyLLM::UnauthorizedError)
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/ruby_llm-1.0.1/lib/ruby_llm/error.rb:42:in 'block in RubyLLM::ErrorMiddleware#call'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/faraday-2.12.2/lib/faraday/response.rb:42:in 'Faraday::Response#on_complete'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/ruby_llm-1.0.1/lib/ruby_llm/error.rb:41:in 'RubyLLM::ErrorMiddleware#call'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/faraday-2.12.2/lib/faraday/middleware.rb:56:in 'Faraday::Middleware#call'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/faraday-2.12.2/lib/faraday/middleware.rb:56:in 'Faraday::Middleware#call'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/faraday-retry-2.3.1/lib/faraday/retry/middleware.rb:171:in 'block in Faraday::Retry::Middleware#call'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/faraday-retry-2.3.1/lib/faraday/retry/retryable.rb:7:in 'Faraday::Retryable#with_retries'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/faraday-retry-2.3.1/lib/faraday/retry/middleware.rb:167:in 'Faraday::Retry::Middleware#call'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/faraday-2.12.2/lib/faraday/middleware.rb:56:in 'Faraday::Middleware#call'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/faraday-2.12.2/lib/faraday/response/logger.rb:23:in 'Faraday::Response::Logger#call'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/faraday-2.12.2/lib/faraday/rack_builder.rb:152:in 'Faraday::RackBuilder#build_response'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/faraday-2.12.2/lib/faraday/connection.rb:452:in 'Faraday::Connection#run_request'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/faraday-2.12.2/lib/faraday/connection.rb:280:in 'Faraday::Connection#post'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/ruby_llm-1.0.1/lib/ruby_llm/provider.rb:73:in 'RubyLLM::Provider::Methods#post'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/ruby_llm-1.0.1/lib/ruby_llm/provider.rb:55:in 'RubyLLM::Provider::Methods#sync_response'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/ruby_llm-1.0.1/lib/ruby_llm/provider.rb:27:in 'RubyLLM::Provider::Methods#complete'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/ruby_llm-1.0.1/lib/ruby_llm/chat.rb:81:in 'RubyLLM::Chat#complete'
	from /Users/dewayne/.rbenv/versions/3.4.2/lib/ruby/gems/3.4.0/gems/ruby_llm-1.0.1/lib/ruby_llm/chat.rb:30:in 'RubyLLM::Chat#ask'
	from ./xai.rb:77:in '<main>'

@crmne
Copy link
Owner Author

crmne commented Apr 23, 2025

Closed by 89c371a

@crmne crmne closed this as completed Apr 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new provider New provider integration
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants