Custom URL support #9

zachlatta · 2025-03-12T17:41:57Z

First - wow! Thanks for making this!

Would love the ability to override the API base URL for any given model. I didn't see something in the docs around this.

This would enable support for Llama and proxies.

crmne · 2025-03-14T07:03:20Z

Hey @zachlatta - thanks for the kind words!

RubyLLM is designed around the idea that provider-specific details should be handled inside the library, not exposed as configuration. This gives us cleaner APIs and better handling of models than the "just swap the URL" approach.

We have an open issue for Ollama (#2), but we're implementing it as a proper provider rather than through OpenAI compatibility layer, which tends to be buggy and incomplete in my experience.

The same goes for other providers with OpenAI-compatible APIs - we'll add them as first-class providers in RubyLLM (like we did with DeepSeek) rather than making you fiddle with base URLs. This approach lets us properly handle model routing, capabilities, and pricing info.

If you're looking to use a specific provider, let us know which one and we can prioritize it. If it's about running models in your own infrastructure, that's valuable feedback for us to consider how to best support that use case.

zachlatta · 2025-03-14T11:45:55Z

The two others that immediately come to mind for me are LM Studio and Groq!

…

________________________________ From: Carmine Paolino ***@***.***> Sent: Friday, March 14, 2025 3:03:41 AM To: crmne/ruby_llm ***@***.***> Cc: Zach Latta ***@***.***>; Mention ***@***.***> Subject: Re: [crmne/ruby_llm] Custom URL support (Issue #9) Hi @zachlatta<https://github.com/zachlatta> and thank you! What other providers apart from Ollama would you like to support by changing the base url? We already have an issue to support Ollama #2<#2> and we're not gonna do it through the OpenAI compatibility layer as it doesn't support everything, and I've had a myriad of issues with other OpenAI compatibility layers in the past. Looking at you Gemini. — Reply to this email directly, view it on GitHub<#9 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAHSH6GRI4EYFZ2XG7DFBM32UJ5M3AVCNFSM6AAAAABY4GU72KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOMRTHAYTSNJYGU>. You are receiving this because you were mentioned.Message ID: ***@***.***> [crmne]crmne left a comment (crmne/ruby_llm#9)<#9 (comment)> Hi @zachlatta<https://github.com/zachlatta> and thank you! What other providers apart from Ollama would you like to support by changing the base url? We already have an issue to support Ollama #2<#2> and we're not gonna do it through the OpenAI compatibility layer as it doesn't support everything, and I've had a myriad of issues with other OpenAI compatibility layers in the past. Looking at you Gemini. — Reply to this email directly, view it on GitHub<#9 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAHSH6GRI4EYFZ2XG7DFBM32UJ5M3AVCNFSM6AAAAABY4GU72KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOMRTHAYTSNJYGU>. You are receiving this because you were mentioned.Message ID: ***@***.***>

jyunderwood · 2025-03-15T12:06:06Z

This makes sense, but I'm not sure how you get around the need for configuring the url for provider like Azure OpenAI Service where your deployed models are at a unique endpoint for your resource.

We use Azure OpenAI at work for its unified billing and access control, and I would like to be able to use this library with our existing deployments.

aspires · 2025-03-15T20:12:17Z

Hopping in here with another use case. Fastly's LLM semantic caching product, AI Accellerator, operates as a URL/API key swap within code to initiate middleware caching with Python and Node.js libraries.

Disclaimer: I work for Fastly. I'd love to feature a Ruby example within the product. If there's an easy drop-in integration option, we'll find a way to feature it.

crmne · 2025-03-17T09:13:55Z

I still believe in our model-first approach rather than taking the "swap the URL in the config" shortcut. Each provider deserves proper first-class implementation with correct model routing, capabilities, and pricing info.

That said, Azure OpenAI presents a legitimate use case where URL configuration is inherent to the provider. I'll need to think about the right design to accommodate this without compromising our principles - maybe extending our OpenAI provider to handle Azure's deployment-based approach. Are pricing and capabilities the same as the original OpenAI models?

jyunderwood · 2025-03-17T15:45:32Z

There are probably enough differences in constructing the request that it would make sense for Azure OpenAI to be it's own provider.

However, the capabilities of Azure's hosted OpenAI APIs are the same as OpenAI's, and their pricing is identical, as far as I'm aware. For example, o3-mini costs $1.10 per million tokens on both platforms.

And the responses are the same structure.

Some differences:

Models can be deployed under custom names rather than their official model names. For instance, you could deploy the o3-mini model as "my-o3-mini," or whatever you want, and that would be the name used to access it istead of the official name.

The API key is sent as an api-key header when you use the static keys generated when your AI resource is created and as a bearer token using Authorization if you use Entra ID to get a per client or user token before sending the request.

Additionally, Azure requires an explicit API version. My understanding is this helps Azure stay in sync with OpenAI's API in terms of features and capabilities. So if new features are added to OpenAI's API, you would need to update the API version to gain access to those features.

An example Azure OpenAI request:

azure_api_key = "asdf-1234-asdf-1234"
azure_endpoint = "https://resource-name.openai.azure.com"
deployment_name = "my-deployment-name"  # Instead of official model name
api_version = "2024-10-21" # This is the latest GA version

# Instead of "https://api.openai.com/v1/chat/completions"
url = "#{azure_endpoint}/openai/deployments/#{deployment_name}/chat/completions?api-version=#{api_version}"

# Instead of { "Authorization" => "Bearer #{ENV['OPENAI_API_KEY']}" }
headers = {
  "api-key" => azure_api_key,
  "Content-Type" => "application/json"
}

# Note since the model is in the url, you don't need to put it in the body payload.
body = {
  "messages" => [{ "role" => "user", "content" => "Hello, how are you?" }]
}.to_json

response = HTTP.headers(headers).post(url, body: body)
puts JSON.pretty_generate(JSON.parse(response.body.to_s))

crmne · 2025-03-23T16:11:14Z

I appreciate the enthusiasm for RubyLLM, but I'm not particularly eager to turn this into a game of "config over code." URL overrides are essentially a backdoor that lets you bypass all the intelligence we've built into proper provider implementations.

The appeal of "just let me change the URL" is understandable, but it introduces a ton of implicit assumptions that I don't want in the codebase. When we implement a provider directly, we can ensure proper model routing, parameter handling, error management, and pricing information.

Azure OpenAI is a legitimate exception since their entire architecture is based on dynamic endpoints. We'll likely add proper Azure support in #15, but with a deliberate implementation, not by introducing a generic override.

For providers like LM Studio, Groq, or Ollama - I'd rather see purpose-built providers that correctly handle their quirks and capabilities than to have people hacking around with URL swaps.

howdymatt · 2025-03-26T17:13:14Z

If it's about running models in your own infrastructure, that's valuable feedback for us to consider how to best support that use case.

I have this exact case. I plan to use LiteLLM to proxy our requests to models through an OpenAI-compatible interface. The underlying models will vary greatly. In my case, would I need to create a configuration for each application that uses the gateway, given that each application make end up using a set of models (they may want o3-mini and amazon-nova-pro)?

crmne · 2025-03-26T17:42:36Z

@howdymatt if you're simply using an OpenAI interface for everything you should probably use @alexrudall's https://github.com/alexrudall/ruby-openai gem.

zachlatta · 2025-03-26T17:45:07Z

I respect this is your library, but it’d really be quite helpful to be able to easily override URLs. Also, from a UX perspective, if I’m overriding the URL I kind of assume if stuff is breaking, it’s my fault. I’ve been thinking about building an OpenAI proxy and force usage of it across our apps. That’s another scenario where URL overrides would be helpful.

…

-- Zach Latta https://zachlatta.com<http://zachlatta.com> @zachlatta<https://twitter.com/zachlatta> / fb<https://www.facebook.com/crynix> / github<https://github.com/zachlatta> On Mar 26, 2025 at 1:13:36 PM, Matt Baker ***@***.***> wrote: [howdymatt]howdymatt left a comment (crmne/ruby_llm#9)<#9 (comment)> If it's about running models in your own infrastructure, that's valuable feedback for us to consider how to best support that use case. I have this exact case. I plan to use LiteLLM to proxy our requests to models through an OpenAI-compatible interface. The underlying models will vary greatly. In my case, would I need to create a configuration for each application that uses the gateway, given that each application make end up using a set of models (they may want o3-mini and amazon-nova-pro)? — Reply to this email directly, view it on GitHub<#9 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAHSH6BYGMDPFR5PQWKC2P32WLN4BAVCNFSM6AAAAABY4GU72KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDONJVGE2DQMRZGY>. You are receiving this because you were mentioned.

howdymatt · 2025-03-26T17:45:54Z

@howdymatt if you're simply using an OpenAI interface for everything you should probably use @alexrudall's alexrudall/ruby-openai gem.

Got it. That is the library we use currently btw, I just like the idea of yours more :). Feels more Railsy, especially the tool calling functionality.

crmne · 2025-03-26T17:57:56Z

I hear you all on the need for more flexibility with custom endpoints and models. After some thinking, I think I've found a clean solution that maintains RubyLLM's principles while giving you the escape hatches you need.

Here's what we're going to do:

# Configure a custom endpoint if you need one
RubyLLM.configure do |config|
  config.openai_api_key = ENV['OPENAI_API_KEY']
  config.openai_api_base = "https://your-proxy.example.com"
end

# Default case - everything validated and proper
chat = RubyLLM.chat(model: "gpt-4")

# Need to use a custom model name? Just skip validation
chat = RubyLLM.chat(
  model: "gpt-9",
  provider: :openai,
  assume_model_exists: true
)

chat.with_model(model: "gpt-9", provider: :openai, assume_exists: true)

This handles pretty much every case that's come up:

@zachlatta - Your proxy case just works with openai_api_base
@jyunderwood - Azure deployments can use custom model names with assume_model_exists until we add proper Azure support in Azure OpenAI support #15
@howdymatt - LiteLLM proxying works the same way - use the custom URL and whatever model names your proxy exposes
@aspires - For Fastly's AI Accelerator, same deal - swap the URL and keep all the nice RubyLLM functionality

The assume_model_exists flag becomes particularly powerful when combined with provider - it lets you use any model ID with any provider, even if we don't know about it yet. This helps with new models, custom deployments, proxies, or any other case where you need to bypass our model registry.

I still believe in proper provider implementations as the best path (we'll keep adding official providers like we did with DeepSeek), but this gives you a clean way forward when you need to work outside those bounds.

How's that sound?

crmne · 2025-03-26T18:03:34Z

The reason we need this and why I was a bit resistant to this change is because RubyLLM requires any model you use to be registered in our models.json file (or discoverable via models.refresh!). The assume_model_exists flag would let you bypass this requirement - particularly useful when:

# Using models we don't know about yet
chat = RubyLLM.chat(
  model: "gpt-5-preview",  # New model not in models.json
  provider: :openai,
  assume_model_exists: true
)

# Using custom deployment names
chat = RubyLLM.chat(
  model: "my-deployment",  # Custom name not in models.json
  provider: :openai,
  assume_model_exists: true
)

This gives you flexibility while keeping our default behavior safe and validated.

howdymatt · 2025-03-26T19:39:08Z

That works for my use case.

I assume that bypassing validation still allows us to use all the functionality (tool calling, etc) that exists for a given proxied model (as long as the proxy supports them as well)?

crmne · 2025-03-26T19:42:23Z

It should work indeed.

zachlatta · 2025-03-26T19:59:58Z

I love this proposal!

gastonrey · 2025-04-14T14:51:24Z

Hi team, is the skip_model_validation thing already implemented? I saw you created a PR for custom urls but then it was closed

crmne · 2025-04-15T13:18:27Z

Fixed in d628495 and 48f2faa

Grab the version from main and please test it!

The only change from the above message is that the option is assume_model_exists which is more ergonomic IMO, as is openai_api_base. I updated the above message.

crmne · 2025-04-15T22:47:17Z

https://rubyllm.com/guides/models#connecting-to-custom-endpoints--using-unlisted-models

aspires mentioned this issue Mar 16, 2025

Add OpenAI URL override and support for custom request headers #27

Closed

crmne added the enhancement New feature or request label Mar 23, 2025

crmne closed this as completed Mar 23, 2025

crmne reopened this Mar 26, 2025

crmne closed this as completed Apr 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom URL support #9

Custom URL support #9

zachlatta commented Mar 12, 2025

crmne commented Mar 14, 2025 •

edited

Loading

zachlatta commented Mar 14, 2025 via email

jyunderwood commented Mar 15, 2025

aspires commented Mar 15, 2025

crmne commented Mar 17, 2025

jyunderwood commented Mar 17, 2025

crmne commented Mar 23, 2025

howdymatt commented Mar 26, 2025

crmne commented Mar 26, 2025

zachlatta commented Mar 26, 2025 via email

howdymatt commented Mar 26, 2025

crmne commented Mar 26, 2025 •

edited

Loading

crmne commented Mar 26, 2025 •

edited

Loading

howdymatt commented Mar 26, 2025 •

edited

Loading

crmne commented Mar 26, 2025

zachlatta commented Mar 26, 2025

gastonrey commented Apr 14, 2025

crmne commented Apr 15, 2025 •

edited

Loading

crmne commented Apr 15, 2025

Custom URL support #9

Custom URL support #9

Comments

zachlatta commented Mar 12, 2025

crmne commented Mar 14, 2025 • edited Loading

zachlatta commented Mar 14, 2025 via email

jyunderwood commented Mar 15, 2025

aspires commented Mar 15, 2025

crmne commented Mar 17, 2025

jyunderwood commented Mar 17, 2025

crmne commented Mar 23, 2025

howdymatt commented Mar 26, 2025

crmne commented Mar 26, 2025

zachlatta commented Mar 26, 2025 via email

howdymatt commented Mar 26, 2025

crmne commented Mar 26, 2025 • edited Loading

crmne commented Mar 26, 2025 • edited Loading

howdymatt commented Mar 26, 2025 • edited Loading

crmne commented Mar 26, 2025

zachlatta commented Mar 26, 2025

gastonrey commented Apr 14, 2025

crmne commented Apr 15, 2025 • edited Loading

crmne commented Apr 15, 2025

crmne commented Mar 14, 2025 •

edited

Loading

crmne commented Mar 26, 2025 •

edited

Loading

crmne commented Mar 26, 2025 •

edited

Loading

howdymatt commented Mar 26, 2025 •

edited

Loading

crmne commented Apr 15, 2025 •

edited

Loading