Skip to content

[rb] Leverage existing URI::Generic and Net::HTTP code for proxy handling #15782

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: trunk
Choose a base branch
from

Conversation

rgisiger
Copy link

@rgisiger rgisiger commented May 23, 2025

User description

Details:

  • Replaced custom use_proxy? method with URI::Generic.use_proxy? to use already existing functionality for handling the no_proxy string, which can include a list of comma- or space-separated elements.
  • Take advantage of the p_no_proxy variable in Net::HTTP to maintain environment variables related to proxy settings.
  • Note: With this change, wildcards are no longer usable to bypass the proxy due to limitations in the existing proxy code.

🔗 Related Issues

Related to #5004

💥 What does this PR do?

It changes the way the new_http_client is built to leverage what is already done in URI::Generic and Net::HTTP about the condition to use the proxy or not.

🔧 Implementation Notes

I tried to look up for a standardized way of using no_proxy but didn't find any. It seems like we are in the jungle and everyone has its own opinion.
However, the most common format I have found for no_proxy was a comma-separated list of strings:

Also, I used only components that are already part of the project and things could been done differently.

💡 Additional Considerations

If wildcards need to be allowed, then it would most likely implies a change on URI::Generic first. However, I don't get why it could be useful to have it since we could use no proxy instead.

As alternative for the no_proxy value, I could have kept a string in the setter method, and transformed it to Array for the as_json method.

🔄 Types of changes

  • Cleanup (formatting, renaming):
    • no_proxy value is considered as an Array directly in its setter method
  • Bug fix (backwards compatible):
    • no_proxy value can be passed as a space-separated elements, or comma-plus-space-separated elements
  • New feature (non-breaking change which adds functionality and tests!):
    • domain names can be specified partially (using second-level domain) and it can cover multiple subdomains
  • Breaking change (fix or feature that would cause existing functionality to change):
    • Global wildcard (*) is not supported anymore

PR Type

Bug fix, Enhancement


Description

  • Refactored proxy handling to use URI::Generic and Net::HTTP

    • Removed custom proxy logic in favor of Ruby stdlib methods
    • Improved parsing of no_proxy to handle lists per WebDriver spec
  • Updated serialization and Firefox profile to treat no_proxy as a list

  • Enhanced and expanded unit tests for proxy and no_proxy handling

  • Removed deprecated and redundant proxy code


Changes walkthrough 📝

Relevant files
Bug fix
proxy.rb
Refactor and spec-compliant handling of no_proxy in Proxy class

rb/lib/selenium/webdriver/common/proxy.rb

  • Updated no_proxy= to parse strings into lists per spec
  • Adjusted JSON serialization to output no_proxy as a list
  • Added reference to WebDriver proxy spec
  • +8/-2     
    Enhancement
    profile.rb
    Update Firefox profile to use list-based no_proxy               

    rb/lib/selenium/webdriver/firefox/profile.rb

  • Set Firefox network.proxy.no_proxies_on using joined list from
    no_proxy
  • Ensured compatibility with new list-based no_proxy
  • +2/-1     
    default.rb
    Refactor HTTP proxy logic to use stdlib and support no_proxy lists

    rb/lib/selenium/webdriver/remote/http/default.rb

  • Switched to using URI::Generic for proxy logic
  • Refactored proxy client creation to support list-based no_proxy
  • Removed custom use_proxy? logic in favor of stdlib
  • Improved error messages and proxy environment handling
  • +10/-29 
    Tests
    proxy_spec.rb
    Update and expand Proxy spec for list-based no_proxy         

    rb/spec/unit/selenium/webdriver/proxy_spec.rb

  • Updated tests to expect no_proxy as a list
  • Added tests for various no_proxy separators and formats
  • Improved test coverage for serialization and input handling
  • +10/-4   
    default_spec.rb
    Expand HTTP Default spec for robust no_proxy handling       

    rb/spec/unit/selenium/webdriver/remote/http/default_spec.rb

  • Enhanced tests for proxy and no_proxy environment variable handling
  • Added cases for space-separated, subnet, and wildcard no_proxy
  • Improved test clarity and coverage
  • +34/-2   
    Miscellaneous
    default.rbs
    Remove deprecated proxy method signature from RBS               

    rb/sig/lib/selenium/webdriver/remote/http/default.rbs

  • Removed signature for deprecated use_proxy? method
  • Updated type signatures to match refactored code
  • +0/-2     

    Need help?
  • Type /help how to ... in the comments thread for any questions about Qodo Merge usage.
  • Check out the documentation for more information.
  • Details:
    - Replaced custom use_proxy? method with URI::Generic.use_proxy? to use already existing functionality for handling the `no_proxy` string, which can include a list of comma- or space-separated elements.
    - Take advantage of the `p_no_proxy` variable in Net::HTTP to maintain environment variables related to proxy settings.
    - Note: With this change, wildcards are no longer usable to bypass the proxy due to limitations in the existing proxy code.
    
    Related to SeleniumHQ#5004
    @CLAassistant
    Copy link

    CLAassistant commented May 23, 2025

    CLA assistant check
    All committers have signed the CLA.

    Copy link
    Contributor

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    🎫 Ticket compliance analysis ❌

    5004 - Partially compliant

    Compliant requirements:

    • Fix the noProxy capability to be a list as required by the WebDriver spec

    Non-compliant requirements:

    • Update the Java binding that currently forces the entry to a string type

    Requires further human verification:

    • Create a wrapper to maintain compatibility with both old drivers and WebDriver spec compliant drivers

    5678 - Not compliant

    Non-compliant requirements:

    • Fix "Error: ConnectFailure (Connection refused)" when instantiating ChromeDriver

    1234 - Not compliant

    Non-compliant requirements:

    • Fix issue where JavaScript in link's href is not triggered on click() in Firefox 42.0

    ⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
    🧪 PR contains tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Proxy Behavior Change

    The PR removes wildcard support in no_proxy by leveraging URI::Generic.use_proxy? which doesn't support wildcards. This is a behavior change that should be validated to ensure it doesn't break existing use cases.

    if proxy
      url = proxy.http
      unless proxy.respond_to?(:http) && url
        raise Error::WebDriverError,
              "expected HTTP proxy, got #{proxy.inspect}"
      end
    
      proxy_url = URI.parse(url)
    
      Net::HTTP.new(server_url.host, server_url.port,
                    proxy_url.host, proxy_url.port, proxy_url.user, proxy_url.password,
                    proxy.no_proxy&.join(','))
    else
      Net::HTTP.new(server_url.host, server_url.port, nil)
    end
    String Parsing

    The regex used to parse no_proxy strings may not handle all edge cases. The pattern value.scan(/([^:,\s]+)(:\d+)?/).map(&:join) should be reviewed to ensure it correctly handles all valid no_proxy formats.

    @no_proxy = if value.is_a?(String)
                  value.scan(/([^:,\s]+)(:\d+)?/).map(&:join) # adapted from URI::Generic.use_proxy?
                else
                  value
                end
    Potential Null Reference

    The expression proxy.no_proxy&.join(',') || '' could be simplified to avoid potential issues if no_proxy is an empty array rather than nil.

      self['network.proxy.no_proxies_on'] = proxy.no_proxy&.join(',') || ''
    when :pac

    Copy link
    Contributor

    qodo-merge-pro bot commented May 23, 2025

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Impact
    General
    Clarify test description
    Suggestion Impact:The commit directly implemented the suggested test name change to clarify that wildcard characters in no_proxy are ignored and the proxy is still used

    code diff:

    -            it "uses the specified proxy if the wildcard character is used in #{no_proxy_var}" do
    +            it "ignores wildcard characters in #{no_proxy_var} and uses the proxy" do

    The test name suggests that a wildcard in no_proxy should use the proxy, but
    this behavior contradicts standard proxy behavior. Wildcards in no_proxy
    typically mean "don't use proxy for these domains". The test should be renamed
    to clarify the actual behavior being tested.

    rb/spec/unit/selenium/webdriver/remote/http/default_spec.rb [147-157]

    -it "uses the specified proxy if the wildcard character is used in #{no_proxy_var}" do
    +it "ignores wildcard characters in #{no_proxy_var} and uses the proxy" do
       with_env('http_proxy' => 'proxy.org:8080', no_proxy_var => '*') do
         http = client.send :http
         expect(http).to be_proxy
       end
     
       with_env('http_proxy' => 'proxy.org:8080', no_proxy_var => '*.example.com') do
         http = client.send :http
         expect(http).to be_proxy
       end
     end

    [Suggestion processed]

    Suggestion importance[1-10]: 6

    __

    Why: Renaming the test to clarify that wildcards in no_proxy do not prevent proxy usage improves test readability and developer understanding. The code logic remains unchanged, so the impact is moderate but helpful for maintainability.

    Low
    Handle nil proxy values

    The no_proxy parameter is passed directly to Net::HTTP, but proxy.no_proxy might
    be nil. While &.join(',') handles nil arrays, it's safer to ensure we're passing
    a properly formatted string or nil to Net::HTTP to avoid potential errors.

    rb/lib/selenium/webdriver/remote/http/default.rb [131-133]

     Net::HTTP.new(server_url.host, server_url.port,
                   proxy_url.host, proxy_url.port, proxy_url.user, proxy_url.password,
    -              proxy.no_proxy&.join(','))
    +              proxy.no_proxy.nil? ? nil : proxy.no_proxy.join(','))
    • Apply / Chat
    Suggestion importance[1-10]: 5

    __

    Why: This suggestion makes the handling of a nil no_proxy value more explicit, which can improve code clarity and prevent subtle bugs. However, the use of &.join(',') already safely handles nil, so the improvement is minor.

    Low
    Learned
    best practice
    Add server_url null check
    Suggestion Impact:The commit directly implemented the suggested null check for server_url before it's used, adding the exact line 'raise Error::WebDriverError, 'server_url not set' unless server_url' followed by a blank line

    code diff:

    +            raise Error::WebDriverError, 'server_url not set' unless server_url
    +

    Add a null check for server_url before accessing its properties to prevent
    potential nil reference errors. The method assumes server_url is always
    initialized, but it's safer to validate this assumption.

    rb/lib/selenium/webdriver/remote/http/default.rb [121-136]

     def new_http_client
    +  raise Error::WebDriverError, "server_url not set" unless server_url
    +  
       if proxy
         url = proxy.http
         unless proxy.respond_to?(:http) && url
           raise Error::WebDriverError,
                 "expected HTTP proxy, got #{proxy.inspect}"
         end
     
         proxy_url = URI.parse(url)
     
         Net::HTTP.new(server_url.host, server_url.port,
                       proxy_url.host, proxy_url.port, proxy_url.user, proxy_url.password,
                       proxy.no_proxy&.join(','))
       else
         Net::HTTP.new(server_url.host, server_url.port, nil)
       end
     end

    [Suggestion processed]

    Suggestion importance[1-10]: 6

    __

    Why:
    Relevant best practice - Add null checks for parameters and properties before using them to prevent NullReferenceExceptions.

    Low
    • Update

    @selenium-ci selenium-ci added the C-rb Ruby Bindings label May 23, 2025
    @rgisiger rgisiger changed the title Leverage existing URI::Generic and Net::HTTP code for proxy handling [rb] leverage existing URI::Generic and Net::HTTP code for proxy handling May 23, 2025
    @rgisiger rgisiger changed the title [rb] leverage existing URI::Generic and Net::HTTP code for proxy handling [rb] Leverage existing URI::Generic and Net::HTTP code for proxy handling May 23, 2025
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    3 participants