Skip to content

Commit 781e400

Browse files
committed
Improve BufferedIO performance
`BufferedIO` is a bit inefficient for reading large responses because it use the classic `buffer.slice!` technique which cause a lot of unnecessary string copying. This is particularly visible on line based protocol when reading line by line. Instead of repeatedly shifting the string, we can keep track of which offset we're at, to know how many bytes are left in the buffer. This change also open the door to further optimization by increasing the buffer size, as previously `slice!` would get slower the larger the buffer is. Benchmark results: ``` === 1k === Warming up -------------------------------------- 1k 1.234k i/100ms 1k opt 1.283k i/100ms Calculating ------------------------------------- 1k 12.615k (± 0.9%) i/s - 64.168k in 5.086995s 1k opt 12.856k (± 0.9%) i/s - 65.433k in 5.090051s Comparison: 1k: 12615.2 i/s 1k opt: 12856.0 i/s - 1.02x (± 0.00) faster === 10k === Warming up -------------------------------------- 10k 1.165k i/100ms 10k opt 1.269k i/100ms Calculating ------------------------------------- 10k 11.550k (± 2.4%) i/s - 58.250k in 5.046378s 10k opt 12.736k (± 1.0%) i/s - 64.719k in 5.081969s Comparison: 10k: 11550.3 i/s 10k opt: 12736.3 i/s - 1.10x (± 0.00) faster === 100k === Warming up -------------------------------------- 100k 809.000 i/100ms 100k opt 926.000 i/100ms Calculating ------------------------------------- 100k 8.054k (± 3.0%) i/s - 40.450k in 5.028299s 100k opt 9.286k (± 2.2%) i/s - 47.226k in 5.088841s Comparison: 100k: 8053.6 i/s 100k opt: 9285.5 i/s - 1.15x (± 0.00) faster === 1M === Warming up -------------------------------------- 1M 249.000 i/100ms 1M opt 315.000 i/100ms Calculating ------------------------------------- 1M 2.448k (± 2.5%) i/s - 12.450k in 5.089744s 1M opt 3.119k (± 2.6%) i/s - 15.750k in 5.053772s Comparison: 1M: 2447.8 i/s 1M opt: 3118.8 i/s - 1.27x (± 0.00) faster ``` Profiling before (1MB responses): ``` ================================== Mode: wall(1000) Samples: 5276 (0.00% miss rate) GC: 394 (7.47%) ================================== TOTAL (pct) SAMPLES (pct) FRAME 1622 (30.7%) 1622 (30.7%) IO#wait_readable 777 (14.7%) 777 (14.7%) IO#read_nonblock 365 (6.9%) 365 (6.9%) (sweeping) 2705 (51.3%) 364 (6.9%) Net::BufferedIO#rbuf_fill 264 (5.0%) 264 (5.0%) String#index 223 (4.2%) 223 (4.2%) String#sub 221 (4.2%) 221 (4.2%) String#slice! 185 (3.5%) 185 (3.5%) String#split 108 (2.0%) 108 (2.0%) IO#write_nonblock 101 (1.9%) 101 (1.9%) String#downcase 66 (1.3%) 66 (1.3%) Net::BufferedIO#LOG 57 (1.1%) 57 (1.1%) String#count 51 (1.0%) 51 (1.0%) String#to_s 391 (7.4%) 50 (0.9%) Net::HTTPGenericRequest#write_header 50 (0.9%) 50 (0.9%) String#capitalize 49 (0.9%) 49 (0.9%) Array#join 47 (0.9%) 47 (0.9%) String#b 106 (2.0%) 36 (0.7%) Net::HTTPHeader#set_field 34 (0.6%) 34 (0.6%) Module#=== 33 (0.6%) 33 (0.6%) String#[] 140 (2.7%) 29 (0.5%) Net::BufferedIO#write0 29 (0.5%) 29 (0.5%) (marking) 281 (5.3%) 27 (0.5%) Net::BufferedIO#rbuf_consume 1195 (22.6%) 25 (0.5%) Net::HTTPResponse#read_body 1024 (19.4%) 25 (0.5%) Net::HTTPResponse.each_response_header 86 (1.6%) 24 (0.5%) Net::HTTPHeader#set_field 23 (0.4%) 23 (0.4%) Net::HTTP#proxy_uri 51 (1.0%) 23 (0.4%) Net::HTTPHeader#initialize_http_header 2225 (42.2%) 22 (0.4%) Net::BufferedIO#readuntil 20 (0.4%) 20 (0.4%) Regexp#=== ``` Profiling after (1MB responses): ``` ================================== Mode: wall(1000) Samples: 15180 (0.00% miss rate) GC: 1688 (11.12%) ================================== TOTAL (pct) SAMPLES (pct) FRAME 4534 (29.9%) 4534 (29.9%) IO#read_nonblock 10650 (70.2%) 3944 (26.0%) Net::HTTPOpt::BufferedIOOpt#rbuf_fill 2101 (13.8%) 2101 (13.8%) IO#wait_readable 1442 (9.5%) 1442 (9.5%) (sweeping) 360 (2.4%) 360 (2.4%) String#sub 312 (2.1%) 312 (2.1%) String#split 265 (1.7%) 265 (1.7%) String#bytesize 246 (1.6%) 246 (1.6%) (marking) 151 (1.0%) 151 (1.0%) IO#write_nonblock 125 (0.8%) 125 (0.8%) String#downcase 116 (0.8%) 116 (0.8%) String#index 113 (0.7%) 113 (0.7%) Module#=== 162 (1.1%) 89 (0.6%) Net::HTTPOpt::BufferedIOOpt#rbuf_consume_all_shareable! 158 (1.0%) 65 (0.4%) Net::HTTPHeader#set_field 63 (0.4%) 63 (0.4%) String#capitalize 63 (0.4%) 63 (0.4%) BasicObject#equal? 58 (0.4%) 58 (0.4%) Regexp#match 58 (0.4%) 58 (0.4%) String#[] 449 (3.0%) 56 (0.4%) Net::HTTPGenericRequest#write_header 53 (0.3%) 53 (0.3%) String#to_s 52 (0.3%) 52 (0.3%) Net::HTTPOpt::BufferedIOOpt#LOG 52 (0.3%) 52 (0.3%) String#count 44 (0.3%) 44 (0.3%) String#byteslice 44 (0.3%) 44 (0.3%) Array#join 1096 (7.2%) 42 (0.3%) Net::HTTPResponse.each_response_header 2617 (17.2%) 40 (0.3%) Net::HTTPOpt::BufferedIOOpt#readuntil 132 (0.9%) 30 (0.2%) Net::HTTPOpt::BufferedIOOpt#rbuf_consume 28 (0.2%) 28 (0.2%) Regexp#=== 27 (0.2%) 27 (0.2%) Net::HTTP#proxy_uri 8862 (58.4%) 27 (0.2%) Net::HTTPResponse#read_body ```` Benchmark code: ```ruby require "fileutils" DIR = "/tmp/www" FileUtils.mkdir_p(DIR) HOST = "127.0.0.1" PORT = 8080 CONF = <<~EOS daemon off; worker_processes 2; events { worker_connections 128; } http { server_tokens off; charset utf-8; server { server_name localhost; listen #{HOST}:#{PORT}; keepalive_requests 10000000; keepalive_timeout 3600s; error_page 500 502 503 504 /50x.html; location / { root #{DIR}; } } } EOS File.write(File.join(DIR, "1k.txt"), 'a' * 1024) File.write(File.join(DIR, "10k.txt"), 'a' * 1024 * 10) File.write(File.join(DIR, "100k.txt"), 'a' * 1024 * 100) File.write(File.join(DIR, "1M.txt"), 'a' * 1024 * 1024) File.write(File.join(DIR, "nginx.conf"), CONF) require "benchmark/ips" require "net/http" nginx_pid = Process.spawn('nginx', '-c', File.join(DIR, "nginx.conf")) module Net class HTTPOpt < HTTP class BufferedIOOpt < ::Net::BufferedIO #:nodoc: internal use only def initialize(io, read_timeout: 60, write_timeout: 60, continue_timeout: nil, debug_output: nil) @io = io @read_timeout = read_timeout @write_timeout = write_timeout @continue_timeout = continue_timeout @debug_output = debug_output @rbuf = ''.b @rbuf_offset = 0 end attr_reader :io attr_accessor :read_timeout attr_accessor :write_timeout attr_accessor :continue_timeout attr_accessor :debug_output def inspect "#<#{self.class} io=#{@io}>" end def eof? @io.eof? end def closed? @io.closed? end def close @io.close end # # Read # public def read(len, dest = ''.b, ignore_eof = false) LOG "reading #{len} bytes..." read_bytes = 0 begin while read_bytes + rbuf_size < len if s = rbuf_consume_all_shareable! read_bytes += s.bytesize dest << s end rbuf_fill end s = rbuf_consume(len - read_bytes) read_bytes += s.bytesize dest << s rescue EOFError raise unless ignore_eof end LOG "read #{read_bytes} bytes" dest end def read_all(dest = ''.b) LOG 'reading all...' read_bytes = 0 begin while true if s = rbuf_consume_all_shareable! read_bytes += s.bytesize dest << s end rbuf_fill end rescue EOFError ; end LOG "read #{read_bytes} bytes" dest end def readuntil(terminator, ignore_eof = false) offset = @rbuf_offset begin until idx = @rbuf.index(terminator, offset) offset = @rbuf.bytesize rbuf_fill end return rbuf_consume(idx + terminator.bytesize - @rbuf_offset) rescue EOFError raise unless ignore_eof return rbuf_consume end end def readline readuntil("\n").chop end private BUFSIZE = 1024 * 16 def rbuf_fill tmp = @rbuf_empty ? @rbuf : nil case rv = @io.read_nonblock(BUFSIZE, tmp, exception: false) when String @rbuf_empty = false if rv.equal?(tmp) @rbuf_offset = 0 else @rbuf << rv rv.clear end return when :wait_readable (io = @io.to_io).wait_readable(@read_timeout) or raise Net::ReadTimeout.new(io) # continue looping when :wait_writable # OpenSSL::Buffering#read_nonblock may fail with IO::WaitWritable. # http://www.openssl.org/support/faq.html#PROG10 (io = @io.to_io).wait_writable(@read_timeout) or raise Net::ReadTimeout.new(io) # continue looping when nil raise EOFError, 'end of file reached' end while true end def rbuf_flush if @rbuf_empty @rbuf.clear @rbuf_offset = 0 end nil end def rbuf_size @rbuf.bytesize - @rbuf_offset end # Warning: this method may share the buffer to avoid # copying. The caller must no longer use the returned # string once rbuf_fill has been called again def rbuf_consume_all_shareable! @rbuf_empty = true buf = if @rbuf_offset == 0 @rbuf else @rbuf.byteslice(@rbuf_offset..-1) end @rbuf_offset = @rbuf.bytesize buf end def rbuf_consume(len = nil) if @rbuf_offset == 0 && (len.nil? || len == @rbuf.bytesize) s = @rbuf @rbuf = ''.b @rbuf_offset = 0 @rbuf_empty = true elsif len.nil? s = @rbuf.byteslice(@rbuf_offset..-1) @rbuf = ''.b @rbuf_offset = 0 @rbuf_empty = true else s = @rbuf.byteslice(@rbuf_offset, len) @rbuf_offset += len @rbuf_empty = @rbuf_offset == @rbuf.bytesize rbuf_flush end @debug_output << %Q[-> #{s.dump}\n] if @debug_output s end # # Write # public def write(*strs) writing { write0(*strs) } end alias << write def writeline(str) writing { write0 str + "\r\n" } end private def writing @written_bytes = 0 @debug_output << '<- ' if @debug_output yield @debug_output << "\n" if @debug_output bytes = @written_bytes @written_bytes = nil bytes end def write0(*strs) @debug_output << strs.map(&:dump).join if @debug_output orig_written_bytes = @written_bytes strs.each_with_index do |str, i| need_retry = true case len = @io.write_nonblock(str, exception: false) when Integer @written_bytes += len len -= str.bytesize if len == 0 if strs.size == i+1 return @written_bytes - orig_written_bytes else need_retry = false # next string end elsif len < 0 str = str.byteslice(len, -len) else # len > 0 need_retry = false # next string end # continue looping when :wait_writable (io = @io.to_io).wait_writable(@write_timeout) or raise Net::WriteTimeout.new(io) # continue looping end while need_retry end end # # Logging # private def LOG_off @save_debug_out = @debug_output @debug_output = nil end def LOG_on @debug_output = @save_debug_out end def LOG(msg) return unless @debug_output @debug_output << msg + "\n" end end BufferedIO = BufferedIOOpt # Unchanged from ruby 3.1.1, only allow to lookup the mofidied BufferedIO def connect if use_ssl? # reference early to load OpenSSL before connecting, # as OpenSSL may take time to load. @ssl_context = OpenSSL::SSL::SSLContext.new end if proxy? then conn_addr = proxy_address conn_port = proxy_port else conn_addr = conn_address conn_port = port end D "opening connection to #{conn_addr}:#{conn_port}..." begin s = Socket.tcp conn_addr, conn_port, @local_host, @local_port, connect_timeout: @open_timeout rescue => e e = Net::OpenTimeout.new(e) if e.is_a?(Errno::ETIMEDOUT) #for compatibility with previous versions raise e, "Failed to open TCP connection to " + "#{conn_addr}:#{conn_port} (#{e.message})" end s.setsockopt(Socket::IPPROTO_TCP, Socket::TCP_NODELAY, 1) D "opened" if use_ssl? if proxy? plain_sock = BufferedIO.new(s, read_timeout: @read_timeout, write_timeout: @write_timeout, continue_timeout: @continue_timeout, debug_output: @debug_output) buf = "CONNECT #{conn_address}:#{@PORT} HTTP/#{HTTPVersion}\r\n" buf << "Host: #{@address}:#{@PORT}\r\n" if proxy_user credential = ["#{proxy_user}:#{proxy_pass}"].pack('m0') buf << "Proxy-Authorization: Basic #{credential}\r\n" end buf << "\r\n" plain_sock.write(buf) HTTPResponse.read_new(plain_sock).value # assuming nothing left in buffers after successful CONNECT response end ssl_parameters = Hash.new iv_list = instance_variables SSL_IVNAMES.each_with_index do |ivname, i| if iv_list.include?(ivname) value = instance_variable_get(ivname) unless value.nil? ssl_parameters[SSL_ATTRIBUTES[i]] = value end end end @ssl_context.set_params(ssl_parameters) @ssl_context.session_cache_mode = OpenSSL::SSL::SSLContext::SESSION_CACHE_CLIENT | OpenSSL::SSL::SSLContext::SESSION_CACHE_NO_INTERNAL_STORE @ssl_context.session_new_cb = proc {|sock, sess| @ssl_session = sess } D "starting SSL for #{conn_addr}:#{conn_port}..." s = OpenSSL::SSL::SSLSocket.new(s, @ssl_context) s.sync_close = true # Server Name Indication (SNI) RFC 3546 s.hostname = @address if s.respond_to? :hostname= if @ssl_session and Process.clock_gettime(Process::CLOCK_REALTIME) < @ssl_session.time.to_f + @ssl_session.timeout s.session = @ssl_session end ssl_socket_connect(s, @open_timeout) if (@ssl_context.verify_mode != OpenSSL::SSL::VERIFY_NONE) && @ssl_context.verify_hostname s.post_connection_check(@address) end D "SSL established, protocol: #{s.ssl_version}, cipher: #{s.cipher[0]}" end @socket = BufferedIO.new(s, read_timeout: @read_timeout, write_timeout: @write_timeout, continue_timeout: @continue_timeout, debug_output: @debug_output) @last_communicated = nil on_connect rescue => exception if s D "Conn close because of connect error #{exception}" s.close end raise end private :connect end end begin sleep 0.2 connection = Net::HTTP.start(HOST, PORT) connection.keep_alive_timeout = 3600 connection_opt = Net::HTTPOpt.start(HOST, PORT) connection_opt.keep_alive_timeout = 3600 unless connection.request_get("/100k.txt").body == connection_opt.request_get("/100k.txt").body abort("bug?") end if ARGV.first == "profile" require 'stackprof' require 'json' StackProf.run(mode: :wall, out: "/tmp/stackprof-net-http.dump", raw: true) do 40_000.times do connection.request_get("/1M.txt").body end end File.write("/tmp/stackprof-net-http.json", JSON.dump(Marshal.load(File.binread("/tmp/stackprof-net-http.dump")))) system("stackprof", "/tmp/stackprof-net-http.rb") StackProf.run(mode: :wall, out: "/tmp/stackprof-net-http-opt.dump", raw: true) do 40_000.times do connection_opt.request_get("/1M.txt").body end end File.write("/tmp/stackprof-net-http-opt.json", JSON.dump(Marshal.load(File.binread("/tmp/stackprof-net-http-opt.dump")))) system("stackprof", "/tmp/stackprof-net-http-opt.dump") else %w(1k 10k 100k 1M).each do |size| puts "=== #{size} ===" Benchmark.ips do |x| path = "/#{size}.txt" x.report("#{size}") { connection.request_get(path).body } x.report("#{size} opt") { connection_opt.request_get(path).body } x.compare!(order: :baseline) end puts end end ensure Process.kill('TERM', nginx_pid) Process.wait(nginx_pid) end ```
1 parent 9cf40af commit 781e400

File tree

1 file changed

+64
-18
lines changed

1 file changed

+64
-18
lines changed

lib/net/protocol.rb

Lines changed: 64 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -120,6 +120,7 @@ def initialize(io, read_timeout: 60, write_timeout: 60, continue_timeout: nil, d
120120
@continue_timeout = continue_timeout
121121
@debug_output = debug_output
122122
@rbuf = ''.b
123+
@rbuf_offset = 0
123124
end
124125

125126
attr_reader :io
@@ -154,14 +155,15 @@ def read(len, dest = ''.b, ignore_eof = false)
154155
LOG "reading #{len} bytes..."
155156
read_bytes = 0
156157
begin
157-
while read_bytes + @rbuf.size < len
158-
s = rbuf_consume(@rbuf.size)
159-
read_bytes += s.size
160-
dest << s
158+
while read_bytes + rbuf_size < len
159+
if s = rbuf_consume_all_shareable!
160+
read_bytes += s.bytesize
161+
dest << s
162+
end
161163
rbuf_fill
162164
end
163165
s = rbuf_consume(len - read_bytes)
164-
read_bytes += s.size
166+
read_bytes += s.bytesize
165167
dest << s
166168
rescue EOFError
167169
raise unless ignore_eof
@@ -175,9 +177,10 @@ def read_all(dest = ''.b)
175177
read_bytes = 0
176178
begin
177179
while true
178-
s = rbuf_consume(@rbuf.size)
179-
read_bytes += s.size
180-
dest << s
180+
if s = rbuf_consume_all_shareable!
181+
read_bytes += s.bytesize
182+
dest << s
183+
end
181184
rbuf_fill
182185
end
183186
rescue EOFError
@@ -188,14 +191,16 @@ def read_all(dest = ''.b)
188191
end
189192

190193
def readuntil(terminator, ignore_eof = false)
194+
offset = @rbuf_offset
191195
begin
192-
until idx = @rbuf.index(terminator)
196+
until idx = @rbuf.index(terminator, offset)
197+
offset = @rbuf.bytesize
193198
rbuf_fill
194199
end
195-
return rbuf_consume(idx + terminator.size)
200+
return rbuf_consume(idx + terminator.bytesize - @rbuf_offset)
196201
rescue EOFError
197202
raise unless ignore_eof
198-
return rbuf_consume(@rbuf.size)
203+
return rbuf_consume
199204
end
200205
end
201206

@@ -208,12 +213,16 @@ def readline
208213
BUFSIZE = 1024 * 16
209214

210215
def rbuf_fill
211-
tmp = @rbuf.empty? ? @rbuf : nil
216+
tmp = @rbuf_empty ? @rbuf : nil
212217
case rv = @io.read_nonblock(BUFSIZE, tmp, exception: false)
213218
when String
214-
return if rv.equal?(tmp)
215-
@rbuf << rv
216-
rv.clear
219+
@rbuf_empty = false
220+
if rv.equal?(tmp)
221+
@rbuf_offset = 0
222+
else
223+
@rbuf << rv
224+
rv.clear
225+
end
217226
return
218227
when :wait_readable
219228
(io = @io.to_io).wait_readable(@read_timeout) or raise Net::ReadTimeout.new(io)
@@ -228,13 +237,50 @@ def rbuf_fill
228237
end while true
229238
end
230239

231-
def rbuf_consume(len)
232-
if len == @rbuf.size
240+
def rbuf_flush
241+
if @rbuf_empty
242+
@rbuf.clear
243+
@rbuf_offset = 0
244+
end
245+
nil
246+
end
247+
248+
def rbuf_size
249+
@rbuf.bytesize - @rbuf_offset
250+
end
251+
252+
# Warning: this method may share the buffer to avoid
253+
# copying. The caller must no longer use the returned
254+
# string once rbuf_fill has been called again
255+
def rbuf_consume_all_shareable!
256+
@rbuf_empty = true
257+
buf = if @rbuf_offset == 0
258+
@rbuf
259+
else
260+
@rbuf.byteslice(@rbuf_offset..-1)
261+
end
262+
@rbuf_offset = @rbuf.bytesize
263+
buf
264+
end
265+
266+
def rbuf_consume(len = nil)
267+
if @rbuf_offset == 0 && (len.nil? || len == @rbuf.bytesize)
233268
s = @rbuf
234269
@rbuf = ''.b
270+
@rbuf_offset = 0
271+
@rbuf_empty = true
272+
elsif len.nil?
273+
s = @rbuf.byteslice(@rbuf_offset..-1)
274+
@rbuf = ''.b
275+
@rbuf_offset = 0
276+
@rbuf_empty = true
235277
else
236-
s = @rbuf.slice!(0, len)
278+
s = @rbuf.byteslice(@rbuf_offset, len)
279+
@rbuf_offset += len
280+
@rbuf_empty = @rbuf_offset == @rbuf.bytesize
281+
rbuf_flush
237282
end
283+
238284
@debug_output << %Q[-> #{s.dump}\n] if @debug_output
239285
s
240286
end

0 commit comments

Comments
 (0)