Skip to content

Commit 3d66b77

Browse files
committed
Use StringScanner#peek_byte to get double or single quotation mark
## Why? `StringScanner#peek_byte` is fast, because it does not generate String object. ## Benchmark ``` RUBYLIB= BUNDLER_ORIG_RUBYLIB= /Users/naitoh/.rbenv/versions/3.3.4/bin/ruby -v -S benchmark-driver /Users/naitoh/ghq/github.com/naitoh/rexml/benchmark/parse.yaml ruby 3.3.4 (2024-07-09 revision be1089c8ec) [arm64-darwin22] Calculating ------------------------------------- before after before(YJIT) after(YJIT) dom 19.896 19.847 35.010 36.082 i/s - 100.000 times in 5.026136s 5.038531s 2.856343s 2.771448s sax 30.249 31.268 54.243 55.914 i/s - 100.000 times in 3.305870s 3.198205s 1.843540s 1.788471s pull 34.433 35.661 63.782 68.741 i/s - 100.000 times in 2.904195s 2.804202s 1.567841s 1.454734s stream 34.127 35.278 58.097 64.829 i/s - 100.000 times in 2.930236s 2.834602s 1.721255s 1.542522s Comparison: dom after(YJIT): 36.1 i/s before(YJIT): 35.0 i/s - 1.03x slower before: 19.9 i/s - 1.81x slower after: 19.8 i/s - 1.82x slower sax after(YJIT): 55.9 i/s before(YJIT): 54.2 i/s - 1.03x slower after: 31.3 i/s - 1.79x slower before: 30.2 i/s - 1.85x slower pull after(YJIT): 68.7 i/s before(YJIT): 63.8 i/s - 1.08x slower after: 35.7 i/s - 1.93x slower before: 34.4 i/s - 2.00x slower stream after(YJIT): 64.8 i/s before(YJIT): 58.1 i/s - 1.12x slower after: 35.3 i/s - 1.84x slower before: 34.1 i/s - 1.90x slower ``` - YJIT=ON : 1.03x - 1.12x faster - YJIT=OFF : 1.00x - 1.03x faster
1 parent bb0bedd commit 3d66b77

File tree

2 files changed

+27
-2
lines changed

2 files changed

+27
-2
lines changed

lib/rexml/parsers/baseparser.rb

Lines changed: 19 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -157,6 +157,11 @@ module Private
157157
DEFAULT_ENTITIES_PATTERNS[term] = /&#{term};/
158158
end
159159
XML_PREFIXED_NAMESPACE = "http://www.w3.org/XML/1998/namespace"
160+
161+
QUOTE = [].tap { |x|
162+
x['"'.ord] = '"'
163+
x["'".ord] = "'"
164+
}
160165
end
161166
private_constant :Private
162167

@@ -766,6 +771,19 @@ def process_instruction
766771
[:processing_instruction, name, content]
767772
end
768773

774+
if StringScanner::Version < "3.1.1"
775+
def get_quote
776+
@source.match(/(['"])/, true)&.[](1)
777+
end
778+
else
779+
def get_quote
780+
if quote = Private::QUOTE[@source.peek_byte]
781+
@source.scan_byte
782+
end
783+
quote
784+
end
785+
end
786+
769787
def parse_attributes(prefixes)
770788
attributes = {}
771789
expanded_names = {}
@@ -785,11 +803,10 @@ def parse_attributes(prefixes)
785803
message = "Missing attribute equal: <#{name}>"
786804
raise REXML::ParseException.new(message, @source)
787805
end
788-
unless match = @source.match(/(['"])/, true)
806+
unless quote = get_quote
789807
message = "Missing attribute value start quote: <#{name}>"
790808
raise REXML::ParseException.new(message, @source)
791809
end
792-
quote = match[1]
793810
start_position = @source.position
794811
value = @source.read_until(quote)
795812
unless value.chomp!(quote)

lib/rexml/source.rb

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -158,6 +158,14 @@ def position=(pos)
158158
@scanner.pos = pos
159159
end
160160

161+
def peek_byte
162+
@scanner.peek_byte
163+
end
164+
165+
def scan_byte
166+
@scanner.scan_byte
167+
end
168+
161169
# @return true if the Source is exhausted
162170
def empty?
163171
@scanner.eos?

0 commit comments

Comments
 (0)