Skip to content

Reuse the executor in firstMatch #489

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

rctcwyvrn
Copy link
Contributor

Reset the processor and it's registers instead of recreating them from scratch each iteration in firstMatch

Using this benchmark https://github.com/rctcwyvrn/swift-experimental-string-processing/blob/lily-benchmarker/Sources/RegexBenchmark/Suite/FirstMatch.swift we get a pretty big speedup, 60ms to 10ms on my machine

There isn't much benefit in the original test case from #483 (BasicBacktrack in the suite) but maybe it'll be more pronounced once the other patch mentioned in that issue is implemented. There is also some slowdown in the other benchmarks and I'm not sure why. Since wholeMatch calls into _firstMatch we also see some speedup in the css benchmark

Before

Running
- ReluctantQuant 36.4ms
- ReluctantQuantWithTerminal 151ms
- EagarQuantWithTerminal 37.8ms
- BasicBacktrack 299ms
- BasicBacktrackNS 617µs
- BasicBacktrackFirstMatch 297ms
- BasicBacktrackNSFirstMatch 598µs
- cssRegex 174ms
- cssRegexNS 1.02ms
- FirstMatch 61.4ms
- FirstMatchNS 256µs
- AllMatches 295ms
- AllMatchesNS 819µs

After

Running
- ReluctantQuant 44.6ms
- ReluctantQuantWithTerminal 158ms
- EagarQuantWithTerminal 44.3ms
- BasicBacktrack 292ms
- BasicBacktrackNS 598µs
- BasicBacktrackFirstMatch 292ms
- BasicBacktrackNSFirstMatch 599µs
- cssRegex 155ms
- cssRegexNS 1.02ms
- FirstMatch 10.5ms
- FirstMatchNS 254µs
- AllMatches 296ms
- AllMatchesNS 813µs

@rctcwyvrn rctcwyvrn requested a review from milseman June 15, 2022 20:17
@rctcwyvrn
Copy link
Contributor Author

Looking at ReluctantQuant the regression seems to be due to an extra memmove in Processor.cycle(). Not sure what's causing this extra move though

Before

3.21 s   70.1%	71.00 ms	 	             specialized Processor.cycle()
1.13 s   24.7%	32.00 ms	 	              closure #1 in Compiler.ByteCodeGen.emitAny()
512.00 ms   11.1%	378.00 ms	 	              specialized Processor.cycle()
426.00 ms    9.3%	426.00 ms	 	              swift_release

After

4.08 s   73.1%	86.00 ms	 	              specialized Processor.cycle()
1.15 s   20.6%	25.00 ms	 	               closure #1 in Compiler.ByteCodeGen.emitAny()
736.00 ms   13.1%	736.00 ms	 	               _platform_memmove
439.00 ms    7.8%	439.00 ms	 	               swift_release
420.00 ms    7.5%	244.00 ms	 	               specialized Processor.cycle()

@@ -32,7 +32,7 @@ struct Processor<
typealias Element = Input.Element

let input: Input
let bounds: Range<Position>
var bounds: Range<Position>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a bug or series of bugs that @natecook1000 is working on, but we probably won't want to be changing the bounds here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we're fine with leaving the lower bound at the beginning of the string? I guess it still makes sense as the bounds of the string and we just need to set currentPosition to be where we want to start from when we reset?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, though it might temporarily break some firstMatch tests until some anchors get fixed. @natecook1000 can you work with Lily here?

for idx in xs.indices {
xs[idx] = v
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or as an extension on MutableCollection. @natecook1000 does such an algorithm exist?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could try these out here before adding them to swift-algorithms:

extension MutableCollection {
  mutating func setAll(to element: Element) {
    self.withEach { $0 = element }
  }
  
  mutating func withEach(_ body: (inout Element) throws -> Void) rethrows {
    var i = startIndex
    while i < endIndex {
      try body(&self[i])
      formIndex(after: &i)
    }
  }
}

// usage:
self.bools.setAll(to: false)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also considering an underscore them so that no one accidentally ships it 😅

Comment on lines +47 to +49
input,
&cpu,
startingFrom: low
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
input,
&cpu,
startingFrom: low
input, &cpu, startingFrom: low

@milseman
Copy link
Member

The memmoves come from having to re-grow the save point stack, etc. Have you tried measuring now that we don't make new arrays?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants