-
Notifications
You must be signed in to change notification settings - Fork 49
[Benchmark] Excessive (and expensive) backtracking #483
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@rctcwyvrn, can you add this to the benchmark suite you're working on? Lots of opportunities exposed here. We don't have a compiler IR yet, so we don't want to layer on excess complexity while the execution model is still settling down. But, there's a ton of low-hanging fruit here.
@Azoy, can you try to spin up a PR incorporating this patch and make sure we're test the boundaries correctly?
This patch gives an an additional 10% improvement on this after the 7x from above from re-using the same executor. It still creates a new processor every time though, avoiding that would give us another 30% beyond the above. @rctcwyvrn Can you work with me on integrating it? Can you also work with me on crafting targeted benchmarks that demonstrates these prominently, as well as reusing the same processor? After that, we're getting absolutely killed by ARC, which accounts for ~70% of the remaining time after the above improvements. Part of this is how we store We still end up ARCing the executor after the above fixes, and somehow And there are lots of dumb ARC going on for things like @rctcwyvrn, would you be interested in tracking these down with me and fixing them? Intermediate-term, we should peephole optimize a single-scalar normalization-invariant @Azoy and @natecook1000, can you think through where we'd need to check for grapheme cluster boundaries? Where or how could those checks could be elided? And how we could make them faster? Similarly for verbatim sequences. We should explore instructions for simple repetitions, especially those around a single-scalar Character. This benchmark doesn't yet demonstrate the need, but hopefully after tackling the other low-hanging fruit this becomes more salient. We can also exploit the fact that grapheme breaking for Similarly, we need much cheaper save point representations and backtracking. We should switch at some point to prioritize or advantage contiguous validly-encoded UTF-8. That is, we branch once at the top and not during every read. This will require some more design and architecture. Currently, the low-hanging fruit will drown out the improvements here and this will add complexity, so it makes sense to wait until we tackle those other issues. After we have an IR, we should do post-dominating terminal analysis so that we can replace consume loops with scanning, eliding needless save points and backtracking. Similarly for eliding grapheme boundary checks and save points. It makes sense to wait until we settle our execution model a little before doing this. Similarly for anchor analysis and exiting early for first-match style operations. |
20x improvement to this micro (nano?) benchmark we extracted from this report: #494 |
A compounding 2x in #495 |
A compounding 2-7x, depending, in #497 |
Sorry for walking into this old thread, I'm happy to move the conversation to other channels, I'm just not entirely sure where would be best and this thread came up when looking for known issues. Context: I'm evaluating Swift on the server-side (linux) and I wrote a simple program. I was very excited to discover the RegexBuilder APIs and got off the ground running quickly. However, I noticed that the regexes were noticeably slower than my prior art. To be specific, for a simple toy-case like let msg =
"FLRD0056B>OGFLR,qAS,LSZI2:/215553h4730.50N\00757.08En000/000/A=001975 !W56! id3ED0056B -019fpm +0.0rot 22.5dB -9.0kHz gps1x1"
let re = /(?<time>\d+)h/
let N = 1000000
for i in 0..<N {
let match = try? re.firstMatch(in: msg)
match?.time
} built and run in release mode:
It takes 5.7 seconds, which compared to other runtimes/re-engines
was quite a bit slower. Arguably, the performance of the standard library regex-engine doesn't say much about the language performance itself (and vice versa) but the factors were certainly large than what I had expected and larger than between the tested alternatives. |
From the forums:
The text was updated successfully, but these errors were encountered: