Improve time complexity of tokenization regex used in diffSentences #580

ExplodingCabbage · 2025-02-08T21:38:05Z

The performance bug here was essentially the same as what I describe at https://markamery.com/blog/quadratic-time-regexes/; essentially, doing a regex search using a regex that starts with a pattern repeated with + or * is by default a quadratic time operation, unless you put a lookbehind before it, which we weren't doing.

While adding tests to check I wasn't breaking anything with this fix, I realised some existing behaviour was perverse, so settled on a fix that changed some of the perverse behaviour while also fixing the "ReDOS" issue (Snyk reported it via email as such).

(As with many "ReDOS" "vulns", I have conflicting & complicated thoughts on whether it's reasonable to consider this a security vulnerability in the first place, as Snyk do. That question of labeling doesn't change that it's bad and should be fixed, though.)

Improve time complexity of tokenization regex used in diffSentences

1f94e14

ExplodingCabbage self-assigned this Feb 8, 2025

ExplodingCabbage added 2 commits February 12, 2025 10:34

Better fix

60fa6c7

Document behaviour changes

0ea4aa0

ExplodingCabbage marked this pull request as ready for review February 14, 2025 16:26

ExplodingCabbage merged commit 4c8f444 into master Feb 14, 2025

ExplodingCabbage deleted the fix-bad-time-complexity-of-diffSentences-regex branch February 14, 2025 16:55

ExplodingCabbage mentioned this pull request Feb 14, 2025

Purge more quadratic-time regexes #581

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve time complexity of tokenization regex used in diffSentences #580

Improve time complexity of tokenization regex used in diffSentences #580

ExplodingCabbage commented Feb 8, 2025 •

edited

Loading

Improve time complexity of tokenization regex used in diffSentences #580

Improve time complexity of tokenization regex used in diffSentences #580

Conversation

ExplodingCabbage commented Feb 8, 2025 • edited Loading

ExplodingCabbage commented Feb 8, 2025 •

edited

Loading