-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Catastrophic backtracking #820
Catastrophic backtracking #820
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Isildur46 感谢翻译!一点格式上的小问题,请再查看一下~
|
||
The typical symptom -- a regular expression works fine sometimes, but for certain strings it "hangs", consuming 100% of CPU. | ||
典型的症状就是 -- 一个正则表达式有时能正常工作,但在某些特定字符串中就会消耗 100% 的 CPU 算力,出现「挂起」现象,。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
典型的症状就是 -- 一个正则表达式有时能正常工作,但在某些特定字符串中就会消耗 100% 的 CPU 算力,出现「挂起」现象,。 | |
典型的症状就是 - 一个正则表达式有时能正常工作,但在某些特定字符串中就会消耗 100% 的 CPU 算力,出现「挂起」现象。 |
|
||
Let's say we have a string, and we'd like to check if it consists of words `pattern:\w+` with an optional space `pattern:\s?` after each. | ||
比如我们现在有一个字符串,我们想检查其中是否包含一些字符 `pattern:\w+` ,我们允许字符后跟着可选的空格符 `pattern:\s?` 。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
比如我们现在有一个字符串,我们想检查其中是否包含一些字符 `pattern:\w+` ,我们允许字符后跟着可选的空格符 `pattern:\s?` 。 | |
比如我们现在有一个字符串,我们想检查其中是否包含一些字符 `pattern:\w+`,我们允许字符后跟着可选的空格符 `pattern:\s?`。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
能否翻译如下,因为根据下面的例子来看这里的words应该理解为单词更贴切:
比如我们现在有一个字符串,我们想检查该字符串是不是由一些后面带有可选空格pattern:\s?
的单词pattern:\w+
所组成 。
|
||
We'll use a regexp `pattern:^(\w+\s?)*$`, it specifies 0 or more such words. | ||
我们使用一个这样的正则 `pattern:^(\w+\s?)*$` ,它指定了 0 个或更多个此类的字符。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我们使用一个这样的正则 `pattern:^(\w+\s?)*$` ,它指定了 0 个或更多个此类的字符。 | |
我们使用一个这样的正则 `pattern:^(\w+\s?)*$`,它指定了 0 个或更多个此类的字符。 |
|
||
And, to make things more obvious, let's replace `pattern:\w` with `pattern:\d`. The resulting regular expression still hangs, for instance: | ||
同时为了让问题更显著,再用 `pattern:\d` 替换掉 `pattern:\w` 。这个新的正则表达式执行时仍然会挂起,比如: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同时为了让问题更显著,再用 `pattern:\d` 替换掉 `pattern:\w` 。这个新的正则表达式执行时仍然会挂起,比如: | |
同时为了让问题更显著,再用 `pattern:\d` 替换掉 `pattern:\w`。这个新的正则表达式执行时仍然会挂起,比如: |
|
||
``` | ||
|
||
\d+.......\d+ | ||
(12345678)(9)! | ||
``` | ||
|
||
The engine tries to match `pattern:$` again, but fails, because meets `subject:!`: | ||
引擎再次去尝试匹配 `pattern:$` ,但是失败了,因为它遇到了 `subject:!` : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
引擎再次去尝试匹配 `pattern:$` ,但是失败了,因为它遇到了 `subject:!` : | |
引擎再次去尝试匹配 `pattern:$`,但是失败了,因为它遇到了 `subject:!` : |
|
||
For example, the previous pattern `pattern:(\w+\s?)*` could match the word `subject:string` as two `pattern:\w+`: | ||
举个例子,之前那个模式 `pattern:(\w+\s?)*` 可能以两个 `pattern:\w+` 的方式来匹配单词 `subject:string` : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
举个例子,之前那个模式 `pattern:(\w+\s?)*` 可能以两个 `pattern:\w+` 的方式来匹配单词 `subject:string` : | |
举个例子,之前那个模式 `pattern:(\w+\s?)*` 可能以两个 `pattern:\w+` 的方式来匹配单词 `subject:string`: |
|
||
```js run | ||
\w+\w+ | ||
string | ||
``` | ||
|
||
The previous pattern, due to the optional `pattern:\s` allowed variants `pattern:\w+`, `pattern:\w+\s`, `pattern:\w+\w+` and so on. | ||
之前那个模式,由于存在可选的 `pattern:\s` ,它允许 `pattern:\w+`,`pattern:\w+\s`,`pattern:\w+\w+` 等等的变体形式。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
之前那个模式,由于存在可选的 `pattern:\s` ,它允许 `pattern:\w+`,`pattern:\w+\s`,`pattern:\w+\w+` 等等的变体形式。 | |
之前那个模式,由于存在可选的 `pattern:\s`,它允许 `pattern:\w+`、`pattern:\w+\s` 和 `pattern:\w+\w+` 等等的变体形式。 |
|
||
E.g. in the regexp `pattern:(\d+)*$` it's obvious for a human, that `pattern:+` shouldn't backtrack. If we replace one `pattern:\d+` with two separate `pattern:\d+\d+`, nothing changes: | ||
比如,正则 `pattern:(\d+)*$` 中 `pattern:+` 对于我们人类来说很明显不应去回溯,就算我们用两个独立的 `pattern:\d+\d+` 去替换一个 `pattern:\d+`, 也是根本没作用的: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
比如,正则 `pattern:(\d+)*$` 中 `pattern:+` 对于我们人类来说很明显不应去回溯,就算我们用两个独立的 `pattern:\d+\d+` 去替换一个 `pattern:\d+`, 也是根本没作用的: | |
比如,正则 `pattern:(\d+)*$` 中 `pattern:+` 对于我们人类来说很明显不应去回溯,就算我们用两个独立的 `pattern:\d+\d+` 去替换一个 `pattern:\d+`,也是根本没作用的: |
|
||
For instance, in the word `subject:JavaScript` it may not only match `match:Java`, but leave out `match:Script` to match the rest of the pattern. | ||
例如,在单词 `subject:JavaScript` 不仅可以匹配 `match:Java`,而且可以忽略 `match:Script` ,匹配模式的其余部分。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
例如,在单词 `subject:JavaScript` 不仅可以匹配 `match:Java`,而且可以忽略 `match:Script` ,匹配模式的其余部分。 | |
例如,在单词 `subject:JavaScript` 中不仅可以匹配 `match:Java`,而且可以忽略 `match:Script`,匹配模式的其余部分。 |
``` | ||
|
||
Here `pattern:\2` is used instead of `pattern:\1`, because there are additional outer parentheses. To avoid messing up with the numbers, we can give the parentheses a name, e.g. `pattern:(?<word>\w+)`. | ||
这里我们用 `pattern:\2` 代替 `pattern:\1`,因为这里附加了额外的外部括号。为了防止数字产生混淆,,我们可以给括号命名,例如 `pattern:(?<word>\w+)`。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里我们用 `pattern:\2` 代替 `pattern:\1`,因为这里附加了额外的外部括号。为了防止数字产生混淆,,我们可以给括号命名,例如 `pattern:(?<word>\w+)`。 | |
这里我们用 `pattern:\2` 代替 `pattern:\1`,因为这里附加了额外的外部括号。为了防止数字产生混淆,我们可以给括号命名,例如 `pattern:(?<word>\w+)`。 |
Please make the requested changes. After it, add a comment "/done". |
格式已修正,更新了 PR。 @bemself |
/done |
/done |
Thank you 💖 I updated the Progress Issue #324 🎉 🎉 🎉 |
感谢翻译和校对 👍 后面的内容有些更新和变动,我先合并了,后续再做更新优化。欢迎继续 PR 更多内容。 |
目标章节:9-regular-expressions/15-regexp-catastrophic-backtracking
当前上游最新 commit:javascript-tutorial/en.javascript.info@32e20fc#diff-daa73ffcb00f33284c228961daab9209
本 PR 所做更改如下: