Skip to content

[SE-0351] Revise regex builder proposal #1634

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Jul 6, 2022

Conversation

rxwei
Copy link
Contributor

@rxwei rxwei commented Apr 25, 2022

Revision:

Clarification:

  • Clarify that RegexComponent and Regex: RegexComponent will be in the stdlib, not in RegexBuilder.
  • Make detailed design driven by example and move API definition to a collapsible. This is so that the complex result builder machinery won't obscure the API design.
  • Add alternative considered section about unifying Capture and TryCapture.

@rxwei rxwei requested review from milseman, natecook1000 and Azoy April 25, 2022 08:55
@rxwei rxwei force-pushed the 0351-dsl-update branch 6 times, most recently from e02cfbf to f73b130 Compare April 25, 2022 09:05
Copy link
Member

@milseman milseman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick review of the first 2/3rds of this PR. Sorry it's not more in-depth, but I wanted to get the feedback out early

```

By conforming standard library types to `RegexComponent`, we allow them to be used inside the regex builder DSL as a match target.
Note:
- `RegexComponent` and `Regex`'s conformance to `RegexComponent` are available without importing `RegexBuilder`. All other types and conformances introduced in this proposal are in the `RegexBuilder` module.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this where we landed, with the String conformance not being in the stdlib proper (i.e. Swift)? I thought we were leaning the other way but I'll have to crawl over conversation threads to see.

Either way, let's be very explicit what we mean by "other types". Perhaps pointing out String but also b, c, and d.

There's also a very real possibility of CharacterClass getting pulled into the Swift module at some point, perhaps even during the Unicode review.

Copy link
Contributor Author

@rxwei rxwei Apr 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this where we landed, with the String conformance not being in the stdlib proper (i.e. Swift)?

Yes.

Either way, let's be very explicit what we mean by "other types".

"All other types" seems pretty clear to me. The sentence below it already says explicitly:

By conforming standard library types to RegexComponent, we allow them to .... These conformances are available in the RegexBuilder module.

Copy link
Contributor

@Azoy Azoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this reads really well now (including Michael's comments). I'm super excited for this!

@milseman
Copy link
Member

milseman commented Apr 26, 2022

Argh, Github comment threads are so hard to see over time. The remaining point is:

A regex is considered invalid when it contains a use of reference without it ever being captured in the regex. When this occurs in the regex builder DSL, a runtime error will be reported. Similarly, the use of a reference in a Regex.Match.subscript(_:) must have been captured in the regex that produced the match.

Is the intent to trap if that branch of an alternation wasn't taken, or to trap if the reference wasn't used with the regex?

edit: to clarify, "ever being captured" could be confused as meaning it was set to a value instead of nil. I.e. captured at run-time instead of compile-time (noting that compilation time is not Swift compilation time but regex compilation time).

Maybe we can define "regex compilation time" as a term and use that to clarify?

@rxwei rxwei force-pushed the 0351-dsl-update branch from f73b130 to f2dfbab Compare April 26, 2022 01:46
Copy link
Member

@milseman milseman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Temporarily marking this as requesting changes concerning the trapping behavior of reference subscript and some other concerns. Don't want this to be merged accidentally.

@rxwei rxwei force-pushed the 0351-dsl-update branch from 23379d7 to 1198ce2 Compare April 27, 2022 04:59
Revision:
- Capture takes throwing closures. (swiftlang/swift-experimental-string-processing#261)
- Rename `Output` associated type to `RegexOutput`. (swiftlang/swift-experimental-string-processing#281)
- Define primary associated type for `RegexComponent`.

Clarification:
- Clarify that `RegexComponent` and `Regex: RegexComponent` will be in the stdlib, not in `RegexBuilder`.
- Make detailed design driven by example and move API definition to a collapsible. This is so that the complex result builder machinery won't obscure the API design.
- Add alternative considered section about unifying `Capture` and `TryCapture`.
@rxwei rxwei force-pushed the 0351-dsl-update branch from 1198ce2 to 5478972 Compare April 27, 2022 05:38
/// An anchor that matches at the first position of a match in the input
/// string.
///
/// This anchor is equivalent to `\y` in regex syntax.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// This anchor is equivalent to `\y` in regex syntax.
/// This anchor is equivalent to `\G` in regex syntax.

IIRC

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

\y is correct according to @natecook1000

}
}.map { _, c1, c2, c3 in
SemanticVersion(major: c1, minor: c2, patch: c3 ?? 0)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is such a compelling use

@milseman milseman self-requested a review May 4, 2022 01:40
@benrimmington
Copy link
Contributor

In the "Future directions" section, two instances of CustomMatchingRegexComponent could be updated to CustomConsumingRegexComponent.


At the end of the document, [Regex Syntax] is a broken link, and [String Processing Algorithms] could also be updated to:

[Regex Syntax]:                 https://github.com/apple/swift-evolution/blob/main/proposals/0355-regex-syntax-run-time-construction.md
[String Processing Algorithms]: https://github.com/apple/swift-evolution/blob/main/proposals/0357-regex-string-processing-algorithms.md

rxwei added a commit to rxwei/swift-experimental-string-processing that referenced this pull request Jun 14, 2022
`buildEither` was removed from the regex builder DSL proposal. See swiftlang/swift-evolution#1634.
@airspeedswift airspeedswift merged commit 164ca7f into swiftlang:main Jul 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants