-
Notifications
You must be signed in to change notification settings - Fork 49
Fully generalize "whole match" in the engine and enable transforming custom types #470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
5b413ac
to
381825d
Compare
99d01fc
to
8437579
Compare
@swift-ci please test |
@swift-ci please test Linux |
For some reason the CI failure only occurs on Linux, while both platforms have the same compiler. [252/256] Compiling RegexBuilderTests RegexDSLTests.swift
/build/swift-experimental-string-processing/Tests/RegexBuilderTests/RegexDSLTests.swift:586:18: error: the compiler is unable to type-check this expression in reasonable time; try breaking up the expression into distinct sub-expressions
let regex3 = Regex {
^~~~~~~ Apple Swift version 5.8-dev (LLVM 278d67f38c6a910, Swift ee312bc1e20eb01)
Target: x86_64-apple-macosx11.0 Swift version 5.8-dev (LLVM 278d67f38c6a910, Swift ee312bc1e20eb01)
Target: x86_64-unknown-linux-gnu |
8437579
to
fef3831
Compare
@swift-ci please test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM. I'm not sure why transform is back on capture, it seems to complicate the model and make it harder to drop the capture enum. Otherwise it looks good.
return nil | ||
} | ||
assert(type(of: result) == resultType) | ||
return result | ||
} | ||
} | ||
|
||
func callAsFunction(_ input: Substring) throws -> Any? { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, why are these closures all union-ed together into a run-time value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose we can define a separate CaptureTransform
for every capture type, e.g. CaptureTransform
and FailableCaptureTransform
. But they'd also end up getting unioned as different instruction cases.
0adbff7
to
1d5db25
Compare
IMO it's the opposite. I don't think a standalone |
@swift-ci please test |
8c35b1a
to
fc678d2
Compare
@swift-ci please test |
1 similar comment
@swift-ci please test |
fc678d2
to
d645c5f
Compare
@swift-ci please test |
1 similar comment
@swift-ci please test |
…custom types * Track the whole match as an element of the "capture list" in the matching engine. Do so by emitting code as an implicit `capture` around the root node. * No longer handle `matcher` as a special case within `capture` lowering, because the matcher can be arbitrarily nested within "output-forwarding" nodes, such as a `changeMatchingOptions` non-capturing group. Instead, make the bytecode emitter carry a result value so that a custom output can be propagated through any forwarding nodes. ```swift Regex { Capture( SemanticVersionParser() .ignoringCase() .matchingSemantics(.unicodeScalar) ) // This would not work previously. } ``` * Collapse DSLTree node `transform` into `capture`, because a transform can never be standalone (without a `capture` parent). This greatly simplifies `capture` lowering. * Make the bytecode's capture transform use type `(Input, _StoredCapture) -> Any` so that it can transform any whole match, not just `Substring`. This means you can now transform any captured value, including a custom-consuming regex component's result! ```swift Regex { "version:" OneOrMore(.whitespace) Capture { SemanticVersionParser() // Regex<SemanticVersion> } transform: { // (SemanticVersion) -> SomethingElse } } ``` The transforms of `Capture` and `TryCapture` are now generalized from taking `Substring` to taking generic parameter `W` (the whole match). * Fix an issue where initial options were applied based solely on whether the bytecode had any instructions, failing examples such as `((?i:.))`. It now checks whether the first matchable atom has been emitted.
d645c5f
to
9d5280e
Compare
@swift-ci please test |
1 similar comment
@swift-ci please test |
Fully generalize "whole match" in the engine and enable transforming custom types
capture
around the root node.matcher
as a special case withincapture
lowering, because the matcher can be arbitrarily nested within "output-forwarding" nodes, such as achangeMatchingOptions
non-capturing group. Instead, make the bytecode emitter carry a result value so that a custom output can be propagated through any forwarding nodes.transform
intocapture
, because a transform can never be standalone (without acapture
parent). This greatly simplifiescapture
lowering.(Input, _StoredCapture) -> Any
so that it can transform any whole match, not justSubstring
. This means you can now transform any captured value, including a custom-consuming regex component's result!Capture
andTryCapture
are now generalized from takingSubstring
to taking generic parameterW
(the whole match).((?i:.))
. It now checks whether the first matchable atom has been emitted.