Skip to content

Commit a91b951

Browse files
authored
DSL pitch fixes (#209)
1 parent 69d23a8 commit a91b951

File tree

1 file changed

+120
-59
lines changed

1 file changed

+120
-59
lines changed

Documentation/Evolution/RegexBuilderDSL.md

+120-59
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,10 @@
11
# Regex builder DSL
22

33
* Proposal: [SE-NNNN](NNNN-filename.md)
4-
* Authors: [Richard Wei](https://github.com/rxwei), ...
4+
* Authors: [Richard Wei](https://github.com/rxwei)
55
* Review Manager: TBD
6-
* Status: **Awaiting implementation**
7-
8-
*During the review process, add the following fields as needed:*
9-
10-
* Implementation: [apple/swift#NNNNN](https://github.com/apple/swift/pull/NNNNN) or [apple/swift-evolution-staging#NNNNN](https://github.com/apple/swift-evolution-staging/pull/NNNNN)
11-
* Decision Notes: [Rationale](https://forums.swift.org/), [Additional Commentary](https://forums.swift.org/)
12-
* Bugs: [SR-NNNN](https://bugs.swift.org/browse/SR-NNNN), [SR-MMMM](https://bugs.swift.org/browse/SR-MMMM)
13-
* Previous Revision: [1](https://github.com/apple/swift-evolution/blob/...commit-ID.../proposals/NNNN-filename.md)
14-
* Previous Proposal: [SE-XXXX](XXXX-filename.md**
6+
* Implementation: [apple/swift-experimental-string-processing](https://github.com/apple/swift-experimental-string-processing/tree/main/Sources/_StringProcessing/RegexDSL)
7+
* Status: **Pitch**
158

169
**Table of Contents**
1710
- [Introduction](#introduction)
@@ -340,36 +333,43 @@ public enum RegexComponentBuilder {
340333
line: Int = #line,
341334
column: Int = #column
342335
) -> Component<R>
343-
344-
/// Provides support for “if” statements in multi-statement closures, producing
345-
/// conditional content for the “then” branch.
346-
public static func buildEither<R: RegexComponent>(
347-
first component: Component<R>
348-
) -> Regex<R.Output> {
349-
component
350-
}
351-
352-
/// Provides support for “if-else” statements in multi-statement closures,
353-
/// producing conditional content for the “else” branch.
354-
public static func buildEither<R: RegexComponent>(
355-
second component: Component<R>
356-
) -> Regex<R.Output> {
357-
component
358-
}
359336
}
360337
```
361338

362339
When it comes to concatenation, `RegexComponentBuilder` utilizes the [recently proposed `buildPartialBlock` feature](https://forums.swift.org/t/pitch-buildpartialblock-for-result-builders/55561/1) to be able to concatenate all components' capture types to a single result tuple. `buildPartialBlock(first:)` provides support for creating a regex from a single component, and `buildPartialBlock(accumulated:next:)` support for creating a regex from multiple results.
363340

364-
Before Swift supports variadic generics, `buildPartialBlock(accumulated:next:)` must be overloaded to support concatenating regexes of supported capture quantities (arities). Due to the need for concatenating any pair of regexes that make up 10 captures, `buildPartialBlock(accumulated:next:)` is overloaded up to `arity^2` times.
341+
Before Swift supports variadic generics, `buildPartialBlock(first:)` and `buildPartialBlock(accumulated:next:)` must be overloaded to support concatenating regexes of supported capture quantities (arities).
342+
- `buildPartialBlock(first:)` is overloaded `arity` times such that a unary block with a component of any supported capture arity will produce a regex with capture type `Substring` followed by the component's capture types. The base overload, `buildPartialBlock<R>(first:) -> Regex<Substring>`, must be marked with `@_disfavoredOverload` to prevent it from shadowing other overloads.
343+
- `buildPartialBlock(accumulated:next:)` is overloaded up to `arity^2` times to account for all possible pairs of regexes that make up 10 captures.
365344

366-
In the initial version of the DSL, we plan to support regexes with up to 10 captures, as 10 captures are sufficient for most use cases. These overloads can be superceded by a variadic version of `buildPartialBlock(accumulated:next:)` in a future release.
345+
In the initial version of the DSL, we plan to support regexes with up to 10 captures, as 10 captures are sufficient for most use cases. These overloads can be superceded by variadic versions of `buildPartialBlock(first:)` and `buildPartialBlock(accumulated:next:)` in a future release.
367346

368347
```swift
369348
extension RegexComponentBuilder {
349+
// The following builder methods implement what would be possible with
350+
// variadic generics (using imaginary syntax) as a single method:
351+
//
352+
// public static func buildPartialBlock<
353+
// R, WholeMatch, Capture...
354+
// >(
355+
// first component: Component<R>
356+
// ) -> Regex<(Substring, Capture...)>
357+
// where Component.Output == (WholeMatch, Capture...),
358+
359+
@_disfavoredOverload
370360
public static func buildPartialBlock<R: RegexComponent>(
371-
first r: Compoment<R>
372-
) -> Regex<R.Output>
361+
first r: Component<R>
362+
) -> Regex<Substring>
363+
364+
public static func buildPartialBlock<W, C0, R: RegexComponent>(
365+
first r: Component<R>
366+
) -> Regex<(Substring, C0)> where R.Output == (W, C0)
367+
368+
public static func buildPartialBlock<W, C0, C1, R: RegexComponent>(
369+
first r: Component<R>
370+
) -> Regex<(Substring, C0, C1)> where R.Output == (W, C0, C1)
371+
372+
// ... `O(arity)` overloads of `buildPartialBlock(first:)`
373373

374374
// The following builder methods implement what would be possible with
375375
// variadic generics (using imaginary syntax) as a single method:
@@ -379,18 +379,18 @@ extension RegexComponentBuilder {
379379
// AccumulatedCapture..., NextCapture...,
380380
// Accumulated: RegexComponent, Next: RegexComponent
381381
// >(
382-
// accumulated: Accumulated, next: Next
382+
// accumulated: Accumulated, next: Component<Next>
383383
// ) -> Regex<(Substring, AccumulatedCapture..., NextCapture...)>
384384
// where Accumulated.Output == (AccumulatedWholeMatch, AccumulatedCapture...),
385385
// Next.Output == (NextWholeMatch, NextCapture...)
386386

387387
public static func buildPartialBlock<W0, W1, C0, R0: RegexComponent, R1: RegexComponent>(
388388
accumulated: R0, next: Component<R1>
389-
) -> Regex<(Substring, C0)> where R0.Output == W0, R1.Output == (W1, C0)
389+
) -> Regex<(Substring, C0)> where R0.Output == W0, R1.Output == (W1, C0)
390390

391391
public static func buildPartialBlock<W0, W1, C0, C1, R0: RegexComponent, R1: RegexComponent>(
392392
accumulated: R0, next: Component<R1>
393-
) -> Regex<(Substring, C0, C1)> where R0.Output == W0, R1.Output == (W1, C0, C1)
393+
) -> Regex<(Substring, C0, C1)> where R0.Output == W0, R1.Output == (W1, C0, C1)
394394

395395
public static func buildPartialBlock<W0, W1, C0, C1, C2, R0: RegexComponent, R1: RegexComponent>(
396396
accumulated: R0, next: Component<R1>
@@ -400,10 +400,68 @@ extension RegexComponentBuilder {
400400
}
401401
```
402402

403-
To support `if` statements, `buildOptional(_:)` is defined with overloads to support up to 10 captures because each capture type needs to be transformed to an optional. The overload for non-capturing regexes, due to the lack of generic constraints, must be annotated with `@_disfavoredOverload` in order not to become the default choice by the compiler. We expect that a variadic-generic version of this method will eventually superceded all of these overloads.
403+
To support `if` statements, `buildEither(first:)`, `buildEither(second:)` and `buildOptional(_:)` are defined with overloads to support up to 10 captures because each capture type needs to be transformed to an optional. The overload for non-capturing regexes, due to the lack of generic constraints, must be annotated with `@_disfavoredOverload` in order not shadow other overloads. We expect that a variadic-generic version of this method will eventually superseded all of these overloads.
404404

405405
```swift
406406
extension RegexComponentBuilder {
407+
// The following builder methods implement what would be possible with
408+
// variadic generics (using imaginary syntax) as a single method:
409+
//
410+
// public static func buildEither<
411+
// Component, WholeMatch, Capture...
412+
// >(
413+
// first component: Component
414+
// ) -> Regex<(Substring, Capture...)>
415+
// where Component.Output == (WholeMatch, Capture...)
416+
417+
public static func buildEither<R: RegexComponent>(
418+
first component: Component<R>
419+
) -> Regex<Substring> {
420+
component
421+
}
422+
423+
public static func buildEither<W, C0, R: RegexComponent>(
424+
first component: Component<R>
425+
) -> Regex<(Substring, C0)> where R.Output == (W, C0) {
426+
component
427+
}
428+
429+
public static func buildEither<W, C0, C1, R: RegexComponent>(
430+
first component: Component<R>
431+
) -> Regex<(Substring, C0, C1)> where R.Output == (W, C0, C1) {
432+
component
433+
}
434+
435+
// The following builder methods implement what would be possible with
436+
// variadic generics (using imaginary syntax) as a single method:
437+
//
438+
// public static func buildEither<
439+
// Component, WholeMatch, Capture...
440+
// >(
441+
// second component: Component
442+
// ) -> Regex<(Substring, Capture...)>
443+
// where Component.Output == (WholeMatch, Capture...)
444+
445+
public static func buildEither<R: RegexComponent>(
446+
second component: Component<R>
447+
) -> Regex<Substring> {
448+
component
449+
}
450+
451+
public static func buildEither<W, C0, R: RegexComponent>(
452+
second component: Component<R>
453+
) -> Regex<(Substring, C0)> where R.Output == (W, C0) {
454+
component
455+
}
456+
457+
public static func buildEither<W, C0, C1, R: RegexComponent>(
458+
second component: Component<R>
459+
) -> Regex<(Substring, C0, C1)> where R.Output == (W, C0, C1) {
460+
component
461+
}
462+
463+
// ... `O(arity)` overloads of `buildEither(_:)`
464+
407465
// The following builder methods implement what would be possible with
408466
// variadic generics (using imaginary syntax) as a single method:
409467
//
@@ -427,10 +485,6 @@ extension RegexComponentBuilder {
427485
) -> Regex<(Substring, C0?, C1?)>
428486

429487
// ... `O(arity)` overloads of `buildOptional(_:)`
430-
431-
public static func buildOptional<W, C0, C1, C2, C3, C4, C5, C6, C7, C8, C9, R: RegexComponent>(
432-
_ component: Component<R>?
433-
) -> Regex<(Substring, C0?, C1?, C2?, C3?, C4?, C5?, C6?, C7?, C8, C9?)> where R.Output == (W, C0, C1, C2, C3, C4, C5, C6, C7, C8, C9)
434488
}
435489
```
436490

@@ -461,10 +515,6 @@ extension RegexComponentBuilder {
461515
) -> Regex<(Substring, C0?, C1?)>
462516

463517
// ... `O(arity)` overloads of `buildLimitedAvailability(_:)`
464-
465-
public static func buildLimitedAvailability<W, C0, C1, C2, C3, C4, C5, C6, C7, C8, C9, R: RegexComponent>(
466-
_ component: Component<R>
467-
) -> Regex<(Substring, C0?, C1?, C2?, C3?, C4?, C5?, C6?, C7?, C8, C9?)> where R.Output == (W, C0, C1, C2, C3, C4, C5, C6, C7, C8, C9)
468518
}
469519
```
470520

@@ -505,15 +555,7 @@ public struct ChoiceOf<Output>: RegexComponent {
505555
```swift
506556
@resultBuilder
507557
public enum AlternationBuilder {
508-
/// A builder component that stores a regex component and its source location
509-
/// for debugging purposes.
510-
public struct Component<Value: RegexComponent> {
511-
public var value: Value
512-
public var file: String
513-
public var function: String
514-
public var line: Int
515-
public var column: Int
516-
}
558+
public typealias Component<Value> = RegexComponentBuilder.Component<Value>
517559

518560
/// Returns a component by wrapping the component regex in `Component` and
519561
/// recording its source location.
@@ -525,9 +567,28 @@ public enum AlternationBuilder {
525567
column: Int = #column
526568
) -> Component<R>
527569

570+
// The following builder methods implement what would be possible with
571+
// variadic generics (using imaginary syntax) as a single method:
572+
//
573+
// public static func buildPartialBlock<
574+
// R, WholeMatch, Capture...
575+
// >(
576+
// first component: Component<R>
577+
// ) -> Regex<(Substring, Capture?...)>
578+
// where Component.Output == (WholeMatch, Capture...),
579+
580+
@_disfavoredOverload
528581
public static func buildPartialBlock<R: RegexComponent>(
529-
first: Component<R>
530-
) -> Regex<R.Output>
582+
first r: Component<R>
583+
) -> Regex<Substring>
584+
585+
public static func buildPartialBlock<W, C0, R: RegexComponent>(
586+
first r: Component<R>
587+
) -> Regex<(Substring, C0?)> where R.Output == (W, C0)
588+
589+
public static func buildPartialBlock<W, C0, C1, R: RegexComponent>(
590+
first r: Component<R>
591+
) -> Regex<(Substring, C0?, C1?)> where R.Output == (W, C0, C1)
531592

532593
// The following builder methods implement what would be possible with
533594
// variadic generics (using imaginary syntax) as a single method:
@@ -977,7 +1038,7 @@ extension Repeat {
9771038
_ behavior: QuantificationBehavior = .eagerly
9781039
)
9791040
where Output == (Substring, C0),
980-
Compoment.Output == (Substring, C0),
1041+
Component.Output == (Substring, C0),
9811042
R.Bound == Int
9821043

9831044
public init<W, C0, Component: RegexComponent>(
@@ -986,7 +1047,7 @@ extension Repeat {
9861047
@RegexComponentBuilder _ component: () -> Component
9871048
)
9881049
where Output == (Substring, C0),
989-
Compoment.Output == (Substring, C0),
1050+
Component.Output == (Substring, C0),
9901051
R.Bound == Int
9911052

9921053
public init<W, C0, Component: RegexComponent, RE: RangeExpression>(
@@ -1125,7 +1186,7 @@ let regex = Regex {
11251186
}
11261187
```
11271188

1128-
Variants of `capture` and `tryCapture` accept a `Reference` argument. References can be used to achieve named captures and named backreferences from textual regexes.
1189+
Variants of `Capture` and `TryCapture` accept a `Reference` argument. References can be used to achieve named captures and named backreferences from textual regexes.
11291190

11301191
```swift
11311192
public struct Reference<Capture>: RegexComponent {
@@ -1160,7 +1221,7 @@ A regex is considered invalid when it contains a use of reference without it eve
11601221
In textual regex, one can refer to a subpattern to avoid duplicating the subpattern, for example:
11611222

11621223
```
1163-
(you|I) say (goodbye|hello); (?0) say (?1)
1224+
(you|I) say (goodbye|hello); (?1) say (?2)
11641225
```
11651226

11661227
The above regex is equivalent to
@@ -1238,7 +1299,7 @@ The proposed feature relies heavily upon overloads of `buildBlock` and `buildPar
12381299

12391300
## Alternatives considered
12401301

1241-
### Operators for quantification and alterantion
1302+
### Operators for quantification and alternation
12421303

12431304
While `ChoiceOf` and quantifier functions provide a general way of creating alternations and quantifications, we recognize that some synctactic sugar can be useful for creating one-liners like in textual regexes, e.g. infix operator `|`, postfix operator `*`, etc.
12441305

@@ -1373,7 +1434,7 @@ However, given that one-or-more (`+`), zero-or-more (`*`) and optional (`?`) are
13731434

13741435
One could argue that type such as `OneOrMore<Output>` could be defined as a top-level function that returns `Regex`. While it is entirely possible to do so, it would lose the name scoping benefits of a type and pollute the top-level namespace with `O(arity^2)` overloads of quantifiers, `capture`, `tryCapture`, etc. This could be detrimental to the usefulness of code completion.
13751436

1376-
Another reason to use types instead of free functions is consistency with existing result-buidler-based DSLs such as SwiftUI.
1437+
Another reason to use types instead of free functions is consistency with existing result-builder-based DSLs such as SwiftUI.
13771438

13781439
[Declarative String Processing]: https://github.com/apple/swift-experimental-string-processing/blob/main/Documentation/DeclarativeStringProcessing.md
13791440
[Strongly Typed Regex Captures]: https://github.com/apple/swift-experimental-string-processing/blob/main/Documentation/Evolution/StronglyTypedCaptures.md

0 commit comments

Comments
 (0)