Support for formatting a selection #708

DaveEwing · 2024-03-22T20:27:54Z

Work in progress at the moment. I have a handful of tweaks I want to make, but it appears to work pretty well overall.

allevato

This is looking really great! A few high-level comments to start that are mainly API/UX things.

Can you plumb through a test or two that shows the whitespace issues you're seeing? That would make it easier to help debug what's going on.

Sources/SwiftFormat/API/Selection.swift

allevato · 2024-03-26T19:37:46Z

Sources/swift-format/Subcommands/LintFormatOptions.swift

+      A JSON string containing an array of {"offset":<Int>, "length":<Int>} pairs specifying \
+      the source code ranges to format.
+      """)
+  var selection: String?


This might just be for testing, but I think we can improve the user interface a little bit here; how about something like --lines <start line>:<end line> for line-based ranges and --offsets <start offset>:<end offset> for offset-based ranges? (I don't think "length" is a particularly good way to express this on the command line.) Those flags are a bit closer to what clang-format provides for its command line, and we could accept multiple occurrences of it.

To deal with parsing the s:e format, you can use swift-argument-parser's ExpressibleByArgument protocol instead of JSON decoding. Maybe the option type is just a pair of ints that implements ExpressibleByArgument and then the validate method constructs the Selection from any --lines and --offsets that it's given.

I like the idea of supporting --lines <start line>:<end line> as clang-format does. This seems much easier for specifying ranges at the command line. The one issue there is having to convert to AbsolutePositions, since that's what we have with a Syntax struct. But we could do that conversion at the start. (I'm already worried about performance of all the range comparisons that happen. We don't want to make that harder.) Anyhow, maybe this can be a second PR?

I was modeling the offset/length bit (kinda) off what clang-format does. It can take multiple --offset/--length pairs. (Would swift-argument-parser support that? I assumed not, but I admit that I never looked.) I ended up using JSON decoding because it was easy. 🤷‍♂️ clang-format doesn't support --offsets <start offset>:<end offset>, so there's no precedent there. My expectation is that a human wouldn't ever specify ranges with offsets (or offsets/lengths). It's just not convenient. But it is convenient for an editor shelling out to swift-format.

If I do manage to change Selection.Range to be a typealias to Range<AbsolutePosition>, then it might be a bit more convenient to pass two offsets, rather than offset/length. Do you have a strong opinion as to whether I should bother with this refactoring? I made a start down that road, and it was a kinda annoying set of changes. 🤣

My own personal opinion is that ranges are a bit more natural than an offset/length pair. I'm not sure what motivated clang-format's --offset and --length flags unless it was some very specific workflow that already had the information in that form, but I don't think ordered pairs of --offset X --length Y ... would work well in swift-argument-parser or for human users.

FWIW, I would also prefer --lines <start line>:<end line>. And if we do the conversion up front with a SourceLocationConverter, I don’t think it should be too expensive. SourceLocationConverter needs to build up a line table but I would be surprised if that’s the same order of magnitude as formatting (would be nice to confirm though).

The SourceLocationConverter is already being created in other code paths anyway, so using it to map the line numbers to offsets or vice versa shouldn't create additional overhead; we just need to make sure we're using the same one and not creating a separate one just for that purpose.

Yeah, I like --lines <start line>:<end line> as well. For an actual user, this is way more useful than offsets. But offsets works well when integrating with other tools. (It lets you, for instance, reformat spacing within a single line without affecting anything else. That's something you might want to do from within an editor.) Chatting with Tony on Friday we also agreed that using --offsets <start offset>:<end offset> is cleaner. I'll make that change in this PR. Supporting --lines <start line>:<end line> can come in a followup PR.

I may add the --lines <start line>:<end line> support in this PR... at least for the testing support. Right now I've disabled the idempotency testing for non-infinite selection tests. If we support specifying lines, then we can (usually) check for idempotency.

@DaveEwing Alex pointed me to this conversion because I'm trying to implement git swift-format, we do need --lines for that to work because that's the only easily extractable information from git diff-index.

Sources/SwiftFormat/PrettyPrint/PrettyPrint.swift

DaveEwing · 2024-03-27T17:15:00Z

This is looking really great! A few high-level comments to start that are mainly API/UX things.

Can you plumb through a test or two that shows the whitespace issues you're seeing? That would make it easier to help debug what's going on.

I've got testing support plumbed through, but I fixed the whitespace issues I was seeing.

DaveEwing · 2024-03-27T21:11:48Z

A note on performance.

To be honest, I was rather worried about the performance of this approach. So, I did a couple of tests on a release build of swift-format using TokenStreamCreator.swift from the project (which is 4268 lines long). I used the --measure-instructions option to show how many instructions are executed when formatting the file. I compared this branch to main.

The tests are:

formatting the whole file (no selection)
formatting a selection of the whole file
formatting 200 lines of the file (starting at line 2000)

Here are the results:

	No selection	Whole file selected	Selection lines 2000-2200
Before	4073650908
After	4166960035	4230133360	1552009297

In all cases, formatting finished in well under a second in real time. (I didn't precisely meassure this.)

Formatting the whole file without a selection is about 2% slower. Formatting the whole file with a selection is less than 4% slower. Formatting a small range is significantly faster. That shouldn't really be a surprise, but it is nice to confirm!

Tests/SwiftFormatTests/PrettyPrint/Selections.swift

Sources/SwiftFormat/API/Selection.swift

Tests/SwiftFormatTests/PrettyPrint/AccessorTests.swift

Tests/SwiftFormatTests/PrettyPrint/Selections.swift

ahoppen · 2024-04-03T20:35:04Z

Sources/_SwiftFormatTestSupport/MarkedText.swift

+  /// If the marked text contains "➡️" and "⬅️", they're used to create a selection
+  public var selection: Selection?
+


Since all the markers are represented as Int that signify UTF-8 offsets, I think it would make sense to also store the ranges as Range<Int>.

I'm now using Range<Int> as they're built up, but create a Selection in the initializer. I may want to revisit this (and not have a selection property) if we support line ranges.

DaveEwing · 2024-05-23T21:19:29Z

Pushed changes that respond to most of the comments from @allevato and @ahoppen. (I need to take another pass through the comments to see what I've missed!) I think the changes really are improvements, so thanks you two! I also need to do more testing locally... probably adding more tests - and getting the one disabled test working.

Sources/SwiftFormat/API/Selection.swift

The basic idea here is to insert `enableFormatting` and `disableFormatting` tokens into the print stream when we enter or leave the selection. When formatting is enabled, we print out the tokens as usual. When formatting is disabled, we turn off any output until the next `enableFormatting` token. When that token is hit, we write the original source text from the location of the last `disableFormatting` to the current location. Note that this means that all the APIs need the original source text to be passed in. A `Selection` is represented as an enum with an `.infinite` case, and a `.ranges` case to indicate either selecting the entire file, or an array of start/end utf-8 offsets. The offset pairs are given with `Range<AbsolutePosition>`, matching the (now common) usage in swift-syntax. For testing, allow marked text to use `⏩` and `⏪` to deliniate the start/end of a range of a selection. The command line now takes an `--offsets` option of comma-separated "start:end" pairs to set the selection for formatting.

Sources/swift-format/Subcommands/LintFormatOptions.swift

Sources/SwiftFormat/API/Selection.swift

Sources/SwiftFormat/API/SwiftFormatter.swift

ahoppen · 2024-05-29T18:13:36Z

Sources/SwiftFormat/PrettyPrint/PrettyPrint.swift

+        if let nonWhitespace = text.rangeOfCharacter(
+          from: CharacterSet.whitespaces.inverted, options: .backwards) {
+          text = String(text[..<nonWhitespace.upperBound])
+        }


I’m still missing how the whitespace characters that we’re shaving off with this are being printed.

If you have the following

func foo() ⏩{}⏪

Then I think this only adds func foo() to the output stream, right (no whitespace)? But the next token that we print is the {, which doesn’t have any leading trivia (spaces are trailing trivia to )).

If those examples work because of some trivia re-attribution rule, how about the following?

func foo() /**/ ⏩{}⏪

func foo() /**/ ⏩{}⏪

Sources/swift-format/Subcommands/LintFormatOptions.swift

Tests/SwiftFormatTests/PrettyPrint/SelectionTests.swift

…ore test cases. For formatting a selection (<#297>).

… support passing multiple of them.

ahoppen · 2024-06-05T00:08:35Z

Looking through my comments again, I think there are 3 open comments (noting them here as much for my own sake as everybody else’s)

Start in disabled state and explicitly enable formatting in the full-document case: Support for formatting a selection #708 (comment)
Enable formatting for all trivia: Support for formatting a selection #708 (comment)
I still don’t understand how whitespace that is being stripped will be printed: Support for formatting a selection #708 (comment)

DaveEwing · 2024-06-05T18:15:55Z

github won't let me comment on I’m still missing how the whitespace characters that we’re shaving off with this are being printed, so I'll do it here. As I alluded to here, we strip off trailing whitespace when we copy the original code so that we can let the normal mechanisms do all the whitespace generation magic instead. This is actually pretty simple. It's a matter of setting just a couple of state variables. This is PrettyPrint.swift, lines 610-617.

DaveEwing · 2024-06-05T18:39:12Z

And one additional comment. Currently all the rules that do "non-whitespace" formatting are disabled when there's a selection. For instance, formatting won't remove semicolons when there's a selection. I think that's fine for a first pass, but I want to get that working too.

My intention, and what the code is currently trying to do but is failing, was to only enable processing of nodes that are fully-contained inside individual ranges. (By "failing", I mean that it just ends up disabling all rules.) Instead, I'm now looking at enabling the rules all the time, and using the existing mechanism to turn printing on/off. That's not working yet because the source ranges of the rewritten nodes no longer match the original source. I've good some work to do to figure out how to get it right....

ahoppen · 2024-06-05T22:29:07Z

github won't let me comment on I’m still missing how the whitespace characters that we’re shaving off with this are being printed, so I'll do it here. As I alluded to here, we strip off trailing whitespace when we copy the original code so that we can let the normal mechanisms do all the whitespace generation magic instead. This is actually pretty simple. It's a matter of setting just a couple of state variables. This is PrettyPrint.swift, lines 610-617.

OK, I think I understand what was throwing me off now

I thought that the space and tab trivia from the syntax tree were also added to the token stream but they are not.

Also, you assume that CharacterSet.whitespace is equivalent to the .spaces, and .tabs, .formfeeds and .verticalTabs cases in Trivia right? But that’s not generally true. Two examples:

NO-BREAK SPACE is considered as whitespace in CharacterSet.whitespace but it’s parsed as unexpected text in swift-syntax (same for other characters from Unicode General Category Zs)
If you have a line comment with a trailing space, we consider that space as part of the comment and not as separate whitespace in the trivia.
Because of this, when the formatted selection ends after a line comment with a trailing whitespace, you would remove it while whole file formatting would not trim it, I think. Similarly, if the formatted selection ends after a non-breaking space, you would remove it while whole file formatting would not.

I think we have two solutions here:

Inspect the syntax tree to find characters that swift-syntax has classified as spaces or tabs
Accept that there is a discrepancy between whole-file formatting and range-based formatting here. If we do that, I think we should
a. Add a comment explaining the trade-off being made
b. Only trim space, tab, formfeed and vertical tab instead of all characters in CharacterSet.whitespaces to align this to swift-syntax as closely as possible.

DaveEwing · 2024-06-11T17:09:44Z

Also, you assume that CharacterSet.whitespace is equivalent to the .spaces, and .tabs, .formfeeds and .verticalTabs cases in Trivia right? But that’s not generally true. Two examples:

NO-BREAK SPACE is considered as whitespace in CharacterSet.whitespace but it’s parsed as unexpected text in swift-syntax (same for other characters from Unicode General Category Zs)
If you have a line comment with a trailing space, we consider that space as part of the comment and not as separate whitespace in the trivia.
Because of this, when the formatted selection ends after a line comment with a trailing whitespace, you would remove it while whole file formatting would not trim it, I think. Similarly, if the formatted selection ends after a non-breaking space, you would remove it while whole file formatting would not.
I think we have two solutions here:

Inspect the syntax tree to find characters that swift-syntax has classified as spaces or tabs
Accept that there is a discrepancy between whole-file formatting and range-based formatting here. If we do that, I think we should
a. Add a comment explaining the trade-off being made
b. Only trim space, tab, formfeed and vertical tab instead of all characters in CharacterSet.whitespaces to align this to swift-syntax as closely as possible.

Once again, github doesn't let me comment directly. Weird.

In any case, adding a non-breaking space to your source causes swift-format to fail completely. It just outputs an empty file. This is true even before the changes in this PR. I haven't bothered looking into what it's doing.

I've tried cases where a comment ends with a space, or when block comments have a trailing space. I haven't found any differences between the behavior with or without a selection. The spaces are always removed. (And the behavior matches the behavior before the changes in this PR.)

While there might be some edge cases here, they don't seem important to me.

ahoppen

I agree I am very pedantic here. A lot of it was me trying to understand how things fit together to see if there are any more significant issues lurking somewhere. But seems like there aren’t. Let’s get this merged and we can always fix it later if it’s causing any issues in practice (which I agree would probably be very niece anyway).

allevato · 2024-06-12T11:36:29Z

Currently all the rules that do "non-whitespace" formatting are disabled when there's a selection. For instance, formatting won't remove semicolons when there's a selection. I think that's fine for a first pass, but I want to get that working too.

@DaveEwing Were you still working on this, or did you want to land this change and then come back to it? I'm fine either way (since it's not immediately obvious how deep the change will be make that work).

I'll go ahead and kick off CI again, and if it comes back clean and you're happy with the current state of the PR, we can get it merged.

allevato

CI is green, so let me know if you'd like this merged now!

DaveEwing · 2024-06-14T02:09:08Z

Currently all the rules that do "non-whitespace" formatting are disabled when there's a selection. For instance, formatting won't remove semicolons when there's a selection. I think that's fine for a first pass, but I want to get that working too.

@DaveEwing Were you still working on this, or did you want to land this change and then come back to it? I'm fine either way (since it's not immediately obvious how deep the change will be make that work).

Part of me would like to clean this up first (maybe not fix it, but just make it more obvious in the code, and to users, that the rules are disabled). But I'd gladly do that in a second PR.

If you're up for it @allevato, let's merge, and I'll continue with polish over the next few weeks/months.

allevato · 2024-06-14T11:50:24Z

I'll go ahead and merge then. Since this feature is only on main, we have time to iterate on it. Thanks for working on this—it's something I've wanted for a long time!

Support for formatting a selection

DaveEwing added New Feature swift-format labels Mar 22, 2024

DaveEwing requested review from allevato, bnbarham and ahoppen March 22, 2024 20:27

DaveEwing linked an issue Mar 22, 2024 that may be closed by this pull request

[SR-15880] Add ability to swift-format to only re-format certain ranges of a file #297

Closed

allevato reviewed Mar 26, 2024

View reviewed changes

allevato reviewed Mar 28, 2024

View reviewed changes

Tests/SwiftFormatTests/PrettyPrint/Selections.swift Outdated Show resolved Hide resolved

ahoppen reviewed Apr 3, 2024

View reviewed changes

DaveEwing commented May 24, 2024

View reviewed changes

Sources/SwiftFormat/API/Selection.swift Show resolved Hide resolved

DaveEwing force-pushed the dewing/FormatRanges branch 2 times, most recently from 4558882 to a1b10de Compare May 28, 2024 20:07

DaveEwing marked this pull request as ready for review May 28, 2024 20:13

allevato mentioned this pull request May 28, 2024

[DO NOT MERGE] allevato's dummy PR for swift-format CI swiftlang/swift-syntax#1086

Open

ahoppen force-pushed the dewing/FormatRanges branch from a1b10de to 6640067 Compare May 29, 2024 16:27

ahoppen reviewed May 29, 2024

View reviewed changes

DaveEwing added 2 commits June 3, 2024 16:08

Some small refectorings and updates from review feedback. Add a few m…

88beb85

…ore test cases. For formatting a selection (<#297>).

Change the --offsets argument to take a single pair of offsets, and…

0c0977d

… support passing multiple of them.

DaveEwing force-pushed the dewing/FormatRanges branch from 7e7d17c to 0c0977d Compare June 4, 2024 15:11

MahdiBM mentioned this pull request Jun 8, 2024

[Getting started with vscode swift] Formatting Section swiftlang/swift-org-website#705

Open

ahoppen approved these changes Jun 12, 2024

View reviewed changes

allevato approved these changes Jun 12, 2024

View reviewed changes

allevato merged commit 53279ee into main Jun 14, 2024

allevato deleted the dewing/FormatRanges branch June 14, 2024 11:50

ahoppen mentioned this pull request Jun 17, 2024

Cherry-pick #708 and #750 to release/6.0 #751

Closed

ahoppen pushed a commit to ahoppen/swift-format that referenced this pull request Jun 17, 2024

Merge pull request swiftlang#708 from apple/dewing/FormatRanges

e5360f4

Support for formatting a selection

plemarquand mentioned this pull request Jun 21, 2024

[SR-16106] Once SourceKit-LSP supports formatting, run formatter on text inserted by CodeActions swiftlang/sourcekit-lsp#496

Open

ahoppen mentioned this pull request Jun 27, 2024

Merge main into release/6.0 #760

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for formatting a selection #708

Support for formatting a selection #708

DaveEwing commented Mar 22, 2024 •

edited

Loading

allevato left a comment

allevato Mar 26, 2024

DaveEwing Mar 27, 2024

allevato Mar 28, 2024

ahoppen Mar 29, 2024

allevato Mar 29, 2024

DaveEwing Apr 1, 2024

DaveEwing May 24, 2024

xedin Jul 25, 2024

DaveEwing commented Mar 27, 2024

DaveEwing commented Mar 27, 2024

ahoppen Apr 3, 2024

DaveEwing May 24, 2024

DaveEwing commented May 23, 2024

ahoppen May 29, 2024

ahoppen commented Jun 5, 2024 •

edited by DaveEwing

Loading

DaveEwing commented Jun 5, 2024 •

edited

Loading

DaveEwing commented Jun 5, 2024

ahoppen commented Jun 5, 2024 •

edited

Loading

DaveEwing commented Jun 11, 2024 •

edited

Loading

ahoppen left a comment

allevato commented Jun 12, 2024

allevato left a comment

DaveEwing commented Jun 14, 2024 •

edited

Loading

allevato commented Jun 14, 2024

		/// If the marked text contains "➡️" and "⬅️", they're used to create a selection
		public var selection: Selection?

Support for formatting a selection #708

Support for formatting a selection #708

Conversation

DaveEwing commented Mar 22, 2024 • edited Loading

allevato left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DaveEwing commented Mar 27, 2024

DaveEwing commented Mar 27, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DaveEwing commented May 23, 2024

Choose a reason for hiding this comment

ahoppen commented Jun 5, 2024 • edited by DaveEwing Loading

DaveEwing commented Jun 5, 2024 • edited Loading

DaveEwing commented Jun 5, 2024

ahoppen commented Jun 5, 2024 • edited Loading

DaveEwing commented Jun 11, 2024 • edited Loading

ahoppen left a comment

Choose a reason for hiding this comment

allevato commented Jun 12, 2024

allevato left a comment

Choose a reason for hiding this comment

DaveEwing commented Jun 14, 2024 • edited Loading

allevato commented Jun 14, 2024

DaveEwing commented Mar 22, 2024 •

edited

Loading

ahoppen commented Jun 5, 2024 •

edited by DaveEwing

Loading

DaveEwing commented Jun 5, 2024 •

edited

Loading

ahoppen commented Jun 5, 2024 •

edited

Loading

DaveEwing commented Jun 11, 2024 •

edited

Loading

DaveEwing commented Jun 14, 2024 •

edited

Loading