SSR: Support statement matching and replacing #6587

MarijnS95 · 2020-11-17T21:44:52Z

Hi!

This is a smaller initial patchset that came up while working on support for statement lists (and my first time working on RA 😁). It has me stuck on trailing semicolons for which I hope to receive some feedback. Matching (and replacing) let bindings with a trailing semicolon works fine, but trying to omit these (to make patterns more ergonomic) turns out more complex than expected.

The "optional trailing semicolon solution" implemented in this PR is ugly because Matcher::attempt_match_token should only consume a trailing ; when parsing let bindings to prevent other code from breaking. That at the same time has a nasty side-effect of ; ending up in the matched code: any replacements on that should include the trailing semicolon as well even if it was not in the pattern. A better example is in the tests:

https://github.com/rust-analyzer/rust-analyzer/blob/3ae1649c24a689473b874c331f5f176e5839978e/crates/ssr/src/tests.rs#L178-L184

The end result to achieve is (I guess) allowing replacement of let bindings without trailing semicolon like let x = $a ==>> let x = 1 (but including them on both sides is still fine), and should make replacement in a macro call (where foo!(let a = 2;) for a $x:stmt is invalid syntax) possible as well. That should allow to enable/fix these tests:

https://github.com/rust-analyzer/rust-analyzer/blob/3ae1649c24a689473b874c331f5f176e5839978e/crates/ssr/src/tests.rs#L201-L214

A possible MVP of this PR might be to drop this optional `;' handling entirely and only allow an SSR pattern/template with semicolons on either side.

MarijnS95 · 2020-12-29T22:38:15Z

Marking this PR ready for review to capture suggestions and a general (n)ack on the way this is implemented. Note that the code and commit history are still messy, that will be cleaned up in due time when the path forward is cleared up 🙂

matklad · 2020-12-30T10:14:49Z

cc @davidlattimore

crates/ssr/src/lib.rs

davidlattimore · 2020-12-31T01:20:51Z

Nice to see this being worked on! You're right that the optional semicolon handling is adding quite a bit of complexity. My personal feeling would be that making semicolons non-optional would be reasonable for a first cut of the feature.

If you do want to make semicolons optional, I think it could be done more simply. I'd be inclined to allow let $a = $b to match let x = 10; (including the semicolon). The semicolon is part of the let statement and trying to match only part of a syntax node is I think what's adding lots of the complexity. Allowing this would only require a couple of extra lines of code in attempt_match_token - if the pattern is at the end, the current code token is a semicolon and the parent node is a let-statement then accept the match.

You'd then need a little extra code in replacing.rs to add an extra semicolon if the replacement node is a statement, but doesn't end in a semicolon.

MarijnS95 · 2020-12-31T16:29:35Z

@davidlattimore That suggestion is implemented in the first 5 commits, up to and including be6b4c3. That last commit makes sure optional trailing semicolons are only allowed for let patterns, otherwise tons of tests break.

With that, as you say the replacement code needs an update to insert missing trailing semicolons where necessary.

I'm hesitant to implement it that way though because we're going to need partial match ranges anyway for statements lists - where one usually only wants to match a few consecutive statements in a block. At that point this arguably hacky approach can be stripped out again, and have a consistent user experience as well (ie. replacing let x=y; ==>> let x=y should make the semicolon disappear, replacing let x=y ==>> let x=y; should make a second one appear). Also, what would - and should - happen when replacing a let binding with something different entirely? If we insert a trailing semicolon for every statement that lacks one it might not be what the user intended. It can possibly be limited to instances where the matched block was a let binding with semicolon.

In the end we should start with some test cases covering these, especially where the user intends to change trailing colon rules - and compare the simplified approach with match ranges.

crates/ssr/src/matching.rs

davidlattimore · 2020-12-31T22:56:15Z

Do you think it might make sense then to leave optional semicolons until after multi-statement support is implemented?

MarijnS95 · 2021-01-01T13:43:27Z

@davidlattimore I am not sure. Implementing it the way you suggested turns out fairly trivial, but the aforementioned hypothetical test/usecases fail as expected:

https://github.com/rust-analyzer/rust-analyzer/blob/a87293b45f1679d42a9f1568ef65ff76d98c61c6/crates/ssr/src/tests.rs#L222-L227

https://github.com/rust-analyzer/rust-analyzer/blob/a87293b45f1679d42a9f1568ef65ff76d98c61c6/crates/ssr/src/tests.rs#L239-L244

Afaik this should be fixable earlier in the pipeline, when having the search pattern and template side-by-side: if the search pattern includes a trailing semicolon, the replacement statement should receive one as well.

Unsurprisingly the partial match implementation implicitly succeeds all these tests, see the cleaned up version here: https://github.com/MarijnS95/rust-analyzer/compare/stmt..let-stmt-partial

Do you think these theoretical usecases are worth fixing? Otherwise, we might do this in 3 stages:

Disable these test cases and take this PR as-is with insertion code in replacing.rs;
Re-enable the testcases with the partial match range implementation, reviewed in a separate PR;
Implement and review partial statement-list matching in parallel.

davidlattimore · 2021-01-01T22:11:51Z

I probably wouldn't worry about the hypothetical use cases. SSR is really intended for transforming syntactically valid code into different syntactically valid code. I struggle to think of a case where removing semicolons from valid code would leave the code valid. Even if there is a case, it's probably pretty obscure, so probably isn't a high priority to support.

I assume the motivation for wanting to support optional trailing semicolons is to make writing the rule easier, since you don't need to remember to add the semicolons? Assuming that's the case, you could make it an error to supply a rule that has a trailing semicolon on one side, but not the other. Not sure if it's worth enforcing such a rule though. I'd probably just treat rules with / without the trailing semicolon as semantically equivalent.

MarijnS95 · 2021-01-02T00:00:18Z

@davidlattimore I mostly care about consistency to the user. If they specify a semicolon on one side but not the other there must be a meaning in that. I won't be surprised if someone ends up in a situation with a single let statement in a block that they wish to replace with some expression without trailing semicolon to use as the final expression in a block.
(Next to that I care about code cleanliness here in the codebase. Not a fan of littering some if LET_STMT edgecases around if it can be solved more or less "generic" with a concept needed later).

As for making it optional, yes, just convenience. Just as it might be convenient to match just the let foo part for renames, even though that can't be substituted for anything other than a partial let <name> statement or the output becomes invalid (well, assigning the RHS to something that's not a let binding is valid. That's possible now with a pattern like: let foo = $b ==>> something_mut = $b).

In the end, unlike expressions a let binding without trailing semicolon is invalid so we might as well treat it as such: simply disallow it from being specified (parsed, really) in the search pattern nor replacement template.

davidlattimore · 2021-01-02T04:08:08Z

Disallowing let expressions without a trailing semicolon sounds good to me :)

MarijnS95 · 2021-01-02T15:09:17Z

@davidlattimore This should have been easy to solve by converting the statement parser to expressions::StmtWithSemi::Yes but breaks parsing the RHS in one of the simple test cases: let $a = $b; ==>> $b. I'm leaving it at Optional and have instead removed StmtWithSemi handling from the let_stmt parser. If I understand our conversations correctly thus far it should be impossible/invalid for a let statement to contain a trailing semicolon, ever.

MarijnS95 · 2021-01-02T15:39:33Z

Unsurprisingly this breaks macro calls, where let statements without trailing semicolon are allowed. We can use Optional after all, and I've removed the (now failing) test case asserting let $a = $b to be invalid again.

davidlattimore · 2021-01-02T21:58:49Z

crates/syntax/src/lib.rs

@@ -212,6 +212,13 @@ impl ast::Attr {
    }
 }

+impl ast::Stmt {
+    /// Returns `text`, parsed as statement, but only if it has no errors.
+    pub fn parse(text: &str) -> Result<Self, ()> {


It'd be nice to add a test for this function. The other parse functions in this file have tests - see test.rs in this directory

Fair point. Doesn't seem like every type (ast::Attr right above) has an explicit test though.

Not sure what to add for the err case. Could add some garbage, but I guess we rather want bits of code that should explicitly not be parseable as Stmt. Suggestions?

Ah, I hadn't seen the parser for ast::Attr, that must have been added since I implemented the others. I was thinking of the ast::Item::parse etc, which have tests - e.g. item_parser_tests.

For error cases, I guess:

Something that isn't a statement. e.g. and attribute #[foo]

Multiple statements: a(); b();

Something unparsable: )

Just letting you know in case you feel like adding more tests to the other parsers.

Apart your suggestions I've added a bunch of items, expressions and blocks to the mix that can be parsed as Stmt.

crates/ssr/src/tests.rs

davidlattimore

How fast are all those parser tests? Given that the parser is already pretty well tested elsewhere and the new API is just adding a thin layer of extra code, I'm wondering if that many tests might be too much. But if they run quickly, perhaps it doesn't matter.

Otherwise, this looks good to me.

crates/syntax/src/tests.rs

Adjusting `grammar::fragments::stmt` to Optional or Yes will break original functionality and tests.

MarijnS95 · 2021-01-03T11:10:54Z

@davidlattimore The timing difference is pretty much negligible and within margin of error:

test result: ok. 47 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

cargo t -p syntax  21.40s user 0.09s system 2297% cpu 0.935 total

test result: ok. 48 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

cargo t -p syntax  20.98s user 0.20s system 2211% cpu 0.957 total

(Measurements taken after building the testing binary once, of course)

EDIT: However, fair to say littering tests around when they have little significance only confuses the reader above all. In the end I guess these serve to test the parser "setup" like whether trailing semicolons are allowed (which doesn't make sense on Items, those should be allowed regardless. But then those tests exist elsewhere as well).

Now that statements can be matched and replaced (rust-lang#6587) some usecases require expressions to be replaced with statements as well. This happens when something that can ambiguously be an expression or statement like `if` blocks and loops appear in the last position of a block: in this case a replacement pattern of the form `if foo(){$a();}==>>$a();` will only substitute `if` blocks in the list of statements but not if they (implicitly) end up in the trailing expression, where they are not wrapped by an EXPR_STMT (but the pattern and template are as only succeeds for the `stmt ==>> stmt` case). Instead of adding two rules that match an expression - and emit duplicate matching errors - allow the template for expressions to be a statement if it fails to parse as an expression.

Now that statements can be matched and replaced (rust-lang#6587) some usecases require expressions to be replaced with statements as well. This happens when something that can ambiguously be an expression or statement like `if` and loop blocks appear in the last position of a block: in this case a replacement pattern of the form `if foo(){$a();}==>>$a();` will only substitute `if` blocks in the list of statements but not if they (implicitly) end up in the trailing expression, where they are not wrapped by an EXPR_STMT (but the pattern and template are as only succeeds for the `stmt ==>> stmt` case). Instead of adding two rules that match an expression - and emit duplicate matching errors - allow the template for expressions to be a statement if it fails to parse as an expression.

Now that statements can be matched and replaced (rust-lang#6587) some usecases require expressions to be replaced with statements as well. This happens when something that can ambiguously be an expression or statement like `if` and loop blocks appear in the last position of a block, as trailing expression. In this case a replacement pattern of the form `if foo(){$a();}==>>$a();` will only substitute `if` blocks in the list of statements but not if they (implicitly) end up in the trailing expression, where they are not wrapped by an EXPR_STMT (but the pattern and template are, as parsing only succeeds for the `stmt ==>> stmt` case). Instead of adding two rules that match an expression - and emit duplicate matching errors - allow the template for expressions to be a statement if it fails to parse as an expression.

matklad · 2021-01-04T11:01:05Z

let me

bors d=davidlattimore

bors · 2021-01-04T11:01:06Z

✌️ davidlattimore can now approve this pull request. To approve and merge a pull request, simply reply with bors r+. More detailed instructions are available here.

davidlattimore · 2021-01-04T11:14:25Z

bors r+

bors · 2021-01-04T11:23:23Z

Build succeeded:

@davidlattimore

7147: ssr: Allow replacing expressions with statements r=davidlattimore a=MarijnS95 Depends on #6587 Until that is merged, the diff is https://github.com/MarijnS95/rust-analyzer/compare/stmt..replace-expr-with-stmt --- Now that statements can be matched and replaced (#6587) some usecases require expressions to be replaced with statements as well. This happens when something that can ambiguously be an expression or statement like `if` and loop blocks appear in the last position of a block, as trailing expression. In this case a replacement pattern of the form `if foo(){$a();}==>>$a();` will only substitute `if` blocks in the list of statements but not if they (implicitly) end up in the trailing expression, where they are not wrapped by an EXPR_STMT (but the pattern and template are, as parsing only succeeds for the `stmt ==>> stmt` case). Instead of adding two rules that match an expression - and emit duplicate matching errors - allow the template for expressions to be a statement if it fails to parse as an expression. --- Another gross change that does not seem to break any tests currently, but perhaps a safeguard should be added to only allow this kind of replacement in blocks by "pushing" the replacement template to the statement list and clearing the trailing expression? CC @davidlattimore Co-authored-by: Marijn Suijten <[email protected]>

Now that statements can be matched and replaced (rust-lang#6587) some usecases require expressions to be replaced with statements as well. This happens when something that can ambiguously be an expression or statement like `if` and loop blocks appear in the last position of a block, as trailing expression. In this case a replacement pattern of the form `if foo(){$a();}==>>$a();` will only substitute `if` blocks in the list of statements but not if they (implicitly) end up in the trailing expression, where they are not wrapped by an EXPR_STMT (but the pattern and template are, as parsing only succeeds for the `stmt ==>> stmt` case). Instead of adding two rules that match an expression - and emit duplicate matching errors - allow the template for expressions to be a statement if it fails to parse as an expression.

MarijnS95 force-pushed the stmt branch from b38fb1a to 21bd02b Compare November 29, 2020 23:33

MarijnS95 force-pushed the stmt branch from d5b0995 to de8f6d8 Compare December 29, 2020 22:04

MarijnS95 marked this pull request as ready for review December 29, 2020 22:38

lnicola reviewed Dec 30, 2020

View reviewed changes

crates/ssr/src/lib.rs Outdated Show resolved Hide resolved

MarijnS95 force-pushed the stmt branch from de8f6d8 to c495a05 Compare December 30, 2020 10:28

MarijnS95 force-pushed the stmt branch from 3168adf to 6ca8538 Compare December 31, 2020 16:30

davidlattimore reviewed Dec 31, 2020

View reviewed changes

crates/ssr/src/matching.rs Outdated Show resolved Hide resolved

MarijnS95 force-pushed the stmt branch 3 times, most recently from 574db30 to f8bc518 Compare January 1, 2021 13:38

MarijnS95 force-pushed the stmt branch from f8bc518 to 7d1b725 Compare January 2, 2021 15:09

MarijnS95 force-pushed the stmt branch from 7d1b725 to f38dbce Compare January 2, 2021 15:37

davidlattimore reviewed Jan 2, 2021

View reviewed changes

MarijnS95 force-pushed the stmt branch 2 times, most recently from 380b872 to 253d651 Compare January 3, 2021 10:25

syntax,ssr: Implement statement parsing

cc081b7

MarijnS95 force-pushed the stmt branch from 253d651 to 86c8dea Compare January 3, 2021 10:29

davidlattimore reviewed Jan 3, 2021

View reviewed changes

crates/syntax/src/tests.rs Outdated Show resolved Hide resolved

parser,syntax: Add separate parser for stmt with optional semicolon

42da26e

Adjusting `grammar::fragments::stmt` to Optional or Yes will break original functionality and tests.

ssr: Add tests for raw LetStmt matching

b87699d

MarijnS95 force-pushed the stmt branch from 86c8dea to b87699d Compare January 3, 2021 11:07

davidlattimore approved these changes Jan 3, 2021

View reviewed changes

MarijnS95 changed the title ~~SSR: Preliminary support for let statement matching~~ SSR: Support statement matching and replacing Jan 4, 2021

MarijnS95 mentioned this pull request Jan 4, 2021

ssr: Allow replacing expressions with statements #7147

Merged

bors bot merged commit ac123ac into rust-lang:master Jan 4, 2021

MarijnS95 deleted the stmt branch January 4, 2021 11:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SSR: Support statement matching and replacing #6587

SSR: Support statement matching and replacing #6587

MarijnS95 commented Nov 17, 2020

MarijnS95 commented Dec 29, 2020

matklad commented Dec 30, 2020

davidlattimore commented Dec 31, 2020

MarijnS95 commented Dec 31, 2020

davidlattimore commented Dec 31, 2020

MarijnS95 commented Jan 1, 2021

davidlattimore commented Jan 1, 2021

MarijnS95 commented Jan 2, 2021

davidlattimore commented Jan 2, 2021

MarijnS95 commented Jan 2, 2021

MarijnS95 commented Jan 2, 2021

davidlattimore Jan 2, 2021

MarijnS95 Jan 3, 2021

davidlattimore Jan 3, 2021

MarijnS95 Jan 3, 2021

davidlattimore left a comment

MarijnS95 commented Jan 3, 2021 •

edited

Loading

matklad commented Jan 4, 2021

bors bot commented Jan 4, 2021

davidlattimore commented Jan 4, 2021

bors bot commented Jan 4, 2021

SSR: Support statement matching and replacing #6587

SSR: Support statement matching and replacing #6587

Conversation

MarijnS95 commented Nov 17, 2020

MarijnS95 commented Dec 29, 2020

matklad commented Dec 30, 2020

davidlattimore commented Dec 31, 2020

MarijnS95 commented Dec 31, 2020

davidlattimore commented Dec 31, 2020

MarijnS95 commented Jan 1, 2021

davidlattimore commented Jan 1, 2021

MarijnS95 commented Jan 2, 2021

davidlattimore commented Jan 2, 2021

MarijnS95 commented Jan 2, 2021

MarijnS95 commented Jan 2, 2021

davidlattimore Jan 2, 2021

Choose a reason for hiding this comment

MarijnS95 Jan 3, 2021

Choose a reason for hiding this comment

davidlattimore Jan 3, 2021

Choose a reason for hiding this comment

MarijnS95 Jan 3, 2021

Choose a reason for hiding this comment

davidlattimore left a comment

Choose a reason for hiding this comment

MarijnS95 commented Jan 3, 2021 • edited Loading

matklad commented Jan 4, 2021

bors bot commented Jan 4, 2021

davidlattimore commented Jan 4, 2021

bors bot commented Jan 4, 2021

MarijnS95 commented Jan 3, 2021 •

edited

Loading