Skip to content

SSR: Support statement matching and replacing #6587

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jan 4, 2021
Merged

Conversation

MarijnS95
Copy link
Contributor

For #3186

Hi!

This is a smaller initial patchset that came up while working on support for statement lists (and my first time working on RA 😁). It has me stuck on trailing semicolons for which I hope to receive some feedback. Matching (and replacing) let bindings with a trailing semicolon works fine, but trying to omit these (to make patterns more ergonomic) turns out more complex than expected.

The "optional trailing semicolon solution" implemented in this PR is ugly because Matcher::attempt_match_token should only consume a trailing ; when parsing let bindings to prevent other code from breaking. That at the same time has a nasty side-effect of ; ending up in the matched code: any replacements on that should include the trailing semicolon as well even if it was not in the pattern. A better example is in the tests:

https://github.com/rust-analyzer/rust-analyzer/blob/3ae1649c24a689473b874c331f5f176e5839978e/crates/ssr/src/tests.rs#L178-L184

The end result to achieve is (I guess) allowing replacement of let bindings without trailing semicolon like let x = $a ==>> let x = 1 (but including them on both sides is still fine), and should make replacement in a macro call (where foo!(let a = 2;) for a $x:stmt is invalid syntax) possible as well. That should allow to enable/fix these tests:

https://github.com/rust-analyzer/rust-analyzer/blob/3ae1649c24a689473b874c331f5f176e5839978e/crates/ssr/src/tests.rs#L201-L214

A possible MVP of this PR might be to drop this optional `;' handling entirely and only allow an SSR pattern/template with semicolons on either side.

@MarijnS95
Copy link
Contributor Author

Marking this PR ready for review to capture suggestions and a general (n)ack on the way this is implemented. Note that the code and commit history are still messy, that will be cleaned up in due time when the path forward is cleared up 🙂

@MarijnS95 MarijnS95 marked this pull request as ready for review December 29, 2020 22:38
@matklad
Copy link
Member

matklad commented Dec 30, 2020

cc @davidlattimore

@davidlattimore
Copy link
Contributor

Nice to see this being worked on! You're right that the optional semicolon handling is adding quite a bit of complexity. My personal feeling would be that making semicolons non-optional would be reasonable for a first cut of the feature.

If you do want to make semicolons optional, I think it could be done more simply. I'd be inclined to allow let $a = $b to match let x = 10; (including the semicolon). The semicolon is part of the let statement and trying to match only part of a syntax node is I think what's adding lots of the complexity. Allowing this would only require a couple of extra lines of code in attempt_match_token - if the pattern is at the end, the current code token is a semicolon and the parent node is a let-statement then accept the match.

You'd then need a little extra code in replacing.rs to add an extra semicolon if the replacement node is a statement, but doesn't end in a semicolon.

@MarijnS95
Copy link
Contributor Author

@davidlattimore That suggestion is implemented in the first 5 commits, up to and including be6b4c3. That last commit makes sure optional trailing semicolons are only allowed for let patterns, otherwise tons of tests break.

With that, as you say the replacement code needs an update to insert missing trailing semicolons where necessary.

I'm hesitant to implement it that way though because we're going to need partial match ranges anyway for statements lists - where one usually only wants to match a few consecutive statements in a block. At that point this arguably hacky approach can be stripped out again, and have a consistent user experience as well (ie. replacing let x=y; ==>> let x=y should make the semicolon disappear, replacing let x=y ==>> let x=y; should make a second one appear). Also, what would - and should - happen when replacing a let binding with something different entirely? If we insert a trailing semicolon for every statement that lacks one it might not be what the user intended. It can possibly be limited to instances where the matched block was a let binding with semicolon.

In the end we should start with some test cases covering these, especially where the user intends to change trailing colon rules - and compare the simplified approach with match ranges.

@davidlattimore
Copy link
Contributor

Do you think it might make sense then to leave optional semicolons until after multi-statement support is implemented?

@MarijnS95 MarijnS95 force-pushed the stmt branch 3 times, most recently from 574db30 to f8bc518 Compare January 1, 2021 13:38
@MarijnS95
Copy link
Contributor Author

@davidlattimore I am not sure. Implementing it the way you suggested turns out fairly trivial, but the aforementioned hypothetical test/usecases fail as expected:

https://github.com/rust-analyzer/rust-analyzer/blob/a87293b45f1679d42a9f1568ef65ff76d98c61c6/crates/ssr/src/tests.rs#L222-L227

https://github.com/rust-analyzer/rust-analyzer/blob/a87293b45f1679d42a9f1568ef65ff76d98c61c6/crates/ssr/src/tests.rs#L239-L244

Afaik this should be fixable earlier in the pipeline, when having the search pattern and template side-by-side: if the search pattern includes a trailing semicolon, the replacement statement should receive one as well.

Unsurprisingly the partial match implementation implicitly succeeds all these tests, see the cleaned up version here: https://github.com/MarijnS95/rust-analyzer/compare/stmt..let-stmt-partial

Do you think these theoretical usecases are worth fixing? Otherwise, we might do this in 3 stages:

  1. Disable these test cases and take this PR as-is with insertion code in replacing.rs;
  2. Re-enable the testcases with the partial match range implementation, reviewed in a separate PR;
  3. Implement and review partial statement-list matching in parallel.

@davidlattimore
Copy link
Contributor

I probably wouldn't worry about the hypothetical use cases. SSR is really intended for transforming syntactically valid code into different syntactically valid code. I struggle to think of a case where removing semicolons from valid code would leave the code valid. Even if there is a case, it's probably pretty obscure, so probably isn't a high priority to support.

I assume the motivation for wanting to support optional trailing semicolons is to make writing the rule easier, since you don't need to remember to add the semicolons? Assuming that's the case, you could make it an error to supply a rule that has a trailing semicolon on one side, but not the other. Not sure if it's worth enforcing such a rule though. I'd probably just treat rules with / without the trailing semicolon as semantically equivalent.

@MarijnS95
Copy link
Contributor Author

@davidlattimore I mostly care about consistency to the user. If they specify a semicolon on one side but not the other there must be a meaning in that. I won't be surprised if someone ends up in a situation with a single let statement in a block that they wish to replace with some expression without trailing semicolon to use as the final expression in a block.
(Next to that I care about code cleanliness here in the codebase. Not a fan of littering some if LET_STMT edgecases around if it can be solved more or less "generic" with a concept needed later).

As for making it optional, yes, just convenience. Just as it might be convenient to match just the let foo part for renames, even though that can't be substituted for anything other than a partial let <name> statement or the output becomes invalid (well, assigning the RHS to something that's not a let binding is valid. That's possible now with a pattern like: let foo = $b ==>> something_mut = $b).

In the end, unlike expressions a let binding without trailing semicolon is invalid so we might as well treat it as such: simply disallow it from being specified (parsed, really) in the search pattern nor replacement template.

@davidlattimore
Copy link
Contributor

Disallowing let expressions without a trailing semicolon sounds good to me :)

@MarijnS95
Copy link
Contributor Author

@davidlattimore This should have been easy to solve by converting the statement parser to expressions::StmtWithSemi::Yes but breaks parsing the RHS in one of the simple test cases: let $a = $b; ==>> $b. I'm leaving it at Optional and have instead removed StmtWithSemi handling from the let_stmt parser. If I understand our conversations correctly thus far it should be impossible/invalid for a let statement to contain a trailing semicolon, ever.

@MarijnS95
Copy link
Contributor Author

Unsurprisingly this breaks macro calls, where let statements without trailing semicolon are allowed. We can use Optional after all, and I've removed the (now failing) test case asserting let $a = $b to be invalid again.

@@ -212,6 +212,13 @@ impl ast::Attr {
}
}

impl ast::Stmt {
/// Returns `text`, parsed as statement, but only if it has no errors.
pub fn parse(text: &str) -> Result<Self, ()> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be nice to add a test for this function. The other parse functions in this file have tests - see test.rs in this directory

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point. Doesn't seem like every type (ast::Attr right above) has an explicit test though.

Not sure what to add for the err case. Could add some garbage, but I guess we rather want bits of code that should explicitly not be parseable as Stmt. Suggestions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I hadn't seen the parser for ast::Attr, that must have been added since I implemented the others. I was thinking of the ast::Item::parse etc, which have tests - e.g. item_parser_tests.

For error cases, I guess:

  • Something that isn't a statement. e.g. and attribute #[foo]
  • Multiple statements: a(); b();
  • Something unparsable: )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just letting you know in case you feel like adding more tests to the other parsers.

Apart your suggestions I've added a bunch of items, expressions and blocks to the mix that can be parsed as Stmt.

@MarijnS95 MarijnS95 force-pushed the stmt branch 2 times, most recently from 380b872 to 253d651 Compare January 3, 2021 10:25
Copy link
Contributor

@davidlattimore davidlattimore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How fast are all those parser tests? Given that the parser is already pretty well tested elsewhere and the new API is just adding a thin layer of extra code, I'm wondering if that many tests might be too much. But if they run quickly, perhaps it doesn't matter.

Otherwise, this looks good to me.

Adjusting `grammar::fragments::stmt` to Optional or Yes will break
original functionality and tests.
@MarijnS95
Copy link
Contributor Author

MarijnS95 commented Jan 3, 2021

@davidlattimore The timing difference is pretty much negligible and within margin of error:

test result: ok. 47 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

cargo t -p syntax  21.40s user 0.09s system 2297% cpu 0.935 total
test result: ok. 48 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

cargo t -p syntax  20.98s user 0.20s system 2211% cpu 0.957 total

(Measurements taken after building the testing binary once, of course)

EDIT: However, fair to say littering tests around when they have little significance only confuses the reader above all. In the end I guess these serve to test the parser "setup" like whether trailing semicolons are allowed (which doesn't make sense on Items, those should be allowed regardless. But then those tests exist elsewhere as well).

@MarijnS95 MarijnS95 changed the title SSR: Preliminary support for let statement matching SSR: Support statement matching and replacing Jan 4, 2021
MarijnS95 added a commit to MarijnS95/rust-analyzer that referenced this pull request Jan 4, 2021
Now that statements can be matched and replaced (rust-lang#6587) some usecases
require expressions to be replaced with statements as well. This happens
when something that can ambiguously be an expression or statement like
`if` blocks and loops appear in the last position of a block: in this
case a replacement pattern of the form `if foo(){$a();}==>>$a();` will
only substitute `if` blocks in the list of statements but not if they
(implicitly) end up in the trailing expression, where they are not
wrapped by an EXPR_STMT (but the pattern and template are as only
succeeds for the `stmt ==>> stmt` case).

Instead of adding two rules that match an expression - and emit
duplicate matching errors - allow the template for expressions to be a
statement if it fails to parse as an expression.
MarijnS95 added a commit to MarijnS95/rust-analyzer that referenced this pull request Jan 4, 2021
Now that statements can be matched and replaced (rust-lang#6587) some usecases
require expressions to be replaced with statements as well. This happens
when something that can ambiguously be an expression or statement like
`if` and loop blocks appear in the last position of a block: in this
case a replacement pattern of the form `if foo(){$a();}==>>$a();` will
only substitute `if` blocks in the list of statements but not if they
(implicitly) end up in the trailing expression, where they are not
wrapped by an EXPR_STMT (but the pattern and template are as only
succeeds for the `stmt ==>> stmt` case).

Instead of adding two rules that match an expression - and emit
duplicate matching errors - allow the template for expressions to be a
statement if it fails to parse as an expression.
MarijnS95 added a commit to MarijnS95/rust-analyzer that referenced this pull request Jan 4, 2021
Now that statements can be matched and replaced (rust-lang#6587) some usecases
require expressions to be replaced with statements as well. This happens
when something that can ambiguously be an expression or statement like
`if` and loop blocks appear in the last position of a block, as trailing
expression. In this case a replacement pattern of the form `if
foo(){$a();}==>>$a();` will only substitute `if` blocks in the list of
statements but not if they (implicitly) end up in the trailing
expression, where they are not wrapped by an EXPR_STMT (but the pattern
and template are, as parsing only succeeds for the `stmt ==>> stmt`
case).

Instead of adding two rules that match an expression - and emit
duplicate matching errors - allow the template for expressions to be a
statement if it fails to parse as an expression.
@matklad
Copy link
Member

matklad commented Jan 4, 2021

let me

bors d=davidlattimore

@bors
Copy link
Contributor

bors bot commented Jan 4, 2021

✌️ davidlattimore can now approve this pull request. To approve and merge a pull request, simply reply with bors r+. More detailed instructions are available here.

@davidlattimore
Copy link
Contributor

bors r+

@bors
Copy link
Contributor

bors bot commented Jan 4, 2021

@bors bors bot merged commit ac123ac into rust-lang:master Jan 4, 2021
@MarijnS95 MarijnS95 deleted the stmt branch January 4, 2021 11:28
bors bot added a commit that referenced this pull request Jan 4, 2021
7147: ssr: Allow replacing expressions with statements r=davidlattimore a=MarijnS95

Depends on #6587

Until that is merged, the diff is https://github.com/MarijnS95/rust-analyzer/compare/stmt..replace-expr-with-stmt

---

Now that statements can be matched and replaced (#6587) some usecases require expressions to be replaced with statements as well. This happens when something that can ambiguously be an expression or statement like `if` and loop blocks appear in the last position of a block, as trailing expression. In this case a replacement pattern of the form `if foo(){$a();}==>>$a();` will only substitute `if` blocks in the list of statements but not if they (implicitly) end up in the trailing expression, where they are not wrapped by an EXPR_STMT (but the pattern and template are, as parsing only succeeds for the `stmt ==>> stmt` case).

Instead of adding two rules that match an expression - and emit duplicate matching errors - allow the template for expressions to be a statement if it fails to parse as an expression.

---

Another gross change that does not seem to break any tests currently, but perhaps a safeguard should be added to only allow this kind of replacement in blocks by "pushing" the replacement template to the statement list and clearing the trailing expression?

CC @davidlattimore 

Co-authored-by: Marijn Suijten <[email protected]>
ivan770 pushed a commit to ivan770/rust-analyzer that referenced this pull request Jan 22, 2021
Now that statements can be matched and replaced (rust-lang#6587) some usecases
require expressions to be replaced with statements as well. This happens
when something that can ambiguously be an expression or statement like
`if` and loop blocks appear in the last position of a block, as trailing
expression. In this case a replacement pattern of the form `if
foo(){$a();}==>>$a();` will only substitute `if` blocks in the list of
statements but not if they (implicitly) end up in the trailing
expression, where they are not wrapped by an EXPR_STMT (but the pattern
and template are, as parsing only succeeds for the `stmt ==>> stmt`
case).

Instead of adding two rules that match an expression - and emit
duplicate matching errors - allow the template for expressions to be a
statement if it fails to parse as an expression.
Matthias-Fauconneau pushed a commit to Matthias-Fauconneau/rust-analyzer that referenced this pull request Feb 7, 2021
Now that statements can be matched and replaced (rust-lang#6587) some usecases
require expressions to be replaced with statements as well. This happens
when something that can ambiguously be an expression or statement like
`if` and loop blocks appear in the last position of a block, as trailing
expression. In this case a replacement pattern of the form `if
foo(){$a();}==>>$a();` will only substitute `if` blocks in the list of
statements but not if they (implicitly) end up in the trailing
expression, where they are not wrapped by an EXPR_STMT (but the pattern
and template are, as parsing only succeeds for the `stmt ==>> stmt`
case).

Instead of adding two rules that match an expression - and emit
duplicate matching errors - allow the template for expressions to be a
statement if it fails to parse as an expression.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants