Skip to content

More context on diagnostics #12583

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
josevalim opened this issue May 22, 2023 · 28 comments
Closed

More context on diagnostics #12583

josevalim opened this issue May 22, 2023 · 28 comments

Comments

@josevalim
Copy link
Member

@josevalim I have another ask, I'll describe the problem here and we can create another issue if you so desire.

I was looking at a bug in lexical with code actions. Given the following:

case something() do 
  {:ok, unused} -> 
      :ok 
  :error -> 
    :error 
end

The second line will report an unused variable, unused. A LSP server can then emit "code actions" that fix these problems, mainly by replacing the variables with an underscore. The way this is currently implemented is as follows.

  1. Get the line with the warning on it {:ok, unused} ->
  2. Turn that into AST
  3. Look for the variable name in the ast
  4. Replace that variable with the underscored version
  5. Turn the resulting fixed AST back into a string
  6. Run a diff on the original string and the fixed string to get an Edit
  7. Emit the edit back to the client

For this bit of code, the above steps fail at the second step, as {:ok, unused} -> is not parseable by any of the methods that I'm familiar with. I currently use ElixirSense to parse because it fixes errors, but it can't handle this. I also tried Code.string_to_quoted which failed (obviously) as did everything in Code.Fragment.

Previously, I asked for a function in Code.Fragment that would emit the AST until the cursor position. I think that would be helpful in this case as well. Alternately, having the ability to parse a chunk of code that's known to likely be invalid would also be helpful, but I think I understand why this will be much harder.

Originally posted by @scohen in #12566 (comment)

@josevalim
Copy link
Member Author

@scohen wouldnt it be better to include the column in the diagnostic? This way you know exactly where to fix it?

@scohen
Copy link
Contributor

scohen commented May 22, 2023

Right now, due to the other issue that I'm not remembering (the one where warnings are just printed to the console rather than being returned) I have to parse out the line number and variable name from a string, so that compounds the problem.

In this case, I think that using string replacements is much less safe than using an AST transform. One of the nicest things about elixir is the ability to change code with code, which enables a much more reliable transformation. I'm trying to build a language server that provides abstractions around doing transformations like this, and I would much rather that extensions deal with transforming AST rather than using string manipulation. Replacing variables with and underscored version is only the beginning here, I'd like to include refactoring support as well, and having two ways to do code modification (AST and string manipulation) doesn't seem ideal to me.

@scohen
Copy link
Contributor

scohen commented May 22, 2023

For example, the entire code replacement for the above is just this:

    underscored_variable_name = :"_#{unused_variable_name}"
    Macro.postwalk(quoted_ast, fn
      {^unused_variable_name, meta, nil} ->
        {underscored_variable_name, meta, nil}

      other ->
        other
    end)
    |> Macro.to_string()

That's pretty clean.

@josevalim
Copy link
Member Author

Got it. Elixir v1.15 has Code.with_diagnostics for capturing the warning. The parsing is meant to be tackled in #11967. :)

@josevalim
Copy link
Member Author

@scohen said:

I see how this gets tricky; Tuples have a different AST representation than something like a map. In this case, I feel what you have there should be equivalent to what I'd desire (but with a cursor in there) Same with the do/end. I really only care about the things before the cursor, and I would likely be feeding

def foo do 
  {foo, 

to the function. I would say that the cursor position is very relevant to me.
This would make me happy:

{:def, [line: 1], 
  [{:foo, [line: 1], nil}, 
    [do: 
      {:foo, {:__cursor__, [line: 2, column: 8], []}]]}

@josevalim
Copy link
Member Author

I think I am missing a couple things:

  1. You say that you need only part of the AST, but don’t you need the full AST in order to rewrite and traverse it?
  2. What happens when the AST has the same variable elsewhere but it is used? Don’t you need your traversal to be precise in regards to its scope?

@scohen
Copy link
Contributor

scohen commented May 22, 2023

I think there are two conflicting things at work here.
The first is that I'm trying to provide you with a concrete example where the current AST-based approach fails and seemingly has no way forward. That's been done with the above single-line example.
The second thing is I'm trying to anticipate other ways in which the AST approach will work, which is admittedly harder as it's not actually implemented, and I can't anticipate the stumbling blocks as well.

To answer your questions directly:

don’t you need the full AST in order to rewrite and traverse it?

Not really, if I have a warning on a line, I really only need the AST from that line to fix the warning

What happens when the AST has the same variable elsewhere but it is used? Don’t you need your traversal to be precise in regards to its scope?

If I read this correctly, I think this would be along the lines of a "rename variable" kind of fix. In that case, yes, I would need more of the AST so that I could anticipate scoping correctly. In that case, I'd likely need the entire file, which is definitely possible.

@josevalim
Copy link
Member Author

Not really, if I have a warning on a line, I really only need the AST from that line to fix the warning

I am not convinced this will work. Elixir has no tooling to print incomplete code. Imagine you have a three element tuple and the one in the middle needs to be fixed. How are you going to amend the AST up to the cursor and join it with the rest of the file? The AST stitching would be very complex.

Alternatively you could also print the partial AST only up to said line, but then you would have to print an incomplete tuple, which is also not an option.

so I don’t see how this could be done without the full AST. And even the full AST comes with the pitfall that you will format the user code.

in this case, I think a textual fix is the simpler. And other workflows could use the AST fix.

@scohen
Copy link
Contributor

scohen commented May 22, 2023

I'm really confused by your response. I was answering the question of if I need to complete AST in order to replace an unused variable on a specific line. By definition, I only need the one line (my assumption is that an unused variable can only appear on one line). In my above example, I would use the context at the end of the line, for reasons you pointed out.

@scohen
Copy link
Contributor

scohen commented May 22, 2023

in this case, I think a textual fix is the simpler. And other workflows could use the AST fix.

I would really prefer to present one interface to the implementers of code actions. Knowing when and how to switch between the two will make building for the project a lot more difficult.

@josevalim
Copy link
Member Author

josevalim commented May 22, 2023

What I am saying is that this code (#12583 (comment)) with a partial AST won’t ever work or have many pitfalls.

For example, imagine you have this:

def foo({var1, var2, var3}) do
  bar
end

And var2 is unused. The AST would be:

{:def, [], [{:foo, [], [{:{}, [], [{:var1, [], nil]}, {:var2, [], nil]}]}

Even if you rename var2 to _var2, there is no way for you to say: “print it only until var2, not closing any of the parens or tuple along the way”. And doing so would be complex.

And even if we give a partial or complete AST, you can still have “var2 = :foo” just before that definition which you don’t want to rename.

Given the input above, can you tell me what are the inputs and outputs of all of the steps you outlined in your original comment?

@scohen
Copy link
Contributor

scohen commented May 22, 2023

You're saying that I'm doing something that I'm not doing ;).

I'm not printing out a subset of the line. I'm taking the entire line's AST, and running it through a replacement function, then I emit the line (and yes, it will emit an end block, which I remove). I then do a textual diff against the original line and emit that as an edit.

The example you cite above works perfectly in lexical. This approach falls apart for those circumstances when the single line can't be parsed. That's what I'm asking for help with.

@scohen
Copy link
Contributor

scohen commented May 22, 2023

And even if we give a partial or complete AST, you can still have “var2 = :foo” just before that definition which you don’t want to rename.

Can you give a concrete example?

@josevalim
Copy link
Member Author

I don’t think we will ever be able to parse a single line. For example, how to parse the second line here:

{
  var1, var2, var3
}

and then, how to format the second line which can be incomplete in all kinds of ways?

even building the AST for a partial code would be very hard. cursor parsing requires all code up to the cursor, and that’s what the assumption was based on. And even that is hard per the above.

@josevalim
Copy link
Member Author

Another way to put it: it is impossible to parse a single line. :) so that can’t happen.

And then you can have this line:

var2 = foo(var2)

where only one of them is unused.

@scohen
Copy link
Contributor

scohen commented May 22, 2023

Another way to put it: it is impossible to parse a single line. :) so that can’t happen.

Ok, can I parse up to a line with a cursor then? Will / can that be made to work? Or, maybe if I'm doing that, container_cursor_to_quoted might be an appropriate option here.

where only one of them is unused.

Yes, this is correct, the current approach will replace both. Having columns would also be appreciated.

@scohen
Copy link
Contributor

scohen commented May 22, 2023

Heck let's say that the code does compile ; I'd even be cool with having something that inserts a cursor into the AST at a position I specify so I can pull things out easily.

@josevalim
Copy link
Member Author

Ok, can I parse up to a line with a cursor then? Will / can that be made to work? Or, maybe if I'm doing that, container_cursor_to_quoted might be an appropriate option here.

Exactly but my point in this earlier comment is that stitching the result of container_cursor_to_quoted back into the original code will be equally hard. I don't think you can escape from going with the full AST and then finding the particular variable by line and columns.

Heck let's say that the code does compile ;

The code does compile though. Otherwise we would not have gotten warnings. :)

I'd even be cool with having something that inserts a cursor into the AST at a position I specify so I can pull things out easily.

I think you should be able to do that with container_cursor_to_quoted. You need a function that count lines, count columns, and compute the byte offset. Pseudo code:

byte_offset = byte_offset_from_line_column(source, line, column)
Code.Fragment.container_cursor_to_quoted(source[0..byte_offset-1])

I still think, for this particular case, the textual approach is simpler. I wonder if we should make this part of the existing diagnostics? What if we have a :fix field in diagnostic, with a module, function, args to be called that will fix the source code?

@scohen
Copy link
Contributor

scohen commented May 22, 2023

The code does compile though

I know, I was saying, "let's forget the case when it doesn't compile". We have a couple cases here, the concrete warnings-based case and the abstract "i have a general fix for something" case.

What if we have a :fix field in diagnostic, with a module, function, args to be called that will fix the source code?

That would be cool indeed.

I really understand what you're trying to accomplish here, and it indeed might be simpler in this particular case to apply the fix textually, but I kind of have some lofty goals for what lexical will be able to do, and having a single way to patch and update code is very core to that. In the future, I'd like to distribute a library that enables authors of other libraries to generate diagnostics / generate code actions and refactorings and other things just by including our library and implementing some functions. Exposing a single API for source code translation will make this a lot easier. I've been bitten a lot by the strategy of textual replacement in code in the past, and perhaps my scars are showing.

I think I actually have enough written in lexical right now to support the suggestion you gave above. I'll play with it and see and report back to you.

@scohen
Copy link
Contributor

scohen commented May 22, 2023

@josevalim I played around a little bit and it's so close, but I don't think it's going to work. Perhaps you can shed light on what if anything I'm doing incorrectly.
Say I have the following code:

defmodule UnderTest do 
  def func(thing) do 
     case thing do 
       {:ok, unused} -> 
           :ok 
       _other -> 
          :error
     end
  end
end

I sent everything up to and including line 4 {:ok, unused} -> to Code.Fragment.container_cursor_to_quoted but got this back:

 {:defmodule, [line: 1],
  [
    {:__aliases__, [line: 1], [:UnderTest]},
    [
      do: {:def, [line: 2],
       [
         {:func, [line: 2], [{:thing, [line: 2], nil}]},
         [
           do: {:case, [line: 3],
            [{:thing, [line: 3], nil}, [do: {:__cursor__, [line: 4], []}]]}
         ]
       ]}
    ]
  ]}

This is so tantalizingly close, but the part of line 4 that I would like to examine ({:ok, unused} -> :ok) isn't present. Is there a way to make it so I can see what's actually before the cursor?

@scohen
Copy link
Contributor

scohen commented May 22, 2023

Alternately, since the thing compiles, I can just use Code.string_to_quoted and find the line that way.

@josevalim
Copy link
Member Author

I really understand what you're trying to accomplish here, and it indeed might be simpler in this particular case to apply the fix textually,

I understand your side too. You are not trying to fix the particular variable renaming problem, you want to provide a framework for people to solve this problem. And solving the variable problem is a way for you dogfood your own API.

I think both problems are worth solving!

Alternately, since the thing compiles, I can just use Code.string_to_quoted and find the line that way.

Yes. 100% the way to go and that solves a whole bunch of other issues. If you want to use container_cursor_to_quoted, put the cursor immediately after the variable (and skipping the rest of the line).

@scohen
Copy link
Contributor

scohen commented May 22, 2023

You are not trying to fix the particular variable renaming problem, you want to provide a framework for people to solve this problem. And solving the variable problem is a way for you dogfood your own API.

Yep, that's exactly it.

Yes. 100% the way to go

I'm going to play around a bit, and I wonder if I can provide a lens-like approach where the whole document gets parsed, but we're able to limit the changes to a range to be examined, changed and pulled back out when turned back into a string.

@josevalim
Copy link
Member Author

@scohen can you provide examples of other warnings and errors given by the Elixir compiler that could be automatically fixed?

@scohen
Copy link
Contributor

scohen commented May 22, 2023

Is there a place in the code where errors and warnings are enumerated? I suspect a lot of them can be fixed not completely automatically, but possibly with some user intervention. One candidate would be "this clause cannot match because the other on line x always matches". The fix would be to swap the branches.

This seems icky and scary, but keep in mind, the user would have to approve the action.

@scohen
Copy link
Contributor

scohen commented May 23, 2023

@josevalim Another thing that can sometimes be fixed are missing ends. We currently parse the error message so we can see the source of the error, for example:

Enum.map(foos, fn x -> 
  x + 1
)

We will highlight both the ending paren, which is where the diagnostic is emitted and also the beginning fn x-> to provide a more relevant context (the error message says something like "the fn here is missing an end"). I believe one fix that would work in a bunch of cases would be to add an end before the missing token. Again, it wouldn't fix every case, but that's OK, since the user has to select the code actions.

@josevalim
Copy link
Member Author

You can search for format_error functions at the bottom of the .erl files: https://github.com/elixir-lang/elixir/blob/main/lib/elixir/src/elixir_aliases.erl#L174

@scohen
Copy link
Contributor

scohen commented May 23, 2023

thank you!
Another one:
<function name> is undefined or private. Did you mean: <list of candidates>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants