Skip to content

Include end column / range in message output #5336

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
6 tasks done
cdce8p opened this issue Nov 18, 2021 · 28 comments
Closed
6 tasks done

Include end column / range in message output #5336

cdce8p opened this issue Nov 18, 2021 · 28 comments
Labels
Enhancement ✨ Improvement to a component High priority Issue with more than 10 reactions
Milestone

Comments

@cdce8p
Copy link
Member

cdce8p commented Nov 18, 2021

Current problem

At the moment, pylint only include a column offset in the output. That works find for command line or CI uses but doesn't work too well in IDEs. They depend on us to output an error range so they can effectively highlight the issue.
microsoft/vscode-python#18027

Desired solution

We already have node information for most / all (?) errors. Those include end column information. It should be fairly strait forward to add a new formatting option to the pylint output.

A Todo list:

Additional context

Maybe we can release a beta version first. With that IDEs and extensions could be updated and then it's up to us.
For that first version we wouldn't need to update the existing tests. It should be enough to enable the output.

--
Side note: IMO, this should be completely optional. It doesn't make much sense to update the default output.

@cdce8p cdce8p added Enhancement ✨ Improvement to a component Needs triage 📥 Just created, needs acknowledgment, triage, and proper labelling labels Nov 18, 2021
@cdce8p
Copy link
Member Author

cdce8p commented Nov 18, 2021

/CC: @Pierre-Sassoulas @DanielNoord This has come up recently with the last vscode update. I believe it would be worth prioritizing it and maybe even to include the beta version in 2.12.0. If we release it this weekend, there might be enough time to get it into vscode-python just in time for the last release of the year.

@DanielNoord
Copy link
Collaborator

Can we use end_col_offset and end_lineno?

I'm not sure what vscode (and other IDEs) need exactly and how they handle multi-line statements, but it seems ast might help us here.

@cdce8p
Copy link
Member Author

cdce8p commented Nov 18, 2021

Can we use end_col_offset and end_lineno?

That's what I was thinking

@Pierre-Sassoulas Pierre-Sassoulas added this to the 2.12.0 milestone Nov 18, 2021
@Pierre-Sassoulas Pierre-Sassoulas removed the Needs triage 📥 Just created, needs acknowledgment, triage, and proper labelling label Nov 18, 2021
@Pierre-Sassoulas
Copy link
Member

Decide what to output if we don't have an end column.

I would say only one char as it minimize visual disturbance when we don't know exactly ? There will be a lot of case where a full line would make more sense but it's hard to guess. We'd have to think about it for each message, right ?

@cdce8p cdce8p added the High priority Issue with more than 10 reactions label Nov 18, 2021
@DanielNoord
Copy link
Collaborator

@cdce8p Are you working on this? If not I can start the process of adding end_lineno and end_col_offset to nodes.Statement in astroid. That should be the first step I think.

@cdce8p
Copy link
Member Author

cdce8p commented Nov 19, 2021

@DanielNoord I've started working on the astroid side. Hope to have a first draft ready tomorrow. Would you like to take a look at the pylint side? Expect end_lineno: int | None and end_col_offset: int | None as attributes on NodeNG.

@cdce8p
Copy link
Member Author

cdce8p commented Nov 19, 2021

Opened pylint-dev/astroid#1258. In theory that should already work for most nodes. Haven't tested it yet.

The change got a bit larger than I expected initially. I somehow though that end_lineno and end_col_offset had already been added to the astroid nodes. As it's larger, we might want to reconsider the timing. Does it make sense to move it to 2.13?

@Pierre-Sassoulas
Copy link
Member

Does it make sense to move it to 2.13?

Depends on the number of issue we face when integrating. How doable is this before vscode pylint releases ? Also, is the vscode team ready to integrate pylint 2.12 at the last minute ? If there's little chance to make it in vscode in time then I'd say we push it to 2.13. The decision will probably be easier to take once pylint-dev/astroid#1258 is pylint tested (or pylint failed).

@DanielNoord
Copy link
Collaborator

@DanielNoord I've started working on the astroid side. Hope to have a first draft ready tomorrow. Would you like to take a look at the pylint side? Expect end_lineno: int | None and end_col_offset: int | None as attributes on NodeNG.

I willl start looking into it today!

@DanielNoord
Copy link
Collaborator

DanielNoord commented Nov 19, 2021

@Pierre-Sassoulas did you, by any chance, ever figure out how to update all functional test with the --update-functional-output option? I think it will be easier to compare the diff if we ran that over all files first to add the confidence and subsequently do a PR where we add the end line and column. Otherwise the diff will become extra large and might be more difficult to spot mistakes!

Edit: Found it! If we want, it is available on DanielNoord/tests, see DanielNoord#42

@Pierre-Sassoulas
Copy link
Member

did you, by any chance, ever figure out how to update all functional test with the --update-functional-output option?

😱 You never used that before ? Did you upgrade everything by hand ? (I figured it existed before doing #4004)

@DanielNoord
Copy link
Collaborator

Luckily I used it before, but some if-statements in the code prevented updating if the output was already "correct", even if it was missing the confidence. I thought it had something to do with only being able to do one test at a time, but I was incorrect. I forced it to overwrite everything for DanielNoord#42, which fixed it 😄

@Pierre-Sassoulas
Copy link
Member

the code prevented updating if the output was already "correct" ", even if it was missing the confidence

Ho, I understand now. I would have removed all the .txt but there are a lot of solution. Refactoring so the confidence is required when it's the default HIGH confidence would be one permanent solution (in 3.0 ?).

@DanielNoord
Copy link
Collaborator

I think we should do that anyway. As far as I know it is not a breaking change as this is only part of our own test validation.
Shall I open the PR with the confidence change against main? Or do you want to do it together with end column? You can look in the PR I linked to see what it would look like!

@Pierre-Sassoulas
Copy link
Member

Pierre-Sassoulas commented Nov 20, 2021

this is only part of our own test validation

Not exactly as testutil could be used by pylint plugin developers. But it's a small breaking change as the fix is easy : You only have to update the expected output with the update command . Thoughts ?

Or do you want to do it together with end column?

I think it make more sense to change everything at once as we need to upgrade for the column anyway to minimize conflicts when rebasing . We'll have to enforce rebasing after that change before merging to master or it will break master. One conflict vs 2 will save a lot of time when factoring all the branches and all the contributors. (It would also have been automatic as the output would not be correct anymore, but you already did the research and the work to upgrade without changes :)

@DanielNoord
Copy link
Collaborator

DanielNoord commented Nov 20, 2021

Not exactly as testutil could be used by pylint plugin developers. But it's a small breaking change as the fix is easy : You only have to update the expected output with the update command . Thoughts ?

Yeah, personally I think this can go in a minor release. pylint already handles cases where confidence is not supplied, developers don't need to change the way they are creating Message or anything. The test validator just ignores it. As it is so easy to adopt to with the command I think we could do this indeed.

I think it make more sense to change everything at once as we need to upgrade for the column anyway to minimize conflicts when rebasing . We'll have to enforce rebasing after that change before merging to master or it will break master. One conflict vs 2 will save a lot of time when factoring all the branches and all the contributors. (It would also have been automatic as the output would not be correct anymore, but you already did the research and the work to upgrade without changes :)

The thing is that the validator is currently not completely straightforward, as it needs to allow both len() == 5 and len() == 6. If we did the confidence PR first I could then do the change to the validator to 6 and then to 8. Instead of building on a somewhat "broken" validator we could first "fix" it and then add the end column and end line. That would make the review process easier I think. But since you (and Marc) are probably the ones reviewing I'll leave this up to you.

@Pierre-Sassoulas
Copy link
Member

Pierre-Sassoulas commented Nov 20, 2021

It I could then do the change to the validator to 6 and then to 8. Instead of building on a somewhat "broken" validator we could first "fix" it and then add the end column and end line.

I agree, could we do both change in a single MR in order to be able to squash later and have only one actual conflict during rebase later ? (I mean we can review each commits during the MR review)

@DanielNoord
Copy link
Collaborator

Just out of curiosity: would we ever consider making this a breaking change and go to 3.0?

We could sort out a number of issues:

  1. column being arbitrarily set to 0 if None is supplied -> We could bring this in line with end_col_offset which I would propose to make Optional[int] from the beginning.
  2. confidence not being checked by testutils

Furthermore we could implement this more easily and with more consistency:

  1. Add end_line and end_col_offset after line and col_offset in TextReporter output instead of after confidence.
  2. Break support for Message without MessageLocationTuple.

It would also make the final product offered to vscode and other IDEs more elegant and easier to interpret:
A pylint warning always has an associated line number. If we know the col_offset it is an int, otherwise it is None. If we know the end_line it is an int, otherwise it is None. If we know the end_col_offset it is an int, otherwise it is None.
Whether they want to underline the full line for a message without a col_offset would then be a visual decision up to them. Right now there is no way for them to infer if a message truly starts on col 0 or if we return 0 because there is no actual col_offset. Making this a breaking change would fix that.

I know we tend to not do major versions too often, but I think the improved consistency in output returned to IDEs might be something worth considering.

@cdce8p
Copy link
Member Author

cdce8p commented Nov 20, 2021

Disclaimer: I haven't followed the recent discussion closely.

IMO it's better to return 0 than None for unknown values. vscode-python only does some basic regex parsing of the output, so we should try to only emit one datatype. Even if it means that it can't be represented quite as it should be.

Form what I've seen while working on my PR, every node that doesn't have it's lineno and col_offset explicitly set to None, does have these. That's still true for end_lineno and end_col_offset. Maybe it would be better to fix the remaining once by inferring them ourselves. I.e. Arguments, Comprehensions, and MatchCase.

@Pierre-Sassoulas
Copy link
Member

would we ever consider making this a breaking change and go to 3.0?

Now that you say it it seems like if we do not make a major for adding line end / column end to the output we would hardly ever do a major. This change the way pylint's output will be consumed (unless I'm mistaken ?). There's a lot of thing to consider if we must release 3.0 in particular there's some other blocker on the json output and a lot of project management to update and keep consistent. Then integrator needs to update the way they use pylint (clearly it indicate a breaking change unless we add an option do get the new output). Even without taking into account the change to end line end column itself, I don't think we can do that in time for a vscode integration this week end.

@DanielNoord
Copy link
Collaborator

DanielNoord commented Nov 20, 2021

I've only taken a very quick look at the regex @cdce8p mentioned for vscode-python:
https://github.com/microsoft/vscode-python/blob/24fb1c22763663055655f237510f82148250bb48/src/client/linters/pylint.ts#L11

I believe it is:
const REGEX = '(?<line>\\d+),(?<column>-?\\d+),(?<type>\\w+),(?<code>[\\w-]+):(?<message>.*)\\r?(\\n|$)';

I think we can't avoid to make them (or us with a PR) change how they interpret the output. To me it makes no sense to put end_col after message, so we will need to insert it in the standard pattern anyway.

They are making this regex themselves btw a couple lines below with:
"--msg-template='{line},{column},{category},{symbol}:{msg}'". So it should be quite easy to change this.

IMO it's better to return 0 than None for unknown values. vscode-python only does some basic regex parsing of the output, so we should try to only emit one datatype. Even if it means that it can't be represented quite as it should be.

Based on this pattern we would be fine if we would use "" in the final string output. It would then become something like:
const REGEX = '(?<line>\\d+),(?<column>-?\\d+),(?<end_line>-?\\d+), (?<end_column>-?\\d+),(?<type>\\w+),(?<code>[\\w-]+):(?<message>.*)\\r?(\\n|$)';.
I need to brush up my javascript regex, but I think the ,'s allow us to demarcate the end of end_column even if it is empty.

My suggestion would be to leave col_offset None until the moment we actually print an output. Currently if somebody wrote their own Reporter they could not infer whether col_offset is truly 0 or not. I think there is a possibility to do so while still allowing regex interpretation of our final printed output.

@cdce8p
Copy link
Member Author

cdce8p commented Nov 20, 2021

would we ever consider making this a breaking change and go to 3.0?

Now that you say it it seems like if we do not make a major for adding line end / column end to the output we would hardly ever do a major. This change the way pylint's output will be consumed (unless I'm mistaken ?).

My proposal was only to allow the output of end_line, end_column. Not to change the default output. I.e. if someone wants it, like IDE extensions, they would need to specify it in --msg-template. Thus this shouldn't be a breaking change.

Even without taking into account the change to end line end column itself, I don't think we can do that in time for a vscode integration this week end.

A new release is usually in the first week of a month. So 1,5 - 2 weeks from now. If we manage to get a new release out before then, I think I would be able to update vscode-python myself. It would be ok if that release only includes the first beta version, we don't need to check each message for it. It's only important that the output option is included. After the update to vscode-python, we don't depend on their schedule anymore for updates.

They are making this regex themselves btw a couple lines below with:
"--msg-template='{line},{column},{category},{symbol}:{msg}'". So it should be quite easy to change this.

Yeah, the msg-template should be updated.

My suggestion would be to leave col_offset None until the moment we actually print an output. Currently if somebody wrote their own Reporter they could not infer whether col_offset is truly 0 or not. I think there is a possibility to do so while still allowing regex interpretation of our final printed output.

That might work.

@DanielNoord
Copy link
Collaborator

Making None output as "" in our TextReporter could be done by changing to following code:
https://github.com/PyCQA/pylint/blob/bcdb98a68407f57243b509dd35ab24c418485138/pylint/reporters/text.py#L190-L192

To something like:

    def write_message(self, msg: Message) -> None:
        """Convenience method to write a formatted message with class default template"""
        self_dict = msg._asdict()
        for key, value in self_dict.items():
            if value is None:
                self_dict[key] = ""
        self.writeln(self._template.format(**self_dict))

@cdce8p
Copy link
Member Author

cdce8p commented Nov 22, 2021

A little preview ;)

Screen Shot 2021-11-22 at 17 35 47

@cdce8p
Copy link
Member Author

cdce8p commented Dec 10, 2021

My PR for vscode-python has been merged yesterday. Unfortunately, we just missed the release this month and due to the holidays there won't be one in January 😞

If someone would like to try it earlier though, it is included in the latest insider release. Just add this to your VS Code config

"python.insidersChannel": "daily"

@Pierre-Sassoulas
Copy link
Member

Unfortunately, we just missed the release this month

Ho, that's too bad ! It's still merged though which is very good. By the way I did not know you were also fluent in typescript, impressive ! 😄

@DanielNoord
Copy link
Collaborator

With #5383 I think we have finished all that needs to be done for this issue on our side?

@cdce8p cdce8p modified the milestones: 2.13.0, 2.12.2 Dec 13, 2021
@cdce8p
Copy link
Member Author

cdce8p commented Dec 13, 2021

The next step would probably to check existing messages, but that can be a gradual process. There is also #5466 which ultimately needs some astroid love. Although a temporary fix in pylint is probably better for the moment.

I've change the milestone to 2.12.2. Let's close the issue.
@DanielNoord @Pierre-Sassoulas Thanks you both for helping out! 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement ✨ Improvement to a component High priority Issue with more than 10 reactions
Projects
None yet
Development

No branches or pull requests

3 participants