Skip to content

Allow passing an output to Run() to capture the output of a run #5295

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
DanielNoord opened this issue Nov 12, 2021 · 12 comments · Fixed by #5335
Closed

Allow passing an output to Run() to capture the output of a run #5295

DanielNoord opened this issue Nov 12, 2021 · 12 comments · Fixed by #5335
Assignees
Labels
Documentation 📗 Enhancement ✨ Improvement to a component Help wanted 🙏 Outside help would be appreciated, good for new contributors High effort 🏋 Difficult solution or problem to solve
Milestone

Comments

@DanielNoord
Copy link
Collaborator

DanielNoord commented Nov 12, 2021

Current problem

It seems logical to be able to capture the output of a pylint run with code that is similar to this:

from io import StringIO
from pylint.reporters import text
from pylint.lint import Run


def main():
    pylint_output = StringIO()
    reporter = text.TextReporter(pylint_output)
    Run(["test_file.py"], reporter=reporter, do_exit=False)
    print(pylint_output.read()) # Show messages emitted by this pylint run


if __name__ == '__main__':
    main()

See for reference:
This StackOverflow question and #5260

Desired solution

A little investigation showed that the output argument passed to the reporter doesn't actually get used as the output of the PyLinter class. By setting PyLinter._output the expected behaviour might actually be triggerable, but normally we don't want users to directly access or initialise that class.
As @Pierre-Sassoulas noted in #5260 (comment) it might be something to do with kwargs not being taken into account.

To achieve the expected behaviour we could either 1) make the output passed to the reporter of the Run instance be used as the PyLinter output or 2) allow passing a new output argument to Run which achieves the same thing.

Additional context

No response

@DanielNoord DanielNoord added Enhancement ✨ Improvement to a component Help wanted 🙏 Outside help would be appreciated, good for new contributors Documentation 📗 High effort 🏋 Difficult solution or problem to solve labels Nov 12, 2021
@areveny
Copy link
Contributor

areveny commented Nov 13, 2021

Looking at this.

@areveny
Copy link
Contributor

areveny commented Nov 14, 2021

Well, I've learned a lot about the reporters but it seems to work as expected. The passed-in TextReporter with the file object output does overwrite the default output and get written to by the linter.

I think the problem is in the proposed code, which has to seek to the beginning of the file object before reading from it. The below code seems to work as expected:

def main():
    pylint_output = StringIO()
    reporter = text.TextReporter(pylint_output)
    Run(["test_file.py"], reporter=reporter, do_exit=False)
    pylint_output.seek(0)
    print(pylint_output.read()) # Show messages emitted by this pylint run

We could set the stream position to 0 when a file-like object is being used after all the checking but I'm not sure about the standard for if the user or module is responsible for this.

If we don't want to handle it then this issue can be closed and I'll leave a reply on the StackOverflow question.

@Pierre-Sassoulas
Copy link
Member

Thank you for clearing that up @areveny (I think you just unstuck me on another issue regarding json report for 3.0). That is very counter intuitive, I think we should change the design so that it works like programmers expect it to work. What do you think ? If we change something the stackoverflow answer would still need to detail the behavior by pylint version.

@DanielNoord
Copy link
Collaborator Author

Personally, I think the example shown in the StackOverflow question and in the OP of this issue should work (perhaps with minor modification). The way the class and arguments are named creates the expectation that users should be able to do this.

@areveny
Copy link
Contributor

areveny commented Nov 14, 2021

Thanks for the feedback.

In preparing to implement this, I have a case for not making a change.

Most notably StringIO has a method getvalue that returns the contents of the entire buffer, preserving stream position.

def main():
    pylint_output = StringIO()
    reporter = text.TextReporter(pylint_output)
    Run(["test_file.py"], reporter=reporter, do_exit=False)
    print(pylint_output.getvalue())

This makes me feel that if a user is working with lower level text streams which expose an interface for seek, read, getvalue and so forth they should have an awareness of stream position, or at least know to use getvalue for safe reading (i.e. to use it over read when passing a stream to a program that makes no promises about cursor position.)

One could also imagine another program that relies on the current behavior of the cursor being left at the end of the file to continue writing additional text to the same stream. That would be a correct use case broken by this change.

@Pierre-Sassoulas
Copy link
Member

Look like the correct fix to do is to document the get_value as the right way to do it in our documentation, then ? I think it's safe to say that we're not the only one who needs to dig the documentation about stream to be able to use them.

Or maybe (as we can't touch the stream), we can add a __str__ function to Reporters so we can do this ?

from io import StringIO
from pylint.reporters import text
from pylint.lint import Run


def main():
    pylint_output = StringIO()
    reporter = text.TextReporter(pylint_output)
    Run(["test_file.py"], reporter=reporter, do_exit=False)
    print(reporter) # Show messages emitted by this pylint run


if __name__ == '__main__':
    main()

@areveny
Copy link
Contributor

areveny commented Nov 14, 2021

I would like to update the documentation, most likely output.rst to describe some of the behaviors around reporters and injecting a custom reporter. To me, the stream interface and behavior we have now, used properly, is clean and well encapsulated.

I think adding __str__ and passing through to the underlying stream would be confusing, because if somebody adds more to the stream, then print(reporter) would include all the excess. An alternative implementation that saves messages in memory would be overhead on the default TextReporter to store messages that will rarely be accessed.

@DanielNoord
Copy link
Collaborator Author

@areveny I think everybody would be very glad with updated documentation. That's often one of the things that's left for "soon".

Since you clearly understand this particular part of codebase much better than I do I wanted to ask your opinion on one more solution: allow passing a keyword argument output_file to Run. We could leave the intricacies of streams and reporters and only add a little if-statement plus some handling to the end of each Run. Is this feasible? Or am I looking at this too simply?

@areveny
Copy link
Contributor

areveny commented Nov 15, 2021

There are a couple of existing options for adding an external file output. --output-format=file:report.txt for the linter or --output:report.txt on the call to Run. Both of these fold pretty well into the existing reporter framework. I think any other file-like object can be folded into TextReporter as in the example pretty well, and I think the existing code already does a good job of being pretty abstracted from the specifics of that stream object.

But I do think there are some interesting ideas. Maybe I will come back to the reporter/output infrastructure using what I've learned to handle some of these different cases and output options. Possibly as part of the refactor of PyLinter. For example, I there may be options that are redundant or clobber each other silently.

@felixvd
Copy link
Contributor

felixvd commented Nov 15, 2021

Just for the record, I ended up simply calling via the command line:

ignored_violations = self.override_ignore.split(',') if self.override_ignore else [
   "R",  # Refactoring warnings
    "attribute-defined-outside-init", 
]
command = ["python3", "-m", "pylint", "--reports=n", "--disable=" + ",".join(ignored_violations), "--exit-zero", "--msg-template=\"{path}:{line}:{column}: {category} ({msg_id}, {symbol}, {obj}) {msg}\""] + filenames
# print(" ".join(command))  # To compare against the command line call
import subprocess
output = subprocess.check_output(command, env=env).decode('utf-8')

(For completeness, since I had to run this on legacy system with python2:)

try:
    output = subprocess.check_output(command, env=env).decode('utf-8')
except Exception as e:  # --exit-zero is not available on Python2
    output = e.output

If CLI is the intended interface, then just referring to it like this may be the easiest solution. The split between epylint and lint is somewhat confusing.

@Pierre-Sassoulas
Copy link
Member

Thank you for clarifying @areveny, regarding the output file, would a context manager be a possibility ?

with open("myfile.out", "w") as f:
    reporter = text.TextReporter(f)
    Run(["test_file.py"], reporter=reporter, do_exit=False)
    # Something to do here ?

@areveny
Copy link
Contributor

areveny commented Nov 15, 2021

That should be an already functional way of writing to an external file that is pretty similar to what happens when one of the external file command line options is set.

TextReporter contains a text stream, so it leaves the cursor at the end when it's done, but we close the stream and read from the file so the stream position is irrelevant.

But this setup would allow you to append to a longer stream, if you want to build a report of some sort, you could continue to write to the same f stream before closing it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation 📗 Enhancement ✨ Improvement to a component Help wanted 🙏 Outside help would be appreciated, good for new contributors High effort 🏋 Difficult solution or problem to solve
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants