Allow passing an `output` to `Run()` to capture the output of a run #5295

DanielNoord · 2021-11-12T20:07:15Z

Current problem

It seems logical to be able to capture the output of a pylint run with code that is similar to this:

from io import StringIO
from pylint.reporters import text
from pylint.lint import Run


def main():
    pylint_output = StringIO()
    reporter = text.TextReporter(pylint_output)
    Run(["test_file.py"], reporter=reporter, do_exit=False)
    print(pylint_output.read()) # Show messages emitted by this pylint run


if __name__ == '__main__':
    main()

See for reference:
This StackOverflow question and #5260

Desired solution

A little investigation showed that the output argument passed to the reporter doesn't actually get used as the output of the PyLinter class. By setting PyLinter._output the expected behaviour might actually be triggerable, but normally we don't want users to directly access or initialise that class.
As @Pierre-Sassoulas noted in #5260 (comment) it might be something to do with kwargs not being taken into account.

To achieve the expected behaviour we could either 1) make the output passed to the reporter of the Run instance be used as the PyLinter output or 2) allow passing a new output argument to Run which achieves the same thing.

Additional context

No response

The text was updated successfully, but these errors were encountered:

areveny · 2021-11-13T15:14:47Z

Looking at this.

areveny · 2021-11-14T04:34:56Z

Well, I've learned a lot about the reporters but it seems to work as expected. The passed-in TextReporter with the file object output does overwrite the default output and get written to by the linter.

I think the problem is in the proposed code, which has to seek to the beginning of the file object before reading from it. The below code seems to work as expected:

def main():
    pylint_output = StringIO()
    reporter = text.TextReporter(pylint_output)
    Run(["test_file.py"], reporter=reporter, do_exit=False)
    pylint_output.seek(0)
    print(pylint_output.read()) # Show messages emitted by this pylint run

We could set the stream position to 0 when a file-like object is being used after all the checking but I'm not sure about the standard for if the user or module is responsible for this.

If we don't want to handle it then this issue can be closed and I'll leave a reply on the StackOverflow question.

Pierre-Sassoulas · 2021-11-14T06:39:43Z

Thank you for clearing that up @areveny (I think you just unstuck me on another issue regarding json report for 3.0). That is very counter intuitive, I think we should change the design so that it works like programmers expect it to work. What do you think ? If we change something the stackoverflow answer would still need to detail the behavior by pylint version.

DanielNoord · 2021-11-14T11:52:15Z

Personally, I think the example shown in the StackOverflow question and in the OP of this issue should work (perhaps with minor modification). The way the class and arguments are named creates the expectation that users should be able to do this.

areveny · 2021-11-14T17:20:05Z

Thanks for the feedback.

In preparing to implement this, I have a case for not making a change.

Most notably StringIO has a method getvalue that returns the contents of the entire buffer, preserving stream position.

def main():
    pylint_output = StringIO()
    reporter = text.TextReporter(pylint_output)
    Run(["test_file.py"], reporter=reporter, do_exit=False)
    print(pylint_output.getvalue())

This makes me feel that if a user is working with lower level text streams which expose an interface for seek, read, getvalue and so forth they should have an awareness of stream position, or at least know to use getvalue for safe reading (i.e. to use it over read when passing a stream to a program that makes no promises about cursor position.)

One could also imagine another program that relies on the current behavior of the cursor being left at the end of the file to continue writing additional text to the same stream. That would be a correct use case broken by this change.

Pierre-Sassoulas · 2021-11-14T22:27:01Z

Look like the correct fix to do is to document the get_value as the right way to do it in our documentation, then ? I think it's safe to say that we're not the only one who needs to dig the documentation about stream to be able to use them.

Or maybe (as we can't touch the stream), we can add a __str__ function to Reporters so we can do this ?

from io import StringIO
from pylint.reporters import text
from pylint.lint import Run


def main():
    pylint_output = StringIO()
    reporter = text.TextReporter(pylint_output)
    Run(["test_file.py"], reporter=reporter, do_exit=False)
    print(reporter) # Show messages emitted by this pylint run


if __name__ == '__main__':
    main()

areveny · 2021-11-14T23:15:06Z

I would like to update the documentation, most likely output.rst to describe some of the behaviors around reporters and injecting a custom reporter. To me, the stream interface and behavior we have now, used properly, is clean and well encapsulated.

I think adding __str__ and passing through to the underlying stream would be confusing, because if somebody adds more to the stream, then print(reporter) would include all the excess. An alternative implementation that saves messages in memory would be overhead on the default TextReporter to store messages that will rarely be accessed.

DanielNoord · 2021-11-14T23:27:35Z

@areveny I think everybody would be very glad with updated documentation. That's often one of the things that's left for "soon".

Since you clearly understand this particular part of codebase much better than I do I wanted to ask your opinion on one more solution: allow passing a keyword argument output_file to Run. We could leave the intricacies of streams and reporters and only add a little if-statement plus some handling to the end of each Run. Is this feasible? Or am I looking at this too simply?

areveny · 2021-11-15T00:04:49Z

There are a couple of existing options for adding an external file output. --output-format=file:report.txt for the linter or --output:report.txt on the call to Run. Both of these fold pretty well into the existing reporter framework. I think any other file-like object can be folded into TextReporter as in the example pretty well, and I think the existing code already does a good job of being pretty abstracted from the specifics of that stream object.

But I do think there are some interesting ideas. Maybe I will come back to the reporter/output infrastructure using what I've learned to handle some of these different cases and output options. Possibly as part of the refactor of PyLinter. For example, I there may be options that are redundant or clobber each other silently.

felixvd · 2021-11-15T05:32:29Z

Just for the record, I ended up simply calling via the command line:

ignored_violations = self.override_ignore.split(',') if self.override_ignore else [
   "R",  # Refactoring warnings
    "attribute-defined-outside-init", 
]
command = ["python3", "-m", "pylint", "--reports=n", "--disable=" + ",".join(ignored_violations), "--exit-zero", "--msg-template=\"{path}:{line}:{column}: {category} ({msg_id}, {symbol}, {obj}) {msg}\""] + filenames
# print(" ".join(command))  # To compare against the command line call
import subprocess
output = subprocess.check_output(command, env=env).decode('utf-8')

(For completeness, since I had to run this on legacy system with python2:)

try:
    output = subprocess.check_output(command, env=env).decode('utf-8')
except Exception as e:  # --exit-zero is not available on Python2
    output = e.output

If CLI is the intended interface, then just referring to it like this may be the easiest solution. The split between epylint and lint is somewhat confusing.

Pierre-Sassoulas · 2021-11-15T07:10:45Z

Thank you for clarifying @areveny, regarding the output file, would a context manager be a possibility ?

with open("myfile.out", "w") as f:
    reporter = text.TextReporter(f)
    Run(["test_file.py"], reporter=reporter, do_exit=False)
    # Something to do here ?

areveny · 2021-11-15T14:48:37Z

That should be an already functional way of writing to an external file that is pretty similar to what happens when one of the external file command line options is set.

TextReporter contains a text stream, so it leaves the cursor at the end when it's done, but we close the stream and read from the file so the stream position is irrelevant.

But this setup would allow you to append to a longer stream, if you want to build a report of some sort, you could continue to write to the same f stream before closing it.

DanielNoord added Enhancement ✨ Improvement to a component Help wanted 🙏 Outside help would be appreciated, good for new contributors Documentation 📗 High effort 🏋 Difficult solution or problem to solve labels Nov 12, 2021

Pierre-Sassoulas assigned areveny Nov 13, 2021

areveny mentioned this issue Nov 18, 2021

Update documentation for command-line output options and custom reporters #5335

Merged

Pierre-Sassoulas added this to the 2.12.0 milestone Nov 18, 2021

Pierre-Sassoulas closed this as completed in #5335 Nov 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow passing an `output` to `Run()` to capture the output of a run #5295

Allow passing an `output` to `Run()` to capture the output of a run #5295

DanielNoord commented Nov 12, 2021 •

edited

Loading

areveny commented Nov 13, 2021

areveny commented Nov 14, 2021

Pierre-Sassoulas commented Nov 14, 2021

DanielNoord commented Nov 14, 2021

areveny commented Nov 14, 2021

Pierre-Sassoulas commented Nov 14, 2021

areveny commented Nov 14, 2021

DanielNoord commented Nov 14, 2021

areveny commented Nov 15, 2021

felixvd commented Nov 15, 2021

Pierre-Sassoulas commented Nov 15, 2021

areveny commented Nov 15, 2021

Allow passing an output to Run() to capture the output of a run #5295

Allow passing an output to Run() to capture the output of a run #5295

Comments

DanielNoord commented Nov 12, 2021 • edited Loading

Current problem

Desired solution

Additional context

areveny commented Nov 13, 2021

areveny commented Nov 14, 2021

Pierre-Sassoulas commented Nov 14, 2021

DanielNoord commented Nov 14, 2021

areveny commented Nov 14, 2021

Pierre-Sassoulas commented Nov 14, 2021

areveny commented Nov 14, 2021

DanielNoord commented Nov 14, 2021

areveny commented Nov 15, 2021

felixvd commented Nov 15, 2021

Pierre-Sassoulas commented Nov 15, 2021

areveny commented Nov 15, 2021

Allow passing an `output` to `Run()` to capture the output of a run #5295

Allow passing an `output` to `Run()` to capture the output of a run #5295

DanielNoord commented Nov 12, 2021 •

edited

Loading