Skip to content

errors: speedup for large error counts #12631

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 29, 2022
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 13 additions & 4 deletions mypy/errors.py
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,11 @@ class Errors:
# files were processed.
error_info_map: Dict[str, List[ErrorInfo]]

# optimization for legacy codebases with many files with errors
has_real_errors: bool = False
has_blockers: bool = False
total_error_count: int = 0

# Files that we have reported the errors for
flushed_files: Set[str]

Expand Down Expand Up @@ -235,7 +240,7 @@ def copy(self) -> 'Errors':
return new

def total_errors(self) -> int:
return sum(len(errs) for errs in self.error_info_map.values())
return self.total_error_count
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only used in one place, where we check that the error count before and after some operation is the same. I wonder if we can do something simpler there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried something a little more complex but which doesn't require maintaining a state that might diverge


def set_ignore_prefix(self, prefix: str) -> None:
"""Set path prefix that will be removed from all paths."""
Expand Down Expand Up @@ -357,6 +362,11 @@ def _add_error_info(self, file: str, info: ErrorInfo) -> None:
if file not in self.error_info_map:
self.error_info_map[file] = []
self.error_info_map[file].append(info)
self.total_error_count += 1
if info.severity == 'error':
self.has_real_errors = True
if info.blocker:
self.has_blockers = True
if info.code is IMPORT:
self.seen_import_error = True

Expand Down Expand Up @@ -553,12 +563,11 @@ def is_errors(self) -> bool:

def is_real_errors(self) -> bool:
"""Are there any generated errors (not just notes, for example)?"""
return any(info.severity == 'error'
for infos in self.error_info_map.values() for info in infos)
return self.has_real_errors

def is_blockers(self) -> bool:
"""Are the any errors that are blockers?"""
return any(err for errs in self.error_info_map.values() for err in errs if err.blocker)
return self.has_blockers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one makes sense to me. We check for blockers after processing every single file, so with a lot of errors that naturally gets quadratic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add to change from a single boolean to a path-indexed Set to accommodate removal of errors


def blocker_module(self) -> Optional[str]:
"""Return the module with a blocking error, or None if not possible."""
Expand Down