-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
[pyreverse] Fix duplicate arrows when class attribute is assigned more than once #10333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
for more information, see https://pre-commit.ci
Codecov ReportAttention: Patch coverage is
β Your patch check has failed because the patch coverage (76.92%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #10333 +/- ##
==========================================
- Coverage 95.90% 95.85% -0.05%
==========================================
Files 176 176
Lines 19108 19145 +37
==========================================
+ Hits 18325 18352 +27
- Misses 783 793 +10
π New features to boost your workflow:
|
This comment has been minimized.
This comment has been minimized.
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Hi @pranav-data, thanks for the PR! π |
This comment has been minimized.
This comment has been minimized.
@DudeNr33 I have removed the files and pushed. |
I have been noticing duplicate class definitions captured, in a project I am working on, but unable to reproduce it in a simpler example for a unit test. The parts where code coverage is lacking is addressing that issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few remarks before we can merge this:
- The logic to filter out the duplications is better suited in
diagrams.ClassDiagram
. This is where the logic to extract the relationships lives.diadefslib
is more about handling the configuration options provided by the user when generating the diagrams.
Ideally, the duplicates are not added to the relationships in the first place instead of doing post-processing to get rid of them. - The description links multiple issues as 'closed', but there are no testcases for the examples given in Pyreverse: Duplicated class variablesΒ #8046 - could add them to verify if this solution also solves the problems mentioned in the isse? Pay attention to this comment as there are some caveats when differentiating class and instance attributes.
- New code without code coverage (removal of duplicated classes) is not ideal. It is better to split this PR up into two parts: first MR that focuses on the duplicated relationship arrows (where we have reproduction examples from the linked issues). The second MR for the duplicated classes, but only once a minimal reproduction example can be provided.
Thank you for taking the time fixing bugs in pyreverse
!
π€ According to the primer, this change has no effect on the checked open source code. π€π This comment was generated for commit 608bbc0 |
Removing the duplicate from diagrams.py could clean up the classes but messes up the relationships. From my debugging, it appears that the duplicate originates from how astroid parses the repository for dependencies, where one copy of the class is required to establish the relationship. Breaking it down to 2 separate fixes will fix one while breaking the other. |
I do not see yet how moving the logic to I tried it locally and the test still passes, expand to see the diff of my changes: diff --git a/pylint/pyreverse/diadefslib.py b/pylint/pyreverse/diadefslib.py
index f91a89eca..d3b045437 100644
--- a/pylint/pyreverse/diadefslib.py
+++ b/pylint/pyreverse/diadefslib.py
@@ -68,10 +68,7 @@ class DiaDefGenerator:
leaf_nodes = [
module
for module in self.args
- if not any(
- other != module and other.startswith(module + ".")
- for other in self.args
- )
+ if not any(other != module and other.startswith(module + ".") for other in self.args)
]
return leaf_nodes
@@ -120,12 +117,8 @@ class DiaDefGenerator:
absolute_depth = name.count(".")
# Retrieve the base depth to compare against
- relative_depth = next(
- (v for k, v in self.args_depths.items() if name.startswith(k)), None
- )
- return relative_depth is not None and bool(
- (absolute_depth - relative_depth) <= self.config.max_depth
- )
+ relative_depth = next((v for k, v in self.args_depths.items() if name.startswith(k)), None)
+ return relative_depth is not None and bool((absolute_depth - relative_depth) <= self.config.max_depth)
def show_node(self, node: nodes.ClassDef) -> bool:
"""Determine if node should be shown based on config."""
@@ -143,9 +136,7 @@ class DiaDefGenerator:
self.linker.visit(node)
self.classdiagram.add_object(self.get_title(node), node)
- def get_ancestors(
- self, node: nodes.ClassDef, level: int
- ) -> Generator[nodes.ClassDef]:
+ def get_ancestors(self, node: nodes.ClassDef, level: int) -> Generator[nodes.ClassDef]:
"""Return ancestor nodes of a class node."""
if level == 0:
return
@@ -154,9 +145,7 @@ class DiaDefGenerator:
continue
yield ancestor
- def get_associated(
- self, klass_node: nodes.ClassDef, level: int
- ) -> Generator[nodes.ClassDef]:
+ def get_associated(self, klass_node: nodes.ClassDef, level: int) -> Generator[nodes.ClassDef]:
"""Return associated nodes of a class node."""
if level == 0:
return
@@ -170,9 +159,7 @@ class DiaDefGenerator:
continue
yield node
- def extract_classes(
- self, klass_node: nodes.ClassDef, anc_level: int, association_level: int
- ) -> None:
+ def extract_classes(self, klass_node: nodes.ClassDef, anc_level: int, association_level: int) -> None:
"""Extract recursively classes related to klass_node."""
if self.classdiagram.has_node(klass_node) or not self.show_node(klass_node):
return
@@ -203,9 +190,7 @@ class DefaultDiadefGenerator(LocalsVisitor, DiaDefGenerator):
"""
mode = self.config.mode
if len(node.modules) > 1:
- self.pkgdiagram: PackageDiagram | None = PackageDiagram(
- f"packages {node.name}", mode
- )
+ self.pkgdiagram: PackageDiagram | None = PackageDiagram(f"packages {node.name}", mode)
else:
self.pkgdiagram = None
self.classdiagram = ClassDiagram(f"classes {node.name}", mode)
@@ -273,99 +258,6 @@ class DiadefsHandler:
self.config = config
self.args = args
- def _get_object_name(self, obj: Any) -> str:
- """Get object name safely handling both title attributes and strings.
-
- :param obj: Object to get name from
- :return: Object name/title
- :rtype: str
- """
- return obj.title if hasattr(obj, "title") else str(obj)
-
- def _process_relationship(
- self, relationship: Relationship, unique_classes: dict[str, Any], obj: Any
- ) -> None:
- """Process a single relationship for deduplication.
-
- :param relationship: Relationship to process
- :param unique_classes: Dict of unique classes
- :param obj: Current object being processed
- """
- if relationship.from_object == obj:
- relationship.from_object = unique_classes[obj.node.qname()]
- if relationship.to_object == obj:
- relationship.to_object = unique_classes[obj.node.qname()]
-
- def _process_class_relationships(
- self, diagram: ClassDiagram, obj: Any, unique_classes: dict[str, Any]
- ) -> None:
- """Merge relationships for a class.
-
- :param diagram: Current diagram
- :param obj: Object whose relationships to process
- :param unique_classes: Dict of unique classes
- """
- for rel_type in ("specialization", "association", "aggregation"):
- for rel in diagram.get_relationships(rel_type):
- self._process_relationship(rel, unique_classes, obj)
-
- def deduplicate_classes(self, diagrams: list[ClassDiagram]) -> list[ClassDiagram]:
- """Remove duplicate classes from diagrams."""
- for diagram in diagrams:
- # Track unique classes by qualified name
- unique_classes: dict[str, Any] = {}
- duplicate_classes: Any = set()
-
- # First pass - identify duplicates
- for obj in diagram.objects:
- qname = obj.node.qname()
- if qname in unique_classes:
- duplicate_classes.add(obj)
- self._process_class_relationships(diagram, obj, unique_classes)
- else:
- unique_classes[qname] = obj
-
- # Second pass - filter out duplicates
- diagram.objects = [
- obj for obj in diagram.objects if obj not in duplicate_classes
- ]
-
- return diagrams
-
- def _process_relationship_type(
- self,
- rel_list: list[Relationship],
- seen: set[tuple[str, str, str, Any | None]],
- unique_rels: dict[str, list[Relationship]],
- rel_name: str,
- ) -> None:
- """Process a list of relationships of a single type.
-
- :param rel_list: List of relationships to process
- :param seen: Set of seen relationships
- :param unique_rels: Dict to store unique relationships
- :param rel_name: Name of relationship type
- """
- for rel in rel_list:
- key = (
- self._get_object_name(rel.from_object),
- self._get_object_name(rel.to_object),
- type(rel).__name__,
- getattr(rel, "name", None),
- )
- if key not in seen:
- seen.add(key)
- unique_rels[rel_name].append(rel)
-
- def deduplicate_relationships(self, diagram: ClassDiagram) -> None:
- """Remove duplicate relationships between objects."""
- seen: set[tuple[str, str, str, Any | None]] = set()
- unique_rels: dict[str, list[Relationship]] = defaultdict(list)
- for rel_name, rel_list in diagram.relationships.items():
- self._process_relationship_type(rel_list, seen, unique_rels, rel_name)
-
- diagram.relationships = dict(unique_rels)
-
def get_diadefs(self, project: Project, linker: Linker) -> list[ClassDiagram]:
"""Get the diagram's configuration data.
@@ -386,5 +278,4 @@ class DiadefsHandler:
diagrams = DefaultDiadefGenerator(linker, self).visit(project)
for diagram in diagrams:
diagram.extract_relationships()
- self.deduplicate_relationships(diagram)
- return self.deduplicate_classes(diagrams)
+ return diagrams
diff --git a/pylint/pyreverse/diagrams.py b/pylint/pyreverse/diagrams.py
index 6db056300..73ec71ba0 100644
--- a/pylint/pyreverse/diagrams.py
+++ b/pylint/pyreverse/diagrams.py
@@ -6,6 +6,7 @@
from __future__ import annotations
+from collections import defaultdict
from collections.abc import Iterable
from typing import Any
@@ -45,9 +46,7 @@ class DiagramEntity(Figure):
default_shape = ""
- def __init__(
- self, title: str = "No name", node: nodes.NodeNG | None = None
- ) -> None:
+ def __init__(self, title: str = "No name", node: nodes.NodeNG | None = None) -> None:
super().__init__()
self.title = title
self.node: nodes.NodeNG = node or nodes.NodeNG(
@@ -109,9 +108,7 @@ class ClassDiagram(Figure, FilterMixIn):
rel = Relationship(from_object, to_object, relation_type, name)
self.relationships.setdefault(relation_type, []).append(rel)
- def get_relationship(
- self, from_object: DiagramEntity, relation_type: str
- ) -> Relationship:
+ def get_relationship(self, from_object: DiagramEntity, relation_type: str) -> Relationship:
"""Return a relationship or None."""
for rel in self.relationships.get(relation_type, ()):
if rel.from_object is from_object:
@@ -126,14 +123,11 @@ class ClassDiagram(Figure, FilterMixIn):
properties = {
local_name: local_node
for local_name, local_node in node.items()
- if isinstance(local_node, nodes.FunctionDef)
- and decorated_with_property(local_node)
+ if isinstance(local_node, nodes.FunctionDef) and decorated_with_property(local_node)
}
# Add instance attributes to properties
- for attr_name, attr_type in list(node.locals_type.items()) + list(
- node.instance_attrs_type.items()
- ):
+ for attr_name, attr_type in list(node.locals_type.items()) + list(node.instance_attrs_type.items()):
if attr_name not in properties:
properties[attr_name] = attr_type
@@ -142,9 +136,7 @@ class ClassDiagram(Figure, FilterMixIn):
continue
# Handle property methods differently to correctly extract return type
- if isinstance(
- associated_nodes, nodes.FunctionDef
- ) and decorated_with_property(associated_nodes):
+ if isinstance(associated_nodes, nodes.FunctionDef) and decorated_with_property(associated_nodes):
if associated_nodes.returns:
type_annotation = get_annotation_label(associated_nodes.returns)
node_name = f"{node_name} : {type_annotation}"
@@ -184,9 +176,7 @@ class ClassDiagram(Figure, FilterMixIn):
if isinstance(node, astroid.Instance):
node = node._proxied
if (
- isinstance(
- node, (nodes.ClassDef, nodes.Name, nodes.Subscript, nodes.BinOp)
- )
+ isinstance(node, (nodes.ClassDef, nodes.Name, nodes.Subscript, nodes.BinOp))
and hasattr(node, "name")
and not self.has_node(node)
):
@@ -194,11 +184,7 @@ class ClassDiagram(Figure, FilterMixIn):
node_name = node.name
names.append(node_name)
# sorted to get predictable (hence testable) results
- return sorted(
- name
- for name in names
- if all(name not in other or name == other for other in names)
- )
+ return sorted(name for name in names if all(name not in other or name == other for other in names))
def has_node(self, node: nodes.NodeNG) -> bool:
"""Return true if the given node is included in the diagram."""
@@ -237,9 +223,7 @@ class ClassDiagram(Figure, FilterMixIn):
# associations & aggregations links
for name, values in list(node.aggregations_type.items()):
for value in values:
- self.assign_association_relationship(
- value, obj, name, "aggregation"
- )
+ self.assign_association_relationship(value, obj, name, "aggregation")
associations = node.associations_type.copy()
@@ -249,9 +233,9 @@ class ClassDiagram(Figure, FilterMixIn):
for name, values in associations.items():
for value in values:
- self.assign_association_relationship(
- value, obj, name, "association"
- )
+ self.assign_association_relationship(value, obj, name, "association")
+ self.deduplicate_relationships()
+ self.deduplicate_classes()
def assign_association_relationship(
self, value: astroid.NodeNG, obj: ClassEntity, name: str, type_relationship: str
@@ -266,6 +250,92 @@ class ClassDiagram(Figure, FilterMixIn):
except KeyError:
return
+ def _get_object_name(self, obj: Any) -> str:
+ """Get object name safely handling both title attributes and strings.
+
+ :param obj: Object to get name from
+ :return: Object name/title
+ :rtype: str
+ """
+ return obj.title if hasattr(obj, "title") else str(obj)
+
+ def _process_relationship(self, relationship: Relationship, unique_classes: dict[str, Any], obj: Any) -> None:
+ """Process a single relationship for deduplication.
+
+ :param relationship: Relationship to process
+ :param unique_classes: Dict of unique classes
+ :param obj: Current object being processed
+ """
+ if relationship.from_object == obj:
+ relationship.from_object = unique_classes[obj.node.qname()]
+ if relationship.to_object == obj:
+ relationship.to_object = unique_classes[obj.node.qname()]
+
+ def _process_class_relationships(
+ self, diagram: ClassDiagram, obj: Any, unique_classes: dict[str, Any]
+ ) -> None:
+ """Merge relationships for a class.
+
+ :param diagram: Current diagram
+ :param obj: Object whose relationships to process
+ :param unique_classes: Dict of unique classes
+ """
+ for rel_type in ("specialization", "association", "aggregation"):
+ for rel in diagram.get_relationships(rel_type):
+ self._process_relationship(rel, unique_classes, obj)
+
+ def deduplicate_classes(self) -> None:
+ """Remove duplicate classes from diagrams."""
+ # Track unique classes by qualified name
+ unique_classes: dict[str, Any] = {}
+ duplicate_classes: Any = set()
+
+ # First pass - identify duplicates
+ for obj in self.objects:
+ qname = obj.node.qname()
+ if qname in unique_classes:
+ duplicate_classes.add(obj)
+ self._process_class_relationships(self, obj, unique_classes)
+ else:
+ unique_classes[qname] = obj
+
+ # Second pass - filter out duplicates
+ self.objects = [obj for obj in self.objects if obj not in duplicate_classes]
+
+ def _process_relationship_type(
+ self,
+ rel_list: list[Relationship],
+ seen: set[tuple[str, str, str, Any | None]],
+ unique_rels: dict[str, list[Relationship]],
+ rel_name: str,
+ ) -> None:
+ """Process a list of relationships of a single type.
+
+ :param rel_list: List of relationships to process
+ :param seen: Set of seen relationships
+ :param unique_rels: Dict to store unique relationships
+ :param rel_name: Name of relationship type
+ """
+ for rel in rel_list:
+ key = (
+ self._get_object_name(rel.from_object),
+ self._get_object_name(rel.to_object),
+ type(rel).__name__,
+ getattr(rel, "name", None),
+ )
+ if key not in seen:
+ seen.add(key)
+ unique_rels[rel_name].append(rel)
+
+ def deduplicate_relationships(self) -> None:
+ """Remove duplicate relationships between objects."""
+ seen: set[tuple[str, str, str, Any | None]] = set()
+ unique_rels: dict[str, list[Relationship]] = defaultdict(list)
+ for rel_name, rel_list in self.relationships.items():
+ self._process_relationship_type(rel_list, seen, unique_rels, rel_name)
+
+ self.relationships = dict(unique_rels)
+
class PackageDiagram(ClassDiagram):
"""Package diagram handling.""" There is also no problem when removing the call to Maybe there was some misunderstanding, so can you please check the proposed diff and comment if your concerns about moving the logic and splitting the MR are still valid? |
Type of Changes
Description
Postprocessing relationships and classes to find replication across relationship types / attributes.
Output for example:
Closes #9267
Closes: #8046
Fixes closed issue: #8189