[pyreverse] Fix duplicate arrows when class attribute is assigned more than once #10333

pranav-data · 2025-04-09T20:23:17Z

Type of Changes

	Type
✓	🐛 Bug fix
	✨ New feature
	🔨 Refactoring
	📜 Docs

Description

Postprocessing relationships and classes to find replication across relationship types / attributes.

Output for example:

classDiagram
  class A {
    var : int
  }
  class B {
    a_obj
    func()
  }
  A --* B : a_obj

Closes #9267
Closes: #8046
Fixes closed issue: #8189

This reverts commit 1796316.

This reverts commit 8690160.

for more information, see https://pre-commit.ci

codecov · 2025-04-09T20:55:23Z

Codecov Report

Attention: Patch coverage is 76.92308% with 9 lines in your changes missing coverage. Please review.

Project coverage is 95.85%. Comparing base (6faf5b4) to head (608bbc0).
Report is 2 commits behind head on main.

Files with missing lines	Patch %	Lines
pylint/pyreverse/diadefslib.py	76.92%	9 Missing ⚠️

❌ Your patch check has failed because the patch coverage (76.92%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #10333      +/-   ##
==========================================
- Coverage   95.90%   95.85%   -0.05%     
==========================================
  Files         176      176              
  Lines       19108    19145      +37     
==========================================
+ Hits        18325    18352      +27     
- Misses        783      793      +10

Files with missing lines	Coverage Δ
pylint/pyreverse/diadefslib.py	`94.47% <76.92%> (-4.84%)`	⬇️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

for more information, see https://pre-commit.ci

DudeNr33 · 2025-04-10T18:38:43Z

Hi @pranav-data, thanks for the PR! 🙂
Can you remove the additions to the .gitignore and the pyenv.cfg file? These are custom to your own environment and should not be added to the repo.

pranav-data · 2025-04-10T22:01:36Z

@DudeNr33 I have removed the files and pushed.

pranav-data · 2025-04-10T22:03:35Z

I have been noticing duplicate class definitions captured, in a project I am working on, but unable to reproduce it in a simpler example for a unit test. The parts where code coverage is lacking is addressing that issue.

DudeNr33

I have a few remarks before we can merge this:

The logic to filter out the duplications is better suited in diagrams.ClassDiagram. This is where the logic to extract the relationships lives. diadefslib is more about handling the configuration options provided by the user when generating the diagrams.
Ideally, the duplicates are not added to the relationships in the first place instead of doing post-processing to get rid of them.
The description links multiple issues as 'closed', but there are no testcases for the examples given in Pyreverse: Duplicated class variables #8046 - could add them to verify if this solution also solves the problems mentioned in the isse? Pay attention to this comment as there are some caveats when differentiating class and instance attributes.
New code without code coverage (removal of duplicated classes) is not ideal. It is better to split this PR up into two parts: first MR that focuses on the duplicated relationship arrows (where we have reproduction examples from the linked issues). The second MR for the duplicated classes, but only once a minimal reproduction example can be provided.

Thank you for taking the time fixing bugs in pyreverse!

github-actions · 2025-04-12T12:45:15Z

🤖 According to the primer, this change has no effect on the checked open source code. 🤖🎉

This comment was generated for commit 608bbc0

pranav-data · 2025-04-14T16:47:04Z

Removing the duplicate from diagrams.py could clean up the classes but messes up the relationships. From my debugging, it appears that the duplicate originates from how astroid parses the repository for dependencies, where one copy of the class is required to establish the relationship.

Breaking it down to 2 separate fixes will fix one while breaking the other.

DudeNr33 · 2025-04-14T18:37:00Z

I do not see yet how moving the logic to diagrams.py would be a problem, as all the post-processing happens on instance attributes of the ClassDiagram objects.

I tried it locally and the test still passes, expand to see the diff of my changes:

diff --git a/pylint/pyreverse/diadefslib.py b/pylint/pyreverse/diadefslib.py
index f91a89eca..d3b045437 100644
--- a/pylint/pyreverse/diadefslib.py
+++ b/pylint/pyreverse/diadefslib.py
@@ -68,10 +68,7 @@ class DiaDefGenerator:
         leaf_nodes = [
             module
             for module in self.args
-            if not any(
-                other != module and other.startswith(module + ".")
-                for other in self.args
-            )
+            if not any(other != module and other.startswith(module + ".") for other in self.args)
         ]
         return leaf_nodes
 
@@ -120,12 +117,8 @@ class DiaDefGenerator:
         absolute_depth = name.count(".")
 
         # Retrieve the base depth to compare against
-        relative_depth = next(
-            (v for k, v in self.args_depths.items() if name.startswith(k)), None
-        )
-        return relative_depth is not None and bool(
-            (absolute_depth - relative_depth) <= self.config.max_depth
-        )
+        relative_depth = next((v for k, v in self.args_depths.items() if name.startswith(k)), None)
+        return relative_depth is not None and bool((absolute_depth - relative_depth) <= self.config.max_depth)
 
     def show_node(self, node: nodes.ClassDef) -> bool:
         """Determine if node should be shown based on config."""
@@ -143,9 +136,7 @@ class DiaDefGenerator:
         self.linker.visit(node)
         self.classdiagram.add_object(self.get_title(node), node)
 
-    def get_ancestors(
-        self, node: nodes.ClassDef, level: int
-    ) -> Generator[nodes.ClassDef]:
+    def get_ancestors(self, node: nodes.ClassDef, level: int) -> Generator[nodes.ClassDef]:
         """Return ancestor nodes of a class node."""
         if level == 0:
             return
@@ -154,9 +145,7 @@ class DiaDefGenerator:
                 continue
             yield ancestor
 
-    def get_associated(
-        self, klass_node: nodes.ClassDef, level: int
-    ) -> Generator[nodes.ClassDef]:
+    def get_associated(self, klass_node: nodes.ClassDef, level: int) -> Generator[nodes.ClassDef]:
         """Return associated nodes of a class node."""
         if level == 0:
             return
@@ -170,9 +159,7 @@ class DiaDefGenerator:
                     continue
                 yield node
 
-    def extract_classes(
-        self, klass_node: nodes.ClassDef, anc_level: int, association_level: int
-    ) -> None:
+    def extract_classes(self, klass_node: nodes.ClassDef, anc_level: int, association_level: int) -> None:
         """Extract recursively classes related to klass_node."""
         if self.classdiagram.has_node(klass_node) or not self.show_node(klass_node):
             return
@@ -203,9 +190,7 @@ class DefaultDiadefGenerator(LocalsVisitor, DiaDefGenerator):
         """
         mode = self.config.mode
         if len(node.modules) > 1:
-            self.pkgdiagram: PackageDiagram | None = PackageDiagram(
-                f"packages {node.name}", mode
-            )
+            self.pkgdiagram: PackageDiagram | None = PackageDiagram(f"packages {node.name}", mode)
         else:
             self.pkgdiagram = None
         self.classdiagram = ClassDiagram(f"classes {node.name}", mode)
@@ -273,99 +258,6 @@ class DiadefsHandler:
         self.config = config
         self.args = args
 
-    def _get_object_name(self, obj: Any) -> str:
-        """Get object name safely handling both title attributes and strings.
-
-        :param obj: Object to get name from
-        :return: Object name/title
-        :rtype: str
-        """
-        return obj.title if hasattr(obj, "title") else str(obj)
-
-    def _process_relationship(
-        self, relationship: Relationship, unique_classes: dict[str, Any], obj: Any
-    ) -> None:
-        """Process a single relationship for deduplication.
-
-        :param relationship: Relationship to process
-        :param unique_classes: Dict of unique classes
-        :param obj: Current object being processed
-        """
-        if relationship.from_object == obj:
-            relationship.from_object = unique_classes[obj.node.qname()]
-        if relationship.to_object == obj:
-            relationship.to_object = unique_classes[obj.node.qname()]
-
-    def _process_class_relationships(
-        self, diagram: ClassDiagram, obj: Any, unique_classes: dict[str, Any]
-    ) -> None:
-        """Merge relationships for a class.
-
-        :param diagram: Current diagram
-        :param obj: Object whose relationships to process
-        :param unique_classes: Dict of unique classes
-        """
-        for rel_type in ("specialization", "association", "aggregation"):
-            for rel in diagram.get_relationships(rel_type):
-                self._process_relationship(rel, unique_classes, obj)
-
-    def deduplicate_classes(self, diagrams: list[ClassDiagram]) -> list[ClassDiagram]:
-        """Remove duplicate classes from diagrams."""
-        for diagram in diagrams:
-            # Track unique classes by qualified name
-            unique_classes: dict[str, Any] = {}
-            duplicate_classes: Any = set()
-
-            # First pass - identify duplicates
-            for obj in diagram.objects:
-                qname = obj.node.qname()
-                if qname in unique_classes:
-                    duplicate_classes.add(obj)
-                    self._process_class_relationships(diagram, obj, unique_classes)
-                else:
-                    unique_classes[qname] = obj
-
-            # Second pass - filter out duplicates
-            diagram.objects = [
-                obj for obj in diagram.objects if obj not in duplicate_classes
-            ]
-
-        return diagrams
-
-    def _process_relationship_type(
-        self,
-        rel_list: list[Relationship],
-        seen: set[tuple[str, str, str, Any | None]],
-        unique_rels: dict[str, list[Relationship]],
-        rel_name: str,
-    ) -> None:
-        """Process a list of relationships of a single type.
-
-        :param rel_list: List of relationships to process
-        :param seen: Set of seen relationships
-        :param unique_rels: Dict to store unique relationships
-        :param rel_name: Name of relationship type
-        """
-        for rel in rel_list:
-            key = (
-                self._get_object_name(rel.from_object),
-                self._get_object_name(rel.to_object),
-                type(rel).__name__,
-                getattr(rel, "name", None),
-            )
-            if key not in seen:
-                seen.add(key)
-                unique_rels[rel_name].append(rel)
-
-    def deduplicate_relationships(self, diagram: ClassDiagram) -> None:
-        """Remove duplicate relationships between objects."""
-        seen: set[tuple[str, str, str, Any | None]] = set()
-        unique_rels: dict[str, list[Relationship]] = defaultdict(list)
-        for rel_name, rel_list in diagram.relationships.items():
-            self._process_relationship_type(rel_list, seen, unique_rels, rel_name)
-
-        diagram.relationships = dict(unique_rels)
-
     def get_diadefs(self, project: Project, linker: Linker) -> list[ClassDiagram]:
         """Get the diagram's configuration data.
 
@@ -386,5 +278,4 @@ class DiadefsHandler:
             diagrams = DefaultDiadefGenerator(linker, self).visit(project)
         for diagram in diagrams:
             diagram.extract_relationships()
-            self.deduplicate_relationships(diagram)
-        return self.deduplicate_classes(diagrams)
+        return diagrams
diff --git a/pylint/pyreverse/diagrams.py b/pylint/pyreverse/diagrams.py
index 6db056300..73ec71ba0 100644
--- a/pylint/pyreverse/diagrams.py
+++ b/pylint/pyreverse/diagrams.py
@@ -6,6 +6,7 @@
 
 from __future__ import annotations
 
+from collections import defaultdict
 from collections.abc import Iterable
 from typing import Any
 
@@ -45,9 +46,7 @@ class DiagramEntity(Figure):
 
     default_shape = ""
 
-    def __init__(
-        self, title: str = "No name", node: nodes.NodeNG | None = None
-    ) -> None:
+    def __init__(self, title: str = "No name", node: nodes.NodeNG | None = None) -> None:
         super().__init__()
         self.title = title
         self.node: nodes.NodeNG = node or nodes.NodeNG(
@@ -109,9 +108,7 @@ class ClassDiagram(Figure, FilterMixIn):
         rel = Relationship(from_object, to_object, relation_type, name)
         self.relationships.setdefault(relation_type, []).append(rel)
 
-    def get_relationship(
-        self, from_object: DiagramEntity, relation_type: str
-    ) -> Relationship:
+    def get_relationship(self, from_object: DiagramEntity, relation_type: str) -> Relationship:
         """Return a relationship or None."""
         for rel in self.relationships.get(relation_type, ()):
             if rel.from_object is from_object:
@@ -126,14 +123,11 @@ class ClassDiagram(Figure, FilterMixIn):
         properties = {
             local_name: local_node
             for local_name, local_node in node.items()
-            if isinstance(local_node, nodes.FunctionDef)
-            and decorated_with_property(local_node)
+            if isinstance(local_node, nodes.FunctionDef) and decorated_with_property(local_node)
         }
 
         # Add instance attributes to properties
-        for attr_name, attr_type in list(node.locals_type.items()) + list(
-            node.instance_attrs_type.items()
-        ):
+        for attr_name, attr_type in list(node.locals_type.items()) + list(node.instance_attrs_type.items()):
             if attr_name not in properties:
                 properties[attr_name] = attr_type
 
@@ -142,9 +136,7 @@ class ClassDiagram(Figure, FilterMixIn):
                 continue
 
             # Handle property methods differently to correctly extract return type
-            if isinstance(
-                associated_nodes, nodes.FunctionDef
-            ) and decorated_with_property(associated_nodes):
+            if isinstance(associated_nodes, nodes.FunctionDef) and decorated_with_property(associated_nodes):
                 if associated_nodes.returns:
                     type_annotation = get_annotation_label(associated_nodes.returns)
                     node_name = f"{node_name} : {type_annotation}"
@@ -184,9 +176,7 @@ class ClassDiagram(Figure, FilterMixIn):
             if isinstance(node, astroid.Instance):
                 node = node._proxied
             if (
-                isinstance(
-                    node, (nodes.ClassDef, nodes.Name, nodes.Subscript, nodes.BinOp)
-                )
+                isinstance(node, (nodes.ClassDef, nodes.Name, nodes.Subscript, nodes.BinOp))
                 and hasattr(node, "name")
                 and not self.has_node(node)
             ):
@@ -194,11 +184,7 @@ class ClassDiagram(Figure, FilterMixIn):
                     node_name = node.name
                     names.append(node_name)
         # sorted to get predictable (hence testable) results
-        return sorted(
-            name
-            for name in names
-            if all(name not in other or name == other for other in names)
-        )
+        return sorted(name for name in names if all(name not in other or name == other for other in names))
 
     def has_node(self, node: nodes.NodeNG) -> bool:
         """Return true if the given node is included in the diagram."""
@@ -237,9 +223,7 @@ class ClassDiagram(Figure, FilterMixIn):
             # associations & aggregations links
             for name, values in list(node.aggregations_type.items()):
                 for value in values:
-                    self.assign_association_relationship(
-                        value, obj, name, "aggregation"
-                    )
+                    self.assign_association_relationship(value, obj, name, "aggregation")
 
             associations = node.associations_type.copy()
 
@@ -249,9 +233,9 @@ class ClassDiagram(Figure, FilterMixIn):
 
             for name, values in associations.items():
                 for value in values:
-                    self.assign_association_relationship(
-                        value, obj, name, "association"
-                    )
+                    self.assign_association_relationship(value, obj, name, "association")
+        self.deduplicate_relationships()
+        self.deduplicate_classes()
 
     def assign_association_relationship(
         self, value: astroid.NodeNG, obj: ClassEntity, name: str, type_relationship: str
@@ -266,6 +250,92 @@ class ClassDiagram(Figure, FilterMixIn):
         except KeyError:
             return
 
+    def _get_object_name(self, obj: Any) -> str:
+        """Get object name safely handling both title attributes and strings.
+
+        :param obj: Object to get name from
+        :return: Object name/title
+        :rtype: str
+        """
+        return obj.title if hasattr(obj, "title") else str(obj)
+
+    def _process_relationship(self, relationship: Relationship, unique_classes: dict[str, Any], obj: Any) -> None:
+        """Process a single relationship for deduplication.
+
+        :param relationship: Relationship to process
+        :param unique_classes: Dict of unique classes
+        :param obj: Current object being processed
+        """
+        if relationship.from_object == obj:
+            relationship.from_object = unique_classes[obj.node.qname()]
+        if relationship.to_object == obj:
+            relationship.to_object = unique_classes[obj.node.qname()]
+
+    def _process_class_relationships(
+        self, diagram: ClassDiagram, obj: Any, unique_classes: dict[str, Any]
+    ) -> None:
+        """Merge relationships for a class.
+
+        :param diagram: Current diagram
+        :param obj: Object whose relationships to process
+        :param unique_classes: Dict of unique classes
+        """
+        for rel_type in ("specialization", "association", "aggregation"):
+            for rel in diagram.get_relationships(rel_type):
+                self._process_relationship(rel, unique_classes, obj)
+
+    def deduplicate_classes(self) -> None:
+        """Remove duplicate classes from diagrams."""
+        # Track unique classes by qualified name
+        unique_classes: dict[str, Any] = {}
+        duplicate_classes: Any = set()
+
+        # First pass - identify duplicates
+        for obj in self.objects:
+            qname = obj.node.qname()
+            if qname in unique_classes:
+                duplicate_classes.add(obj)
+                self._process_class_relationships(self, obj, unique_classes)
+            else:
+                unique_classes[qname] = obj
+
+        # Second pass - filter out duplicates
+        self.objects = [obj for obj in self.objects if obj not in duplicate_classes]
+
+    def _process_relationship_type(
+        self,
+        rel_list: list[Relationship],
+        seen: set[tuple[str, str, str, Any | None]],
+        unique_rels: dict[str, list[Relationship]],
+        rel_name: str,
+    ) -> None:
+        """Process a list of relationships of a single type.
+
+        :param rel_list: List of relationships to process
+        :param seen: Set of seen relationships
+        :param unique_rels: Dict to store unique relationships
+        :param rel_name: Name of relationship type
+        """
+        for rel in rel_list:
+            key = (
+                self._get_object_name(rel.from_object),
+                self._get_object_name(rel.to_object),
+                type(rel).__name__,
+                getattr(rel, "name", None),
+            )
+            if key not in seen:
+                seen.add(key)
+                unique_rels[rel_name].append(rel)
+
+    def deduplicate_relationships(self) -> None:
+        """Remove duplicate relationships between objects."""
+        seen: set[tuple[str, str, str, Any | None]] = set()
+        unique_rels: dict[str, list[Relationship]] = defaultdict(list)
+        for rel_name, rel_list in self.relationships.items():
+            self._process_relationship_type(rel_list, seen, unique_rels, rel_name)
+
+        self.relationships = dict(unique_rels)
+
 
 class PackageDiagram(ClassDiagram):
     """Package diagram handling."""

There is also no problem when removing the call to deduplicate_classes.

Maybe there was some misunderstanding, so can you please check the proposed diff and comment if your concerns about moving the logic and splitting the MR are still valid?

Pranav Kumar Seerala added 12 commits April 9, 2025 11:33

remove duplicate classes

1796316

write_classes wdit

8690160

deduplicate classes

8338f9c

deduplicate relationships as well

9c86b3b

attempt1 to fix

364da44

use DiagramEntity title

997afb6

list iteration in relationships

5212bcd

bugfix relationships

27cb36e

error fix

1e6d020

default dict

9fd85dc

Revert "remove duplicate classes"

e373aa6

This reverts commit 1796316.

Revert "write_classes wdit"

c7bb438

This reverts commit 8690160.

pranav-data marked this pull request as ready for review April 9, 2025 20:24

pranav-data requested a review from DudeNr33 as a code owner April 9, 2025 20:24

[pre-commit.ci] auto fixes from pre-commit.com hooks

cdcdbbf

for more information, see https://pre-commit.ci

Pierre-Sassoulas added Bug 🪲 pyreverse Related to pyreverse component labels Apr 9, 2025

Pierre-Sassoulas added this to the 3.3.7 milestone Apr 9, 2025

Pierre-Sassoulas added the backport maintenance/3.3.x label Apr 9, 2025

Pierre-Sassoulas changed the title ~~Closing #9267~~ [pyreverse] Fix duplicate arrows when class attribute is assigned more than once Apr 9, 2025

This comment has been minimized.

Sign in to view

Pranav Kumar Seerala and others added 8 commits April 10, 2025 10:40

Made mypy suggestions and added test

0aedc80

Merge branch 'main' of https://github.com/pranav-data/pylint

b64b36c

[pre-commit.ci] auto fixes from pre-commit.com hooks

4d83b3a

for more information, see https://pre-commit.ci

removed deprecated Tuple

eeccaf1

Merge branch 'main' of https://github.com/pranav-data/pylint

3748219

[pre-commit.ci] auto fixes from pre-commit.com hooks

442d2f5

for more information, see https://pre-commit.ci

tuple from builtins

c0f0cc0

Merge branch 'main' of https://github.com/pranav-data/pylint

54a6e25

pre-commit-ci bot and others added 4 commits April 10, 2025 15:02

[pre-commit.ci] auto fixes from pre-commit.com hooks

b85d720

for more information, see https://pre-commit.ci

Fixed formatting of .mmd for test

0c4f401

Merge branch 'main' of https://github.com/pranav-data/pylint

be38db4

Passed all linter and mypy checks

0d89cc9

This comment has been minimized.

Sign in to view

Pranav Kumar Seerala added 2 commits April 10, 2025 15:49

Removed unnecessary block

5e8223c

Remove pyvenv.cfg from version control

608bbc0

DudeNr33 requested changes Apr 12, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pyreverse] Fix duplicate arrows when class attribute is assigned more than once #10333

[pyreverse] Fix duplicate arrows when class attribute is assigned more than once #10333

pranav-data commented Apr 9, 2025 •

edited

Loading

codecov bot commented Apr 9, 2025 •

edited

Loading

This comment has been minimized.

DudeNr33 commented Apr 10, 2025

This comment has been minimized.

pranav-data commented Apr 10, 2025

pranav-data commented Apr 10, 2025

DudeNr33 left a comment

github-actions bot commented Apr 12, 2025

pranav-data commented Apr 14, 2025

DudeNr33 commented Apr 14, 2025

[pyreverse] Fix duplicate arrows when class attribute is assigned more than once #10333

Are you sure you want to change the base?

[pyreverse] Fix duplicate arrows when class attribute is assigned more than once #10333

Conversation

pranav-data commented Apr 9, 2025 • edited Loading

Type of Changes

Description

codecov bot commented Apr 9, 2025 • edited Loading

Codecov Report

This comment has been minimized.

DudeNr33 commented Apr 10, 2025

This comment has been minimized.

pranav-data commented Apr 10, 2025

pranav-data commented Apr 10, 2025

DudeNr33 left a comment

Choose a reason for hiding this comment

github-actions bot commented Apr 12, 2025

pranav-data commented Apr 14, 2025

DudeNr33 commented Apr 14, 2025

pranav-data commented Apr 9, 2025 •

edited

Loading

codecov bot commented Apr 9, 2025 •

edited

Loading