Skip to content

Fix: Prevent name collision between coords and data variables #7794

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

twiecki
Copy link
Member

@twiecki twiecki commented May 21, 2025

Fix: Prevent name collision between coords and data variables

Adds checks to prevent you from defining a coordinate with the same name as a data variable, or vice-versa.

This addresses issue #7788, where such name collisions could lead to downstream errors, particularly with sample_posterior_predictive returning prior predictive samples instead of posterior predictive samples.

The following changes were made:

  • Modified Model.add_named_variable to check if the proposed variable name already exists as a coordinate name.
  • Modified Model.add_coord to check if the proposed coordinate name already exists as a variable name.
  • Added a test case to tests/model/test_core.py to verify these checks.

📚 Documentation preview 📚: https://pymc--7794.org.readthedocs.build/en/7794/

Removes two unintentionally duplicated instances of the `test_name_conflict_variable_and_coord` test method from the `TestNested` class in `tests/model/test_core.py`.

This ensures the test suite is clean and avoids redundant test executions.
@twiecki twiecki requested a review from Copilot May 21, 2025 02:07
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds safeguards against naming conflicts between model coordinates and variables, and includes tests to verify these checks.

  • Introduces ValueError in Model.add_coord if the coordinate name exists among variables.
  • Introduces ValueError in Model.add_named_variable if the variable name exists among coordinates.
  • Adds a test covering both conflict scenarios between data variables and coordinates.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
tests/model/test_core.py Added test_name_conflict_variable_and_coord to cover both conflict directions.
pymc/model/core.py Implemented collision checks in add_coord and add_named_variable.
Comments suppressed due to low confidence (2)

pymc/model/core.py:948

  • Update the add_coord docstring to mention that coordinate names cannot collide with existing variable names and that a ValueError is raised when such a conflict occurs.
def add_coord(

tests/model/test_core.py:102

  • [nitpick] Consider adding a similar test for collisions involving random variable names (e.g., using pm.Normal) against existing coordinates to ensure full coverage across all variable types.
def test_name_conflict_variable_and_coord(self):

@@ -948,6 +948,11 @@ def add_coord(
FutureWarning,
)

if name in self.named_vars:
Copy link
Preview

Copilot AI May 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The collision check logic in add_coord and add_named_variable is duplicated. Consider extracting a shared helper method to centralize name‐conflict validation and maintain consistent error messages.

Copilot uses AI. Check for mistakes.

Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a small change in the error message and agree with copilot to test with a distribution as well as data

@@ -948,6 +948,11 @@ def add_coord(
FutureWarning,
)

if name in self.named_vars:
raise ValueError(
f"Name '{name}' already exists as a variable name in the model. Please choose a different name for the coordinate."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
f"Name '{name}' already exists as a variable name in the model. Please choose a different name for the coordinate."
f"Name '{name}' already exists as a variable name in the model. Please choose a different name for the dimension."

@@ -1463,6 +1468,10 @@ def add_named_variable(self, var, dims: tuple[str | None, ...] | None = None):
"""
if var.name is None:
raise ValueError("Variable is unnamed.")
if var.name in self.coords:
raise ValueError(
f"Name '{var.name}' already exists as a coordinate name in the model. Please choose a different name for the variable."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
f"Name '{var.name}' already exists as a coordinate name in the model. Please choose a different name for the variable."
f"Name '{var.name}' already exists as a dimension name in the model. Please choose a different name for the variable."

@@ -99,6 +99,16 @@ def test_setattr_properly_works(self):
assert len(submodel.value_vars) == 2
assert len(model.value_vars) == 3

def test_name_conflict_variable_and_coord(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong test class to put this test in, this is for nested models tests

@ricardoV94
Copy link
Member

There is a pre-existing test that is failing due to the new error message. Importantly it suggests there was already some code to handle this. Should check why it wasn't sufficient and remove because it's now probably duplicated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants