Skip to content

Remove rarely used shape utilities #1016

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 7, 2024

Conversation

ricardoV94
Copy link
Member

This PR cleans up the use of obscure shape utilities, removing them from Unique.infer_shape and and other places where they are not needed.

@ricardoV94 ricardoV94 removed the request for review from Armavica October 6, 2024 08:49
Copy link

codecov bot commented Oct 6, 2024

Codecov Report

Attention: Patch coverage is 90.00000% with 3 lines in your changes missing coverage. Please review.

Project coverage is 81.73%. Comparing base (fa0ab9d) to head (59ff115).
Report is 108 commits behind head on main.

Files with missing lines Patch % Lines
pytensor/tensor/extra_ops.py 88.46% 1 Missing and 2 partials ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1016      +/-   ##
==========================================
- Coverage   81.75%   81.73%   -0.03%     
==========================================
  Files         183      183              
  Lines       47756    47734      -22     
  Branches    11620    11611       -9     
==========================================
- Hits        39044    39016      -28     
- Misses       6519     6525       +6     
  Partials     2193     2193              
Files with missing lines Coverage Δ
pytensor/tensor/rewriting/shape.py 82.74% <100.00%> (ø)
pytensor/tensor/shape.py 90.75% <ø> (-1.94%) ⬇️
pytensor/tensor/subtensor.py 89.32% <100.00%> (+<0.01%) ⬆️
pytensor/tensor/extra_ops.py 87.77% <88.46%> (+0.13%) ⬆️

@ricardoV94 ricardoV94 marked this pull request as ready for review October 6, 2024 09:45
@ricardoV94 ricardoV94 requested review from twiecki and Armavica October 6, 2024 09:45
Comment on lines +1197 to +1198
if axis is not None and axis < 0:
raise ValueError("Axis cannot be negative.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove this possibility?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it simplifies the logic in the Op. The helper users use pt.unique handles the negative axis and passes a positive one to the Op. Users don't really create Ops themselves

ret[0] = tuple(
fgraph.shape_feature.shape_ir(i, node.outputs[0]) for i in range(ndim)
)
[x_shape] = i0_shapes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand what is happening in this function, but just to check, shouldn't there be a case for return_index and return_counts as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is. i0_shapes are the input dimensions, so that doesn't change with the number of outputs. return_index/counts are outputs, and they are always vector.

We set out_shapes = [out.shape[0] for out in node.outputs] by default which will always work for return_index and return_counts. Then we have special logic for the main output when axis is not None and for return_inverse which is not just out.shape[0].

Copy link
Member Author

@ricardoV94 ricardoV94 Oct 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The big picture is the function tries to return a graph for the shape of the outputs given the input shapes (and possibly values, which you could retrieve from node.inputs). The default graph of the shape is just output.shape, which we try to avoid when possible, as we would like to avoid computing the Op in order to find out its shape.

For unique we can do that for some of the outputs dimensions, but not all (we only know how many repeated values there are if we evaluate Unique).

This method is combining dims we can know from the input shapes and those that we can only get after we compute the outputs with out.shape[0] or out.shape[x].

@@ -1293,6 +1269,9 @@ def unique(
* the number of times each unique value comes up in the input array

"""
ar = as_tensor_variable(ar)
if axis is not None:
axis = normalize_axis_index(axis, ar.ndim)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Armavica here is where we allow negative axis for the user

@ricardoV94 ricardoV94 merged commit 1c2bc8f into pymc-devs:main Oct 7, 2024
60 of 61 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants