Skip to content

Implement several subtensor lift rewrites #1158

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
May 9, 2025

Conversation

ricardoV94
Copy link
Member

@ricardoV94 ricardoV94 commented Jan 20, 2025

This allows reducing computations on batch dimensions by lifting simple indexing operations closer to the inputs.

An obvious example is:

import numpy as np
import pytensor
import pytensor.tensor as pt
from pytensor.compile.mode import get_default_mode

mode = get_default_mode()

x = pt.matrix("x", shape=(512, 512))
x_test = np.random.normal(size=x.type.shape)

x_sum = x.sum(axis=1)
out = x_sum[0]

fn_before = pytensor.function([x], out, mode=mode.excluding("local_subtensor_of_reduce"))
fn_before.dprint(print_type=True)
%timeit fn_before(x_test)
# Subtensor{i} [id A] <Scalar(float64, shape=())> 1
#  ├─ Sum{axis=1} [id B] <Vector(float64, shape=(512,))> 0
#  │  └─ x [id C] <Matrix(float64, shape=(512, 512))>
#  └─ 0 [id D] <uint8>
# 762 μs ± 7.55 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

print()
fn_after = pytensor.function([x], out, mode=mode)
fn_after.dprint(print_type=True)
%timeit fn_after(x_test)
# Sum{axes=None} [id A] <Scalar(float64, shape=())> 1
#  └─ Subtensor{i} [id B] <Vector(float64, shape=(512,))> 0
#     ├─ x [id C] <Matrix(float64, shape=(512, 512))>
#     └─ 0 [id D] <uint8>
# 5.26 μs ± 86 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

📚 Documentation preview 📚: https://pytensor--1158.org.readthedocs.build/en/1158/

@ricardoV94 ricardoV94 force-pushed the subtensor_lift branch 3 times, most recently from cbe0c96 to 9b47cee Compare January 20, 2025 17:56
@ricardoV94 ricardoV94 changed the title Implement sereval subtensor lift rewrites Implement several subtensor lift rewrites Jan 21, 2025
@ricardoV94 ricardoV94 force-pushed the subtensor_lift branch 2 times, most recently from d72b5c2 to 23870fa Compare January 30, 2025 14:42
@ricardoV94 ricardoV94 force-pushed the subtensor_lift branch 4 times, most recently from a022812 to 580149b Compare March 11, 2025 13:29
Copy link

codecov bot commented Mar 11, 2025

Codecov Report

Attention: Patch coverage is 90.97561% with 37 lines in your changes missing coverage. Please review.

Project coverage is 82.07%. Comparing base (5335a68) to head (4b545cf).
Report is 12 commits behind head on main.

Files with missing lines Patch % Lines
pytensor/tensor/rewriting/subtensor_lift.py 90.77% 17 Missing and 20 partials ⚠️

❌ Your patch status has failed because the patch coverage (90.97%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1158      +/-   ##
==========================================
+ Coverage   82.02%   82.07%   +0.05%     
==========================================
  Files         207      208       +1     
  Lines       49301    49517     +216     
  Branches     8747     8785      +38     
==========================================
+ Hits        40440    40642     +202     
- Misses       6695     6702       +7     
- Partials     2166     2173       +7     
Files with missing lines Coverage Δ
pytensor/tensor/rewriting/subtensor.py 90.08% <100.00%> (+0.49%) ⬆️
pytensor/tensor/rewriting/subtensor_lift.py 90.77% <90.77%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ricardoV94 ricardoV94 requested a review from lucianopaz March 11, 2025 16:20
Copy link
Member

@lucianopaz lucianopaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I finally got around to reviewing this. Sorry @ricardoV94 for the huge delay! I left a few questions but it looks good.

Comment on lines +113 to +170
@register_canonicalize("shape_unsafe")
@register_specialize("shape_unsafe")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the difference between canonicalize and specialize.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

canonicalize happens before stabilize and before specialize. Canonicalize should simplify graphs convert different forms into a canonical one. Specialize is allowed to go bananas and produce very specialized code. Many rewrites we apply in multiple phases so they can interact with others that are specific to that phase

copy_stack_trace(node.outputs[0], subt_x)
elem_inputs = elem.owner.inputs
elem_bcast = elem.type.broadcastable
if all(inp.type.broadcastable == elem_bcast for inp in elem_inputs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the elemenwise inputs already have the same number of dimensions as the output at this point?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Elemwise.make_node adds any expand_dims needed so that all inputs/output have the same ndim

[old_out] = node.outputs

# Copy stack trace to new inputs
[copy_stack_trace(old_out, new_inp) for new_inp in indexed_inputs]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this function do?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When there's a runtime error evaluating a node it shows the stack trace of where the original variable was first defined by the user, even if it was later replaced by rewrites and no longer really exists in the computational graph


For now rewrite is restricted to single axis of reduction, for simplicity.

sum(x, axis=1)[0] -> sum(x[0], axis=0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this rewrite handle the case where keepdims=True? Or does that kwarg produce a different Op than CAReduce?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keepdims is not part of CARedue, it's added as a separate expand_dims after the reduction.


if len(fgraph.clients[adv_subtensor]) > 1:
# AdvancedSubtensor involves a full_copy, so we don't want to do it twice
return None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add pragma: no cover to make codecov shut up


@node_rewriter([Subtensor])
def local_subtensor_of_adv_subtensor(fgraph, node):
"""Lift a simple Subtensor through an AdvancedSubtensor, when basic index dimensions are to the left of any advanced ones.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rewrite made me question something of the other commits. If this rewrite is lifting the subtensor out of the advanced subtensor operation, aren't some of your other rewrites actually lowering the subtensor so that it gets applied before the other ops?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when we say lifting we mean applying closer to the inputs of the function (before other operations). What you're saying would be sinking/lowering. In general we want to apply the subtensor before other elementwise operations, so as to compute fewer things). AdvancedIndexing could be a case to lower, if we end up having more rows than before, but we don't have any rewrites to reason about that

@ricardoV94 ricardoV94 marked this pull request as ready for review May 9, 2025 08:38
@ricardoV94 ricardoV94 merged commit 7b0a392 into pymc-devs:main May 9, 2025
72 of 73 checks passed
@ricardoV94 ricardoV94 deleted the subtensor_lift branch May 19, 2025 13:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants