Avoid unnecessary fallback in _bincount when deterministic mode is enabled on CUDA (PyTorch ≥ 2.1) #3086

hyukkyukang · 2025-05-08T07:52:00Z

🚀 Feature

Improve _bincount utility to avoid unnecessary fallback on CUDA when deterministic mode is enabled and conditions are safe for native torch.bincount use (PyTorch ≥ 2.1).

Motivation

The current _bincount implementation in TorchMetrics falls back to a slower and more memory-intensive workaround when torch.are_deterministic_algorithms_enabled() is set to True, regardless of the PyTorch version or backend.

However, since PyTorch v2.1, torch.bincount is allowed in deterministic mode on CUDA as long as:

No weights are passed
Gradients are not required

Avoiding the fallback in this case would improve performance and reduce memory usage.
This is particularly relevant when running large-scale evaluations on modern GPU systems.

Pitch

Update the _bincount utility logic to:

Use native torch.bincount if:
- x.is_cuda is True
- torch.__version__ >= 2.1
- No weights are involved
- Gradients are not required
Only fall back when:
- x.is_mps or
- XLA backend is detected or
- PyTorch version is < 2.1 and deterministic algorithms are enabled

Alternatives

Continue using the current fallback unconditionally under deterministic mode, but this leads to unnecessary compute and memory overhead on newer CUDA-enabled systems.

Additional context

This proposed change aligns with the improvements introduced in PyTorch PR #105244, which enabled deterministic torch.bincount on CUDA under safe conditions starting from v2.1.

A PR will follow shortly to implement this enhancement.

The text was updated successfully, but these errors were encountered:

github-actions · 2025-05-08T07:52:23Z

Hi! Thanks for your contribution! Great first issue!

hyukkyukang added the enhancement New feature or request label May 8, 2025

hyukkyukang mentioned this issue May 8, 2025

enhance: avoid unnecessary fallback in _bincount on CUDA with determi… hyukkyukang/torchmetrics#1

Merged

4 tasks

hyukkyukang closed this as completed in hyukkyukang/torchmetrics#1 May 8, 2025

hyukkyukang mentioned this issue May 8, 2025

enhance: avoid unnecessary fallback in _bincount on CUDA with deterministic #3087

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Avoid unnecessary fallback in _bincount when deterministic mode is enabled on CUDA (PyTorch ≥ 2.1) #3086

Avoid unnecessary fallback in _bincount when deterministic mode is enabled on CUDA (PyTorch ≥ 2.1) #3086

hyukkyukang commented May 8, 2025

github-actions bot commented May 8, 2025

Uh oh!

Avoid unnecessary fallback in _bincount when deterministic mode is enabled on CUDA (PyTorch ≥ 2.1) #3086

Avoid unnecessary fallback in _bincount when deterministic mode is enabled on CUDA (PyTorch ≥ 2.1) #3086

Comments

hyukkyukang commented May 8, 2025

🚀 Feature

Motivation

Pitch

Alternatives

Additional context

github-actions bot commented May 8, 2025

Uh oh!