Skip to content

Commit d422aa9

Browse files
r-barnesjimpang
authored and
jimpang
committed
[Bugfix] Fix numel() downcast in fused_layernorm_dynamic_per_token_quant.cu (vllm-project#17316)
1 parent f0edab1 commit d422aa9

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

csrc/quantization/fused_kernels/fused_layernorm_dynamic_per_token_quant.cu

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@ void rms_norm_dynamic_per_token_quant_dispatch(
9696
std::optional<at::Tensor> const& scale_ub,
9797
std::optional<at::Tensor>& residual) {
9898
int32_t hidden_size = input.size(-1);
99-
int32_t num_tokens = input.numel() / hidden_size;
99+
auto num_tokens = input.numel() / hidden_size;
100100

101101
dim3 grid(num_tokens);
102102
dim3 block(std::min(hidden_size, 1024));

0 commit comments

Comments
 (0)