[Executorch][sdpa] Add accidentaly removed flash attentiona args check (#9910)

pytorchbot · kirklandsign · commit 56d4927b5fee · 2025-04-11T14:33:00.000-07:00
Mostly preparing for adding quantized sdpa Differential Revision: [D71370596](https://our.internmc.facebook.com/intern/diff/D71370596/) ghstack-source-id: 276012277 Pull Request resolved: #9886
diff --git a/extension/llm/custom_ops/op_sdpa.cpp b/extension/llm/custom_ops/op_sdpa.cpp
@@ -294,6 +294,13 @@ Tensor& custom_sdpa_out(
       output,
       "attn_mask and is_causal cannot be set at the same time");
 
+  ET_KERNEL_CHECK_MSG(
+      ctx,
+      validate_flash_attention_args(q, k, v, attn_mask),
+      InvalidArgument,
+      output,
+      "Invalid arguments");
+
   ET_CHECK_MSG(q.dim() == 4, "query must be a 4D tensor");
 
   const int64_t seq_len = q.size(1);