You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Raise warning for 24 compressed sparse-only models (#1107)
In a recent update, we disabled Cutlass kernels for sparse-only models
vllm-project/vllm#12417. As a result,
sparse-24-only compressed-models are no longer runnable in vLLM.
This PR introduces a warning message to inform users when compression is
enabled in scenarios where sparse-only models are unsupported. This
ensures clarity and avoids unexpected behavior when using sparse-24
configurations with vLLM.
Changes:
- Added a warning to notify users when attempting to enable compression
with sparse-only models in unsupported configurations.
---------
Signed-off-by: Rahul Tuli <[email protected]>
0 commit comments