Commit 3df784b

authored

Vulkan: VK_KHR_cooperative_matrix support to speed up prompt processing (ggml-org#10597)

* Vulkan: Implement VK_KHR_cooperative_matrix support in the matrix matrix multiplication shader * Improve performance with better q4_k and q5_k dequant and store unrolling * Add Vulkan MUL_MAT and MUL_MAT_ID accumulator precision selection * Rework mulmat shader selection and compilation logic, avoid compiling shaders that won't get used by device * Vulkan: Implement accumulator switch for specific mul mat mat shaders * Vulkan: Unroll more loops for more mul mat mat performance * Vulkan: Add VK_AMD_shader_core_properties2 support to read Compute Unit count for split_k logic * Disable coopmat support on AMD proprietary driver * Remove redundant checks * Add environment variable GGML_VK_DISABLE_COOPMAT to disable VK_KHR_cooperative_matrix support * Fix rebase typo * Fix coopmat2 MUL_MAT_ID pipeline selection

1 parent 86a1934 commit 3df784bCopy full SHA for 3df784b

3 files changed

+750

-397

lines changed

ggml/src/ggml-vulkan
- ggml-vulkan.cpp
- vulkan-shaders
  - mul_mm.comp
  - vulkan-shaders-gen.cpp

3 files changed

+750

-397

lines changed

Comments

(0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit 3df784b

3 files changed

3 files changed

File tree

3 files changed

3 files changed

0 commit comments