Skip to content

Why does GGML_VK_PREFER_HOST_MEMORY help with iGPUs when using Vulkan? #12770

Answered by wbruna
ddpasa asked this question in Q&A
Discussion options

You must be logged in to vote

The PR has a bit more info about the option: #11592 . But in a nutshell: for some inference operations, back-and-forth data transfers between dedicated VRAM (faster) and host shared (slower) memory may end up much slower than just using host shared memory for everything. That env var just reverses the default logic "allocate dedicated VRAM, with host shared as fallback" to "allocate host shared memory, with dedicated as fallback".

And it's not the default because systems that could comfortably allocate everything on dedicated VRAM would take a big performance hit, while potentially under-utilizing VRAM and leaving less general memory available for other applications. The allocation heuris…

Replies: 2 comments 1 reply

Comment options

You must be logged in to vote
0 replies
Answer selected by ddpasa
Comment options

You must be logged in to vote
1 reply
@0cc4m
Comment options

0cc4m Apr 7, 2025
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants