Skip to content

Commit 7469f20

Browse files
committed
use lowvram flag for offload qkv
1 parent ec21fa7 commit 7469f20

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

Diff for: gpttype_adapter.cpp

+1-1
Original file line numberDiff line numberDiff line change
@@ -895,7 +895,7 @@ ModelLoadResult gpttype_load_model(const load_model_inputs inputs, FileFormat in
895895
//llama_ctx_paran_parts = -1;
896896
llama_ctx_params.seed = -1;
897897
//llama_ctx_params.f16_kv = true;
898-
//llama_ctx_params.low_vram = inputs.low_vram;
898+
llama_ctx_params.offload_kqv = !inputs.low_vram;
899899
llama_ctx_params.mul_mat_q = inputs.use_mmq;
900900
llama_ctx_params.logits_all = false;
901901
model_params.use_mmap = inputs.use_mmap;

0 commit comments

Comments
 (0)