Increase 3B scratch buffers. #1698

SlyEcho · 2023-06-05T09:09:18Z

The 128 MB was too optimistic.
Too bad it is not dynamically computed.

Ref: #1588 (comment)

The 128 MB was too optimistic. Too bad it is not dynamically computed.

SlyEcho · 2023-06-05T09:11:30Z

@LostRuins, you suggested increasing to 256MB.

Is that going to be enough? What is the best way to test it?

SlyEcho · 2023-06-05T10:34:44Z

It seems to run for me with both Q4_0 and Q5_1 on context 2048 and batch 512 with only 128MB.

LostRuins · 2023-06-05T10:35:31Z

Hi @SlyEcho I guess the best way to test it is to download and run your OpenLLAMA 3B ggml quant (which I don't know if I am allowed to link here). But running it as q4_0 with 256MB scratch at batch size 512, for 2048 context, it seems to work for me. I dunno if there may be some boundary parameter that could still fail.

128MB crashes for me at around 1.5k token mark

LostRuins

lgtm

SlyEcho · 2023-06-05T10:42:58Z

OK, I will merge it. Memory use is probably is also dependent on the user's system and build.

Green-Sky · 2023-06-05T12:12:17Z

download and run your OpenLLAMA 3B ggml quant (which I don't know if I am allowed to link here)

yes you are :) . openllama is an opensource reproduction. licensed under the apache2 license

Increase 3B scratch buffers.

dffd2a7

The 128 MB was too optimistic. Too bad it is not dynamically computed.

SlyEcho requested a review from Green-Sky June 5, 2023 09:09

Green-Sky approved these changes Jun 5, 2023

View reviewed changes

LostRuins approved these changes Jun 5, 2023

View reviewed changes

SlyEcho merged commit 5220a99 into master Jun 5, 2023

SlyEcho deleted the fix-3b-mem-req branch June 5, 2023 10:43

Bearsaerker mentioned this pull request Mar 12, 2025

Eval bug: Gemma 3 extremly slow prompt processing when using quantized kv cache. #12352

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increase 3B scratch buffers. #1698

Increase 3B scratch buffers. #1698

SlyEcho commented Jun 5, 2023

SlyEcho commented Jun 5, 2023

SlyEcho commented Jun 5, 2023

LostRuins commented Jun 5, 2023 •

edited

Loading

LostRuins left a comment

SlyEcho commented Jun 5, 2023

Green-Sky commented Jun 5, 2023

Increase 3B scratch buffers. #1698

Increase 3B scratch buffers. #1698

Conversation

SlyEcho commented Jun 5, 2023

SlyEcho commented Jun 5, 2023

SlyEcho commented Jun 5, 2023

LostRuins commented Jun 5, 2023 • edited Loading

LostRuins left a comment

Choose a reason for hiding this comment

SlyEcho commented Jun 5, 2023

Green-Sky commented Jun 5, 2023

LostRuins commented Jun 5, 2023 •

edited

Loading