Skip to content

Increase 3B scratch buffers. #1698

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 5, 2023
Merged

Increase 3B scratch buffers. #1698

merged 1 commit into from
Jun 5, 2023

Conversation

SlyEcho
Copy link
Collaborator

@SlyEcho SlyEcho commented Jun 5, 2023

The 128 MB was too optimistic.
Too bad it is not dynamically computed.

Ref: #1588 (comment)

The 128 MB was too optimistic.
Too bad it is not dynamically computed.
@SlyEcho SlyEcho requested a review from Green-Sky June 5, 2023 09:09
@SlyEcho
Copy link
Collaborator Author

SlyEcho commented Jun 5, 2023

@LostRuins, you suggested increasing to 256MB.

Is that going to be enough? What is the best way to test it?

@SlyEcho
Copy link
Collaborator Author

SlyEcho commented Jun 5, 2023

It seems to run for me with both Q4_0 and Q5_1 on context 2048 and batch 512 with only 128MB.

@LostRuins
Copy link
Collaborator

LostRuins commented Jun 5, 2023

Hi @SlyEcho I guess the best way to test it is to download and run your OpenLLAMA 3B ggml quant (which I don't know if I am allowed to link here). But running it as q4_0 with 256MB scratch at batch size 512, for 2048 context, it seems to work for me. I dunno if there may be some boundary parameter that could still fail.

128MB crashes for me at around 1.5k token mark

Copy link
Collaborator

@LostRuins LostRuins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@SlyEcho
Copy link
Collaborator Author

SlyEcho commented Jun 5, 2023

OK, I will merge it. Memory use is probably is also dependent on the user's system and build.

@SlyEcho SlyEcho merged commit 5220a99 into master Jun 5, 2023
@SlyEcho SlyEcho deleted the fix-3b-mem-req branch June 5, 2023 10:43
@Green-Sky
Copy link
Collaborator

download and run your OpenLLAMA 3B ggml quant (which I don't know if I am allowed to link here)

yes you are :) . openllama is an opensource reproduction. licensed under the apache2 license

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants