-
Notifications
You must be signed in to change notification settings - Fork 11.9k
[User] GGUF conversion, stop sequence Problem #2711
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think you need to install the python tokenizer: |
Ah, that's likely the issue as to why I can't convert with metadata. Thank you. I got a crazy error, so I'll troubleshoot with Termux people.
|
Yeah, unfortunately this is a real problem but I'm not sure how to get around it. As far as I can see, the Debug output from dumping the vocab items from the GGML model:
So I don't think there's a way to recover that. I could possibly add arguments to override the value for specific vocab items but if you're going through that much effort you might as well get the metadata and use that which is likely to be a lot more reliable. I'm not 100% sure this should be closed since it's a legit issue, but I don't know how to fix it either. (With the k-quants improvements on the horizon, converting GGML models is looking less appealing also.)
That looks like something specific to compiling on Android. |
We could have default values for bos eos etc using the same tokens and token ids as the original LLaMA models. |
|
All good - at least you communicated that beforehand, so we can chat about it. I considered you might override the values, but like you mentioned: obviously, metadata is more reliable. I'm converting 'cuz there's barely any gguf models out there right now.
Yes, it's not a llama.cpp issue. |
llama.cpp default the token ids to original token ids, so I suggest also defaulting the tokens:
Conversion with special token mapping kv works by using
|
Folks@Termux solved the Here's a new error, excuse me if it's dumb or obvious, but I don't see what to do:
Thank you. |
Sure, I can add something like that. So just set the token value to
|
OpenLLaMA-v2 3b and 7b: same ids and values, but incorrectly maps unknown to padding..
|
Just to be clear, you're saying they're doing something weird but it should be like what you pasted in the generated GGUF file? In other words, if I converted an OpenLLaMA 3b or 7b GGML to GGUF and:
I would have made a mistake. Is that correct? |
The mapping of token value and token id is the same, so use the mapping I posted first for the original LLaMA. The other identifier BOS EOS PAD UNK is the mapping for the use of the special token id: |
Sorry, I'm still not sure I fully understand. When converting the GGML file without metadata (which is what we're talking about here), all I have is the token ID and the value (bytes, string value for the token). Just as example, dumping the information from the GGML file for
The part after Anyway, at least for the model I looked at, the values for tokens 1 and 2 were just a 0-length string. As far as I know, the problem here is with the string value of the tokens, not anything to do with the IDs. Since llama.cpp prints out |
Yes correct. The initial problem was the string values of the 0,1,2 tokens. Use as default the original LLaMA mapping:
|
When you PR the changes, it would be good to also fix the default UNK mapping in llama.cpp, setting it to |
Oddly, wizardMath doesn't have
Here's the content of Edit: I just noticed some of the file formats change to |
Your link there does have the expected files, including |
Yeah, somehow I messed up. I corrected the formats, thanks. It converted!
Even better, it stopped as expected, so converting with metadata definitely works. |
#2725 should help here even when converting without metadata. |
Uh oh!
There was an error while loading. Please reload this page.
Hi <3 llama.cpp
@KerfuffleV2 shows us that models converted without
metadata
load different: Loading non-metadata:Loading with one converted with external metadata:
I converted WizardMath-7B-V1.0 to GGUF and here's a couple runs:
ex1:
ex2:
It appears due to the way the model is converted it's unable to utilise the stop sequence, thus doesn't return control to the
User
in this case.Edit: Error message trying to include metadata:
Repo & here's the content of

~/storage/shared/downloads/wizardmath
:The text was updated successfully, but these errors were encountered: