-
Notifications
You must be signed in to change notification settings - Fork 12k
qwen 1.5 Beta 1.8B output incoherently #5459
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
+1 Both Qwen1.5-72B-Chat and Qwen-72B-Chat output incoherently. The old llama_cpp which nearly 2023 Dec worked normal. |
That's Great info to know. Can you pinpoint which version is the last version work ? If we can pinpoint which change is the cause of incoherence, it might get us close to solving the problem. |
Same problem. |
There are some mistakes in model config files . I used the Qwen1.5'gguf from huaggingface which run successfully. Maybe relate to this PR https://huggingface.co/Qwen/Qwen1.5-72B-Chat/commit/bc11a298a0c6a5cd737064db62c6ad20ec6331be |
Hmm, I'm unsure that's the only issue. I chat-fine tuned and tried to
quantize since then.
…On Fri, Mar 1, 2024 at 1:36 AM weimy ***@***.***> wrote:
There are some mistake in model config files . I used the Qwen1.5'gguf
from huaggingface which run successfully. Maybe relate to this PR
https://huggingface.co/Qwen/Qwen1.5-72B-Chat/commit/bc11a298a0c6a5cd737064db62c6ad20ec6331be
—
Reply to this email directly, view it on GitHub
<#5459 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ASVG6CVVN3PLRQJLRZ75QTDYV7LRFAVCNFSM6AAAAABDEP5PX6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZSGI4DQMBWGM>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
mostly,but there might me some conf need to adjust in EOS in the conf of the original model. |
So, is this problem solved? |
Not in the official repo |
I have the same problem.
|
This issue was closed because it has been inactive for 14 days since being marked as stale. |
latest llama cpp output incoherently compare to Transformers output.
transformers/vllm work ok but llama cpp gguf does not
The text was updated successfully, but these errors were encountered: