Gibberish output of Qwen2.5-3B-Instruct with Q2_K quantization #12378
Unanswered
simmonssong
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I tried converting
Qwen2.5-3B-Instruct
intoQ2_K
quantization on two different machines. The output of the compressed model is always nonsense:,“||9"363的76...5 31367244一246“),).请-264“3))-64))5761595431636843467435565846"):4843)"),\n5353"34“ ), 3\"6)) the"24\n\n964
But it seems that this only happens to
Q2_K
.Platforms:
Windows 10 with llama.cpp build b4846.
Windows 11 with llama.cpp build b4520.
Original model:
https://huggingface.co/Qwen/Qwen2.5-3B-Instruct
Conversion script:
python convert_hf_to_gguf.py ***\Qwen2.5-7B-Instruct --outfile ***\Qwen2.5-7B-Instruct-FP16.gguf
Quantization script:
llama-quantize.exe ***\Qwen2.5-3B-Instruct-FP16.gguf ***\Qwen2.5-3B-Instruct-Q2_K.gguf Q2_K
Model testing script:
llama-cli.exe -m ***\Qwen2.5-3B-Instruct-Q2_K.gguf
Beta Was this translation helpful? Give feedback.
All reactions