We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
https://github.com/ggerganov/llama.cpp/blob/8b679987cdce292ff36bd741f6715e4927e26f9b/llama.cpp#L1558
Is currently single threaded. Quantization is quite slow (vicuna 7B: 65156.31 ms, vicuna 13B: 129902.48 ms).