Skip to content

Commit 17d1c72

Browse files
committed
Update spqr.md
1 parent 96b2613 commit 17d1c72

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

docs/source/en/quantization/spqr.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ rendered properly in your Markdown viewer.
1616

1717
# SpQR
1818

19-
[SpQR](https://github.com/Vahe1994/SpQR) quantization algorithm involves a 16x16 tile, 3-bit configuration, and unstructured sparsity as detailed in [SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression](https://arxiv.org/abs/2306.03078).
19+
[SpQR](https://github.com/Vahe1994/SpQR) quantization algorithm involves a 16x16 tiled bi-level group 3-bit quantization structure, with sparse outliers as detailed in [SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression](https://arxiv.org/abs/2306.03078).
2020

2121
To SpQR-quantize a model, refer to the [Vahe1994/SpQR](https://github.com/Vahe1994/SpQR) repository.
2222

@@ -32,4 +32,4 @@ quantized_model = AutoModelForCausalLM.from_pretrained(
3232
device_map="auto"
3333
)
3434
tokenizer = AutoTokenizer.from_pretrained("elvircrn/Llama-2-7b-SPQR-3Bit-16x16-red_pajama-hf")
35-
```
35+
```

0 commit comments

Comments
 (0)