Skip to content

Commit 94ab451

Browse files
committed
squash
Signed-off-by: Kyle Sayers <[email protected]>
1 parent 08c4c91 commit 94ab451

File tree

2 files changed

+32
-0
lines changed

2 files changed

+32
-0
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@ Applying quantization with `llmcompressor`:
3939
* [Activation quantization to `fp8`](examples/quantization_w8a8_fp8)
4040
* [Weight only quantization to `int4`](examples/quantization_w4a16)
4141
* [Quantizing MoE LLMs](examples/quantizing_moe)
42+
* [Quantizing Multimodal VLMs](examples/multimodal_vision)
4243

4344
### User Guides
4445
Deep dives into advanced usage of `llmcompressor`:

examples/multimodal_vision/README.md

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
# Quantizing Multimodal Vision-Language Models #
2+
This directory contains example scripts for quantizing a variety of vision-language models using the GPTQ W4A16 quantization scheme.
3+
4+
## Using your own models ##
5+
6+
```python3
7+
recipe = [
8+
GPTQModifier(
9+
targets="Linear",
10+
scheme="W4A16",
11+
sequential_targets=["MistralDecoderLayer"],
12+
ignore=["re:.*lm_head", "re:vision_tower.*", "re:multi_modal_projector.*"],
13+
),
14+
]
15+
```
16+
17+
### Sequential Targets ###
18+
19+
### Ignore ###
20+
21+
### Tracing Errors ###
22+
Because the architectures of vision-language models is often times more complex than those of typical decoder-only text models, you may encounter `torch.fx.TraceError`s when attempting to quantize your model. For more information on `torch.fx.TraceError`s, why they occur, and how to resolve them, please see the [Model Tracing Guide](/src/llmcompressor/transformers/tracing/README.md).
23+
24+
### Adding Smoothquant Mappings ###
25+
26+
### Adding Data Collator ###
27+
* TODO: create a default "multimodal" collator
28+
29+
## Customizing Dataset and Quantization Scheme ##
30+
. For a detailed walkthrough of customzing datasets and quantization for W4A16, see the
31+
[Quantization Guide](/examples/quantization_w4a16/README.md).

0 commit comments

Comments
 (0)