Skip to content

Commit 20bd63c

Browse files
DarkLight1337rasmith
authored andcommitted
[Doc] Add note to gte-Qwen2 models (vllm-project#11808)
Signed-off-by: DarkLight1337 <[email protected]>
1 parent 40c1080 commit 20bd63c

File tree

1 file changed

+3
-0
lines changed

1 file changed

+3
-0
lines changed

docs/source/models/supported_models.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -430,6 +430,9 @@ You can set `--hf-overrides '{"is_causal": false}'` to change the attention mask
430430
431431
On the other hand, its 1.5B variant (`Alibaba-NLP/gte-Qwen2-1.5B-instruct`) uses causal attention
432432
despite being described otherwise on its model card.
433+
434+
Regardless of the variant, you need to enable `--trust-remote-code` for the correct tokenizer to be
435+
loaded. See [relevant issue on HF Transformers](https://github.com/huggingface/transformers/issues/34882).
433436
```
434437

435438
If your model is not in the above list, we will try to automatically convert the model using

0 commit comments

Comments
 (0)