Skip to content

Commit 0d2caec

Browse files
DarkLight1337Isotr0py
authored andcommitted
[Doc] Add note to gte-Qwen2 models (vllm-project#11808)
Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: Isotr0py <[email protected]>
1 parent 3734af0 commit 0d2caec

File tree

1 file changed

+3
-0
lines changed

1 file changed

+3
-0
lines changed

docs/source/models/supported_models.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -430,6 +430,9 @@ You can set `--hf-overrides '{"is_causal": false}'` to change the attention mask
430430
431431
On the other hand, its 1.5B variant (`Alibaba-NLP/gte-Qwen2-1.5B-instruct`) uses causal attention
432432
despite being described otherwise on its model card.
433+
434+
Regardless of the variant, you need to enable `--trust-remote-code` for the correct tokenizer to be
435+
loaded. See [relevant issue on HF Transformers](https://github.com/huggingface/transformers/issues/34882).
433436
```
434437

435438
If your model is not in the above list, we will try to automatically convert the model using

0 commit comments

Comments
 (0)