Skip to content

Commit 9243ba4

Browse files
committed
Update
Signed-off-by: DarkLight1337 <[email protected]>
1 parent 22677bc commit 9243ba4

File tree

4 files changed

+62
-46
lines changed

4 files changed

+62
-46
lines changed

docs/source/contributing/model/basic.md

Lines changed: 5 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -6,18 +6,12 @@ This guide walks you through the steps to implement a basic vLLM model.
66

77
## 1. Bring your model code
88

9-
Start by forking our [GitHub repository](https://github.com/vllm-project/vllm) and then [build it from source](#build-from-source).
10-
This gives you the ability to modify the codebase and test your model.
11-
12-
Clone the PyTorch model code from the HuggingFace Transformers repository and put it into the <gh-dir:vllm/model_executor/models> directory.
13-
For instance, vLLM's [OPT model](gh-file:vllm/model_executor/models/opt.py) was adapted from the HuggingFace's [modeling_opt.py](https://github.com/huggingface/transformers/blob/main/src/transformers/models/opt/modeling_opt.py) file.
9+
First, clone the PyTorch model code from the source repository.
10+
For instance, vLLM's [OPT model](gh-file:vllm/model_executor/models/opt.py) was adapted from
11+
HuggingFace's [modeling_opt.py](https://github.com/huggingface/transformers/blob/main/src/transformers/models/opt/modeling_opt.py) file.
1412

1513
```{warning}
16-
When copying the model code, make sure to review and adhere to the code's copyright and licensing terms.
17-
```
18-
19-
```{tip}
20-
If you don't want to fork the repository and modify vLLM's codebase, please refer to [Out-of-Tree Model Integration](#new-model-oot).
14+
Make sure to review and adhere to the original code's copyright and licensing terms!
2115
```
2216

2317
## 2. Make your code compatible with vLLM
@@ -105,4 +99,4 @@ This method should load the weights from the HuggingFace's checkpoint file and a
10599

106100
## 5. Register your model
107101

108-
Finally, add your `*ForCausalLM` class to `_VLLM_MODELS` in <gh-file:vllm/model_executor/models/registry.py> so that it is available by default.
102+
See [this page](#new-model-registration) for instructions on how to register your new model to be used by vLLM.

docs/source/contributing/model/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,8 @@ This section provides more information on how to integrate a [HuggingFace Transf
99
:maxdepth: 1
1010
1111
basic
12+
registration
1213
multimodal
13-
oot
1414
```
1515

1616
```{note}

docs/source/contributing/model/oot.md

Lines changed: 0 additions & 34 deletions
This file was deleted.
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
(new-model-registration)=
2+
3+
# Model Registration
4+
5+
vLLM relies on a model registry to determine how to run each model.
6+
A list of pre-registered architectures can be found on the [Supported Models](#supported-models) page.
7+
8+
If your model is not on this list, you must register it to vLLM.
9+
This page provides detailed instructions on how to do so.
10+
11+
## Built-in models
12+
13+
To add a model directly to the vLLM library, start by forking our [GitHub repository](https://github.com/vllm-project/vllm) and then [build it from source](#build-from-source).
14+
This gives you the ability to modify the codebase and test your model.
15+
16+
After you have implemented your model (see [tutorial](#new-model-basic)), put it into the <gh-dir:vllm/model_executor/models> directory.
17+
Then, add your model class to `_VLLM_MODELS` in <gh-file:vllm/model_executor/models/registry.py> so that it is automatically registered upon importing vLLM.
18+
You should also include an example HuggingFace repository for this model in <gh-file:tests/models/registry.py> to run the unit tests.
19+
Finally, update the [Supported Models](#supported-models) documentation page to promote your model!
20+
21+
```{important}
22+
The list of models in each section should be maintained in alphabetical order.
23+
```
24+
25+
## Out-of-tree models
26+
27+
You can load an external model using a plugin without modifying the vLLM codebase.
28+
29+
```{seealso}
30+
[vLLM's Plugin System](#plugin-system)
31+
```
32+
33+
To register the model, use the following code:
34+
35+
```python
36+
from vllm import ModelRegistry
37+
from your_code import YourModelForCausalLM
38+
ModelRegistry.register_model("YourModelForCausalLM", YourModelForCausalLM)
39+
```
40+
41+
If your model imports modules that initialize CUDA, consider lazy-importing it to avoid errors like `RuntimeError: Cannot re-initialize CUDA in forked subprocess`:
42+
43+
```python
44+
from vllm import ModelRegistry
45+
46+
ModelRegistry.register_model("YourModelForCausalLM", "your_code:YourModelForCausalLM")
47+
```
48+
49+
```{important}
50+
If your model is a multimodal model, ensure the model class implements the {class}`~vllm.model_executor.models.interfaces.SupportsMultiModal` interface.
51+
Read more about that [here](#enabling-multimodal-inputs).
52+
```
53+
54+
```{note}
55+
Although you can directly put these code snippets in your script using `vllm.LLM`, the recommended way is to place these snippets in a vLLM plugin. This ensures compatibility with various vLLM features like distributed inference and the API server.
56+
```

0 commit comments

Comments
 (0)