Skip to content

[Doc] Update OOT model docs #18742

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 27, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 15 additions & 16 deletions docs/contributing/model/registration.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,33 +23,32 @@ Finally, update our [list of supported models][supported-models] to promote your

## Out-of-tree models

You can load an external model using a plugin without modifying the vLLM codebase.

!!! info
[vLLM's Plugin System][plugin-system]
You can load an external model [using a plugin][plugin-system] without modifying the vLLM codebase.

To register the model, use the following code:

```python
from vllm import ModelRegistry
from your_code import YourModelForCausalLM
ModelRegistry.register_model("YourModelForCausalLM", YourModelForCausalLM)
# The entrypoint of your plugin
def register():
from vllm import ModelRegistry
from your_code import YourModelForCausalLM

ModelRegistry.register_model("YourModelForCausalLM", YourModelForCausalLM)
```

If your model imports modules that initialize CUDA, consider lazy-importing it to avoid errors like `RuntimeError: Cannot re-initialize CUDA in forked subprocess`:

```python
from vllm import ModelRegistry

ModelRegistry.register_model(
"YourModelForCausalLM",
"your_code:YourModelForCausalLM"
)
# The entrypoint of your plugin
def register():
from vllm import ModelRegistry

ModelRegistry.register_model(
"YourModelForCausalLM",
"your_code:YourModelForCausalLM"
)
```

!!! warning
If your model is a multimodal model, ensure the model class implements the [SupportsMultiModal][vllm.model_executor.models.interfaces.SupportsMultiModal] interface.
Read more about that [here][supports-multimodal].

!!! note
Although you can directly put these code snippets in your script using `vllm.LLM`, the recommended way is to place these snippets in a vLLM plugin. This ensures compatibility with various vLLM features like distributed inference and the API server.
Comment on lines -53 to -55
Copy link
Member Author

@DarkLight1337 DarkLight1337 May 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work in V1 anymore as the workers are in separate processes. So I removed this option

6 changes: 4 additions & 2 deletions docs/design/plugin_system.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,10 @@ def register():
from vllm import ModelRegistry

if "MyLlava" not in ModelRegistry.get_supported_archs():
ModelRegistry.register_model("MyLlava",
"vllm_add_dummy_model.my_llava:MyLlava")
ModelRegistry.register_model(
"MyLlava",
"vllm_add_dummy_model.my_llava:MyLlava",
)
```

For more information on adding entry points to your package, please check the [official documentation](https://setuptools.pypa.io/en/latest/userguide/entry_point.html).
Expand Down