Skip to content
This repository was archived by the owner on Oct 11, 2024. It is now read-only.

Commit 90de296

Browse files
robertgshaw2-redhathorheynm
authored andcommitted
Refactor Prometheus and Add Request Level Metrics (vllm-project#2316)
1 parent ab469e5 commit 90de296

File tree

7 files changed

+1234
-102
lines changed

7 files changed

+1234
-102
lines changed
Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
# vLLM + Prometheus/Grafana
2+
3+
This is a simple example that shows you how to connect vLLM metric logging to the Prometheus/Grafana stack. For this example, we launch Prometheus and Grafana via Docker. You can checkout other methods through [Prometheus](https://prometheus.io/) and [Grafana](https://grafana.com/) websites.
4+
5+
Install:
6+
- [`docker`](https://docs.docker.com/engine/install/)
7+
- [`docker compose`](https://docs.docker.com/compose/install/linux/#install-using-the-repository)
8+
9+
### Launch
10+
11+
Prometheus metric logging is enabled by default in the OpenAI-compatible server. Launch via the entrypoint:
12+
```bash
13+
python3 -m vllm.entrypoints.openai.api_server \
14+
--model mistralai/Mistral-7B-v0.1 \
15+
--max-model-len 2048 \
16+
--disable-log-requests
17+
```
18+
19+
Launch Prometheus and Grafana servers with `docker compose`:
20+
```bash
21+
docker compose up
22+
```
23+
24+
Submit some sample requests to the server:
25+
```bash
26+
wget https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json
27+
28+
python3 ../../benchmarks/benchmark_serving.py \
29+
--model mistralai/Mistral-7B-v0.1 \
30+
--tokenizer mistralai/Mistral-7B-v0.1 \
31+
--endpoint /v1/completions \
32+
--dataset ShareGPT_V3_unfiltered_cleaned_split.json \
33+
--request-rate 3.0
34+
```
35+
36+
Navigating to [`http://localhost:8000/metrics`](http://localhost:8000/metrics) will show the raw Prometheus metrics being exposed by vLLM.
37+
38+
### Grafana Dashboard
39+
40+
Navigate to [`http://localhost:3000`](http://localhost:3000). Log in with the default username (`admin`) and password (`admin`).
41+
42+
#### Add Prometheus Data Source
43+
44+
Navigate to [`http://localhost:3000/connections/datasources/new`](http://localhost:3000/connections/datasources/new) and select Prometheus.
45+
46+
On Prometheus configuration page, we need to add the `Prometheus Server URL` in `Connection`. For this setup, Grafana and Prometheus are running in separate containers, but Docker creates DNS name for each containers. You can just use `http://prometheus:9090`.
47+
48+
Click `Save & Test`. You should get a green check saying "Successfully queried the Prometheus API.".
49+
50+
#### Import Dashboard
51+
52+
Navigate to [`http://localhost:3000/dashboard/import`](http://localhost:3000/dashboard/import), upload `grafana.json`, and select the `prometheus` datasource. You should see a screen that looks like the following:
53+
54+
![Grafana Dashboard Image](https://i.imgur.com/R2vH9VW.png)
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# docker-compose.yaml
2+
version: "3"
3+
4+
services:
5+
prometheus:
6+
image: prom/prometheus:latest
7+
extra_hosts:
8+
- "host.docker.internal:host-gateway" # allow a direct connection from container to the local machine
9+
ports:
10+
- "9090:9090" # the default port used by Prometheus
11+
volumes:
12+
- ${PWD}/prometheus.yaml:/etc/prometheus/prometheus.yml # mount Prometheus config file
13+
14+
grafana:
15+
image: grafana/grafana:latest
16+
depends_on:
17+
- prometheus
18+
ports:
19+
- "3000:3000" # the default port used by Grafana

0 commit comments

Comments
 (0)