Skip to content

Commit f8490b4

Browse files
jvstmepranitnaik
authored and
pranitnaik
committed
[Docs]: Services without a gateway (dstackai#2011)
- Update all the spots where gateway was mentioned as being required for services - Do not emphasize that gateways are needed for auto-scaling. This is a temporary restriction and users will see a clear error message is they attempt to use auto-scaling without a gateway. Emphasize that gateways provide a custom domain and HTTPS, which is their main value. - In Protips, add a comparison of running web apps as Tasks vs Services without a gateway vs Services with a gateway - Add a trailing slash in `/proxy/services/.../` to be consistent with CLI and avoid redirects - Other edits here and there
1 parent 1da4902 commit f8490b4

File tree

20 files changed

+90
-78
lines changed

20 files changed

+90
-78
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44

55
dist/
66
venv/
7+
/site/
78
/.cache/
89
.pytest_cache/
910
.coverage

docs/blog/posts/amd-on-runpod.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -101,8 +101,8 @@ Once the configuration is ready, run `dstack apply -f <configuration file>`, and
101101
cloud resources and run the configuration.
102102

103103
??? info "Control plane"
104-
If you specify `model` when running a service, `dstack` will automatically register the model on the gateway's global
105-
endpoint and allow you to use it for chat via the control plane UI.
104+
If you specify `model` when running a service, `dstack` will automatically register the model on
105+
an OpenAI-compatible endpoint and allow you to use it for chat via the control plane UI.
106106

107107
<img src="https://github.com/dstackai/static-assets/blob/main/static-assets/images/dstack-control-plane-model-llama31.png?raw=true" width="750px" />
108108

docs/blog/posts/dstack-sky.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,7 @@ With `dstack Sky` you can use all of `dstack`'s features, incl. [dev environment
7777
[tasks](../../docs/tasks.md), [services](../../docs/services.md), and
7878
[fleets](../../docs/concepts/fleets.md).
7979

80-
To use services, the open-source version requires setting up a gateway with your own domain.
80+
To publish services, the open-source version requires setting up a gateway with your own domain.
8181
`dstack Sky` comes with a pre-configured gateway.
8282

8383
<div class="termy">

docs/blog/posts/tpu-on-gcp.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -120,8 +120,8 @@ and [vLLM :material-arrow-top-right-thin:{ .external }](https://github.com/vllm-
120120
</div>
121121

122122
??? info "Control plane"
123-
If you specify `model` when running a service, `dstack` will automatically register the model on the gateway's global
124-
endpoint and allow you to use it for chat via the control plane UI.
123+
If you specify `model` when running a service, `dstack` will automatically register the model on
124+
an OpenAI-compatible endpoint and allow you to use it for chat via the control plane UI.
125125

126126
<img src="https://github.com/dstackai/static-assets/blob/main/static-assets/images/dstack-control-plane-model-llama31.png?raw=true" width="750px" />
127127

docs/docs/concepts/fleets.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -205,7 +205,7 @@ on the hosts specified in `ssh_config`.
205205

206206
### List fleets
207207

208-
The [`dstack fleet`](../reference/cli/index.md#dstack-gateway-list) command lists fleet instances and theri status:
208+
The [`dstack fleet`](../reference/cli/index.md#dstack-fleet-list) command lists fleet instances and their status:
209209

210210
<div class="termy">
211211

docs/docs/concepts/gateways.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,9 @@
11
# Gateways
22

3-
Gateways manage the ingress traffic of running services and provide them with an HTTPS endpoint mapped to your domain,
3+
Gateways manage the ingress traffic of running [services](../services.md)
4+
and provide them with an HTTPS endpoint mapped to your domain,
45
handling authentication, load distribution, and auto-scaling.
56

6-
To run a service, you need at least one gateway set up.
7-
87
> If you're using [dstack Sky :material-arrow-top-right-thin:{ .external }](https://sky.dstack.ai){:target="_blank"},
98
> the gateway is already set up for you.
109
@@ -43,20 +42,22 @@ To create or update the gateway, simply call the [`dstack apply`](../reference/c
4342
<div class="termy">
4443

4544
```shell
46-
$ dstack apply . -f examples/deployment/gateway.dstack.yml
45+
$ dstack apply -f gateway.dstack.yml
4746
The example-gateway doesn't exist. Create it? [y/n]: y
4847
4948
BACKEND REGION NAME HOSTNAME DOMAIN DEFAULT STATUS
5049
aws eu-west-1 example-gateway example.com ✓ submitted
51-
5250
```
5351

5452
</div>
5553

5654
## Update DNS records
5755

5856
Once the gateway is assigned a hostname, go to your domain's DNS settings
59-
and add an `A` DNS record for `*.<gateway domain>` (e.g., `*.example.com`) pointing to the gateway's hostname.
57+
and add a DNS record for `*.<gateway domain>`, e.g. `*.example.com`.
58+
The record should point to the gateway's hostname shown in `dstack`
59+
and should be of type `A` if the hostname is an IP address (most cases),
60+
or of type `CNAME` if the hostname is another domain (some private gateways and Kubernetes).
6061

6162
## Manage gateways
6263

docs/docs/guides/protips.md

Lines changed: 20 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -93,9 +93,26 @@ This allows you to access the remote `8501` port on `localhost:8501` while the C
9393

9494
This will forward the remote `8501` port to `localhost:3000`.
9595

96-
[Services](../services.md) require a gateway but they also provide additional features for
97-
production-grade service deployment not offered by tasks, such as HTTPS domains and auto-scaling.
98-
If you run a web app as a task and it works, go ahead and run it as a service.
96+
[Services](../services.md) provide additional features not offered by tasks,
97+
such as authorization, load balancing, auto-scaling, an OpenAI-compatible endpoint for models, etc.
98+
99+
Unlike tasks, services are accessible throughout their lifetime, not only when the CLI is attached.
100+
By default, they are published at `<dstack server URL>/proxy/services/<project name>/<run name>/`.
101+
Additionally, if your project has a [gateway](../concepts/gateways.md),
102+
services can be published at a custom domain with HTTPS instead.
103+
104+
So what should you choose for running a web app? Here are some suggestions:
105+
106+
- If you are running a simple app that you only need temporarily, consider **tasks**.
107+
- If your app needs to be available at all times or if it needs to benefit from advanced features
108+
such as authorization or load balancing, use **services**.
109+
- If the service will only be accessed by you and other `dstack` users and supports running
110+
behind a URL path prefix, **no gateway** is needed.
111+
- If the service requires public access, a custom domain, HTTPS, or increased network throughput,
112+
**create a gateway** first.
113+
114+
??? info "Auto-scaling and WebSockets"
115+
Services using WebSockets or auto-scaling currently require a gateway.
99116

100117
## Docker and Docker Compose
101118

docs/docs/guides/troubleshooting.md

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -84,8 +84,9 @@ was using spot instances and was interrupted. To address this, you can either se
8484

8585
#### Gateway configuration
8686

87-
The most common reason a service fails to start is either because you haven’t [created a gateway](../concepts/gateways.md) or haven’t set up the
88-
correct DNS record pointing to the gateway's hostname.
87+
If all services fail to start with a specific gateway, make sure a
88+
[correct DNS record](../concepts/gateways.md#update-dns-records)
89+
pointing to the gateway's hostname is configured.
8990

9091
### Service endpoint doesn't work
9192

@@ -94,10 +95,6 @@ correct DNS record pointing to the gateway's hostname.
9495
If the service endpoint returns a 403 error, it is likely because the [`Authorization`](../services.md#access-the-endpoint)
9596
header with the correct `dstack` token was not provided.
9697

97-
#### SSH fleets
98-
99-
If you attempt to run a service on an SSH fleet, it won't work due to a [known issue :material-arrow-top-right-thin:{ .external }](https://github.com/dstackai/dstack/issues/1640){:target="_blank"} that is expected to be fixed soon.
100-
10198
[//]: # (#### Other)
10299
[//]: # (TODO: Explain how to get the gateway logs)
103100

docs/docs/index.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,8 +25,8 @@ for AI workloads both in the cloud and on-prem, speeding up the development, tra
2525
* [Tasks](tasks.md) &mdash; for scheduling jobs, incl. distributed ones (or running web apps)
2626
* [Services](services.md) &mdash; for deploying models (or web apps)
2727
* [Fleets](concepts/fleets.md) &mdash; for managing cloud and on-prem clusters
28-
* [Volumes](concepts/volumes.md) &mdash; for managing instance and network volumes (to persist data)
29-
* [Gateway](concepts/fleets.md) &mdash; for handling auto-scaling and ingress traffic
28+
* [Volumes](concepts/volumes.md) &mdash; for managing network volumes (to persist data)
29+
* [Gateways](concepts/gateways.md) &mdash; for publishing services with a custom domain and HTTPS
3030

3131
Configuration can be defined as YAML files within your repo.
3232

docs/docs/quickstart.md

Lines changed: 14 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -109,7 +109,7 @@ Your folder can be a regular local folder or a Git repo.
109109
</div>
110110

111111
By default, tasks run on a single instance. To run a distributed task, specify
112-
[`nodes` and system environment variables](reference/dstack.yml/task.md#distributed-tasks),
112+
[`nodes`](reference/dstack.yml/task.md#distributed-tasks),
113113
and `dstack` will run it on a cluster.
114114

115115
##### Run the configuration
@@ -119,16 +119,14 @@ Your folder can be a regular local folder or a Git repo.
119119
<div class="termy">
120120

121121
```shell
122-
$ dstack apply -f streamlit.dstack.yml
122+
$ dstack apply -f serve-task.dstack.yml
123123

124124
# BACKEND REGION RESOURCES SPOT PRICE
125125
1 gcp us-west4 2xCPU, 8GB, 100GB (disk) yes $0.010052
126126
2 azure westeurope 2xCPU, 8GB, 100GB (disk) yes $0.0132
127127
3 gcp europe-central2 2xCPU, 8GB, 100GB (disk) yes $0.013248
128128
129129
Submit the run streamlit? [y/n]: y
130-
131-
Continue? [y/n]: y
132130

133131
Provisioning `streamlit`...
134132
---> 100%
@@ -169,7 +167,7 @@ Your folder can be a regular local folder or a Git repo.
169167
# Expose the vllm server port
170168
port: 8000
171169

172-
# Specify a name if it's an Open-AI compatible model
170+
# Specify a name if it's an OpenAI-compatible model
173171
model: meta-llama/Meta-Llama-3.1-8B-Instruct
174172

175173
# Required resources
@@ -186,36 +184,31 @@ Your folder can be a regular local folder or a Git repo.
186184
<div class="termy">
187185

188186
```shell
189-
$ dstack apply -f streamlit.dstack.yml
187+
$ dstack apply -f service.dstack.yml
190188

191-
# BACKEND REGION RESOURCES SPOT PRICE
192-
1 gcp us-west4 2xCPU, 8GB, 100GB (disk) yes $0.010052
193-
2 azure westeurope 2xCPU, 8GB, 100GB (disk) yes $0.0132
194-
3 gcp europe-central2 2xCPU, 8GB, 100GB (disk) yes $0.013248
189+
# BACKEND REGION INSTANCE RESOURCES SPOT PRICE
190+
1 aws us-west-2 g5.4xlarge 16xCPU, 64GB, 1xA10G (24GB) yes $0.22
191+
2 aws us-east-2 g6.xlarge 4xCPU, 16GB, 1xL4 (24GB) yes $0.27
192+
3 gcp us-west1 g2-standard-4 4xCPU, 16GB, 1xL4 (24GB) yes $0.27
195193
196-
Submit the run streamlit? [y/n]: y
197-
198-
Continue? [y/n]: y
194+
Submit the run llama31-service? [y/n]: y
199195

200-
Provisioning `streamlit`...
196+
Provisioning `llama31-service`...
201197
---> 100%
202198

203199
Service is published at:
204-
http://localhost:3000/proxy/services/main/llama31-service
200+
http://localhost:3000/proxy/services/main/llama31-service/
205201
```
206202

207203
</div>
208204

209205
If you specified `model`, the model will also be available via an OpenAI-compatible endpoint at
210206
`<dstack server URL>/proxy/models/<project name>`.
211207

212-
??? info "Gateway"
213-
By default, services run on a single instance. However, you can specify `replicas` and `target` to enable
214-
[auto-scaling](reference/dstack.yml/service.md#auto-scaling).
215-
216-
Note, to use auto-scaling, a custom domain, or HTTPS, set up a
208+
!!! info "Gateway"
209+
To publish a service with a custom domain and HTTPS, set up a
217210
[gateway](concepts/gateways.md) before running the service.
218-
A gateway pre-configured for you if you are using [dstack Sky :material-arrow-top-right-thin:{ .external }](https://sky.dstack.ai){:target="_blank"}.
211+
A gateway is pre-configured for you if you are using [dstack Sky :material-arrow-top-right-thin:{ .external }](https://sky.dstack.ai){:target="_blank"}.
219212

220213
`dstack apply` automatically provisions instances, uploads the code from the current repo (incl. your local uncommitted changes).
221214

docs/docs/reference/cli/index.md

Lines changed: 2 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -68,10 +68,6 @@ $ dstack delete --help
6868

6969
</div>
7070

71-
!!! info "NOTE:"
72-
The `dstack delete` command currently supports only `gateway` configurations.
73-
Support for other configuration types is coming soon.
74-
7571
### dstack ps
7672

7773
This command shows the status of runs.
@@ -189,8 +185,7 @@ $ dstack fleet delete --help
189185

190186
### dstack gateway
191187

192-
A gateway is required for running services. It handles ingress traffic, authorization, domain mapping, model mapping
193-
for the OpenAI-compatible endpoint, and so on.
188+
A gateway allows publishing services at a custom domain with HTTPS.
194189

195190
##### dstack gateway list
196191

@@ -427,7 +422,7 @@ $ dstack pool delete --help
427422

428423
??? info "Internal environment variables"
429424
* `DSTACK_SERVER_ROOT_LOG_LEVEL` – (Optional) Sets root logger log level. Defaults to `ERROR`.
430-
* `DSTACK_SERVER_LOG_FORMAT` – (Optional) Sets format of log output. Can be `rich`, `standard`, `json`.. Defaults to `rich`.
425+
* `DSTACK_SERVER_LOG_FORMAT` – (Optional) Sets format of log output. Can be `rich`, `standard`, `json`. Defaults to `rich`.
431426
* `DSTACK_SERVER_UVICORN_LOG_LEVEL` – (Optional) Sets uvicorn logger log level. Defaults to `ERROR`.
432427
* `DSTACK_PROFILE` – (Optional) Has the same effect as `--profile`. Defaults to `None`.
433428
* `DSTACK_PROJECT` – (Optional) Has the same effect as `--project`. Defaults to `None`.

docs/docs/reference/dstack.yml/gateway.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# gateway
22

3-
The `gateway` configuration type allows creating and updating [gateways](../../services.md).
3+
The `gateway` configuration type allows creating and updating [gateways](../../concepts/gateways.md).
44

55
> Configuration files must be inside the project repo, and their names must end with `.dstack.yml`
66
> (e.g. `.dstack.yml` or `gateway.dstack.yml` are both acceptable).

docs/docs/reference/dstack.yml/service.md

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -108,13 +108,11 @@ If you want, you can specify your own Docker image via `image`.
108108
All backends except `runpod`, `vastai` and `kubernetes` also allow to use [Docker and Docker Compose](../../guides/protips.md#docker-and-docker-compose)
109109
inside `dstack` runs.
110110

111-
### Model gateway { #model-mapping }
112-
113-
By default, if you run a service, its endpoint is accessible at `https://<run name>.<gateway domain>`.
111+
### Models { #model-mapping }
114112

115113
If you are running a chat model with an OpenAI-compatible interface,
116-
you can optionally set the [`model`](#model) property to make the model accessible via
117-
the model gateway provided by `dstack`.
114+
set the [`model`](#model) property to make the model accessible via
115+
the OpenAI-compatible endpoint provided by `dstack`.
118116

119117
<div editor-title="service.dstack.yml">
120118

@@ -138,7 +136,7 @@ resources:
138136
# Change to what is required
139137
gpu: 24GB
140138
141-
# Make the model accessible at https://gateway.<gateway domain>
139+
# Register the model
142140
model: meta-llama/Meta-Llama-3.1-8B-Instruct
143141
144142
# Alternatively, use this syntax to set more model settings:
@@ -151,8 +149,9 @@ model: meta-llama/Meta-Llama-3.1-8B-Instruct
151149

152150
</div>
153151

154-
With such a configuration, once the service is up, you'll be able to access the model at
155-
`https://gateway.<gateway domain>` via the OpenAI-compatible interface.
152+
Once the service is up, the model will be available via the OpenAI-compatible endpoint
153+
at `<dstack server URL>/proxy/models/<project name>`
154+
or at `https://gateway.<gateway domain>` if your project has a gateway.
156155

157156
### Auto-scaling
158157

@@ -199,6 +198,11 @@ The [`replicas`](#replicas) property can be a number or a range.
199198

200199
Setting the minimum number of replicas to `0` allows the service to scale down to zero when there are no requests.
201200

201+
!!! info "Gateway"
202+
Services with a fixed number of replicas are supported both with and without a
203+
[gateway](../../concepts/gateways.md).
204+
Auto-scaling is currently only supported for services running with a gateway.
205+
202206
### Resources { #_resources }
203207

204208
If you specify memory size, you can either specify an explicit size (e.g. `24GB`) or a

docs/docs/services.md

Lines changed: 6 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ commands:
2828
# Expose the vllm server port
2929
port: 8000
3030

31-
# Specify a name if it's an Open-AI compatible model
31+
# Specify a name if it's an OpenAI-compatible model
3232
model: meta-llama/Meta-Llama-3.1-8B-Instruct
3333

3434
# Use either spot or on-demand instances
@@ -47,12 +47,9 @@ If you don't specify your Docker image, `dstack` uses the [base](https://hub.doc
4747
Note, the `model` property is optional and not needed when deploying a non-OpenAI-compatible model or a regular web app.
4848

4949
!!! info "Gateway"
50-
By default, services run on a single instance. However, you can specify `replicas` and `target` to enable
51-
[auto-scaling](reference/dstack.yml/service.md#auto-scaling).
52-
53-
Note, to use auto-scaling, a custom domain, or HTTPS, set up a
50+
To publish a service with a custom domain and HTTPS, set up a
5451
[gateway](concepts/gateways.md) before running the service.
55-
A gateway pre-configured for you if you are using [dstack Sky :material-arrow-top-right-thin:{ .external }](https://sky.dstack.ai){:target="_blank"}.
52+
A gateway is pre-configured for you if you are using [dstack Sky :material-arrow-top-right-thin:{ .external }](https://sky.dstack.ai){:target="_blank"}.
5653

5754
!!! info "Reference"
5855
See [.dstack.yml](reference/dstack.yml/service.md) for all the options supported by
@@ -80,7 +77,7 @@ Provisioning...
8077
---> 100%
8178
8279
Service is published at:
83-
http://localhost:3000/proxy/services/main/llama31-service
80+
http://localhost:3000/proxy/services/main/llama31-service/
8481
```
8582

8683
</div>
@@ -92,8 +89,8 @@ To avoid uploading large files, ensure they are listed in `.gitignore`.
9289

9390
### Service
9491

95-
If no gateway is created, the service’s endpoint will be accessible at `<dstack server URL>
96-
/proxy/services/<project name>/<run name>`.
92+
If no gateway is created, the service’s endpoint will be accessible at
93+
`<dstack server URL>/proxy/services/<project name>/<run name>/`.
9794

9895
By default, the service endpoint requires the `Authorization` header with `Bearer <dstack token>`.
9996

examples/deployment/nim/README.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -68,8 +68,9 @@ Provisioning...
6868
```
6969
</div>
7070

71-
If no gateway is created, the service’s endpoint will be accessible at
72-
`<dstack server URL>/proxy/services/<project name>/<run name>`.
71+
Once the service is up, the model will be available via the OpenAI-compatible endpoint
72+
at `<dstack server URL>/proxy/models/<project name>`
73+
or at `https://gateway.<gateway domain>` if your project has a gateway.
7374

7475
<div class="termy">
7576

examples/deployment/tgi/README.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -69,8 +69,9 @@ Provisioning...
6969
```
7070
</div>
7171

72-
If no gateway is created, the service’s endpoint will be accessible at
73-
`<dstack server URL>/proxy/services/<project name>/<run name>`.
72+
Once the service is up, the model will be available via the OpenAI-compatible endpoint
73+
at `<dstack server URL>/proxy/models/<project name>`
74+
or at `https://gateway.<gateway domain>` if your project has a gateway.
7475

7576
<div class="termy">
7677

0 commit comments

Comments
 (0)