Skip to content

Commit 377d699

Browse files
peterschmidt85pranitnaik
authored and
pranitnaik
committed
[Docs] Update services docs to reflect that gateway is now optional (dstackai#2005)
* - [Docs] Updated services docs to reflect that gateway is now optional - [Examples] Updated Llama 3.1 to replace tasks with services * - [Examples] Updated Llama 3.2 to replace tasks with services * - [Examples] Updated Llama 3.2 to replace tasks with services (bugfix)
1 parent ae43c7c commit 377d699

15 files changed

+190
-315
lines changed

docs/docs/index.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,9 +22,11 @@ for AI workloads both in the cloud and on-prem, speeding up the development, tra
2222
`dstack` supports the following configurations:
2323

2424
* [Dev environments](dev-environments.md) — for interactive development using a desktop IDE
25-
* [Tasks](tasks.md) — for scheduling jobs (incl. distributed jobs) or running web apps
26-
* [Services](services.md) — for deployment of models and web apps (with auto-scaling and authorization)
25+
* [Tasks](tasks.md) — for scheduling jobs, incl. distributed ones (or running web apps)
26+
* [Services](services.md) — for deploying models (or web apps)
2727
* [Fleets](concepts/fleets.md) — for managing cloud and on-prem clusters
28+
* [Volumes](concepts/volumes.md) — for managing instance and network volumes (to persist data)
29+
* [Gateway](concepts/fleets.md) — for handling auto-scaling and ingress traffic
2830

2931
Configuration can be defined as YAML files within your repo.
3032

docs/docs/quickstart.md

Lines changed: 44 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -21,8 +21,7 @@ Your folder can be a regular local folder or a Git repo.
2121

2222
=== "Dev environment"
2323

24-
A dev environment lets you provision a remote machine with your code, dependencies, and resources, and access it
25-
with your desktop IDE.
24+
A dev environment lets you provision an instance and access it with your desktop IDE.
2625

2726
##### Define a configuration
2827

@@ -32,18 +31,14 @@ Your folder can be a regular local folder or a Git repo.
3231

3332
```yaml
3433
type: dev-environment
35-
# The name is optional, if not specified, generated randomly
3634
name: vscode
3735

36+
# If `image` is not specified, dstack uses its default image
3837
python: "3.11"
39-
# Uncomment to use a custom Docker image
4038
#image: dstackai/base:py3.13-0.6-cuda-12.1
4139

4240
ide: vscode
4341

44-
# Use either spot or on-demand instances
45-
spot_policy: auto
46-
4742
# Uncomment to request resources
4843
#resources:
4944
# gpu: 24GB
@@ -78,24 +73,24 @@ Your folder can be a regular local folder or a Git repo.
7873

7974
Open the link to access the dev environment using your desktop IDE.
8075

76+
Alternatively, you can access it via `ssh <run name>`.
77+
8178
=== "Task"
8279

83-
A task allows you to schedule a job or run a web app. It lets you configure
84-
dependencies, resources, ports, the number of nodes (if you want to run the task on a cluster), etc.
80+
A task allows you to schedule a job or run a web app. Tasks can be distributed and can forward ports.
8581

8682
##### Define a configuration
8783

8884
Create the following configuration file inside the repo:
8985

90-
<div editor-title="streamlit.dstack.yml">
86+
<div editor-title="examples/misc/streamlit/serve-task.dstack.yml">
9187

9288
```yaml
9389
type: task
94-
# The name is optional, if not specified, generated randomly
9590
name: streamlit
9691

92+
# If `image` is not specified, dstack uses its default image
9793
python: "3.11"
98-
# Uncomment to use a custom Docker image
9994
#image: dstackai/base:py3.13-0.6-cuda-12.1
10095

10196
# Commands of the task
@@ -106,16 +101,17 @@ Your folder can be a regular local folder or a Git repo.
106101
ports:
107102
- 8501
108103

109-
# Use either spot or on-demand instances
110-
spot_policy: auto
111-
112104
# Uncomment to request resources
113105
#resources:
114106
# gpu: 24GB
115107
```
116108

117109
</div>
118110

111+
By default, tasks run on a single instance. To run a distributed task, specify
112+
[`nodes` and system environment variables](reference/dstack.yml/task.md#distributed-tasks),
113+
and `dstack` will run it on a cluster.
114+
119115
##### Run the configuration
120116

121117
Run the configuration via [`dstack apply`](reference/cli/index.md#dstack-apply):
@@ -144,50 +140,41 @@ Your folder can be a regular local folder or a Git repo.
144140

145141
</div>
146142

147-
`dstack apply` automatically forwards the remote ports to `localhost` for convenient access.
143+
If you specified `ports`, they will be automatically forwarded to `localhost` for convenient access.
148144

149145
=== "Service"
150146

151-
A service allows you to deploy a web app or a model as a scalable endpoint. It lets you configure
152-
dependencies, resources, authorization, auto-scaling rules, etc.
153-
154-
??? info "Prerequisites"
155-
If you're using the open-source server, you must set up a [gateway](concepts/gateways.md) before you can run a service.
156-
157-
If you're using [dstack Sky :material-arrow-top-right-thin:{ .external }](https://sky.dstack.ai){:target="_blank"},
158-
the gateway is already set up for you.
147+
A service allows you to deploy a model or any web app as an endpoint.
159148

160149
##### Define a configuration
161150

162151
Create the following configuration file inside the repo:
163152

164-
<div editor-title="streamlit-service.dstack.yml">
153+
<div editor-title="examples/deployment/vllm/service.dstack.yml">
165154

166155
```yaml
167156
type: service
168-
# The name is optional, if not specified, generated randomly
169-
name: streamlit-service
157+
name: llama31-service
170158

159+
# If `image` is not specified, dstack uses its default image
171160
python: "3.11"
172-
# Uncomment to use a custom Docker image
173161
#image: dstackai/base:py3.13-0.6-cuda-12.1
174162

175-
# Commands of the service
163+
# Required environment variables
164+
env:
165+
- HF_TOKEN
176166
commands:
177-
- pip install streamlit
178-
- streamlit hello
179-
# Port of the service
180-
port: 8501
181-
182-
# Comment to enable authorization
183-
auth: False
167+
- pip install vllm
168+
- vllm serve meta-llama/Meta-Llama-3.1-8B-Instruct --max-model-len 4096
169+
# Expose the vllm server port
170+
port: 8000
184171

185-
# Use either spot or on-demand instances
186-
spot_policy: auto
172+
# Specify a name if it's an Open-AI compatible model
173+
model: meta-llama/Meta-Llama-3.1-8B-Instruct
187174

188-
# Uncomment to request resources
189-
#resources:
190-
# gpu: 24GB
175+
# Required resources
176+
resources:
177+
gpu: 24GB
191178
```
192179

193180
</div>
@@ -213,28 +200,32 @@ Your folder can be a regular local folder or a Git repo.
213200
Provisioning `streamlit`...
214201
---> 100%
215202

216-
Welcome to Streamlit. Check out our demo in your browser.
217-
218-
Local URL: https://streamlit-service.example.com
203+
Service is published at:
204+
http://localhost:3000/proxy/services/main/llama31-service
219205
```
220206

221207
</div>
222208

223-
Once the service is up, its endpoint is accessible at `https://<run name>.<gateway domain>`.
209+
If you specified `model`, the model will also be available via an OpenAI-compatible endpoint at
210+
`<dstack server URL>/proxy/models/<project name>`.
211+
212+
??? info "Gateway"
213+
By default, services run on a single instance. However, you can specify `replicas` and `target` to enable
214+
[auto-scaling](reference/dstack.yml/service.md#auto-scaling).
224215

225-
> `dstack apply` automatically uploads the code from the current repo, including your local uncommitted changes.
216+
Note, to use auto-scaling, a custom domain, or HTTPS, set up a
217+
[gateway](concepts/gateways.md) before running the service.
218+
A gateway pre-configured for you if you are using [dstack Sky :material-arrow-top-right-thin:{ .external }](https://sky.dstack.ai){:target="_blank"}.
219+
220+
`dstack apply` automatically provisions instances, uploads the code from the current repo (incl. your local uncommitted changes).
226221

227222
## Troubleshooting
228223

229-
Something not working? Make sure to check out the [troubleshooting](guides/troubleshooting.md) guide.
224+
Something not working? See the [troubleshooting](guides/troubleshooting.md) guide.
230225

231226
## What's next?
232227

233228
1. Read about [dev environments](dev-environments.md), [tasks](tasks.md),
234229
[services](services.md), and [fleets](concepts/fleets.md)
235-
2. Browse [examples](https://dstack.ai/examples)
236-
3. Join the community via [Discord :material-arrow-top-right-thin:{ .external }](https://discord.gg/u8SmfwPpMd)
237-
238-
!!! info "Examples"
239-
To see how dev environments, tasks, services, and fleets can be used for
240-
training and deploying AI models, check out the [examples](examples/index.md).
230+
2. Join [Discord :material-arrow-top-right-thin:{ .external }](https://discord.gg/u8SmfwPpMd)
231+
3. Browse [examples](https://dstack.ai/examples)

docs/docs/services.md

Lines changed: 30 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,8 @@
11
# Services
22

3-
A service allows you to deploy a web app or a model as a scalable endpoint. It lets you configure
3+
A service allows you to deploy a model or a web app as an endpoint. It lets you configure
44
dependencies, resources, authorization, auto-scaling rules, etc.
55

6-
Services are provisioned behind a [gateway](concepts/gateways.md) which provides an HTTPS endpoint mapped to your domain,
7-
handles authentication, distributes load, and performs auto-scaling.
8-
9-
??? info "Gateways"
10-
If you're using the open-source server, you must set up a [gateway](concepts/gateways.md) before you can run a service.
11-
12-
If you're using [dstack Sky :material-arrow-top-right-thin:{ .external }](https://sky.dstack.ai){:target="_blank"},
13-
the gateway is already set up for you.
14-
156
## Define a configuration
167

178
First, create a YAML file in your project folder. Its name must end with `.dstack.yml` (e.g. `.dstack.yml` or
@@ -26,7 +17,7 @@ type: service
2617
name: llama31-service
2718

2819
# If `image` is not specified, dstack uses its default image
29-
python: "3.10"
20+
python: "3.11"
3021

3122
# Required environment variables
3223
env:
@@ -37,26 +28,31 @@ commands:
3728
# Expose the vllm server port
3829
port: 8000
3930

31+
# Specify a name if it's an Open-AI compatible model
32+
model: meta-llama/Meta-Llama-3.1-8B-Instruct
33+
4034
# Use either spot or on-demand instances
4135
spot_policy: auto
4236

37+
# Required resources
4338
resources:
44-
# Change to what is required
4539
gpu: 24GB
46-
47-
# Comment out if you won't access the model via https://gateway.<gateway domain>
48-
model: meta-llama/Meta-Llama-3.1-8B-Instruct
4940
```
5041
5142
</div>
5243
5344
If you don't specify your Docker image, `dstack` uses the [base](https://hub.docker.com/r/dstackai/base/tags) image
5445
(pre-configured with Python, Conda, and essential CUDA drivers).
5546

56-
!!! info "Auto-scaling"
57-
By default, the service is deployed to a single instance. However, you can specify the
58-
[number of replicas and scaling policy](reference/dstack.yml/service.md#auto-scaling).
59-
In this case, `dstack` auto-scales it based on the load.
47+
Note, the `model` property is optional and not needed when deploying a non-OpenAI-compatible model or a regular web app.
48+
49+
!!! info "Gateway"
50+
By default, services run on a single instance. However, you can specify `replicas` and `target` to enable
51+
[auto-scaling](reference/dstack.yml/service.md#auto-scaling).
52+
53+
Note, to use auto-scaling, a custom domain, or HTTPS, set up a
54+
[gateway](concepts/gateways.md) before running the service.
55+
A gateway pre-configured for you if you are using [dstack Sky :material-arrow-top-right-thin:{ .external }](https://sky.dstack.ai){:target="_blank"}.
6056

6157
!!! info "Reference"
6258
See [.dstack.yml](reference/dstack.yml/service.md) for all the options supported by
@@ -83,7 +79,8 @@ Submit the run llama31-service? [y/n]: y
8379
Provisioning...
8480
---> 100%
8581
86-
Service is published at https://llama31-service.example.com
82+
Service is published at:
83+
http://localhost:3000/proxy/services/main/llama31-service
8784
```
8885

8986
</div>
@@ -93,14 +90,17 @@ To avoid uploading large files, ensure they are listed in `.gitignore`.
9390

9491
## Access the endpoint
9592

96-
One the service is up, its endpoint is accessible at `https://<run name>.<gateway domain>`.
93+
### Service
94+
95+
If no gateway is created, the service’s endpoint will be accessible at `<dstack server URL>
96+
/proxy/services/<project name>/<run name>`.
9797

9898
By default, the service endpoint requires the `Authorization` header with `Bearer <dstack token>`.
9999

100100
<div class="termy">
101101

102102
```shell
103-
$ curl https://llama31-service.example.com/v1/chat/completions \
103+
$ curl http://localhost:3000/proxy/services/main/llama31-service/v1/chat/completions \
104104
-H 'Content-Type: application/json' \
105105
-H 'Authorization: Bearer &lt;dstack token&gt;' \
106106
-d '{
@@ -119,10 +119,15 @@ $ curl https://llama31-service.example.com/v1/chat/completions \
119119
Authorization can be disabled by setting [`auth`](reference/dstack.yml/service.md#authorization) to `false` in the
120120
service configuration file.
121121

122-
### Gateway endpoint
122+
> When a [gateway](concepts/gateways.md) is configured, the service endpoint will be accessible at `https://<run name>.<gateway domain>`.
123+
124+
### Model
125+
126+
If the service defines the `model` property, the model can be accessed with
127+
the OpenAI-compatible endpoint at `<dstack server URL>/proxy/models/<project name>`,
128+
or via the control plane UI's playground.
123129

124-
In case the service has the [model mapping](reference/dstack.yml/service.md#model-mapping) configured, you will also be
125-
able to access the model at `https://gateway.<gateway domain>` via the OpenAI-compatible interface.
130+
> When a [gateway](concepts/gateways.md) is configured, the OpenAI-compatible endpoint is available at `https://gateway.<gateway domain>`.
126131

127132
## Manage runs
128133

examples/llms/llama31/.dstack.yml

Lines changed: 0 additions & 20 deletions
This file was deleted.

0 commit comments

Comments
 (0)