Skip to content

Commit 6cfc996

Browse files
committed
OLS-1622: Document token quota feature
1 parent 83ff03e commit 6cfc996

3 files changed

+83
-0
lines changed

configure/ols-configuring-openshift-lightspeed.adoc

+2
Original file line numberDiff line numberDiff line change
@@ -23,3 +23,5 @@ include::modules/ols-about-lightspeed-and-role-based-access-control.adoc[levelof
2323
include::modules/ols-granting-access-to-individual-users.adoc[leveloffset=+2]
2424
include::modules/ols-granting-access-to-user-group.adoc[leveloffset=+2]
2525
include::modules/ols-filtering-and-redacting-information.adoc[leveloffset=+1]
26+
include::modules/ols-tokens-and-token-quota-limits.adoc[leveloffset=+1]
27+
include::modules/ols-activating-token-quota-limits.adoc[leveloffset=+2]
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
// Module included in the following assemblies:
2+
// * TBD
3+
4+
:_mod-docs-content-type: PROCEDURE
5+
[id="ols-activating-token-quota-limits_{context}"]
6+
= Activating token quota limits
7+
8+
Modify the `OLSConfigMap` file to activate token quota limits for the {ols-long} Service.
9+
10+
.Prerequisites
11+
12+
* You have installed the the {ols-long} Operator.
13+
14+
* You have configured a large language model provider (LLM).
15+
16+
* A PostgresSQL database is configured and the {ols-long} Service can access the database.
17+
18+
.Procedure
19+
20+
. Open the {ols-long} `ConfigMap` file by running the following command:
21+
+
22+
[source,terminal]
23+
----
24+
$ oc edit configmap <configmap_filename>
25+
----
26+
27+
. Modify the `data` property of the `ConfigMap` file to include token quota limit information. The following example defines the configuration in a file using key-value pairs. The {ols-long} pod mounts the `ConfigMap` resource as a volume, enabling access to the file stored within it. The `OLSConfig` Custom Resource (CR) references the `ConfigMap` resource to obtain the quota limit information.
28+
+
29+
.Example {ols-long} `ConfigMap` file
30+
[source,yaml]
31+
----
32+
apiVersion: v1
33+
kind: ConfigMap
34+
metadata:
35+
name: quota-limit
36+
namespace: openshift-lightspeed
37+
data:
38+
quota_handlers.conf:
39+
storage:
40+
host: <IP_address> <1>
41+
port: "5432"
42+
dbname: <database_name>
43+
user: <user_name>
44+
password_path: <file_containing_database_password>
45+
ssl_mode: disable
46+
limiters:
47+
- name: user_monthly_limits
48+
type: user_limiter
49+
initial_quota: 100000 <2>
50+
quota_increase: 10
51+
period: 30 days
52+
- name: cluster_monthly_limits
53+
type: cluster_limiter
54+
quota_increase: 1000000 <3>
55+
period: 30 days
56+
scheduler:
57+
period: 300 <4>
58+
----
59+
<1> Specifies the IP address for the PostgresSQL database. The database must use port `5432`.
60+
<2> Specifies a token quota limit of 100,000 for each user over a period of 30 days.
61+
<3> Increases the token quota limit for the cluster by 100,000 over a period of 30 days.
62+
<4> Defines the number of seconds that the scheduler waits and then checks if the period interval is over. When the period interval is over, the scheduler stores the timestamp and resets or increases the quota limit.
63+
64+
. Apply the `ConfigMap` file so that the token limit quota takes effect by running the following command:
65+
+
66+
[source,terminal]
67+
----
68+
$ oc apply -f <configmap_name>
69+
----
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
// Module included in the following assemblies:
2+
// * TBD
3+
4+
:_mod-docs-content-type: CONCEPT
5+
[id="ols-tokens-and-token-quota-limits_{context}"]
6+
= Tokens and token quota limits
7+
8+
Tokens are small chunks of text, which can be as small as one character or as large as one word. Tokens are the units of measurement used to quantify the amount of text that the {ols-long} Service sends to, or receives from, a large language model (LLM). Every interaction with the Service and the LLM is counted in tokens.
9+
10+
Token quota limits define the number of tokens that can be used in a certain timeframe. Implementing token quota limits helps control costs, encourage more efficient use of queries, and regulate system demands. In a multi-user configuration, token quota limits help provide equal access to all users ensuring everyone has an opportunity to submit queries.
11+
12+
You can define token quota limits for {ocp-short-name} clusters or {ocp-short-name} user accounts.

0 commit comments

Comments
 (0)