Skip to content

Commit a78c768

Browse files
authored
Move pkg/ext-proc/metrics/README.md -> site-src/guides/metrics.md (#373)
* Move pkgepp/metrics/README.md -> site-src/guides/metrics.md * add docs link for metrics.md * update formatting
1 parent 2577f63 commit a78c768

File tree

2 files changed

+14
-17
lines changed

2 files changed

+14
-17
lines changed

mkdocs.yml

+1
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,7 @@ nav:
5757
- User Guides:
5858
- Getting started: guides/index.md
5959
- Adapter Rollout: guides/adapter-rollout.md
60+
- Metrics: guides/metrics.md
6061
- Implementer's Guide: guides/implementers.md
6162
- Reference:
6263
- API Reference: reference/spec.md

pkg/epp/metrics/README.md site-src/guides/metrics.md

+13-17
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,6 @@
1-
# Documentation
1+
# Metrics
22

3-
This documentation is the current state of exposed metrics.
4-
5-
## Table of Contents
6-
* [Exposed Metrics](#exposed-metrics)
7-
* [Scrape Metrics](#scrape-metrics)
3+
This guide describes the current state of exposed metrics and how to scrape them.
84

95
## Requirements
106

@@ -38,17 +34,17 @@ spec:
3834

3935
## Exposed metrics
4036

41-
| Metric name | Metric Type | Description | Labels | Status |
42-
| ------------|--------------| ----------- | ------ | ------ |
43-
| inference_model_request_total | Counter | The counter of requests broken out for each model. | `model_name`=&lt;model-name&gt; <br> `target_model_name`=&lt;target-model-name&gt; | ALPHA |
44-
| inference_model_request_error_total | Counter | The counter of requests errors broken out for each model. | `model_name`=&lt;model-name&gt; <br> `target_model_name`=&lt;target-model-name&gt; | ALPHA |
45-
| inference_model_request_duration_seconds | Distribution | Distribution of response latency. | `model_name`=&lt;model-name&gt; <br> `target_model_name`=&lt;target-model-name&gt; | ALPHA |
46-
| inference_model_request_sizes | Distribution | Distribution of request size in bytes. | `model_name`=&lt;model-name&gt; <br> `target_model_name`=&lt;target-model-name&gt; | ALPHA |
47-
| inference_model_response_sizes | Distribution | Distribution of response size in bytes. | `model_name`=&lt;model-name&gt; <br> `target_model_name`=&lt;target-model-name&gt; | ALPHA |
48-
| inference_model_input_tokens | Distribution | Distribution of input token count. | `model_name`=&lt;model-name&gt; <br> `target_model_name`=&lt;target-model-name&gt; | ALPHA |
49-
| inference_model_output_tokens | Distribution | Distribution of output token count. | `model_name`=&lt;model-name&gt; <br> `target_model_name`=&lt;target-model-name&gt; | ALPHA |
50-
| inference_pool_average_kv_cache_utilization | Gauge | The average kv cache utilization for an inference server pool. | `name`=&lt;inference-pool-name&gt; | ALPHA |
51-
| inference_pool_average_queue_size | Gauge | The average number of requests pending in the model server queue. | `name`=&lt;inference-pool-name&gt; | ALPHA |
37+
| **Metric name** | **Metric Type** | <div style="width:200px">**Description**</div> | <div style="width:250px">**Labels**</div> | **Status** |
38+
|:---------------------------------------------|:-----------------|:------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------|
39+
| inference_model_request_total | Counter | The counter of requests broken out for each model. | `model_name`=&lt;model-name&gt; <br> `target_model_name`=&lt;target-model-name&gt; | ALPHA |
40+
| inference_model_request_error_total | Counter | The counter of requests errors broken out for each model. | `model_name`=&lt;model-name&gt; <br> `target_model_name`=&lt;target-model-name&gt; | ALPHA |
41+
| inference_model_request_duration_seconds | Distribution | Distribution of response latency. | `model_name`=&lt;model-name&gt; <br> `target_model_name`=&lt;target-model-name&gt; | ALPHA |
42+
| inference_model_request_sizes | Distribution | Distribution of request size in bytes. | `model_name`=&lt;model-name&gt; <br> `target_model_name`=&lt;target-model-name&gt; | ALPHA |
43+
| inference_model_response_sizes | Distribution | Distribution of response size in bytes. | `model_name`=&lt;model-name&gt; <br> `target_model_name`=&lt;target-model-name&gt; | ALPHA |
44+
| inference_model_input_tokens | Distribution | Distribution of input token count. | `model_name`=&lt;model-name&gt; <br> `target_model_name`=&lt;target-model-name&gt; | ALPHA |
45+
| inference_model_output_tokens | Distribution | Distribution of output token count. | `model_name`=&lt;model-name&gt; <br> `target_model_name`=&lt;target-model-name&gt; | ALPHA |
46+
| inference_pool_average_kv_cache_utilization | Gauge | The average kv cache utilization for an inference server pool. | `name`=&lt;inference-pool-name&gt; | ALPHA |
47+
| inference_pool_average_queue_size | Gauge | The average number of requests pending in the model server queue. | `name`=&lt;inference-pool-name&gt; | ALPHA |
5248

5349
## Scrape Metrics
5450

0 commit comments

Comments
 (0)