Skip to content

Move pkg/ext-proc/metrics/README.md -> site-src/guides/metrics.md #373

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Feb 20, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@ nav:
- User Guides:
- Getting started: guides/index.md
- Adapter Rollout: guides/adapter-rollout.md
- Metrics: guides/metrics.md
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't comment on the file itself, but here are a few observations from the preview (https://deploy-preview-373--gateway-api-inference-extension.netlify.app/guides):

  • The "table of contents" at the top is not needed since it shows up on the right side already.
  • When I click on the metrics guide, the left bar completely disappears. Interestingly, when I click on metrics, it sends me to https://deploy-preview-373--gateway-api-inference-extension.netlify.app/guides/metrics.md, but for the other guides it sends me to the url without the .md at the end.
  • The formatting of the metrics table perhaps can be improved? add gray fill to the headers perhaps? widen the "Labels" cell? See https://deploy-preview-373--gateway-api-inference-extension.netlify.app/guides/metrics.md/#exposed-metrics;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the structure and details.

For the table, I used bold text to highlight the headers instead of colors as when my cursor moves to any row of this table, that row will also be gray.

- Implementer's Guide: guides/implementers.md
- Reference:
- API Reference: reference/spec.md
Expand Down
30 changes: 13 additions & 17 deletions pkg/epp/metrics/README.md → site-src/guides/metrics.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,6 @@
# Documentation
# Metrics

This documentation is the current state of exposed metrics.

## Table of Contents
* [Exposed Metrics](#exposed-metrics)
* [Scrape Metrics](#scrape-metrics)
This guide describes the current state of exposed metrics and how to scrape them.

## Requirements

Expand Down Expand Up @@ -38,17 +34,17 @@ spec:

## Exposed metrics

| Metric name | Metric Type | Description | Labels | Status |
| ------------|--------------| ----------- | ------ | ------ |
| inference_model_request_total | Counter | The counter of requests broken out for each model. | `model_name`=&lt;model-name&gt; <br> `target_model_name`=&lt;target-model-name&gt; | ALPHA |
| inference_model_request_error_total | Counter | The counter of requests errors broken out for each model. | `model_name`=&lt;model-name&gt; <br> `target_model_name`=&lt;target-model-name&gt; | ALPHA |
| inference_model_request_duration_seconds | Distribution | Distribution of response latency. | `model_name`=&lt;model-name&gt; <br> `target_model_name`=&lt;target-model-name&gt; | ALPHA |
| inference_model_request_sizes | Distribution | Distribution of request size in bytes. | `model_name`=&lt;model-name&gt; <br> `target_model_name`=&lt;target-model-name&gt; | ALPHA |
| inference_model_response_sizes | Distribution | Distribution of response size in bytes. | `model_name`=&lt;model-name&gt; <br> `target_model_name`=&lt;target-model-name&gt; | ALPHA |
| inference_model_input_tokens | Distribution | Distribution of input token count. | `model_name`=&lt;model-name&gt; <br> `target_model_name`=&lt;target-model-name&gt; | ALPHA |
| inference_model_output_tokens | Distribution | Distribution of output token count. | `model_name`=&lt;model-name&gt; <br> `target_model_name`=&lt;target-model-name&gt; | ALPHA |
| inference_pool_average_kv_cache_utilization | Gauge | The average kv cache utilization for an inference server pool. | `name`=&lt;inference-pool-name&gt; | ALPHA |
| inference_pool_average_queue_size | Gauge | The average number of requests pending in the model server queue. | `name`=&lt;inference-pool-name&gt; | ALPHA |
| **Metric name** | **Metric Type** | <div style="width:200px">**Description**</div> | <div style="width:250px">**Labels**</div> | **Status** |
|:---------------------------------------------|:-----------------|:------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------|
| inference_model_request_total | Counter | The counter of requests broken out for each model. | `model_name`=&lt;model-name&gt; <br> `target_model_name`=&lt;target-model-name&gt; | ALPHA |
| inference_model_request_error_total | Counter | The counter of requests errors broken out for each model. | `model_name`=&lt;model-name&gt; <br> `target_model_name`=&lt;target-model-name&gt; | ALPHA |
| inference_model_request_duration_seconds | Distribution | Distribution of response latency. | `model_name`=&lt;model-name&gt; <br> `target_model_name`=&lt;target-model-name&gt; | ALPHA |
| inference_model_request_sizes | Distribution | Distribution of request size in bytes. | `model_name`=&lt;model-name&gt; <br> `target_model_name`=&lt;target-model-name&gt; | ALPHA |
| inference_model_response_sizes | Distribution | Distribution of response size in bytes. | `model_name`=&lt;model-name&gt; <br> `target_model_name`=&lt;target-model-name&gt; | ALPHA |
| inference_model_input_tokens | Distribution | Distribution of input token count. | `model_name`=&lt;model-name&gt; <br> `target_model_name`=&lt;target-model-name&gt; | ALPHA |
| inference_model_output_tokens | Distribution | Distribution of output token count. | `model_name`=&lt;model-name&gt; <br> `target_model_name`=&lt;target-model-name&gt; | ALPHA |
| inference_pool_average_kv_cache_utilization | Gauge | The average kv cache utilization for an inference server pool. | `name`=&lt;inference-pool-name&gt; | ALPHA |
| inference_pool_average_queue_size | Gauge | The average number of requests pending in the model server queue. | `name`=&lt;inference-pool-name&gt; | ALPHA |

## Scrape Metrics

Expand Down