Skip to content

kubernetes-sigs/gateway-api-inference-extension

This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Folders and files

NameName
Last commit message
Last commit date

Latest commit

00c54e0 · Feb 20, 2025
Feb 15, 2025
Feb 11, 2025
Feb 11, 2025
Feb 19, 2025
Feb 19, 2025
Feb 19, 2025
Feb 19, 2025
Feb 19, 2025
Feb 20, 2025
Feb 19, 2025
Feb 19, 2025
Feb 19, 2025
Oct 21, 2024
Feb 3, 2025
Feb 10, 2025
Aug 28, 2024
Feb 19, 2025
Aug 28, 2024
Feb 14, 2025
Jan 23, 2025
Jan 23, 2025
Jan 7, 2025
Jan 28, 2025
Aug 28, 2024
Aug 28, 2024
Aug 28, 2024
Feb 15, 2025
Aug 28, 2024
Jan 13, 2025
Feb 19, 2025
Feb 18, 2025
Feb 18, 2025
Dec 19, 2024

Repository files navigation

Gateway API Inference Extension

The Gateway API Inference Extension came out of wg-serving and is sponsored by SIG Network. This repo contains: the load balancing algorithm, ext-proc code, CRDs, and controllers of the extension.

This extension is intented to provide value to multiplexed LLM services on a shared pool of compute. See the proposal for more info.

Status

This project is currently in development.

Getting Started

Follow this README to get the inference-extension up and running on your cluster!

End-to-End Tests

Follow this README to learn more about running the inference-extension end-to-end test suite on your cluster.

Website

Detailed documentation is available on our website: https://gateway-api-inference-extension.sigs.k8s.io/

Contributing

Our community meeting is weekly at Thursday 10AM PDT (Zoom, Meeting Notes).

We currently utilize the #wg-serving slack channel for communications.

Contributions are readily welcomed, follow the dev guide to start contributing!

Code of conduct

Participation in the Kubernetes community is governed by the Kubernetes Code of Conduct.