Adds e2e test readme

danehans · danehans · commit 464d8a3fc917 · 2025-01-28T17:28:04.000Z
Signed-off-by: Daneyon Hansen &lt;daneyon.hansen@solo.io&gt;
diff --git a/Makefile b/Makefile
@@ -105,9 +105,8 @@ vet: ## Run go vet against code.
 test: manifests generate fmt vet envtest ## Run tests.
 	KUBEBUILDER_ASSETS="$(shell $(ENVTEST) use $(ENVTEST_K8S_VERSION) --bin-dir $(LOCALBIN) -p path)" go test $$(go list ./... | grep -v /e2e) -coverprofile cover.out
 
-# Utilize Kind or modify the e2e tests to load the image locally, enabling compatibility with other vendors.
-.PHONY: test-e2e  # Run the e2e tests against a Kind k8s instance that is spun up.
-test-e2e:
+.PHONY: test-e2e
+test-e2e: ## Run end-to-end tests against an existing Kubernetes cluster with at least 3 available GPUs.
 	go test ./test/e2e/ -v -ginkgo.v
 
 .PHONY: lint
diff --git a/README.md b/README.md
@@ -12,6 +12,10 @@ This project is currently in development.
 
 Follow this [README](./pkg/README.md) to get the inference-extension up and running on your cluster!
 
+## End-to-End Tests
+
+Follow this [README](./test/e2e/README.md) to learn more about running the inference-extension end-to-end test suite on your cluster.
+
 ## Website
 
 Detailed documentation is available on our website: https://gateway-api-inference-extension.sigs.k8s.io/
diff --git a/pkg/README.md b/pkg/README.md
@@ -4,7 +4,7 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
 
 ### Requirements
  - Envoy Gateway [v1.2.1](https://gateway.envoyproxy.io/docs/install/install-yaml/#install-with-yaml) or higher
- - A cluster that meets the following requirements:
+ - A cluster with:
    - Support for Services of type `LoadBalancer`. (This can be validated by ensuring your Envoy Gateway is up and running). For example, with Kind,
      you can follow [these steps](https://kind.sigs.k8s.io/docs/user/loadbalancer).
    - 3 GPUs to run the sample model server. Adjust the number of replicas in `./manifests/vllm/deployment.yaml` as needed.
diff --git a/test/e2e/README.md b/test/e2e/README.md
@@ -0,0 +1,38 @@
+# End-to-End Tests
+
+This document provides instructions on how to run the end-to-end tests.
+
+## Overview
+
+The end-to-end tests are designed to validate end-to-end Gateway API Inference Extension functionality. These tests are executed against a Kubernetes cluster and use the Ginkgo testing framework to ensure the extension behaves as expected.
+
+## Prerequisites
+
+- [Go](https://golang.org/doc/install) installed on your machine.
+- [Make](https://www.gnu.org/software/make/manual/make.html) installed to run the end-to-end test target.
+- A Hugging Face Hub token with access to the [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf) model.
+
+## Running the End-to-End Tests
+
+Follow these steps to run the end-to-end tests:
+
+1. **Clone the Repository**: Clone the `gateway-api-inference-extension` repository:
+
+   ```sh
+   git clone https://github.com/kubernetes-sigs/gateway-api-inference-extension.git && cd gateway-api-inference-extension
+   ```
+
+1. **Export Your Hugging Face Hub Token**: The token is required to run the test model server:
+
+   ```sh
+   export HF_TOKEN=<MY_HF_TOKEN>
+   ```
+
+1. **Run the Tests**: Run the `test-e2e` target:
+
+   ```sh
+   make test-e2e
+   ```
+
+   The test suite prints details for each step. Note that the `vllm-llama2-7b-pool` model server deployment
+   may take several minutes to report an `Available=True` status due to the time required for bootstraping.