From 15af84dcd693ce770de25d8e20b4fd9452a92097 Mon Sep 17 00:00:00 2001
From: Simon Kurtz <84809797+simonkurtz-MSFT@users.noreply.github.com>
Date: Mon, 3 Jun 2024 14:29:07 -0400
Subject: [PATCH 1/2] Update productionizing.md

Add link to PR for openai-priority-loadbalancer
---
 docs/productionizing.md | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/docs/productionizing.md b/docs/productionizing.md
index 233dce4216..c255004bfd 100644
--- a/docs/productionizing.md
+++ b/docs/productionizing.md
@@ -24,9 +24,10 @@ If the maximum TPM isn't enough for your expected load, you have a few options:
 
 * Use a backoff mechanism to retry the request. This is helpful if you're running into a short-term quota due to bursts of activity but aren't over long-term quota. The [tenacity](https://tenacity.readthedocs.io/en/latest/) library is a good option for this, and this [pull request](https://github.com/Azure-Samples/azure-search-openai-demo/pull/500) shows how to apply it to this app.
 
-* If you are consistently going over the TPM, then consider implementing a load balancer between OpenAI instances. Most developers implement that using Azure API Management or container-based load balancers. For seamless integration instructions with this sample, please check:
+* If you are consistently going over the TPM, then consider implementing a load balancer between OpenAI instances. Most developers implement that using Azure API Management or container-based load balancers. A native Python approach that integrates with the OpenAI Python API Library is also possible. For seamless integration instructions with this sample, please check:
   * [Scale Azure OpenAI for Python with Azure API Management](https://learn.microsoft.com/azure/developer/python/get-started-app-chat-scaling-with-azure-api-management)
   * [Scale Azure OpenAI for Python chat using RAG with Azure Container Apps](https://learn.microsoft.com/azure/developer/python/get-started-app-chat-scaling-with-azure-container-apps)
+  * [Scale Azure OpenAI for Python with the Python openai-priority-loadbalancer](https://github.com/Azure-Samples/azure-search-openai-demo/pull/1626)   
 
 ### Azure Storage
 

From abb1218c1d2e6549214359bd3dcee162563c7687 Mon Sep 17 00:00:00 2001
From: Pamela Fox <pamela.fox@gmail.com>
Date: Mon, 3 Jun 2024 11:34:18 -0700
Subject: [PATCH 2/2] Apply suggestions from code review

---
 docs/productionizing.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/productionizing.md b/docs/productionizing.md
index c255004bfd..9f74a36e4a 100644
--- a/docs/productionizing.md
+++ b/docs/productionizing.md
@@ -24,10 +24,10 @@ If the maximum TPM isn't enough for your expected load, you have a few options:
 
 * Use a backoff mechanism to retry the request. This is helpful if you're running into a short-term quota due to bursts of activity but aren't over long-term quota. The [tenacity](https://tenacity.readthedocs.io/en/latest/) library is a good option for this, and this [pull request](https://github.com/Azure-Samples/azure-search-openai-demo/pull/500) shows how to apply it to this app.
 
-* If you are consistently going over the TPM, then consider implementing a load balancer between OpenAI instances. Most developers implement that using Azure API Management or container-based load balancers. A native Python approach that integrates with the OpenAI Python API Library is also possible. For seamless integration instructions with this sample, please check:
+* If you are consistently going over the TPM, then consider implementing a load balancer between OpenAI instances. Most developers implement that using Azure API Management or container-based load balancers. A native Python approach that integrates with the OpenAI Python API Library is also possible. For integration instructions with this sample, please check:
   * [Scale Azure OpenAI for Python with Azure API Management](https://learn.microsoft.com/azure/developer/python/get-started-app-chat-scaling-with-azure-api-management)
   * [Scale Azure OpenAI for Python chat using RAG with Azure Container Apps](https://learn.microsoft.com/azure/developer/python/get-started-app-chat-scaling-with-azure-container-apps)
-  * [Scale Azure OpenAI for Python with the Python openai-priority-loadbalancer](https://github.com/Azure-Samples/azure-search-openai-demo/pull/1626)   
+  * [Pull request: Scale Azure OpenAI for Python with the Python openai-priority-loadbalancer](https://github.com/Azure-Samples/azure-search-openai-demo/pull/1626)   
 
 ### Azure Storage