Skip to content

Commit 65514ea

Browse files
authored
[serve] Log rejected requests at router side (#51346)
## Why are these changes needed? Router side logs (made less alarming, made clear that request will be retried): ``` INFO 2025-03-13 13:42:35,298 serve 40047 -- Replica(id='7mqhdb0d', deployment='Model', app='default') rejected request because it is at max capacity of 1 ongoing requests. Retrying request 4a843e03-e1c7-47a2-be9d-6c0224108f42. INFO 2025-03-13 13:42:35,298 serve 40047 -- Replica(id='7mqhdb0d', deployment='Model', app='default') rejected request because it is at max capacity of 1 ongoing requests. Retrying request 57d94c8a-13b4-4ea2-a628-75d566ef29e5. INFO 2025-03-13 13:42:35,301 serve 40047 -- Replica(id='7mqhdb0d', deployment='Model', app='default') rejected request because it is at max capacity of 1 ongoing requests. Retrying request 4a843e03-e1c7-47a2-be9d-6c0224108f42. ``` Replica side logs about rejected requests are now DEBUG logs only. This is to make the logs appear less alarming for users who are not familiar with the request lifecycle. The way the logs are now, the user can get confused reading the replica-side logs and think requests got dropped. https://anyscale1.atlassian.net/browse/SERVE-659 --------- Signed-off-by: Cindy Zhang <[email protected]>
1 parent adfb750 commit 65514ea

File tree

3 files changed

+13
-1
lines changed

3 files changed

+13
-1
lines changed

Diff for: python/ray/serve/_private/replica.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -624,7 +624,7 @@ async def handle_request_with_rejection(
624624
limit = self._deployment_config.max_ongoing_requests
625625
num_ongoing_requests = self.get_num_ongoing_requests()
626626
if num_ongoing_requests >= limit:
627-
logger.warning(
627+
logger.debug(
628628
f"Replica at capacity of max_ongoing_requests={limit}, "
629629
f"rejecting request {request_metadata.request_id}.",
630630
extra={"log_to_stderr": False},

Diff for: python/ray/serve/_private/router.py

+8
Original file line numberDiff line numberDiff line change
@@ -541,6 +541,14 @@ async def schedule_and_send_request(
541541
self._replica_scheduler.on_new_queue_len_info(r.replica_id, queue_info)
542542
if queue_info.accepted:
543543
return result, r.replica_id
544+
else:
545+
logger.info(
546+
f"{r.replica_id} rejected request because it is at max "
547+
f"capacity of {r.max_ongoing_requests} ongoing request"
548+
f"{'s' if r.max_ongoing_requests > 1 else ''}. "
549+
f"Retrying request {pr.metadata.request_id}.",
550+
extra={"log_to_stderr": False},
551+
)
544552
except asyncio.CancelledError:
545553
# NOTE(edoakes): this is not strictly necessary because there are
546554
# currently no `await` statements between getting the ref and returning,

Diff for: python/ray/serve/tests/unit/test_router.py

+4
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,10 @@ def __init__(
9090
def replica_id(self) -> ReplicaID:
9191
return self._replica_id
9292

93+
@property
94+
def max_ongoing_requests(self) -> int:
95+
return 5
96+
9397
@property
9498
def is_cross_language(self) -> bool:
9599
return self._is_cross_language

0 commit comments

Comments
 (0)