determine whether default_timeout=None is reasonable #1897

vchudnov-g · 2024-01-03T23:43:50Z

In the course of investigating a client issue, I discovered that the transports/base.py wraps some methods with default_timeout=None, as reflected in the golden files here.

Is this reasonable? If the user does not explicitly specify a timeout when invoking the method, the default_timeout value gets passed all the way through the Python requests library and, as currently set, would not time out RPC calls. Should we change the default?

In the current customer issue I'm investigating (Googlers: b/318069394), for some other reason (still unclear) the server appears to be taking too long to respond, but the client's request method only yields control (by throwing an exception) when the connection is finally dropped, not earlier.

[We should also check that retries are enabled correctly with reasonable values.]

The text was updated successfully, but these errors were encountered:

vchudnov-g · 2024-01-03T23:43:57Z

Marking this as P2 because there does appear to be a workaround: explicitly setting the kwarg timeout=N where N is the number of seconds when invoking the GAPIC method for the RPC. That appears to be propagated to the underlying requests library (checked by code inspection, not by running it).

tswast · 2024-01-19T16:45:33Z

We need to be careful with this, as retrying on client-side timeout is a kind of "request hedging" as far as the backend is concerned. In BigQuery, we had troubles (googleapis/python-bigquery#970) where our default client-side timeout was a full minute+ beyond what was documented as the server-side request timeout, but some operations that used to be possible became impossible (I believe due to differences in when that server-side timeout clock starts and the client-side request timeout).

If we were to do this, it'd be better to do proper hedging where we start another request in parallel but keep the original request open for a bit to avoid that kind of problem.

vchudnov-g · 2024-05-03T22:40:45Z

This is still on our backlog to investigate.

vchudnov-g added type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. priority: p2 Moderately-important priority. Fix may not be included in next release. labels Jan 3, 2024

vchudnov-g mentioned this issue Jan 12, 2024

GCE GAPIC is not applying timeout or retry from config file #1905

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

determine whether default_timeout=None is reasonable #1897

determine whether default_timeout=None is reasonable #1897

vchudnov-g commented Jan 3, 2024

vchudnov-g commented Jan 3, 2024

Uh oh!

tswast commented Jan 19, 2024

Uh oh!

vchudnov-g commented May 3, 2024

Uh oh!

determine whether default_timeout=None is reasonable #1897

determine whether default_timeout=None is reasonable #1897

Comments

vchudnov-g commented Jan 3, 2024

vchudnov-g commented Jan 3, 2024

Uh oh!

tswast commented Jan 19, 2024

Uh oh!

vchudnov-g commented May 3, 2024

Uh oh!