Skip to content

Network retries or time outs #3304

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
kraghu opened this issue Aug 2, 2017 · 5 comments
Closed

Network retries or time outs #3304

kraghu opened this issue Aug 2, 2017 · 5 comments
Assignees

Comments

@kraghu
Copy link

kraghu commented Aug 2, 2017

Please answer these questions before submitting your issue.

What version of gRPC are you using?

1.4.0 on Android

What JVM are you using (java -version)?

1.8

What did you do?

I am trying to add a retry policy. It seems grpc after network failure doesn't re-connect immidiately even when network is back.

If possible, provide a recipe for reproducing the error.

  • disable network n keep on aeroplane mode
  • try to run the app
  • enable back network

You will see grpc doesn't connect back even when you retry so many times. it takes for 25-30 secs to retry again.

What did you expect to see?

grpc should connect immediately on network availability. I like to know the right amount of time for waiting .

What did you see instead?

it never connects back

Is there a way to configure the retries ? or to get to know the retry connection time ?

@ericgribkoff
Copy link
Contributor

Thanks for filing this. This is a known issue, but it looks like it doesn't have have a previously tracked issue on Github: gRPC's default name resolver will attempt to re-resolve the hostname the network is down at 60 second intervals. If the network is restored immediately after a failed name resolution attempt, it can take up to 60 seconds for gRPC to attempt to resolve the hostname and the channel to become usable.

The proper fix for this is to implement an Android-specific name resolver that avoids the 60-second timer and receives notice from the OS when the network is back up. This work is in-progress: I'll update this thread once a PR is ready (should be sometime early next week).

You will see grpc doesn't connect back even when you retry so many times. it takes for 25-30 secs to retry again.

The UNAVAILABLE responses you see here are actually cached by the gRPC channel, so the number of times you retry doesn't effect things. One workaround that might help in the meantime is enabling wait-for-ready on the call options: this will cause the RPC to wait for the network to become available, rather than failing immediately. It doesn't avoid the <60 second wait time, but it does avoid having to keep retrying in a loop until the timer goes off.

@kraghu
Copy link
Author

kraghu commented Aug 2, 2017

Thanks @ericgribkoff for the quick response. I will wait for this PR .

@ericgribkoff ericgribkoff assigned ericgribkoff and unassigned kraghu Aug 3, 2017
@ericgribkoff
Copy link
Contributor

Ooops, sorry about the incorrect assigning of this issue. Re-assigned to me, will update when it's ready.

@ejona86
Copy link
Member

ejona86 commented Aug 3, 2017

Related: #2169

@ejona86
Copy link
Member

ejona86 commented Nov 8, 2017

Closing, since this is being handled as part of #2169 (which added an API in 1.8 that should address this) and #3685 (to reduce the impact when not using the new API)

@ejona86 ejona86 closed this as completed Nov 8, 2017
@lock lock bot locked as resolved and limited conversation to collaborators Sep 29, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants