-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Reconnect Strategy doesn't work if connection is lost #2948
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@edrose I dont see the issue here. Even though |
True, internally there is nothing wrong with the logic - it does exactly what it's told to by the reconnection strategy. However as a user of the library I need to be able to detect when the library has given up retrying, otherwise I just have a dead client. Listening for The context is that we have a webapp and Redis running in a Kubernetes cluster on AWS spot instances. Every so often we get spot interruptions that cause our Redis instances to move to different nodes. When this occurs they go down, then come back up with a new IP address. Since we have replication this can occur with no downtime, but it does cause error events to be emitted whenever it happens, even if that doesn't cause an interruption in service. To detect and recover from a persistent error I need to be able to tell the difference between "an error occurred and I'm going to try to fix it" and "an error occurred and I can't do anything more". |
@edrose AFAIK the only thing that controls when the library will choose to stop retrying is the reconnectStrategy, which is controlled by the user. So, you have to return import { GenericContainer } from 'testcontainers';
import { createClient } from 'redis';
const container = await new GenericContainer('redis')
.withExposedPorts(6379)
.start();
const makeCappedRetryStrategy = (maxRetries: number, cleanup: () => void) => {
return (retries: number, cause: Error) => {
console.log(`retrying... ${retries}`);
if (retries >= maxRetries) {
cleanup();
return false;
}
return 1000;
}
}
const client = createClient({
url: `redis://localhost:${container.getMappedPort(6379)}`,
socket: {
reconnectStrategy: makeCappedRetryStrategy(3, () => {
console.log('cleanup function called');
})
}
});
client.on('error', () => {});
await client.connect();
await container.exec('redis-cli shutdown'); In principal, Im not against firing a 'terminated' event, but fundamentally, the behavior is controlled by the reconnectStrategy. |
@edrose what do you think? |
I don't think that code works when I think the pattern of using the side-effects of a callback function is also just asking for bugs to occur, and I would never do extra work in the callback unless the documentation explicitly states that it's acceptable. As a user of the library I don't know what state the Redis client is in when it calls the callback, so I wouldn't assume that it's safe to modify the state of the client whilst in that callback context. As a bare minimum I would be using |
I see your point. I think its sensible to add the event. @bobymicroby do you see any issues for us emitting a 'terminated' event on client when reconnectStrategy returns false ( which is when we kill the socket and client is unusable from that point on ) ? |
The code should work when duplicate is used as every client gets its own "retry" function instance. If it doesn't, it's a bug we missed, but from what I see there shouldn't be a problem. The retryStrategy function that the user supplies has no side effects and is pure. It receives the strategy(retries, cause) - retries so far and the latest error - so you are free to make informed decisions and to decide if you want to keep retrying or not based on that. |
What we can do ( just thinking out loud, not sure if it makes sense ) is wrap the last error in RetryStrategyExhausted(originalError) before emitting it here: https://github.com/bobymicroby/node-redis/blob/065eb5e9145db9152923a44b63a356c55f76cf3b/packages/client/lib/client/socket.ts#L192. Unfortunately this can break clients which expect the raw error and will need more consideration. |
Description
I'm having some issues with the reconnect strategy in this library. It seems to work as per the documentation on the initial connection to Redis, however the behaviour changes if the connection to Redis is lost at a later stage.
The documentation states that there are three different types that can be returned:
number
can be returned to indicate that the client should retry after the specified delayfalse
can be returned to prevent reconnection and hold the client in some kind of broken state for testingBehaviour on Initial Connection
RedisSocket.connect()
function is calledRedisSocket.#connect()
which deals with retriesfalse
(which I assume is a noop in javascript) it bubbles up to the initial call toRedisSocket.connect()
and the promise called by the library user rejects with the errorThis is the behaviour that is documented.
Behaviour when connection is lost
"error"
event fires, which calls intoRedisSocket.#onSocketError()
RedisSocket.#onSocketError()
re-emits the error event and also emits"reconnecting"
before callingthis.#connect()
on line 269this.#connect().catch(() => {}));
Any error in connecting is completely ignored. This means that is has the same effect as if the reconnect strategy function returned
false
which (quoting the documentation here) "should only be used for testing purposes". This silently leaves the client in a broken state.What I expect to happen
At the very least expect the
"error"
event to be emitted with the error returned by the reconnect strategy. Even better would be for a"terminated"
or similar signal to be emitted to tell the library that the retry strategy failed and allow it to do it's own recovery.Node.js Version
18.20.6
Redis Server Version
7.3.241
Node Redis Version
5.0.1
Platform
Linux
Logs
The text was updated successfully, but these errors were encountered: