Skip to content

Caching of cluster slots doesn't self-heal once it doesn't know the whole cluster slot mapping, even though cluster self healed #3620

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
TimLovellSmith opened this issue Apr 27, 2025 · 2 comments
Assignees

Comments

@TimLovellSmith
Copy link

With redis-py version "5 2 1"

The error message
SlotNotCoveredError('Slot \"4890\" not covered by the cluster. \"require_full_coverage=False\"

is raised, it seems from

raise SlotNotCoveredError(f'Slot "{slot}" is not covered by the cluster.')

unfortunately, if you have a client instance which is raising this error, its not going to self-heal in an important case where it needs to:

That case is, the cluster is actually healthy, and serving that slot (verified per cluster slot mapping) (also note that operations succeed on some client machines but not others), but the client happens to be missing that slot from its local slots cache!

From hastily reviewing code, it appears to me the current redis-py assumption is that every time a slot is missing from its cache, the way to get the slot mapping back into the cache is either, reconnecting, and recreating the cluster mapping, or receiving a MOVED error.

Problem is, this moved error won't actually happen, if the client never gets that error, because the cluster is already healthy and serving the slot - in fact, the client just never even sends the request, so it won't land on a wrong node, and receive a MOVED error response from the server.

What would be needed instead would be something like:
-deliberately route requests to the wrong node, in order to receive the MOVED error and update the cluster mapping, when we don't know it
or
-have an event handler to refresh the cluster mapping when its not completely known

@TimLovellSmith TimLovellSmith changed the title Caching of cluster slots doesn't self-heal once it doesn't know the whole cluster slot mapping, even though cluster might self heal Caching of cluster slots doesn't self-heal once it doesn't know the whole cluster slot mapping, even though cluster self healed Apr 27, 2025
@petyaslavova petyaslavova self-assigned this Apr 28, 2025
@petyaslavova
Copy link
Collaborator

Hi @TimLovellSmith, I ran into the same issue last week and am currently working on a fix.

@petyaslavova
Copy link
Collaborator

Closed with PR #3621

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants