Skip to content

Multi-key operations in cluster #838

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
danieljohannsen opened this issue May 27, 2018 · 16 comments
Closed

Multi-key operations in cluster #838

danieljohannsen opened this issue May 27, 2018 · 16 comments

Comments

@danieljohannsen
Copy link

In a Redis cluster, I get the following error message when retrieving a RedisValue array by parsing a RedisKey array in StringGet():

Multi-key operations must involve a single slot; keys can use 'hash tags' to help this, i.e. '{/users/12345}/account' and '{/users/12345}/contacts' will always be in the same slot

at StackExchange.Redis.ServerSelectionStrategy.Select(Message message) at StackExchange.Redis.ConnectionMultiplexer.TryPushMessageToBridge[T](Message message, ResultProcessor1 processor, ResultBox1 resultBox, ServerEndPoint& server) at StackExchange.Redis.ConnectionMultiplexer.ExecuteSyncImpl[T](Message message, ResultProcessor1 processor, ServerEndPoint server) at StackExchange.Redis.RedisBase.ExecuteSync[T](Message message, ResultProcessor1 processor, ServerEndPoint server)

What can I do to solve this?

@mgravell
Copy link
Collaborator

mgravell commented May 27, 2018

Can I assume here that you aren't currently using "hash tags", and that these keys are just arbitrary keys?

Assuming that: then currently SE.Redis acts naively here; it assumes any single operation is routable to a single server. Theoretically we could make SE.Redis do the work of noting which keys belong to which servers and doing a scatter/gather between the various nodes (essentially returning a Task.WhenAll over all the strands), but that isn't something that exists today.

Options right now:

  1. use "hash tags" if the data is strongly related (this keeps related data in a single slot)
  2. use individual StringGet
  3. we implement a new API that does a scatter/gather (would need to wait for this)
  4. your code could manually groups by slots to perform multiple varadic gets

Note that 2 doesn't necessarily have performance impact, as it can essentially be pipelined:

RedisKey[] keys = ...
RedisValue[] results = await Task.WhenAll(keys.Select(key => db.StringGetAsync(key));

note that the above does not pay latency per operation

(edit: clarification; swapped Array.ConvertAll for Select)

@danieljohannsen
Copy link
Author

danieljohannsen commented May 27, 2018

Thanks, this is very useful. Option 1 and 4 are not possible in the current use case. Option 3 would be great to have, but Option 2 pipelined also looks like a good solution for now. Thank you!

@dmwharris
Copy link

dmwharris commented May 31, 2018

Just wanted to clarify what are multi-key operations. Does single-key operation mean the operation returns a single key/key-value pair in one call and multi-key that it returns multiple keys/key-value pairs in a single call? If so, of the following operations in my code, are any of these supported in default cluster mode without resorting to any of the options above?

SetMembers
SetCombine(SetOperation.Intersect...
HashGet
SetScan(scankey, scanpattern,...
SetCombine(SetOperation.Difference...
HashGetAll

@dmwharris
Copy link

One other question. If there are current limitations on clusters to handle multi-key operations, then horizontal scaling (adding servers to a cluster) may be less desirable than going to a memory appliance hosting a single Redis instance. Do you have any info or case studies on massive memory installations using single Redis instances?

@NickCraver
Copy link
Collaborator

@dmwharris Can you define "massive"? What sort of scale are we talking about here?

@mgravell
Copy link
Collaborator

Setmembers hashgetall hashget and setscan are all single key operations and should be fine. Just could how many RedisKey parameters a method has: it is is more than one (including via arrays) then it is multi-key, and it will need to be reputable to a single server typically via "hash tags"

@mgravell
Copy link
Collaborator

mgravell commented May 31, 2018 via email

@dmwharris
Copy link

dmwharris commented May 31, 2018

Sure, sorry, massive as in 500GB to multi-terabyte memory like
IBM Power795 (16TB memory)
HP ProLiant DL380p Gen8 Server (12 TB memory)

I'll take this to the Redis server team.

@dmwharris
Copy link

So Intersection and Difference are multi-key operations and are not supported by the Redis server in default cluster mode, even with hash tags? For these operations, using one of the 4 options above may provide a solution?

@NickCraver
Copy link
Collaborator

@dmwharris If using hashtags they should work, it's not that multi-key operations aren't supported, it's multi-node operations aren't supported. If you're on the same node, all of this should behave properly.

Can redis do instances that large? Yep. I haven't personally run at that scale though - we have some in the ~300GB range and run without problems, but you are talking about a single process, single replication (e.g. adding a replica copies all the data) and such so some operations time-wise scale up quite a bit as well. Other auxiliary things like the circular change buffer need to be increased so replicas can initiate and catch up.

It's hard to answer if that's a good approach. How stable is the hardware? How much load? Do you need to scale out? Does rack space matter? How about patching and failovers? I'd say things not really about redis but a single server dependency factor in more at those kinds of scale than the limits of the process (unless 200k+ ops/node consistently, or large blocking transfers is a factor in itself).

If cluster is more appealing (it usually is from an infrastructure and redundancy standpoint if server count isn't a bigger issue), then the multi-key operations is solvable with the tags. Sorry that's not really an answer, but "it depends" with no explanation is crap...those are the factors I think "it depends" on. Happy to chat further on it.

@mgravell
Copy link
Collaborator

mgravell commented Jun 1, 2018 via email

@sergii-s
Copy link

sergii-s commented Jun 19, 2018

Hello @mgravell , I have one question about your point above :

`Options right now:

  1. use "hash tags" if the data is strongly related (this keeps related data in a single slot)
  2. use individual StringGet
  3. we implement a new API that does a scatter/gather (would need to wait for this)
  4. your code could manually groups by slots to perform multiple varadic gets`
    Note that 2 doesn't necessarily have performance impact, as it can essentially be pipelined

Unfortunately option 1 is not possible for me.
I'm currently testing "2" and "4" options, but I've got a significant performance degradation when trying requesting 100+ keys in parallel due to TPL saturation. Grouping by slot improves perfs a little bit, but it's not enough.

The solution I suppose is to implement explicit Pipeline operation to regroup all parallel request into one. Like that we will have only one TaskCompletionSource on the TPL unlike current implicit pipelining .

Is that what you meant by option 3 ? If no, do you think this solution is justifiable and could be eventually implemented ?

@mgravell
Copy link
Collaborator

mgravell commented Jun 19, 2018 via email

mgravell added a commit that referenced this issue Jul 9, 2018
…the muxer's state; also avoid hash-slot overhead when not cluster
@mgravell
Copy link
Collaborator

mgravell commented Jul 9, 2018

I've added a GetHashSlot method into the 2.0 code to make this possible for app-code without having to implement the algorithm, and tests that illustrate usage. After consideration, I do not propose to change the varadic multi-string get - IMO it should be obvious and explicit to callers that something unusual is happening, which this would mask. There is non-trivial cost associated with this kind of scatter/gather operation over multiple nodes. We might reconsider that as a separate issue in due course

@ilia-cy
Copy link

ilia-cy commented Jun 6, 2022

@mgravell
How would you suggest dealing with this error during lua script executions (in cluster of course)?

We have the following script:

local current
current = redis.call(""incr"", KEYS[1])
if current == 1 then
    redis.call(""expire"", KEYS[1], KEYS[2])
    return 1
else
    return current
end

We run it by calling _db.ScriptEvaluateAsync(Script, new RedisKey[] { key, ttl });

What can be done in order to avoid getting the multi key operation errors?

@mgravell
Copy link
Collaborator

mgravell commented Jun 6, 2022

the second parameter is presumably the timeout value? in that case, it isn't a key - it is a value; so: instead of KEYS[2] specify ARGV[1], and pass the value in the values array:

_db.ScriptEvaluateAsync(Script, new RedisKey[] { key }, new RedisValue[] { ttl });

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants