-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Multi-key operations in cluster #838
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Can I assume here that you aren't currently using "hash tags", and that these keys are just arbitrary keys? Assuming that: then currently SE.Redis acts naively here; it assumes any single operation is routable to a single server. Theoretically we could make SE.Redis do the work of noting which keys belong to which servers and doing a scatter/gather between the various nodes (essentially returning a Options right now:
Note that 2 doesn't necessarily have performance impact, as it can essentially be pipelined:
note that the above does not pay latency per operation (edit: clarification; swapped |
Thanks, this is very useful. Option 1 and 4 are not possible in the current use case. Option 3 would be great to have, but Option 2 pipelined also looks like a good solution for now. Thank you! |
Just wanted to clarify what are multi-key operations. Does single-key operation mean the operation returns a single key/key-value pair in one call and multi-key that it returns multiple keys/key-value pairs in a single call? If so, of the following operations in my code, are any of these supported in default cluster mode without resorting to any of the options above? SetMembers |
One other question. If there are current limitations on clusters to handle multi-key operations, then horizontal scaling (adding servers to a cluster) may be less desirable than going to a memory appliance hosting a single Redis instance. Do you have any info or case studies on massive memory installations using single Redis instances? |
@dmwharris Can you define "massive"? What sort of scale are we talking about here? |
Setmembers hashgetall hashget and setscan are all single key operations and should be fine. Just could how many RedisKey parameters a method has: it is is more than one (including via arrays) then it is multi-key, and it will need to be reputable to a single server typically via "hash tags" |
Regards the second question (about scale) - frankly, that's a redis server
question and would best be posted to the redis server folks. The client
library merely exposes the functionality (and limitations) available at the
server.
…On Thu, 31 May 2018, 16:21 dmwharris, ***@***.***> wrote:
One other question. If there are current limitations on clusters to handle
multi-key operations, then horizontal scaling (adding servers to a cluster)
may be less desirable than going to a memory appliance hosting a single
Redis instance. Do you have any info or case studies on massive memory
installations using single Redis instances?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#838 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AABDsKw1UDD8xeQ8ijhOwu0-xy3ytEDkks5t4AqOgaJpZM4UPN9a>
.
|
Sure, sorry, massive as in 500GB to multi-terabyte memory like I'll take this to the Redis server team. |
So Intersection and Difference are multi-key operations and are not supported by the Redis server in default cluster mode, even with hash tags? For these operations, using one of the 4 options above may provide a solution? |
@dmwharris If using hashtags they should work, it's not that multi-key operations aren't supported, it's multi-node operations aren't supported. If you're on the same node, all of this should behave properly. Can redis do instances that large? Yep. I haven't personally run at that scale though - we have some in the ~300GB range and run without problems, but you are talking about a single process, single replication (e.g. adding a replica copies all the data) and such so some operations time-wise scale up quite a bit as well. Other auxiliary things like the circular change buffer need to be increased so replicas can initiate and catch up. It's hard to answer if that's a good approach. How stable is the hardware? How much load? Do you need to scale out? Does rack space matter? How about patching and failovers? I'd say things not really about redis but a single server dependency factor in more at those kinds of scale than the limits of the process (unless 200k+ ops/node consistently, or large blocking transfers is a factor in itself). If cluster is more appealing (it usually is from an infrastructure and redundancy standpoint if server count isn't a bigger issue), then the multi-key operations is solvable with the tags. Sorry that's not really an answer, but "it depends" with no explanation is crap...those are the factors I think "it depends" on. Happy to chat further on it. |
"it's multi-node operations aren't supported."
Just for completeness: the library actually tried to enforce that multi
**slot** operations aren't supported. The cluster keyspace is divided into
16k slots, with each node servicing some subset of slots. However, crossing
two slots is always a semantic error, even if they currently resolve to the
same node, because one of the key design points of cluster is that slots
can migrate between nodes. This means that a multi-key operation could turn
out to be single node today, but multi node tomorrow, causing unexpected
failures. To guard against this, the library enforces single slot
operations.
…On Fri, 1 Jun 2018, 23:21 Nick Craver, ***@***.***> wrote:
@dmwharris <https://github.com/dmwharris> If using hashtags they should
work, it's not that multi-key operations aren't supported, it's multi-node
operations aren't supported. If you're on the same node, all of this should
behave properly.
Can redis do instances that large? Yep. I haven't personally run at that
scale though - we have some in the ~300GB range and run without problems,
but you are talking about a single process, single replication (e.g. adding
a replica copies all the data) and such so some operations time-wise scale
up quite a bit as well. Other auxiliary things like the circular change
buffer need to be increased so replicas can initiate and catch up.
It's hard to answer if that's a good approach. How stable is the hardware?
How much load? Do you need to scale out? Does rack space matter? How about
patching and failovers? I'd say things *not really about redis* but a
single server dependency factor in more at those kinds of scale than the
limits of the process (unless 200k+ ops/node consistently, or large
blocking transfers is a factor in itself).
If cluster is more appealing (it usually is from an infrastructure and
redundancy standpoint if server count isn't a bigger issue), then the
multi-key operations is solvable with the tags. Sorry that's not really an
answer, but "it depends" with no explanation is crap...those are the
factors I think "it depends" on. Happy to chat further on it.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#838 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AABDsE3QnbCJ2ALFChqibogb9CAu_M-Gks5t4b58gaJpZM4UPN9a>
.
|
Hello @mgravell , I have one question about your point above :
Unfortunately option 1 is not possible for me. The solution I suppose is to implement explicit Pipeline operation to regroup all parallel request into one. Like that we will have only one TaskCompletionSource on the TPL unlike current implicit pipelining . Is that what you meant by option 3 ? If no, do you think this solution is justifiable and could be eventually implemented ? |
Ah, the TPL; always a giver. Yes, that's broadly what I meant by 3. There's
already a method that gets the *slot* by key, but that isn't really
granular enough.
Let me have a look - it probably isn't too ugly to add.
…On Tue, 19 Jun 2018, 16:29 Sergii, ***@***.***> wrote:
Hello @mgravell <https://github.com/mgravell> , I have one question about
your point above :
`Options right now:
1. use "hash tags" if the data is strongly related (this keeps related
data in a single slot)
2. use individual StringGet
3. we implement a new API that does a scatter/gather (would need to
wait for this)
4. your code could manually groups by slots to perform multiple
varadic gets`
Note that 2 doesn't necessarily have performance impact, as it can
essentially be pipelined
Unfortunately option 1 is not possible for me.
I'm currently testing "2" and "4" options, but I've got a significant
performance degradation when trying requesting 100+ keys in parallel due to
TPL saturation. Grouping by slot improves perfs *a little bit*, but it's
not enough.
The solution I suppose is to implement explicit Pipeline operation to
regroup all parallel request into one. Unlike current implicit pipelining
we will have only one TaskCompletionSource on the TPL.
Is that what you meant by option 3 ? If no, do you think this solution is
justifiable and could be eventually implemented ?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#838 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AABDsKCoagaYFosQIbFmKad66DhwR5jVks5t-RjBgaJpZM4UPN9a>
.
|
…the muxer's state; also avoid hash-slot overhead when not cluster
I've added a |
@mgravell We have the following script:
We run it by calling What can be done in order to avoid getting the multi key operation errors? |
the second parameter is presumably the timeout value? in that case, it isn't a key - it is a value; so: instead of _db.ScriptEvaluateAsync(Script, new RedisKey[] { key }, new RedisValue[] { ttl }); |
In a Redis cluster, I get the following error message when retrieving a RedisValue array by parsing a RedisKey array in StringGet():
at StackExchange.Redis.ServerSelectionStrategy.Select(Message message) at StackExchange.Redis.ConnectionMultiplexer.TryPushMessageToBridge[T](Message message, ResultProcessor
1 processor, ResultBox1 resultBox, ServerEndPoint& server) at StackExchange.Redis.ConnectionMultiplexer.ExecuteSyncImpl[T](Message message, ResultProcessor
1 processor, ServerEndPoint server) at StackExchange.Redis.RedisBase.ExecuteSync[T](Message message, ResultProcessor1 processor, ServerEndPoint server)
What can I do to solve this?
The text was updated successfully, but these errors were encountered: