-
Notifications
You must be signed in to change notification settings - Fork 10
Validate key slots used with RedisCluster::Client#with #314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate key slots used with RedisCluster::Client#with #314
Conversation
I tried really hard to avoid doing this, but unfortunately I don't think we can. In order for pinning support to be safe, the underlying RedisCluster instances we hand out from the `#with` method need to know that they're "locked" to a particular slot, so that they can reject cross-slot accesses that might become invalid on a cluster scale out or in. I had thought it would be possible to do this with middleware, but, the middleware can't easily be accessed from `#with` to set & unset the pinning state.
This is required because the next commit is going to need to access this object from inside a RedisClient middleware.
This is so we can pass it into the individual cluster connections, which obviously need to exist before COMMANDS is run. The cluster connections themselves are going to need this in order to implement slot number validation - because they need to know how to extract the keys.
This is the final phase of pushing down the Command object. Now, it's available on the Node::Config object, and thus from the connection objects.
This is achieved by having the #with method "lock" the yielded connection to a particular slot, and having middleware which implements that locking.
e887cb0
to
b0a0d8a
Compare
Poke @supercaracal wondering if you're able to have a look this week? 🙏 I anticipate this is going to need some feedback and reworking :) |
Sorry for my delayed response. I'm going to review this pull request within a week. |
end | ||
|
||
attr_reader :cluster_commands |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference between cluster_commands
and commands
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I misread it. Forget about that.
@@ -78,14 +79,25 @@ def []=(index, element) | |||
end | |||
end | |||
|
|||
class SingleNodeRedisClient < ::RedisClient | |||
include Pinning::ClientMixin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the following complexity as you said is derived from this mixin and middleware pattern.
- Config instance holds the command information.
- Our cluster client uses our own client implementation.
- Pinning module holds the state of the locking.
I think it would be better to use a simple wrapper object for the slot validation.
class RedisClient
class Cluster
class PinningClient
def initialize(client, key, command, command_builder)
@client = client
@slot = ::RedisClient::Cluster::KeySlotConverter.convert(key)
@command = command
@command_builder = command_builder
end
def call(*args, **kwargs, &block)
command = @command_builder.generate(args, kwargs)
call_v(command, &block)
end
def call_v(command, &block)
validate_slot!(command)
@client.call_v(command, &block)
end
# etc...
private
def validate_slot!(command)
# raise an error if command is invalid
end
end
end
end
class RedisClient
class Cluster
def with(key: nil, hashtag: nil, write: true, retry_count: 0)
key = process_with_arguments(key, hashtag)
node_key = @router.find_node_key_by_key(key, primary: write)
node = @router.find_node(node_key)
@router.try_delegate(node, :with, retry_count: retry_count) do |conn|
yield PinningClient.new(conn, key, @command_builder, @router.command)
end
end
end
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do this, but it's a little awkward because of the need to re-wrap the client if pipelined
or multi
is called. It ends up looking like this: https://github.com/redis-rb/redis-cluster-client/pull/298/files#diff-426e610bbc89f748ae5f9d43790b07d10fed426fa431d6bea29b67f649a7eb72R1 which i remember we were not very happy with.
Most of the complexity here is coming from the fact that we have to change the lifecycle of the RedisClient::Cluster::Command
object so that it's available for the pinning middleware. If we made our custom RedisClient
subclass support some kind of temporary "validation procs" itself, then the RedisCluster::Client
instance can construct the validation proc only when #with
is called and @commands
is already known.
I think that will be a lot simpler - let me try that first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can omit #pipelined
and #multi
in #with
method. There is no user to use in the odd way. We can't never care about all use cases completely. Our users should have responsible to use it method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We definitely need #multi
at least - the whole point is to enable single shard transactions!
I managed to write a much better implementation of the delegator in #298 today though - expect to see another PR tomorrow!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the #multi
methods is holded by the cluster client.
@cluster_client = build_cluster_client
@cluster_client.with(...) do |raw_or_pinning_cli|
raw_or_pinning_cli.call('CMD')
@cluster_client.multi(...) do |raw_or_pinning_cli|
# do something
end
end
Anyway, I think we shouldn't do overengineering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, my plan was to delete the #multi
implementation in RedisClient::Cluster
(or, at least, re-implement it in terms of RedisClient::Cluster#with
.
My understanding was that the original idea of adding this kind of “pinning” was to avoid needing to implement transaction support in RedisClient::Cluster
at all - instead users can issue transactions directly against the correct node selected by #with
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering your decision. In Redis cluster, I think users should use the transaction feature as simple because CAP theory. I've thought the #with
method is just an optional way to be able to use the transaction feature with some complecated ways in cluster mode. I've thought it's not the main interface for the transaction feature. It's a bit different from the redis-client. Please give me time to think it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remember, the original problem I was solving was that RedisClient::Cluster#multi
currently calls the transaction block twice - once itself to figure out what node to send the transaction to, and then RedisClient#multi
calls it again to actually build up the transaction to execute. This interface is also different from what's in RedisClient
!
From my pretty extensive survey of the options here over the last few months, there's a pretty fundamental tradeoff to make here:
- If you want to implement transaction support inside redis-cluster-client in terms of
RedisClient#multi
, you must know the node ahead of time. You can't lazily callRedisClient#multi
once you see the first command in the transaction, because you've already yielded control back to the caller, and can't give the caller the realRedisClient::Multi
object without yielding again (which was the original bug I was trying to solve!). - If you absolutely want
RedisClient::Cluster#multi
to work without knowing the node ahead of time (and thus, make it fully compatible), the only choice is to implement a kind of 'lazy transaction' system like I did in Support watch & unwatch properly #295. That is, you wait for the first command in the transaction, then issue MULTI on the appropriate node (and, thus, the redis-cluster-client must also take responsibility for calling EXEC/DISCARD/UNWATCH when appropriate too). This will work when issuingclient.call('MULTI') ... client.call('EXEC')
, which is valid too withRedisClient
.
My understanding is that we settled on the #with
interface as the best way to implement "knowing the node ahead of time", because you can also make the same code still work with RedisClient (since RedisClient#with
does nothing - like I wrote in the docs here:
redis-cluster-client/README.md
Lines 253 to 255 in 86c0e01
Because `RedisClient` from the redis-client gem implements `#with` as simply `yield self` and ignores all of its | |
arguments, it's possible to write code which is compatible with both redis-client and redis-cluster-client; the `#with` | |
call will pin the connection to a slot when using clustering, or be a no-op when not. |
If we want to make RedisClient::Cluster#multi
keep working how it is now, for backwards compatibility, we can keep its current implementation. However going forward I think we should be recommending people to use RedisClient::Cluter#with
since it doesn't have the calls-block-twice behaviour.
For compatability with RedisClient
, an approach like #295 would be better - it would let us behave exactly like a drop in replacement for RedisClient
.
One more thing, I think it would be better to be able to enable or disable the client side validation feature by options. class RedisClient
class Cluster
def with(key: nil, hashtag: nil, write: true, retry_count: 0)
key = process_with_arguments(key, hashtag)
node_key = @router.find_node_key_by_key(key, primary: write)
node = @router.find_node(node_key)
@router.try_delegate(node, :with, retry_count: retry_count) do |conn|
if @config.client_side_validation_enabled?
yield PinningClient.new(conn, key, @command_builder, @router.command)
else
yield conn
end
end
end
end
end |
Thanks for your feedback @supercaracal - I opened #317 instead which makes the changes you suggested here (it's so different I figured it was better to open a separate PR rather than clutter this one). |
The overall objective of this PR is to unskip the test which asserts that
TransactionConsistencyError
is raised when using a key belonging to the wrong slot on a connection returned from#with
. i.e. it makes this test pass:In order to get there, though, there is a bit of a journey. I've broken it up into five separate commits, which I can make separate PR's for - I did want to show you the whole picture before we drill down on the detail however.
Firstly, and sadly, I discovered that we do actually need to make a custom subclass of
RedisClient
, and pass that asclient_implementation:
in our custom config class. I had previously thought I could implement the slot validation just with middleware, but we need a way to communicate between#with
, which just has access to the client object, and the middleware. By implementing a custom subclass, we can define a methodClient#locked_to_key_slot
, which essentially turns on and off the middleware by setting a flag on the client object which the middleware can see.I do realise that we'd discussed doing this without the client subclass, but unfortunately I could not find another way.
The next phase of this PR involves getting access to the
RedisCluster::Client::Command
object from inside the middleware. There's a bit of circularity here, because obviously we need clients to run COMMAND on, but we need the Command object inside the client middleware (to be able to call#extract_all_keys
on passed in commands). I resolved this by turningCommand
into a long-lived object which gets passed into theNode::Config
redisclient config objects.Once all that's done, we're finally in a position to actually implement the "lock to a single slot"; that then turns out to be pretty simple in a single file
pinning.rb
; we just check if the connection is currently supposed to be pinned, and compare the pinned slot againstCommand#extract_all_keys(command)
if so.Lets debate the general direction of this PR, and I can cut it up into smaller pieces for more detailed review once we're OK with that perhaps?