Validate key slots used with RedisCluster::Client#with #314

KJTsanaktsidis · 2024-01-23T06:20:47Z

The overall objective of this PR is to unskip the test which asserts that TransactionConsistencyError is raised when using a key belonging to the wrong slot on a connection returned from #with. i.e. it makes this test pass:

        assert_raises(::RedisClient::Cluster::Transaction::ConsistencyError) do
          @client.with(hashtag: 'slot1') do |conn|
            conn.call('GET', '{slot2}')
          end
        end

In order to get there, though, there is a bit of a journey. I've broken it up into five separate commits, which I can make separate PR's for - I did want to show you the whole picture before we drill down on the detail however.

Firstly, and sadly, I discovered that we do actually need to make a custom subclass of RedisClient, and pass that as client_implementation: in our custom config class. I had previously thought I could implement the slot validation just with middleware, but we need a way to communicate between #with, which just has access to the client object, and the middleware. By implementing a custom subclass, we can define a method Client#locked_to_key_slot, which essentially turns on and off the middleware by setting a flag on the client object which the middleware can see.

I do realise that we'd discussed doing this without the client subclass, but unfortunately I could not find another way.

The next phase of this PR involves getting access to the RedisCluster::Client::Command object from inside the middleware. There's a bit of circularity here, because obviously we need clients to run COMMAND on, but we need the Command object inside the client middleware (to be able to call #extract_all_keys on passed in commands). I resolved this by turning Command into a long-lived object which gets passed into the Node::Config redisclient config objects.

Once all that's done, we're finally in a position to actually implement the "lock to a single slot"; that then turns out to be pretty simple in a single file pinning.rb; we just check if the connection is currently supposed to be pinned, and compare the pinned slot against Command#extract_all_keys(command) if so.

Lets debate the general direction of this PR, and I can cut it up into smaller pieces for more detailed review once we're OK with that perhaps?

I tried really hard to avoid doing this, but unfortunately I don't think we can. In order for pinning support to be safe, the underlying RedisCluster instances we hand out from the `#with` method need to know that they're "locked" to a particular slot, so that they can reject cross-slot accesses that might become invalid on a cluster scale out or in. I had thought it would be possible to do this with middleware, but, the middleware can't easily be accessed from `#with` to set & unset the pinning state.

This is required because the next commit is going to need to access this object from inside a RedisClient middleware.

This is so we can pass it into the individual cluster connections, which obviously need to exist before COMMANDS is run. The cluster connections themselves are going to need this in order to implement slot number validation - because they need to know how to extract the keys.

This is the final phase of pushing down the Command object. Now, it's available on the Node::Config object, and thus from the connection objects.

This is achieved by having the #with method "lock" the yielded connection to a particular slot, and having middleware which implements that locking.

KJTsanaktsidis · 2024-01-30T04:20:35Z

Poke @supercaracal wondering if you're able to have a look this week? 🙏 I anticipate this is going to need some feedback and reworking :)

supercaracal · 2024-01-30T07:20:25Z

Sorry for my delayed response. I'm going to review this pull request within a week.

supercaracal · 2024-02-05T05:44:11Z

lib/redis_client/cluster/node.rb

        end

+        attr_reader :cluster_commands


What's the difference between cluster_commands and commands?

Sorry, I misread it. Forget about that.

supercaracal · 2024-02-05T06:29:41Z

lib/redis_client/cluster/node.rb

@@ -78,14 +79,25 @@ def []=(index, element)
        end
      end

+      class SingleNodeRedisClient < ::RedisClient
+        include Pinning::ClientMixin


I think the following complexity as you said is derived from this mixin and middleware pattern.

Config instance holds the command information.

Our cluster client uses our own client implementation.

Pinning module holds the state of the locking.

I think it would be better to use a simple wrapper object for the slot validation.

class RedisClient class Cluster class PinningClient def initialize(client, key, command, command_builder) @client = client @slot = ::RedisClient::Cluster::KeySlotConverter.convert(key) @command = command @command_builder = command_builder end def call(*args, **kwargs, &block) command = @command_builder.generate(args, kwargs) call_v(command, &block) end def call_v(command, &block) validate_slot!(command) @client.call_v(command, &block) end # etc... private def validate_slot!(command) # raise an error if command is invalid end end end end

class RedisClient class Cluster def with(key: nil, hashtag: nil, write: true, retry_count: 0) key = process_with_arguments(key, hashtag) node_key = @router.find_node_key_by_key(key, primary: write) node = @router.find_node(node_key) @router.try_delegate(node, :with, retry_count: retry_count) do |conn| yield PinningClient.new(conn, key, @command_builder, @router.command) end end end end

We can do this, but it's a little awkward because of the need to re-wrap the client if pipelined or multi is called. It ends up looking like this: https://github.com/redis-rb/redis-cluster-client/pull/298/files#diff-426e610bbc89f748ae5f9d43790b07d10fed426fa431d6bea29b67f649a7eb72R1 which i remember we were not very happy with.

Most of the complexity here is coming from the fact that we have to change the lifecycle of the RedisClient::Cluster::Command object so that it's available for the pinning middleware. If we made our custom RedisClient subclass support some kind of temporary "validation procs" itself, then the RedisCluster::Client instance can construct the validation proc only when #with is called and @commands is already known.

I think that will be a lot simpler - let me try that first.

I think we can omit #pipelined and #multi in #with method. There is no user to use in the odd way. We can't never care about all use cases completely. Our users should have responsible to use it method.

We definitely need #multi at least - the whole point is to enable single shard transactions!

I managed to write a much better implementation of the delegator in #298 today though - expect to see another PR tomorrow!

I thought the #multi methods is holded by the cluster client.

@cluster_client = build_cluster_client @cluster_client.with(...) do |raw_or_pinning_cli| raw_or_pinning_cli.call('CMD') @cluster_client.multi(...) do |raw_or_pinning_cli| # do something end end

Anyway, I think we shouldn't do overengineering.

No, my plan was to delete the #multi implementation in RedisClient::Cluster (or, at least, re-implement it in terms of RedisClient::Cluster#with.

My understanding was that the original idea of adding this kind of “pinning” was to avoid needing to implement transaction support in RedisClient::Cluster at all - instead users can issue transactions directly against the correct node selected by #with

I'm wondering your decision. In Redis cluster, I think users should use the transaction feature as simple because CAP theory. I've thought the #with method is just an optional way to be able to use the transaction feature with some complecated ways in cluster mode. I've thought it's not the main interface for the transaction feature. It's a bit different from the redis-client. Please give me time to think it.

Remember, the original problem I was solving was that RedisClient::Cluster#multi currently calls the transaction block twice - once itself to figure out what node to send the transaction to, and then RedisClient#multi calls it again to actually build up the transaction to execute. This interface is also different from what's in RedisClient!

From my pretty extensive survey of the options here over the last few months, there's a pretty fundamental tradeoff to make here:

If you want to implement transaction support inside redis-cluster-client in terms of RedisClient#multi, you must know the node ahead of time. You can't lazily call RedisClient#multi once you see the first command in the transaction, because you've already yielded control back to the caller, and can't give the caller the real RedisClient::Multi object without yielding again (which was the original bug I was trying to solve!).

If you absolutely want RedisClient::Cluster#multi to work without knowing the node ahead of time (and thus, make it fully compatible), the only choice is to implement a kind of 'lazy transaction' system like I did in Support watch & unwatch properly #295. That is, you wait for the first command in the transaction, then issue MULTI on the appropriate node (and, thus, the redis-cluster-client must also take responsibility for calling EXEC/DISCARD/UNWATCH when appropriate too). This will work when issuing client.call('MULTI') ... client.call('EXEC'), which is valid too with RedisClient.

My understanding is that we settled on the #with interface as the best way to implement "knowing the node ahead of time", because you can also make the same code still work with RedisClient (since RedisClient#with does nothing - like I wrote in the docs here:

redis-cluster-client/README.md

Lines 253 to 255 in 86c0e01

Because `RedisClient` from the redis-client gem implements `#with` as simply `yield self` and ignores all of its

arguments, it's possible to write code which is compatible with both redis-client and redis-cluster-client; the `#with`

call will pin the connection to a slot when using clustering, or be a no-op when not.

).

If we want to make RedisClient::Cluster#multi keep working how it is now, for backwards compatibility, we can keep its current implementation. However going forward I think we should be recommending people to use RedisClient::Cluter#with since it doesn't have the calls-block-twice behaviour.

For compatability with RedisClient, an approach like #295 would be better - it would let us behave exactly like a drop in replacement for RedisClient.

supercaracal · 2024-02-05T06:32:42Z

One more thing, I think it would be better to be able to enable or disable the client side validation feature by options.

class RedisClient
  class Cluster
    def with(key: nil, hashtag: nil, write: true, retry_count: 0)
      key = process_with_arguments(key, hashtag)
      node_key = @router.find_node_key_by_key(key, primary: write)
      node = @router.find_node(node_key)
      @router.try_delegate(node, :with, retry_count: retry_count) do |conn|
        if @config.client_side_validation_enabled?
          yield PinningClient.new(conn, key, @command_builder, @router.command)
        else
          yield conn
        end
      end
    end
  end
end

KJTsanaktsidis · 2024-02-07T01:26:01Z

Thanks for your feedback @supercaracal - I opened #317 instead which makes the changes you suggested here (it's so different I figured it was better to open a separate PR rather than clutter this one).

KJ Tsanaktsidis added 5 commits January 23, 2024 17:21

Push command construction down into the Node object

5c01be5

This is required because the next commit is going to need to access this object from inside a RedisClient middleware.

Finish drilling @command object into client connection

13b3153

This is the final phase of pushing down the Command object. Now, it's available on the Node::Config object, and thus from the connection objects.

Implement validation of slot numbers in #with

b0a0d8a

This is achieved by having the #with method "lock" the yielded connection to a particular slot, and having middleware which implements that locking.

KJTsanaktsidis force-pushed the ktsanaktsidis/validate_keys branch from e887cb0 to b0a0d8a Compare January 23, 2024 06:21

KJTsanaktsidis mentioned this pull request Jan 23, 2024

Implementation for transaction #299

Closed

supercaracal self-requested a review January 23, 2024 07:59

supercaracal reviewed Feb 5, 2024

View reviewed changes

supercaracal self-requested a review February 5, 2024 06:32

KJTsanaktsidis mentioned this pull request Feb 7, 2024

Validate keys when a raw connection is checked out with #with #317

Closed

KJTsanaktsidis closed this Feb 19, 2024

KJTsanaktsidis deleted the ktsanaktsidis/validate_keys branch February 19, 2024 01:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validate key slots used with RedisCluster::Client#with #314

Validate key slots used with RedisCluster::Client#with #314

KJTsanaktsidis commented Jan 23, 2024

KJTsanaktsidis commented Jan 30, 2024

supercaracal commented Jan 30, 2024

supercaracal Feb 5, 2024

supercaracal Feb 5, 2024

supercaracal Feb 5, 2024 •

edited

Loading

KJTsanaktsidis Feb 5, 2024

supercaracal Feb 6, 2024

KJTsanaktsidis Feb 6, 2024

supercaracal Feb 6, 2024

KJTsanaktsidis Feb 6, 2024

supercaracal Feb 8, 2024

KJTsanaktsidis Feb 8, 2024

supercaracal commented Feb 5, 2024 •

edited

Loading

KJTsanaktsidis commented Feb 7, 2024

	Because `RedisClient` from the redis-client gem implements `#with` as simply `yield self` and ignores all of its
	arguments, it's possible to write code which is compatible with both redis-client and redis-cluster-client; the `#with`
	call will pin the connection to a slot when using clustering, or be a no-op when not.

Validate key slots used with RedisCluster::Client#with #314

Validate key slots used with RedisCluster::Client#with #314

Conversation

KJTsanaktsidis commented Jan 23, 2024

KJTsanaktsidis commented Jan 30, 2024

supercaracal commented Jan 30, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

supercaracal Feb 5, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

supercaracal commented Feb 5, 2024 • edited Loading

KJTsanaktsidis commented Feb 7, 2024

supercaracal Feb 5, 2024 •

edited

Loading

supercaracal commented Feb 5, 2024 •

edited

Loading