Skip to content

DataShard: prioritize cancellation processing for reads #14936

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
snaury opened this issue Feb 23, 2025 · 0 comments · Fixed by #15398
Closed

DataShard: prioritize cancellation processing for reads #14936

snaury opened this issue Feb 23, 2025 · 0 comments · Fixed by #15398
Assignees

Comments

@snaury
Copy link
Member

snaury commented Feb 23, 2025

Turns out it is possible to overload datashards with reads, in a subtle way where inflight will essentially be reported as zero, while datashard's mailbox has so many events that there's no chance for each read to be processed before the timeout at the client. When mailbox is overloaded in such a way TEvReadCancel is never processed before the request completes, which means cancellation doesn't really work.

This happens because simple reads are processed in their event handler, and even the pipeline transaction is often executed directly in the event handler. Since any cancellation cannot arrive before the request, such simple requests are essentially uncancellable.

We need to make sure we never execute read requests directly, and we need to make sure we scan the actor mailbox before actually running every request, because the request may actually be cancelled.

Instead of executing the read via pipeline transaction directly, we need to always put new reads in a queue, and handle them one by one by scheduling at new event when executing the last request, similarly to how progress transaction executes immediate transactions. This way we will always scan the mailbox before each execution attempt, and it will give TEvReadCancel a chance to remove reads from the queue. We will also be able to observe queue size for reads, provided this mailbox processing is fast enough. I think we would probably want to move read keys into an event payload, so they are not parsed from protobuf when the request first arrives (this parsing may be more expensive than the read itself).

This shouldn't affect throughput, since there will be at most one extra event per read, and as long as datashard doesn't fully consume a thread it shouldn't affect latency much either (as long as preliminary request processing is very cheap, which means we should probably move as much validation as possible outside of the enqueue), since adding events to a running mailbox is very cheap. We could even resend the same event handle to avoid extra allocation and adding a flag marking "ready for processing". In case TEvReadCancel arrives first the enqueued event will then be marked as cancelled and ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant