ref(crons): Guard clock ticks against desynced-partitions #54489

evanpurkhiser · 2023-08-09T21:28:21Z

This fixes an issue with the original implementation of GH-54204 when processing messages in a non-monotonic order.

Typically kafka messages will be in order like such

However, because of how messages are shared into the kafka partitions we may end up with a secnario that looks like this

  partitions

  #1        #2        #3
  12:59:58  01:00:00  01:00:01
  12:59:59  01:00:01  01:00:02

With one consumer reading from each partition sequentially we would read these out as

  12:59:58
  01:00:00
  01:00:01
  12:59:59 <-- problematic skip backwards in time
  01:00:01
  01:00:02

Prior to this change, when we would process the task_trigger clock tick for the timestamp 12:59:59 after 01:00:01 our GETSET would update the key with an OLDER timestamps.

When the next tick happens at 01:00:01 we would now tick for the 01:00:00 minute boundary again incorrectly.

This change corrects this by first looking at the existing last timestamp value stored in redis, if that value is smaller than the reference timestamp we're about to tick for, do nothing, do not store the older reference timestamp.

evanpurkhiser · 2023-08-09T21:30:14Z

src/sentry/monitors/consumers/monitor_consumer.py

@@ -180,19 +180,33 @@ def _try_handle_high_volume_task_trigger(ts: datetime):
    reference_datetime = ts.replace(second=0, microsecond=0)
    reference_ts = int(reference_datetime.timestamp())

+    existing_last_ts = redis_client.get(HIGH_VOLUME_LAST_TRIGGER_TS_KEY)


I don't particularly LOVE this variable name, any thoughts?

precheck_last_ts? Idk if that's much better

Feels better 👍

wedamija · 2023-08-09T21:31:55Z

src/sentry/monitors/consumers/monitor_consumer.py

@@ -180,19 +180,33 @@ def _try_handle_high_volume_task_trigger(ts: datetime):
    reference_datetime = ts.replace(second=0, microsecond=0)
    reference_ts = int(reference_datetime.timestamp())

+    existing_last_ts = redis_client.get(HIGH_VOLUME_LAST_TRIGGER_TS_KEY)


precheck_last_ts? Idk if that's much better

This fixes an issue with the original implementation of GH-54204 when processing messages in a non-monotonic order. Typically kafka messages will be in order like such 12:59:58 12:59:59 01:00:00 01:00:01 01:00:01 01:00:02 However, because of how messages are shared into the kafka partitions we may end up with a secnario that looks like this partitions #1 #2 #3 12:59:58 01:00:00 01:00:01 12:59:59 01:00:01 01:00:02 With one consumer reading from each partition sequentially we would read these out as 12:59:58 01:00:00 01:00:01 12:59:59 <-- problematic skip backwards in time 01:00:01 01:00:02 Prior to this change, when we would process the task_trigger clock tick for the timestamp `12:59:59` after `01:00:01` our `GETSET` would update the key with an OLDER timestamps. When the next tick happens at `01:00:01` we would now tick for the `01:00:00` minute boundary again incorrectly. This change corrects this by first looking at the existing last timestamp value stored in redis, if that value is smaller than the reference timestamp we're about to tick for, do nothing, do not store the older reference timestamp.

evanpurkhiser requested a review from a team as a code owner August 9, 2023 21:28

evanpurkhiser requested a review from wedamija August 9, 2023 21:28

github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Aug 9, 2023

evanpurkhiser commented Aug 9, 2023

View reviewed changes

vercel bot deployed to Preview August 9, 2023 21:30 View deployment

wedamija approved these changes Aug 9, 2023

View reviewed changes

evanpurkhiser force-pushed the evanpurkhiser/ref-crons-guard-clock-ticks-against-desynced-partitions branch from 0601c4e to bd9187d Compare August 9, 2023 21:38

vercel bot deployed to Preview August 9, 2023 21:41 View deployment

evanpurkhiser enabled auto-merge (squash) August 9, 2023 21:57

evanpurkhiser merged commit e0dec76 into master Aug 9, 2023

evanpurkhiser deleted the evanpurkhiser/ref-crons-guard-clock-ticks-against-desynced-partitions branch August 9, 2023 22:07

github-actions bot locked and limited conversation to collaborators Aug 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ref(crons): Guard clock ticks against desynced-partitions #54489

ref(crons): Guard clock ticks against desynced-partitions #54489

Uh oh!

evanpurkhiser commented Aug 9, 2023 •

edited

Loading

Uh oh!

evanpurkhiser Aug 9, 2023

Uh oh!

wedamija Aug 9, 2023

Uh oh!

evanpurkhiser Aug 9, 2023

Uh oh!

wedamija Aug 9, 2023

Uh oh!

Uh oh!

Uh oh!

ref(crons): Guard clock ticks against desynced-partitions #54489

ref(crons): Guard clock ticks against desynced-partitions #54489

Uh oh!

Conversation

evanpurkhiser commented Aug 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

evanpurkhiser Aug 9, 2023

Choose a reason for hiding this comment

Uh oh!

wedamija Aug 9, 2023

Choose a reason for hiding this comment

Uh oh!

evanpurkhiser Aug 9, 2023

Choose a reason for hiding this comment

Uh oh!

wedamija Aug 9, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

evanpurkhiser commented Aug 9, 2023 •

edited

Loading