Skip to content

fix: Atomic increment is now performed within one enqueued operation #330

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 23, 2024

Conversation

TizianoCoroneo
Copy link
Contributor

The current implementation of the Atomic property wrapper is sometimes used in a way that results in race conditions.

When used like this:

@Atomic var counter: Int = 0
counter += 1

It doesn't result in correct atomic operations. This is also applicable to operations that read the Atomic value, followed by a write based on the read value.

Screenshot 2024-04-22 at 12 58 28 PM

This is because the line counter += 1 is expanded as:

let oldValue = queue.sync { counter }
queue.sync { counter = oldValue + 1 }

And there is nothing preventing a thread to suspend in the middle of these two operations, so that multiple calls to a function containing an increment could be linearized as:

Thread 1: var x = read counter: 0
Thread 2: var y = read counter: 0
Thread 2: counter = y + 1 = 1
Thread 1: counter = x + 1 = 1
// Two threads trying to increment, but we lost one.

This PR fixes a couple of such cases, by adding a new method to the Atomic property wrapper that allows for more complex operations. Using the new method instead of directly performing the increment fixes the issue:

Screenshot 2024-04-22 at 12 58 53 PM

This PR also adds a unit test just for the Atomic wrapper that demonstrates the issue.

There are another couple cases that I left as they are since I'm not so sure how to fix them:

  • The QueueTimer suspend and resume functions first read the state, then write it if needed, then suspend the timer. This could result in the suspend function being called multiple times, same for the resume. I think the best solution would be to protect state and timer using the same queue.
  • In Analytics.swift:62 we check isActiveWriteKey then in line 67 we add the active key. This is also a data race, that could result in the active key added to activeWriteKeys multiple times. Maybe activeWriteKeys should be a Set to avoid duplicates?
  • In the ConnectionMonitor, the connectionStatus is updated every five minutes. Even if the usage of Atomic is flawed, I left it as is because of the large amount of time between calls.

In general, I suggest to get rid of the @Atomic wrapper altogether, and just use DispatchQueues directly. It will likely result in less queues being managed by the system, and less tricky mistakes like this one. Even better if you move to actor 😄

Copy link
Contributor

@bsneed bsneed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @TizianoCoroneo ! I'll look at what to do re Atomic in the bigger picture.

@bsneed bsneed merged commit 89014de into segmentio:main Apr 23, 2024
1 check passed
@bsneed
Copy link
Contributor

bsneed commented May 13, 2024

please see #343 at your convenience. thanks again for noticing this @TizianoCoroneo !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants