Skip to content

Stats based on LongAdder can return unexpected values #52411

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ywelsch opened this issue Feb 17, 2020 · 2 comments · Fixed by #52581
Closed

Stats based on LongAdder can return unexpected values #52411

ywelsch opened this issue Feb 17, 2020 · 2 comments · Fixed by #52581
Assignees
Labels
>bug :Data Management/Stats Statistics tracking and retrieval APIs

Comments

@ywelsch
Copy link
Contributor

ywelsch commented Feb 17, 2020

We expect certain stats values to be non-negative when serializing them over the wire. While investigating a test failure (#52406), I noticed that, due to concurrency, counts from LongAdder can be inexact, and even be negative even though in a properly serialized execution no such negative value should be observable.

Here's a test illustrating the issue:

    public void testConcurrentLongAdderAccess() throws InterruptedException {
        AtomicBoolean stopped = new AtomicBoolean();
        LongAdder longAdder = new LongAdder();

        Thread[] writers = new Thread[10];
        for (int i = 0; i < writers.length; i++) {
            writers[i] = new Thread(() -> {
                while(stopped.get() == false) {
                    longAdder.increment();
                    longAdder.decrement();
                }
            });
            writers[i].start();
        }

        Thread[] readers = new Thread[10];
        for (int i = 0; i < writers.length; i++) {
            readers[i] = new Thread(() -> {
                while(stopped.get() == false) {
                    long val = longAdder.longValue();
                    assert val >= 0L : "expected positive but got " + val;
                }
            });
            readers[i].start();
        }

        Thread.sleep(10000);

        stopped.set(true);

        for (int i = 0; i < writers.length; i++) {
            writers[i].join();
        }

        for (int i = 0; i < readers.length; i++) {
            readers[i].join();
        }
    }
@ywelsch ywelsch added >bug :Data Management/Stats Statistics tracking and retrieval APIs labels Feb 17, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (:Core/Features/Stats)

@jasontedor
Copy link
Member

It appears that is documented. The Javadocs for LongAdder#longValue point to the Javadocs for LongAdder#sum:

    /**
     * Equivalent to {@link #sum}.
     *
     * @return the sum
     */
    public long longValue();

And here's the Javadocs for LongAdder#sum:

    /**
     * Returns the current sum.  The returned value is <em>NOT</em> an
     * atomic snapshot; invocation in the absence of concurrent
     * updates returns an accurate result, but concurrent updates that
     * occur while the sum is being calculated might not be
     * incorporated.
     *
     * @return the sum
     */
    public long sum();

dnhatn added a commit that referenced this issue Apr 13, 2021
A CounterMetric is used to track the number of completed and outstanding 
items, for example, the number of executed refreshes, the currently used
memory by indexing, the current pending search requests. In all cases,
the current count of CounterMetric is always non-negative.

However, as this metric is implemented using a LongAdder, the returned
count is NOT an atomic snapshot; invocation in the absence of concurrent
updates returns an accurate result, but concurrent updates that occur
while the sum is being calculated might not be incorporated.

We can replace LongAdder with AtomicLong, but this commit chooses to 
continue using LongAdder but returns 0 when the sum value is negative.

Relates #52411
Closes #70968
dnhatn added a commit to dnhatn/elasticsearch that referenced this issue Apr 14, 2021
A CounterMetric is used to track the number of completed and outstanding 
items, for example, the number of executed refreshes, the currently used
memory by indexing, the current pending search requests. In all cases,
the current count of CounterMetric is always non-negative.

However, as this metric is implemented using a LongAdder, the returned
count is NOT an atomic snapshot; invocation in the absence of concurrent
updates returns an accurate result, but concurrent updates that occur
while the sum is being calculated might not be incorporated.

We can replace LongAdder with AtomicLong, but this commit chooses to 
continue using LongAdder but returns 0 when the sum value is negative.

Relates elastic#52411
Closes elastic#70968
dnhatn added a commit to dnhatn/elasticsearch that referenced this issue Apr 14, 2021
A CounterMetric is used to track the number of completed and outstanding 
items, for example, the number of executed refreshes, the currently used
memory by indexing, the current pending search requests. In all cases,
the current count of CounterMetric is always non-negative.

However, as this metric is implemented using a LongAdder, the returned
count is NOT an atomic snapshot; invocation in the absence of concurrent
updates returns an accurate result, but concurrent updates that occur
while the sum is being calculated might not be incorporated.

We can replace LongAdder with AtomicLong, but this commit chooses to 
continue using LongAdder but returns 0 when the sum value is negative.

Relates elastic#52411
Closes elastic#70968
dnhatn added a commit that referenced this issue Apr 14, 2021
A CounterMetric is used to track the number of completed and outstanding
items, for example, the number of executed refreshes, the currently used
memory by indexing, the current pending search requests. In all cases,
the current count of CounterMetric is always non-negative.

However, as this metric is implemented using a LongAdder, the returned
count is NOT an atomic snapshot; invocation in the absence of concurrent
updates returns an accurate result, but concurrent updates that occur
while the sum is being calculated might not be incorporated.

We can replace LongAdder with AtomicLong, but this commit chooses to
continue using LongAdder but returns 0 when the sum value is negative.

Relates #52411
Closes #70968
dnhatn added a commit that referenced this issue Apr 14, 2021
A CounterMetric is used to track the number of completed and outstanding 
items, for example, the number of executed refreshes, the currently used
memory by indexing, the current pending search requests. In all cases,
the current count of CounterMetric is always non-negative.

However, as this metric is implemented using a LongAdder, the returned
count is NOT an atomic snapshot; invocation in the absence of concurrent
updates returns an accurate result, but concurrent updates that occur
while the sum is being calculated might not be incorporated.

We can replace LongAdder with AtomicLong, but this commit chooses to 
continue using LongAdder but returns 0 when the sum value is negative.

Relates #52411
Closes #70968
dnhatn added a commit that referenced this issue Apr 14, 2021
A CounterMetric is used to track the number of completed and outstanding
items, for example, the number of executed refreshes, the currently used
memory by indexing, the current pending search requests. In all cases,
the current count of CounterMetric is always non-negative.

However, as this metric is implemented using a LongAdder, the returned
count is NOT an atomic snapshot; invocation in the absence of concurrent
updates returns an accurate result, but concurrent updates that occur
while the sum is being calculated might not be incorporated.

We can replace LongAdder with AtomicLong, but this commit chooses to
continue using LongAdder but returns 0 when the sum value is negative.

Relates #52411
Closes #70968
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Data Management/Stats Statistics tracking and retrieval APIs
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants