Skip to content

bugfix: native histogram: data race in addExemplar #1609

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 30 additions & 19 deletions prometheus/histogram.go
Original file line number Diff line number Diff line change
Expand Up @@ -1673,6 +1673,7 @@ func makeNativeExemplars(ttl time.Duration, maxCount int) nativeExemplars {

if maxCount < 0 {
maxCount = 0
ttl = -1
}

return nativeExemplars{
Expand All @@ -1682,7 +1683,7 @@ func makeNativeExemplars(ttl time.Duration, maxCount int) nativeExemplars {
}

func (n *nativeExemplars) addExemplar(e *dto.Exemplar) {
if cap(n.exemplars) == 0 {
if n.ttl == -1 {
return
}

Expand All @@ -1705,17 +1706,23 @@ func (n *nativeExemplars) addExemplar(e *dto.Exemplar) {
return
}

if len(n.exemplars) == 1 {
// When the number of exemplars is 1, then
// replace the existing exemplar with the new exemplar.
n.exemplars[0] = e
return
}
// From this point on, the number of exemplars is greater than 1.

// When the number of exemplars exceeds the limit, remove one exemplar.
var (
rIdx int // The index where to remove the old exemplar.

ot = time.Now() // Oldest timestamp seen.
otIdx = -1 // Index of the exemplar with the oldest timestamp.

md = -1.0 // Logarithm of the delta of the closest pair of exemplars.
mdIdx = -1 // Index of the older exemplar within the closest pair.
cLog float64 // Logarithm of the current exemplar.
pLog float64 // Logarithm of the previous exemplar.
md = -1.0 // Logarithm of the delta of the closest pair of exemplars.
rIdx = -1 // Index of the older exemplar within the closest pair and where we need to insert the new exemplar.
cLog float64 // Logarithm of the current exemplar.
pLog float64 // Logarithm of the previous exemplar.
Comment on lines +1722 to +1725
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, totally out of scope for this PR, but more descriptive variable names would remove the necessity for commenting on their meaning 😕

I'm not sure if I'm missing something, but is there a reason to name these variables rIdx, cLog, pLog instead of removeIndex, currentLog or previousLog?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Urgh, reading again replaceIndex is probably a better name for rIdx. Does it make nIdx obsolete? (Do I understand correctly that nIdx is newIndex? 😅

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Historically, we had very different "tastes" playing out between various Prometheus developers, all the way from very long and descriptive names to "use one letter names whenever possible" (slightly exaggerated).

My observation is that Go developers tend towards shorter variable names (have a look at the standard library, for example). But we never explicitly stated a preference for the Prometheus project. In lack of a specific guideline, I would go with generally good practices of programming like "the wider the scope of a variable, the more descriptive its name".

In this case, the variables are local to a (fairly long) function. So I wouldn't be too demanding about the descriptiveness of their names. Furthermore, even previousLog wouldn't really be self explanatory. (Previous log line?) And a name like logarithmOfPreviousExemplar would be clearly too clunky. Given that the comment is needed anyway, I actually like the names as they are. Still very short, but better than just c, p, and r (which others might even prefer).

)

for i, exemplar := range n.exemplars {
Expand All @@ -1726,7 +1733,7 @@ func (n *nativeExemplars) addExemplar(e *dto.Exemplar) {
}

// Find the index at which to insert new the exemplar.
if *e.Value <= *exemplar.Value && nIdx == -1 {
if nIdx == -1 && *e.Value <= *exemplar.Value {
nIdx = i
}

Expand All @@ -1738,11 +1745,13 @@ func (n *nativeExemplars) addExemplar(e *dto.Exemplar) {
}
diff := math.Abs(cLog - pLog)
if md == -1 || diff < md {
// The closest exemplar pair is this: |exemplar.[i] - n.exemplars[i-1].Value| is minimal.
// Choose the exemplar with the older timestamp for replacement.
md = diff
if n.exemplars[i].Timestamp.AsTime().Before(n.exemplars[i-1].Timestamp.AsTime()) {
mdIdx = i
rIdx = i
} else {
mdIdx = i - 1
rIdx = i - 1
}
}

Expand All @@ -1753,8 +1762,11 @@ func (n *nativeExemplars) addExemplar(e *dto.Exemplar) {
if nIdx == -1 {
nIdx = len(n.exemplars)
}
// Here, we have the following relationships:
// n.exemplars[nIdx-1].Value < e.Value <= n.exemplars[nIdx].Value

if otIdx != -1 && e.Timestamp.AsTime().Sub(ot) > n.ttl {
// If the oldest exemplar has expired, then replace it with the new exemplar.
rIdx = otIdx
} else {
// In the previous for loop, when calculating the closest pair of exemplars,
Expand All @@ -1764,23 +1776,22 @@ func (n *nativeExemplars) addExemplar(e *dto.Exemplar) {
if nIdx > 0 {
diff := math.Abs(elog - math.Log(n.exemplars[nIdx-1].GetValue()))
if diff < md {
// The closest exemplar pair is this: |e.Value - n.exemplars[nIdx-1].Value| is minimal.
// Assume that the exemplar we are inserting has a newer timestamp. This is not always
// true, due to concurrency, but it's a good enough approximation.
md = diff
mdIdx = nIdx
if n.exemplars[nIdx-1].Timestamp.AsTime().Before(e.Timestamp.AsTime()) {
mdIdx = nIdx - 1
}
rIdx = nIdx - 1
}
}
if nIdx < len(n.exemplars) {
diff := math.Abs(math.Log(n.exemplars[nIdx].GetValue()) - elog)
if diff < md {
mdIdx = nIdx
if n.exemplars[nIdx].Timestamp.AsTime().Before(e.Timestamp.AsTime()) {
mdIdx = nIdx
}
// The closest exemplar pair is this: |n.exemplars[nIdx].Value - e.Value| is minimal.
// Assume that the exemplar we are inserting has a newer timestamp. This is not always
// true, due to concurrency, but it's a good enough approximation.
rIdx = nIdx
}
}
rIdx = mdIdx
}

// Adjust the slice according to rIdx and nIdx.
Expand Down
8 changes: 6 additions & 2 deletions prometheus/histogram_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -1049,10 +1049,14 @@ func TestNativeHistogramConcurrency(t *testing.T) {

go func(vals []float64) {
start.Wait()
for _, v := range vals {
for i, v := range vals {
// An observation every 1 to 10 seconds.
atomic.AddInt64(&ts, rand.Int63n(10)+1)
his.Observe(v)
if i%2 == 0 {
his.Observe(v)
} else {
his.(ExemplarObserver).ObserveWithExemplar(v, Labels{"foo": "bar"})
}
}
end.Done()
}(vals)
Expand Down
Loading