Skip to content

[Tests] Make testEngineGCDeletesSetting deterministic #38942

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Feb 21, 2019
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 19 additions & 2 deletions server/src/main/java/org/elasticsearch/threadpool/ThreadPool.java
Original file line number Diff line number Diff line change
Expand Up @@ -550,22 +550,39 @@ static class CachedTimeThread extends Thread {
this.absoluteMillis = System.currentTimeMillis();
this.counter = new TimeCounter();
setDaemon(true);
if (interval <= 0) {
this.running = false;
}
}

/**
* Return the current time used for relative calculations. This is
* {@link System#nanoTime()} truncated to milliseconds.
* <p>
* If {@link ThreadPool#ESTIMATED_TIME_INTERVAL_SETTING} is set to 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make ThreadPool#ESTIMATED_TIME_INTERVAL_SETTING a setting that has 0 as an inclusive lower bound? I am fine with that in a follow up.

* then the cache is disabled and the method calls {@link System#nanoTime()}
* whenever called. Typically used for testing.
*/
long relativeTimeInMillis() {
return relativeMillis;
if (running) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd slightly prefer

Suggested change
if (running) {
if (0 < interval) {

since this is what the Javadoc says.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chose to use the boolean running (properly set in the constructor) to avoid the comparison on every call, and it's used in the same way to avoid having a "running" thread that updates those values.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't clearly understand why avoiding the comparison is a good thing. Is it a performance question? Are you sure that a comparison between a constant and a final field is worse than a second volatile read?

Copy link
Contributor Author

@matriv matriv Feb 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking was yes the performance, but didn't think about the volatile.
But also to have the logic that when interval == 0 -> running = false so that we avoid having a running thread and we have a common check in all 3 places (the methods and the run()).

If you think that the interval check is better, I'll happily change. And if we do that, should the code in the run() change like: while(running && 0 < interval) ? So that the running flag is just an external means of controlling the thread and not mixed up with the interval value?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, sounds like a good idea, thanks.

return relativeMillis;
}
return TimeValue.nsecToMSec(System.nanoTime());
}

/**
* Return the current epoch time, used to find absolute time. This is
* a cached version of {@link System#currentTimeMillis()}.
* <p>
* If {@link ThreadPool#ESTIMATED_TIME_INTERVAL_SETTING} is set to 0
* then the cache is disabled and the method calls {@link System#currentTimeMillis()}
* whenever called. Typically used for testing.
*/
long absoluteTimeInMillis() {
return absoluteMillis;
if (running) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, I'd prefer

Suggested change
if (running) {
if (0 < interval) {

return absoluteMillis;
}
return System.currentTimeMillis();
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
import org.elasticsearch.indices.IndicesService;
import org.elasticsearch.plugins.Plugin;
import org.elasticsearch.test.ESIntegTestCase;
import org.elasticsearch.threadpool.ThreadPool;

import java.util.Arrays;
import java.util.Collection;
Expand Down Expand Up @@ -126,6 +127,16 @@ public List<Setting<?>> getSettings() {
}
}

/**
* Needed by {@link UpdateSettingsIT#testEngineGCDeletesSetting()}
*/
@Override
protected Settings nodeSettings(int nodeOrdinal) {
return Settings.builder().put(super.nodeSettings(nodeOrdinal))
.put("thread_pool.estimated_time_interval", 0)
.build();
}

public void testUpdateDependentClusterSettings() {
IllegalArgumentException iae = expectThrows(IllegalArgumentException.class, () ->
client().admin().cluster().prepareUpdateSettings().setPersistentSettings(Settings.builder()
Expand Down Expand Up @@ -435,23 +446,31 @@ public void testOpenCloseUpdateSettings() throws Exception {
assertThat(getSettingsResponse.getSetting("test", "index.final"), nullValue());
}

public void testEngineGCDeletesSetting() throws InterruptedException {
public void testEngineGCDeletesSetting() {
createIndex("test");
client().prepareIndex("test", "type", "1").setSource("f", 1).get();
DeleteResponse response = client().prepareDelete("test", "type", "1").get();
long seqNo = response.getSeqNo();
long primaryTerm = response.getPrimaryTerm();
// delete is still in cache this should work
client().prepareIndex("test", "type", "1").setSource("f", 2).setIfSeqNo(seqNo).setIfPrimaryTerm(primaryTerm).get();
client().admin().indices().prepareUpdateSettings("test").setSettings(Settings.builder().put("index.gc_deletes", 0)).get();
assertAcked(client().admin().indices().prepareUpdateSettings("test").setSettings(Settings.builder().put("index.gc_deletes", 0)));

response = client().prepareDelete("test", "type", "1").get();
seqNo = response.getSeqNo();
Thread.sleep(300); // wait for cache time to change TODO: this needs to be solved better. To be discussed.

// Make sure the time has advanced for InternalEngine#resolveDocVersion()
for (ThreadPool tPool : internalCluster().getInstances(ThreadPool.class)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please use names consistent with the style in the codebase? For example, this would be threadPool.

long time1 = tPool.relativeTimeInMillis();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use a clearer name than time1?

long time2 = tPool.relativeTimeInMillis();
while (time1 == time2) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd have written this loop as the following, since there's no need for it to be a tight loop:

assertBusy(() -> assertThat(tPool.relativeTimeInMillis(), greaterThan(time1)));

It is, admittedly, somewhat of a question of taste.

time2 = tPool.relativeTimeInMillis();
}
}

// delete is should not be in cache
assertThrows(client().prepareIndex("test", "type", "1").setSource("f", 3).setIfSeqNo(seqNo).setIfPrimaryTerm(primaryTerm),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn’t it be enough to assert busy and remove the sleep?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess so, this solution just makes more visible of what's happening and less "brute force".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal of the test is to make sure that once time moves the delete is forgotten. If we busy spin on the indexing request (instead of on time - which is what I think Jason refers to with sleep), we will have different semantics as some indexing ops may succeed, changing the dynamics of the test (it now will check that a CASed index operation fails if it's base is an index op, rather than a delete op).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, yeah. Didn't even think that the assertBusy would potentially execute the prepareIndex request multiple times.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bleskes Sorry to be unclear. I meant busy spin waiting for the cached time to advance. So instead of sleeping for it to happen, assert that it has happened, busily since it happens in the background.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VersionConflictEngineException.class);

}

public void testUpdateSettingsWithBlocks() {
Expand Down