-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Low Level REST Client: Support host selection #29211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Adds `host_selector` support to `do` statements in the yaml tests. Using it looks a little like: ``` --- "some example": - skip: features: host_selector - do: host_selector: version: " - 7.0.0" # same syntax as skip apiname: something: true ``` This is implemented by using `ElasticsearchHostsSniffer` in the test framework to load sniff metadata about all hosts and stick it on a `RestClient`. The `do` section contains a `HostSelector` that it hands off to the `RestClient`. The idea is to use this in mixed version tests to target a specific version of Elasticsearch.
Pinging @elastic/es-core-infra |
I'm not sure how strong our commitment to keeping the Low Level REST Client backwards compatible across releases is. Personally I feel like it is new enough that some bumps along the road for things like this are normal, but I understand that they are far from pleasant. |
private final CloseableHttpAsyncClient client; | ||
// We don't rely on default headers supported by HttpAsyncClient as those cannot be replaced. | ||
// These are package private for tests. | ||
final List<Header> defaultHeaders; | ||
private final long maxRetryTimeoutMillis; | ||
private final String pathPrefix; | ||
private final AtomicInteger lastHostIndex = new AtomicInteger(0); | ||
private volatile HostTuple<Set<HttpHost>> hostTuple; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved this so the mutable stuff was below the immutable stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like it makes it easier to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++
this.blacklist.clear(); | ||
} | ||
|
||
/** | ||
* Sends a request to the Elasticsearch cluster that the client points to and waits for the corresponding response |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of these javadocs are now on the RestClientActions
interface.
* @throws ClientProtocolException in case of an http protocol error | ||
* @throws ResponseException in case Elasticsearch responded with a status code that indicated an error | ||
*/ | ||
// TODO this exists entirely to so we don't have to change much in the high level rest client tests. We'll remove in a followup. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to remove this in the followup that adapts the high level REST client to this change.
And I'm happy to be convinced to do more with regards to backwards compatibility. |
@@ -53,6 +53,11 @@ private DeadHostState() { | |||
this.failedAttempts = previousDeadHostState.failedAttempts + 1; | |||
} | |||
|
|||
DeadHostState(int failedAttempts, long deadUntilNanos) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this - it helps with testing but I expect it conflicts with something @javanna is doing now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not a big deal, I will split my work in smaller PRs. the one around max retry timeout will need rewriting at this point but that's ok.
Specifically, I think this'll be useful for #21510. |
I need to push docs for this before merging but I expect we'll spend a while talking about this before merging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went through this and I like it. I left some comments but this is going in the right direction. Thanks @nik9000 !
/** | ||
* The node runs ingest pipelines. | ||
*/ | ||
public boolean ingest() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe use the is
prefix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
has
prefix instead ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind. maybe also call master masterEligible instead just to prevent misunderstandings?
} | ||
|
||
/** | ||
* Role information about and Elasticsearch process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/and/an ?
/** | ||
* Version of Elasticsearch that the host is running. | ||
*/ | ||
private final String version; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wondering if strings are enough for version comparisons etc. but surely using our own Version class is not a good idea here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the high level rest client will have access to our Version
class and can compare. That is kind of what I do in the yaml tests further down. I don't think it is worth building our own Version class, yeah.
I think a string is the right thing, at least for now. It all depends on how much we can break this in the future as things get better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unfortunately the Version class is the one that currently brings lucene in :) but yea I tend to agree that for now we are good. I was contemplating making it a number for better comparisons, but that may prove even more confusing to users?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think it'd be more confusing....
public static final HostMetadataResolver EMPTY_RESOLVER = new HostMetadataResolver() { | ||
@Override | ||
public HostMetadata resolveMetadata(HttpHost host) { | ||
return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we rather have an EMPTY placeholder here and avoid null values? Or use Optional ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional
is a Java 8 thing.
I'm not sure what the right semantics for EMPTY
would be. Passing null
requires the implementer to think about what they want to do if they don't have any metadata which I kind of like. I think Optional
would be more clear, but we don't have it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh I thought it was introduced much earlier. Maybe in this case null is ok here
* have the {@code master} role. These nodes cannot become a "master" | ||
* node for the Elasticsearch cluster. | ||
*/ | ||
HostSelector NOT_MASTER = new HostSelector() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should instead have one to skip master only nodes? master nodes are not to be avoided unless you have dedicated masters and dedicated data nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me. I'll replace this with that. I wanted an example that folks would find useful in real code too. And NOT_MASTER_ONLY
would be better. It should only filter out nodes that are master nodes but are not data nodes, right? What about ingest? I think it should ignore that role to be honest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would filter out nodes that are only master, so if they happen to be master eligible and also ingest, I would keep them in the loop. But this is an edge case, the important bit is definitely to take out what's master but not data like you said.
@Override | ||
public String toString() { | ||
return "ANY"; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need toString for the Host selector impls?
/** | ||
* Stateless view into a {@link RestClient} with customized parameters. | ||
*/ | ||
class RestClientView extends AbstractRestClientActions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would love to find a better name for this. I don't think view describes it well, but I haven't come up with anything better yet :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really happy with any of the names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe RestClientWrapper but its not much better than RestClientView :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to see an immutable view added! I wanted to ask for one ;)
We plan to allow Hibernate Search users to get a reference to the RestClient so that people can use the occasional "low-level" operation but I wasn't happy to expose some of the mutators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd talked about this with @javanna a while ago and we're kind of abandoning this approach because we think it is too confusing. We won't have a view, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the fact we have a good use case will help make a case for it :)
BTW I would just need the interface: I believe re-defining the interface just to remove some methods in Hibernate is not ideal as it would imply a high coupling, while I'd like to expose your interface so that people using the latest Elasticsearch would have any new features immediately, w/o us having to update our facade first. (Normally we'd have a facade to better maintain the API but this is for advanced users, which will prefer to not have anything in the way)
|
||
public void testHostSelector() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this snippet used although it has no tags for inclusion in the docs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to rework it and add it to the docs, I think.
* <li>It only runs once | ||
* </ul> | ||
*/ | ||
protected void sniffHostMetadata(RestClient client) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is the method here and not where it's used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll move it. I started out thinking it was something that'd have to be done at construction time but it turned out not to be the case. Then I figured it was a thing that other ESRestTestCase subclasses might use. But at this point it doens't make a lot of sense. So I'll move it.
ElasticsearchHostsSniffer sniffer = new ElasticsearchHostsSniffer( | ||
adminClient, ElasticsearchHostsSniffer.DEFAULT_SNIFF_REQUEST_TIMEOUT, scheme); | ||
Map<HttpHost, HostMetadata> meta = sniffer.sniffHosts().hostMetadata(); | ||
client.setHosts(clusterHosts, meta::get); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is an interesting use of the sniffer classes! hopefully the configured hosts match the ones found in the cluster state :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They, uh, have so far!
Now that I think about it I think I should try and be a little more careful here and assert that all the hosts come back. The test would fail in a weird way if I don't.
filteredHosts.remove(entry.getKey()); | ||
final HostTuple<Set<HttpHost>> hostTuple = this.hostTuple; | ||
result = nextHostsOneTime(hostTuple, blacklist, lastHostIndex, System.nanoTime(), hostSelector); | ||
if (result.hosts == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean default result.hosts
to emptyList
and then check for .isEmpty
instead. Hmmmm. I wonder if that'd be easier to read to be honest? I feel like if I was reading it I might think "isEmpty
will catch more than just the default case." Maybe! I'll have a think on it.
}); | ||
HttpHost deadHost = sortedHosts.get(0).getKey(); | ||
attempts++; | ||
} while (attempts < MAX_NEXT_HOSTS_ATTEMPTS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm. We can fail immediately here if the selector rejected all of the dead hosts, but we can't know that we didn't get the concurrent modification of the hosts that shifted a host that the selector wouldn't have rejected out of the dead hosts list. Basically I'm saying I'm not sure we can be sure which of the two things is true because the second reason might be made worse by the first reason.
If dead hosts were stored in another way we could figure out if they changed and retry if they did, but they aren't and I think maybe that should be another PR if we want to change it.
HttpHost deadHost = sortedHosts.get(0).getKey(); | ||
attempts++; | ||
} while (attempts < MAX_NEXT_HOSTS_ATTEMPTS); | ||
throw new IOException("No hosts available for request. Last failure was " + result.describeFailure()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it doesn't give any context because sometimes we're reviving a dead node and sometimes we're picking from a list of living ones.
It feels right to me for HostSelector
s to make very simple choices so it is easier to reason about what they do.
I understand wanting to make the "what if we reject all the hosts" behavior pluggable, but I think it is pretty complicated because you don't know if you rejected all of the hosts or just all of the living hosts.
What if we make our own subclass of IOException
and folks can catch it and retry without the selector if they really want to? That feels like an exceptional case so I don't feel too bad about it. I'm not sure though.
throw new IOException("No hosts available for request. Last failure was " + result.describeFailure()); | ||
} | ||
|
||
static class NextHostsResult { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is an interesting thing! That'd be nice!
static class NextHostsResult { | ||
/** | ||
* Number of hosts filtered from the list because they are | ||
* dead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose that is more correct. Less dramatic, but more correct. I'll change it if we keep the class.
|
||
public void testHostSelector() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to rework it and add it to the docs, I think.
return null; | ||
} else { | ||
logger.trace("adding node [" + nodeId + "]"); | ||
assert sawRoles : "didn't see roles for [" + nodeId + "]"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Damn. I meant to add a test that parses the roles from an old call. I'll do that and fix this up.
hosts.add(publishedHost); | ||
HostMetadata meta = new HostMetadata(version, new Roles(master, data, ingest)); | ||
for (HttpHost bound: boundHosts) { | ||
hostMetadata.put(bound, meta); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we don't know how the client is configured. In particular, at least on my machine, our REST tests are configured against an ipv4 address but the published host is an ipv6 address.
You don't need both if you are using the sniffer to sniff the hosts, but you need it if you are using the sniffer to sniff metadata for some hosts you already have.
/** | ||
* Result of sniffing hosts. | ||
*/ | ||
public final class SnifferResult { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, if I can make the Node
concept make sense I'll do that!
ElasticsearchHostsSniffer sniffer = new ElasticsearchHostsSniffer( | ||
adminClient, ElasticsearchHostsSniffer.DEFAULT_SNIFF_REQUEST_TIMEOUT, scheme); | ||
Map<HttpHost, HostMetadata> meta = sniffer.sniffHosts().hostMetadata(); | ||
client.setHosts(clusterHosts, meta::get); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They, uh, have so far!
Now that I think about it I think I should try and be a little more careful here and assert that all the hosts come back. The test would fail in a weird way if I don't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@javanna I've reworked a whole bunch of things:
Node
instead ofHostMetadata
NodeSelector
(instead ofHostSelector
) takes and returns aList<Node>
. The lists control the ordering.- I fixed the tests in all the ways you asked.
So I think this is worth another look. I know of a few things it needs before it is done though:
- Docs
- Better names for things.
- Tests that run against real
/_nodes
API responses. - I need to revert my NOCOMMIT.
/** | ||
* Version of Elasticsearch that the host is running. | ||
*/ | ||
private final String version; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think it'd be more confusing....
/** | ||
* Roles that the Elasticsearch process on the host has. | ||
*/ | ||
private final Roles roles; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. I was trying to keep this PR from getting huge. I didn't accomplish that, but I tried.
return inner.select(host, meta) && hostSelector.select(host, meta); | ||
} | ||
}; | ||
return new RestClientView(delegate, combo); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you might do this if you have an application that uses a RestClientActions
instance for "normal" operations that has NodeSelector.NOT_MASTER_ONLY
and sometimes you want to send, say, nodes on a certain rack. If you do that then you'd call withNodeSelector
really early to add the NOT_MASTER_ONLY
selector and then call withNodeSelector
many times, much later, to add the rack awareness. But only for those requests that care about it.
Or maybe you want to combine two selectors but don't want to worry about implementing AND logic.
for (int i = 0; i < numHttpServers; i++) { | ||
HttpServer httpServer = createHttpServer(); | ||
httpServers[i] = httpServer; | ||
httpHosts[i] = new HttpHost(httpServer.getAddress().getHostString(), httpServer.getAddress().getPort()); | ||
} | ||
httpServers[0].createContext(pathPrefix + "/firstOnly", new ResponseHandler(200)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose it is good enough. I wanted an extra layer but I don't really need it to be honest.
* <li>It only runs once | ||
* </ul> | ||
*/ | ||
protected void sniffHostMetadata(RestClient client) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll move it. I started out thinking it was something that'd have to be done at construction time but it turned out not to be the case. Then I figured it was a thing that other ESRestTestCase subclasses might use. But at this point it doens't make a lot of sense. So I'll move it.
* Abstract implementation of {@link RestClientActions} shared by | ||
* {@link RestClient} and {@link RestClientView}. | ||
*/ | ||
abstract class AbstractRestClientActions implements RestClientActions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I use this so I only have to implement all the performRequest
flavors. Without it I'd basically copy and paste the guts of it into both RestClientView
and RestClient
. Or I could declare 8 more methods in RestClient
that take the NodeSelector
and then delegate directly to them. I don't think either of those is particularly clean though.
I get wanting to return the abstract class and node having an interface at all, but I feel like the interface makes mocking simpler and it will allow us to change more things later without them being breaking changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nik9000 I did another pass, I have yet to look at the sniffing changes and tests, but I can only do it later today or tomorrow, so I am sending out now the comments that I have till now.
/** | ||
* Make a copy of this {@link Node} but replacing its | ||
* {@link #getHost() host}. Use this when the sniffing implementation | ||
* returns returns a {@link #getHost() host} that is not useful to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
double "returns"
/** | ||
* The node runs ingest pipelines. | ||
*/ | ||
public boolean hasIngest() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
man now that I see it I do prefer is over has :)
* that the {@link RestClient} would prefer to use them | ||
* @return a subset of the provided list of {@linkplain Node}s that the | ||
* selector approves of, in the order that the selector would prefer | ||
* to use them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we clarify in the docs when this gets called so it is clear that we call it for each single request, and the list holds all the nodes that will be tried for that request, in case there are failures that need to be retried?
} | ||
|
||
/** | ||
* Returns a new {@link RestClientBuilder} to help with {@link RestClient} creation. | ||
* Creates a new builder instance and sets the hosts that the client will send requests to. | ||
* Creates a new builder instance and sets the nodes that the client will send requests to. | ||
* @see Node#Node(HttpHost) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe the other variant that accepts Nodes should have the @see Node#Node(HttpHost)
but this one shouldn't?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++. Was a copy and paste mistake.
performRequestAsyncNoCatch(method, endpoint, params, entity, httpAsyncResponseConsumerFactory, | ||
listener, headers); | ||
return listener.get(); | ||
public Node[] getNodes() { // TODO is it ok to expose this? It feels excessive but we do use it in tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make it package private?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that we need it from another package, I think it's ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is used in sniffHostMetadata
so we only sniff one time. That is off in the test framework.
return singletonList(selectedDeadNodes.get(0).getHost()); | ||
} | ||
} | ||
throw new IOException("NodeSelector [" + nodeSelector + "] rejcted all nodes, " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you throw above, maybe throw inside, only in case the selector returns empty here too, that might make the code a bit more readable.
subset.add(node); | ||
} | ||
} | ||
return subset; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should try and make this a bit smarter? In the extreme case that there are no alive non-master only nodes around, do we want to make the client resurrect one of the dead nodes (I'd say we should rather throw exception when we return empty from here)? Maybe we should rather use a master-only node in this specific case? While I write I am actually not sure about this. Maybe instead of overloading the master eligible nodes we should just bail :) But document this behaviour clearly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I worry that "smarter" become "hard to understand" in this case. I think it makes sense to have the built in rule be hard and fast and if folks want to be more flexible in their application they can be.
One issue is that it is kind of hard to tell if you are being called with a list of live nodes or blacklisted ones. We could pass a boolean to communicate that, but that feels pretty complicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are right, this is good enough, and I agree at this point that we should try and revive a node if the selector returns empty.
* The maximum number of attempts that {@link #nextNode(NodeSelector)} makes | ||
* before giving up and failing the request. | ||
*/ | ||
private static final int MAX_NEXT_NODES_ATTEMPTS = 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is no longer used, could be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++
List<Node> livingNodes = new ArrayList<>(nodeTuple.nodes.size() - blacklist.size()); | ||
List<DeadNodeAndRevival> deadNodes = new ArrayList<>(blacklist.size()); | ||
for (Node node : nodeTuple.nodes) { | ||
DeadHostState deadness = blacklist.get(node.getHost()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was hesitant in the beginning due to the need for the additional class, but this looks great, it simplifies also concurrency as we read only once per node from the blacklist, so from then on we don't even have to retry as we kind of act based on the previous view of the blacklist. Not only does it allow us to remove the max_attempt, but also to remove retries due to the concurrent changes as a whole!
throw new IllegalArgumentException("host cannot be null"); | ||
} | ||
this.host = host; | ||
this.boundHosts = boundHosts; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure that we still need this given that users can provide metadata together with the corresponding host. Can we remove this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still use this in the yaml testing framework to look up metadata.
And add tests that parse the `/_nodes/http` API responses from older versions so we verify that we don't break it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving a bunch of new comments, review round completed. This looks great, comments are all minor except a couple of them, especially the one on including node attributes when sniffing metadata, sniffing metadata from 2.x and retry logic in case the selector returns no nodes.
I will look closer at docs and naming as part of the next review round. As for testing against real responses I am fine if it is unit tests that include responses from different ES versions. We may want to add proper integration tests also for the sniffer as it is not well tested but that can come as a follow-up I think.
*/ | ||
public Response performRequest(String method, String endpoint, Header... headers) throws IOException { | ||
return performRequest(method, endpoint, Collections.<String, String>emptyMap(), null, headers); | ||
public void setHosts(HttpHost... hosts) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we could remove this if we wish to. I feel like the method that everybody is using is RestClient#builder(HttpHost...)
and is nice to keep also as a shortcut, while setHosts
is used by the sniffer and more specialized. Yet it is no problem to keep either I think.
static List<HttpHost> selectHosts(NodeTuple<List<Node>> nodeTuple, | ||
Map<HttpHost, DeadHostState> blacklist, AtomicInteger lastNodeIndex, | ||
long now, NodeSelector nodeSelector) throws IOException { | ||
class DeadNodeAndRevival { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that once you merge master in this will hold Node
and DeadHostState
, you could rename it to DeadNode
or DeadNodeState
?
if (false == livingNodes.isEmpty()) { | ||
/* | ||
* Normal state: there is at least one living node. Rotate the | ||
* list so subsequent requests to will see prefer the nodes in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
something is wrong in this sentence :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++. I accidentally left an extra extra word.
/** | ||
* Stateless view into a {@link RestClient} with customized parameters. | ||
*/ | ||
class RestClientView extends AbstractRestClientActions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe RestClientWrapper but its not much better than RestClientView :)
* if we don't know what roles the node has. | ||
*/ | ||
private final Roles roles; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need node attributes in here too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kind of want to do it in a followup if that is ok with you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok as long as we don't forget :)
* returns returns a {@link #getHost() host} that is not useful to the | ||
* client. | ||
*/ | ||
public Node withHost(HttpHost host) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
call it replaceHost?
performRequestAsyncNoCatch(method, endpoint, params, entity, httpAsyncResponseConsumerFactory, | ||
listener, headers); | ||
return listener.get(); | ||
public Node[] getNodes() { // TODO is it ok to expose this? It feels excessive but we do use it in tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that we need it from another package, I think it's ok.
listener, headers); | ||
return listener.get(); | ||
public Node[] getNodes() { // TODO is it ok to expose this? It feels excessive but we do use it in tests. | ||
return nodeTuple.nodes.toArray(new Node[0]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nodes are immutable, so there can't be changes made to the returned array which can affect the running rest client right? Can you use the known size here rather than 0
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I can get away with just returning the List
which is indeed immutable. I'll try that.
} | ||
this.hostTuple = new HostTuple<>(Collections.unmodifiableSet(httpHosts), authCache); | ||
this.blacklist.clear(); | ||
public static RestClientBuilder builder(Node... nodes) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we clarify in the docs why one would use this over the other one and what a Node is for compared to a host?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure!
* first and then running the "left" selector on the results of the "right" | ||
* selector. | ||
*/ | ||
class Compose implements NodeSelector { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am pretty sure that we don't need this, especially given that the only time we use it in our tests is to compose (any , selector) . It is cool but I think it adds complexity, needs testing, while a single selector can do all it wants alone, no need to chain them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also use it in RestClientView.withNodeSelector
. I expect we want to keep withNodeSelector
on RestClientView
because we'll end up with other withXXX
style methods which will all return RestClientView
. So folks will do client.withTime(123).withNodeSelector(bar)
.
We could implement RestClientView.withNodeSelector
to check if the NodeSelector
in the view is ANY
and throw an IllegalStateException
if it isn't but that feels weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would keep things simple and use SetOnce
to make sure that users can only set one node selector, if not set we use ANY
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh man SetOnce is a lucene thing though :x
one more thing, on the usability aspect and naming with the new |
I'm really not sure what a good name is. I want to communicate:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did another pass, looked deeper at javadocs, found a few small things. I think it's time to merge master in.
import org.apache.http.HttpHost; | ||
|
||
/** | ||
* Metadata about an {@link HttpHost} running Elasticsearch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some more info here maybe on the type of metadata?
} | ||
/** | ||
* The node runs ingest pipelines. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we add docs to getters should we add "Returns whether" or something like that?
*/ | ||
public interface NodeSelector { | ||
/** | ||
* Select the {@link Node}s to which to send requests. This may be called |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would try and be even more explicit on what this method does (feel free to make adjustments though), something like: "Receives a sorted list of nodes, in the order that the client would try them for a single request, and returns a sorted list of nodes selected out of the provided ones. Provided nodes may be alive ones, in which case returning an empty list means that the client will try and revive one dead node taken from the blacklist. When reviving nodes, this method will be called too with a list of sorted nodes as argument, where the first one is the one that the client would revive. The first node returned by the selector is the one that will be revived."
* return a list of nodes sorted by its preference for which node is used. | ||
* If it is operating on "living" nodes that it returns function as | ||
* fallbacks in case of request failures. If it is operating on dead nodes | ||
* then the dead node that it returns is attempted but no others. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only the first node returned is attempted but no others
* first and then running the "left" selector on the results of the "right" | ||
* selector. | ||
*/ | ||
class Compose implements NodeSelector { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh man SetOnce is a lucene thing though :x
return new NodeTuple<>(hosts.iterator(), nodeTuple.authCache); | ||
} | ||
|
||
static List<Node> selectHosts(NodeTuple<List<Node>> nodeTuple, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe this could be folded back into nextNode now?
* If no living and no dead nodes match the selector we retry a few | ||
* times to handle concurrent modifications of the list of dead nodes. | ||
* We never block the thread or {@link Thread#sleep} or anything like | ||
* that. If the retries fail this throws a {@link IOException}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this part which doesn't apply anymore
if (false == livingNodes.isEmpty()) { | ||
/* | ||
* Normal state: there is at least one living node. Rotate the | ||
* list so subsequent requests to will prefer the nodes in a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
subsequent requests to ?
*/ | ||
Collections.rotate(livingNodes, lastNodeIndex.getAndIncrement()); | ||
List<Node> selectedLivingNodes = nodeSelector.select(livingNodes); | ||
if (false == selectedLivingNodes.isEmpty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice
* NodeSelector so it can have its say in which nodes are ok and their | ||
* ordering. If the selector is ok with any of the nodes then use just | ||
* the first one in the list because we only want to revive a single | ||
* node. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++
* Actions that can be taken on a {@link RestClient} or the stateless views | ||
* returned by {@link RestClientActions#withNodeSelector withNodeSelector}. | ||
*/ | ||
public interface RestClientActions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we talked about this new way of sending requests with @nik9000 and we said that we will look into changing the existing performRequest methods to accept a Request
object, so that we can trim down the different performRequest methods that we have to two: sync and async. I feel like this is going to be more familiar to most of our users and something that we should have done from day one.
I've spent the morning rebuilding this on top of the |
Adds support users of the Low Level REST Client to implement the new
HostSelector
interface to select the hosts to which to send requests.For example, this
HostSelector
selects only hosts which cannot bemaster nodes:
Consumers on Java 8 should be able to use lambdas like this:
At this point the
HostMetadata
containsversion
androles
and iscanonically populated by
ElasticsearchHostsSniffer
which parses thisinformation from the
/_nodes
API while it is sniffing for hosts.Configuring the
Sniffer
will automatically set this metadata.Instead of creating yet another variant of
performRequest
andperformRequestAsync
this adds awithHostSelector
method toRestClient
that creates a "stateless view" into theRestClient
. It is"stateless" in that it backs of its operations to the
RestClient
thatcreated it so it doesn't need any sort of try-with-resources tracking,
nor does it implement
Closeable
.This same pattern could be used in the future to reduce the number of
variant of
performRequest
andperformRequestAsync
by moving some ofthe less commonly used parameters with other
withXXX
style methods.I've done no work to get the
HighLevelRestClient
working with these"stateless views" so that will have to wait for a followup.
I've marked this as "breaking-java" because it breaks custom
implementations of
HostsSniffer
. It also might cause trouble for folksthat mock
RestClient
. The new interfaceRestClientActions
is muchmore amenable to mocking.
Because we expect to find it useful, this also implements
host_selector
support to
do
statements in the yaml tests. Using it looks a littlelike:
The
do
section parses theversion
string into a host selector thatuses the same version comparison logic as the
skip
section. When thedo
section is executed it passed the off to theRestClient
, usingthe
ElasticsearchHostsSniffer
to sniff the required metadata.The idea is to use this in mixed version tests to target a specific
version of Elasticsearch so we can be sure about the deprecation
logging though we don't currently have any examples that need it. We do,
however, have at least one open pull request that requires something
like this to properly test it.
Closes #21888 (kind of, it isn't in the high level client, but we'll do that in a followup)