Skip to content

Low Level REST Client: Support host selection #29211

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 28 commits into from

Conversation

nik9000
Copy link
Member

@nik9000 nik9000 commented Mar 22, 2018

Adds support users of the Low Level REST Client to implement the new
HostSelector interface to select the hosts to which to send requests.
For example, this HostSelector selects only hosts which cannot be
master nodes:

    /**
     * Selector that matches any node that has metadata and doesn't
     * have the {@code master} role. These nodes cannot become a "master"
     * node for the Elasticsearch cluster.
     */
    HostSelector NOT_MASTER = new HostSelector() {
        @Override
        public boolean select(HttpHost host, HostMetadata meta) {
            return meta != null && false == meta.roles().master();
        }

        @Override
        public String toString() {
            return "NOT_MASTER";
        }
    };

Consumers on Java 8 should be able to use lambdas like this:

HostSelector notIngest = (host, meta) -> meta != null
        && false == meta.roles().ingest();

At this point the HostMetadata contains version and roles and is
canonically populated by ElasticsearchHostsSniffer which parses this
information from the /_nodes API while it is sniffing for hosts.
Configuring the Sniffer will automatically set this metadata.

Instead of creating yet another variant of performRequest and
performRequestAsync this adds a withHostSelector method to
RestClient that creates a "stateless view" into the RestClient. It is
"stateless" in that it backs of its operations to the RestClient that
created it so it doesn't need any sort of try-with-resources tracking,
nor does it implement Closeable.

This same pattern could be used in the future to reduce the number of
variant of performRequest and performRequestAsync by moving some of
the less commonly used parameters with other withXXX style methods.

I've done no work to get the HighLevelRestClient working with these
"stateless views" so that will have to wait for a followup.

I've marked this as "breaking-java" because it breaks custom
implementations of HostsSniffer. It also might cause trouble for folks
that mock RestClient. The new interface RestClientActions is much
more amenable to mocking.

Because we expect to find it useful, this also implements host_selector
support to do statements in the yaml tests. Using it looks a little
like:

---
"example test":
  - skip:
      features: host_selector
  - do:
      host_selector:
        version: " - 7.0.0" # same syntax as skip
      apiname:
        something: true

The do section parses the version string into a host selector that
uses the same version comparison logic as the skip section. When the
do section is executed it passed the off to the RestClient, using
the ElasticsearchHostsSniffer to sniff the required metadata.

The idea is to use this in mixed version tests to target a specific
version of Elasticsearch so we can be sure about the deprecation
logging though we don't currently have any examples that need it. We do,
however, have at least one open pull request that requires something
like this to properly test it.

Closes #21888 (kind of, it isn't in the high level client, but we'll do that in a followup)

nik9000 added 12 commits March 20, 2018 15:27
Adds selector support to tests, plumbs selectors some more.
Adds `host_selector` support to `do` statements in the yaml tests. Using
it looks a little like:

```
---
"some example":
  - skip:
      features: host_selector
  - do:
      host_selector:
        version: " - 7.0.0" # same syntax as skip
      apiname:
        something: true
```

This is implemented by using `ElasticsearchHostsSniffer` in the test
framework to load sniff metadata about all hosts and stick it on a
`RestClient`. The `do` section contains a `HostSelector` that it hands
off to the `RestClient`.

The idea is to use this in mixed version tests to target a specific
version of Elasticsearch.
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

@nik9000
Copy link
Member Author

nik9000 commented Mar 22, 2018

I've marked this as "breaking-java" because it breaks custom
implementations of HostsSniffer. It also might cause trouble for folks
that mock RestClient. The new interface RestClientActions is much
more amenable to mocking.

I'm not sure how strong our commitment to keeping the Low Level REST Client backwards compatible across releases is. Personally I feel like it is new enough that some bumps along the road for things like this are normal, but I understand that they are far from pleasant.

private final CloseableHttpAsyncClient client;
// We don't rely on default headers supported by HttpAsyncClient as those cannot be replaced.
// These are package private for tests.
final List<Header> defaultHeaders;
private final long maxRetryTimeoutMillis;
private final String pathPrefix;
private final AtomicInteger lastHostIndex = new AtomicInteger(0);
private volatile HostTuple<Set<HttpHost>> hostTuple;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this so the mutable stuff was below the immutable stuff.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like it makes it easier to read.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++

this.blacklist.clear();
}

/**
* Sends a request to the Elasticsearch cluster that the client points to and waits for the corresponding response
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these javadocs are now on the RestClientActions interface.

* @throws ClientProtocolException in case of an http protocol error
* @throws ResponseException in case Elasticsearch responded with a status code that indicated an error
*/
// TODO this exists entirely to so we don't have to change much in the high level rest client tests. We'll remove in a followup.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to remove this in the followup that adapts the high level REST client to this change.

@nik9000 nik9000 added the v6.3.0 label Mar 22, 2018
@nik9000
Copy link
Member Author

nik9000 commented Mar 22, 2018

but I understand that they are far from pleasant.

And I'm happy to be convinced to do more with regards to backwards compatibility.

@@ -53,6 +53,11 @@ private DeadHostState() {
this.failedAttempts = previousDeadHostState.failedAttempts + 1;
}

DeadHostState(int failedAttempts, long deadUntilNanos) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this - it helps with testing but I expect it conflicts with something @javanna is doing now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not a big deal, I will split my work in smaller PRs. the one around max retry timeout will need rewriting at this point but that's ok.

@nik9000
Copy link
Member Author

nik9000 commented Mar 22, 2018

Because we expect to find it useful, this also implements host_selector
support to do statements in the yaml tests.

Specifically, I think this'll be useful for #21510.

@nik9000
Copy link
Member Author

nik9000 commented Mar 23, 2018

I need to push docs for this before merging but I expect we'll spend a while talking about this before merging.

Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through this and I like it. I left some comments but this is going in the right direction. Thanks @nik9000 !

/**
* The node runs ingest pipelines.
*/
public boolean ingest() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe use the is prefix?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has prefix instead ok?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind. maybe also call master masterEligible instead just to prevent misunderstandings?

}

/**
* Role information about and Elasticsearch process.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/and/an ?

/**
* Version of Elasticsearch that the host is running.
*/
private final String version;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wondering if strings are enough for version comparisons etc. but surely using our own Version class is not a good idea here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the high level rest client will have access to our Version class and can compare. That is kind of what I do in the yaml tests further down. I don't think it is worth building our own Version class, yeah.

I think a string is the right thing, at least for now. It all depends on how much we can break this in the future as things get better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately the Version class is the one that currently brings lucene in :) but yea I tend to agree that for now we are good. I was contemplating making it a number for better comparisons, but that may prove even more confusing to users?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think it'd be more confusing....

public static final HostMetadataResolver EMPTY_RESOLVER = new HostMetadataResolver() {
@Override
public HostMetadata resolveMetadata(HttpHost host) {
return null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we rather have an EMPTY placeholder here and avoid null values? Or use Optional ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional is a Java 8 thing.

I'm not sure what the right semantics for EMPTY would be. Passing null requires the implementer to think about what they want to do if they don't have any metadata which I kind of like. I think Optional would be more clear, but we don't have it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I thought it was introduced much earlier. Maybe in this case null is ok here

* have the {@code master} role. These nodes cannot become a "master"
* node for the Elasticsearch cluster.
*/
HostSelector NOT_MASTER = new HostSelector() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should instead have one to skip master only nodes? master nodes are not to be avoided unless you have dedicated masters and dedicated data nodes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me. I'll replace this with that. I wanted an example that folks would find useful in real code too. And NOT_MASTER_ONLY would be better. It should only filter out nodes that are master nodes but are not data nodes, right? What about ingest? I think it should ignore that role to be honest.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would filter out nodes that are only master, so if they happen to be master eligible and also ingest, I would keep them in the loop. But this is an edge case, the important bit is definitely to take out what's master but not data like you said.

@Override
public String toString() {
return "ANY";
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need toString for the Host selector impls?

/**
* Stateless view into a {@link RestClient} with customized parameters.
*/
class RestClientView extends AbstractRestClientActions {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to find a better name for this. I don't think view describes it well, but I haven't come up with anything better yet :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really happy with any of the names.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe RestClientWrapper but its not much better than RestClientView :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to see an immutable view added! I wanted to ask for one ;)

We plan to allow Hibernate Search users to get a reference to the RestClient so that people can use the occasional "low-level" operation but I wasn't happy to expose some of the mutators.

https://hibernate.atlassian.net/browse/HSEARCH-3125?focusedCommentId=101943&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-101943

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd talked about this with @javanna a while ago and we're kind of abandoning this approach because we think it is too confusing. We won't have a view, I think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the fact we have a good use case will help make a case for it :)

BTW I would just need the interface: I believe re-defining the interface just to remove some methods in Hibernate is not ideal as it would imply a high coupling, while I'd like to expose your interface so that people using the latest Elasticsearch would have any new features immediately, w/o us having to update our facade first. (Normally we'd have a facade to better maintain the API but this is for advanced users, which will prefer to not have anything in the way)


public void testHostSelector() throws IOException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this snippet used although it has no tags for inclusion in the docs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to rework it and add it to the docs, I think.

* <li>It only runs once
* </ul>
*/
protected void sniffHostMetadata(RestClient client) throws IOException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is the method here and not where it's used?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll move it. I started out thinking it was something that'd have to be done at construction time but it turned out not to be the case. Then I figured it was a thing that other ESRestTestCase subclasses might use. But at this point it doens't make a lot of sense. So I'll move it.

ElasticsearchHostsSniffer sniffer = new ElasticsearchHostsSniffer(
adminClient, ElasticsearchHostsSniffer.DEFAULT_SNIFF_REQUEST_TIMEOUT, scheme);
Map<HttpHost, HostMetadata> meta = sniffer.sniffHosts().hostMetadata();
client.setHosts(clusterHosts, meta::get);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an interesting use of the sniffer classes! hopefully the configured hosts match the ones found in the cluster state :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They, uh, have so far!

Now that I think about it I think I should try and be a little more careful here and assert that all the hosts come back. The test would fail in a weird way if I don't.

filteredHosts.remove(entry.getKey());
final HostTuple<Set<HttpHost>> hostTuple = this.hostTuple;
result = nextHostsOneTime(hostTuple, blacklist, lastHostIndex, System.nanoTime(), hostSelector);
if (result.hosts == null) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean default result.hosts to emptyList and then check for .isEmpty instead. Hmmmm. I wonder if that'd be easier to read to be honest? I feel like if I was reading it I might think "isEmpty will catch more than just the default case." Maybe! I'll have a think on it.

});
HttpHost deadHost = sortedHosts.get(0).getKey();
attempts++;
} while (attempts < MAX_NEXT_HOSTS_ATTEMPTS);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm. We can fail immediately here if the selector rejected all of the dead hosts, but we can't know that we didn't get the concurrent modification of the hosts that shifted a host that the selector wouldn't have rejected out of the dead hosts list. Basically I'm saying I'm not sure we can be sure which of the two things is true because the second reason might be made worse by the first reason.

If dead hosts were stored in another way we could figure out if they changed and retry if they did, but they aren't and I think maybe that should be another PR if we want to change it.

HttpHost deadHost = sortedHosts.get(0).getKey();
attempts++;
} while (attempts < MAX_NEXT_HOSTS_ATTEMPTS);
throw new IOException("No hosts available for request. Last failure was " + result.describeFailure());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it doesn't give any context because sometimes we're reviving a dead node and sometimes we're picking from a list of living ones.

It feels right to me for HostSelectors to make very simple choices so it is easier to reason about what they do.

I understand wanting to make the "what if we reject all the hosts" behavior pluggable, but I think it is pretty complicated because you don't know if you rejected all of the hosts or just all of the living hosts.

What if we make our own subclass of IOException and folks can catch it and retry without the selector if they really want to? That feels like an exceptional case so I don't feel too bad about it. I'm not sure though.

throw new IOException("No hosts available for request. Last failure was " + result.describeFailure());
}

static class NextHostsResult {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is an interesting thing! That'd be nice!

static class NextHostsResult {
/**
* Number of hosts filtered from the list because they are
* dead.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose that is more correct. Less dramatic, but more correct. I'll change it if we keep the class.


public void testHostSelector() throws IOException {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to rework it and add it to the docs, I think.

return null;
} else {
logger.trace("adding node [" + nodeId + "]");
assert sawRoles : "didn't see roles for [" + nodeId + "]";
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Damn. I meant to add a test that parses the roles from an old call. I'll do that and fix this up.

hosts.add(publishedHost);
HostMetadata meta = new HostMetadata(version, new Roles(master, data, ingest));
for (HttpHost bound: boundHosts) {
hostMetadata.put(bound, meta);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we don't know how the client is configured. In particular, at least on my machine, our REST tests are configured against an ipv4 address but the published host is an ipv6 address.

You don't need both if you are using the sniffer to sniff the hosts, but you need it if you are using the sniffer to sniff metadata for some hosts you already have.

/**
* Result of sniffing hosts.
*/
public final class SnifferResult {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, if I can make the Node concept make sense I'll do that!

ElasticsearchHostsSniffer sniffer = new ElasticsearchHostsSniffer(
adminClient, ElasticsearchHostsSniffer.DEFAULT_SNIFF_REQUEST_TIMEOUT, scheme);
Map<HttpHost, HostMetadata> meta = sniffer.sniffHosts().hostMetadata();
client.setHosts(clusterHosts, meta::get);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They, uh, have so far!

Now that I think about it I think I should try and be a little more careful here and assert that all the hosts come back. The test would fail in a weird way if I don't.

Copy link
Member Author

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@javanna I've reworked a whole bunch of things:

  1. Node instead of HostMetadata
  2. NodeSelector (instead of HostSelector) takes and returns a List<Node>. The lists control the ordering.
  3. I fixed the tests in all the ways you asked.

So I think this is worth another look. I know of a few things it needs before it is done though:

  1. Docs
  2. Better names for things.
  3. Tests that run against real /_nodes API responses.
  4. I need to revert my NOCOMMIT.

/**
* Version of Elasticsearch that the host is running.
*/
private final String version;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think it'd be more confusing....

/**
* Roles that the Elasticsearch process on the host has.
*/
private final Roles roles;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I was trying to keep this PR from getting huge. I didn't accomplish that, but I tried.

return inner.select(host, meta) && hostSelector.select(host, meta);
}
};
return new RestClientView(delegate, combo);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you might do this if you have an application that uses a RestClientActions instance for "normal" operations that has NodeSelector.NOT_MASTER_ONLY and sometimes you want to send, say, nodes on a certain rack. If you do that then you'd call withNodeSelector really early to add the NOT_MASTER_ONLY selector and then call withNodeSelector many times, much later, to add the rack awareness. But only for those requests that care about it.

Or maybe you want to combine two selectors but don't want to worry about implementing AND logic.

for (int i = 0; i < numHttpServers; i++) {
HttpServer httpServer = createHttpServer();
httpServers[i] = httpServer;
httpHosts[i] = new HttpHost(httpServer.getAddress().getHostString(), httpServer.getAddress().getPort());
}
httpServers[0].createContext(pathPrefix + "/firstOnly", new ResponseHandler(200));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose it is good enough. I wanted an extra layer but I don't really need it to be honest.

* <li>It only runs once
* </ul>
*/
protected void sniffHostMetadata(RestClient client) throws IOException {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll move it. I started out thinking it was something that'd have to be done at construction time but it turned out not to be the case. Then I figured it was a thing that other ESRestTestCase subclasses might use. But at this point it doens't make a lot of sense. So I'll move it.

* Abstract implementation of {@link RestClientActions} shared by
* {@link RestClient} and {@link RestClientView}.
*/
abstract class AbstractRestClientActions implements RestClientActions {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use this so I only have to implement all the performRequest flavors. Without it I'd basically copy and paste the guts of it into both RestClientView and RestClient. Or I could declare 8 more methods in RestClient that take the NodeSelector and then delegate directly to them. I don't think either of those is particularly clean though.

I get wanting to return the abstract class and node having an interface at all, but I feel like the interface makes mocking simpler and it will allow us to change more things later without them being breaking changes.

Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nik9000 I did another pass, I have yet to look at the sniffing changes and tests, but I can only do it later today or tomorrow, so I am sending out now the comments that I have till now.

/**
* Make a copy of this {@link Node} but replacing its
* {@link #getHost() host}. Use this when the sniffing implementation
* returns returns a {@link #getHost() host} that is not useful to the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

double "returns"

/**
* The node runs ingest pipelines.
*/
public boolean hasIngest() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

man now that I see it I do prefer is over has :)

* that the {@link RestClient} would prefer to use them
* @return a subset of the provided list of {@linkplain Node}s that the
* selector approves of, in the order that the selector would prefer
* to use them.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we clarify in the docs when this gets called so it is clear that we call it for each single request, and the list holds all the nodes that will be tried for that request, in case there are failures that need to be retried?

}

/**
* Returns a new {@link RestClientBuilder} to help with {@link RestClient} creation.
* Creates a new builder instance and sets the hosts that the client will send requests to.
* Creates a new builder instance and sets the nodes that the client will send requests to.
* @see Node#Node(HttpHost)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe the other variant that accepts Nodes should have the @see Node#Node(HttpHost) but this one shouldn't?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++. Was a copy and paste mistake.

performRequestAsyncNoCatch(method, endpoint, params, entity, httpAsyncResponseConsumerFactory,
listener, headers);
return listener.get();
public Node[] getNodes() { // TODO is it ok to expose this? It feels excessive but we do use it in tests.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make it package private?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that we need it from another package, I think it's ok.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is used in sniffHostMetadata so we only sniff one time. That is off in the test framework.

return singletonList(selectedDeadNodes.get(0).getHost());
}
}
throw new IOException("NodeSelector [" + nodeSelector + "] rejcted all nodes, "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you throw above, maybe throw inside, only in case the selector returns empty here too, that might make the code a bit more readable.

subset.add(node);
}
}
return subset;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should try and make this a bit smarter? In the extreme case that there are no alive non-master only nodes around, do we want to make the client resurrect one of the dead nodes (I'd say we should rather throw exception when we return empty from here)? Maybe we should rather use a master-only node in this specific case? While I write I am actually not sure about this. Maybe instead of overloading the master eligible nodes we should just bail :) But document this behaviour clearly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry that "smarter" become "hard to understand" in this case. I think it makes sense to have the built in rule be hard and fast and if folks want to be more flexible in their application they can be.

One issue is that it is kind of hard to tell if you are being called with a list of live nodes or blacklisted ones. We could pass a boolean to communicate that, but that feels pretty complicated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right, this is good enough, and I agree at this point that we should try and revive a node if the selector returns empty.

* The maximum number of attempts that {@link #nextNode(NodeSelector)} makes
* before giving up and failing the request.
*/
private static final int MAX_NEXT_NODES_ATTEMPTS = 10;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no longer used, could be removed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++

List<Node> livingNodes = new ArrayList<>(nodeTuple.nodes.size() - blacklist.size());
List<DeadNodeAndRevival> deadNodes = new ArrayList<>(blacklist.size());
for (Node node : nodeTuple.nodes) {
DeadHostState deadness = blacklist.get(node.getHost());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hesitant in the beginning due to the need for the additional class, but this looks great, it simplifies also concurrency as we read only once per node from the blacklist, so from then on we don't even have to retry as we kind of act based on the previous view of the blacklist. Not only does it allow us to remove the max_attempt, but also to remove retries due to the concurrent changes as a whole!

throw new IllegalArgumentException("host cannot be null");
}
this.host = host;
this.boundHosts = boundHosts;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that we still need this given that users can provide metadata together with the corresponding host. Can we remove this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still use this in the yaml testing framework to look up metadata.

And add tests that parse the `/_nodes/http` API responses from older
versions so we verify that we don't break it.
Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving a bunch of new comments, review round completed. This looks great, comments are all minor except a couple of them, especially the one on including node attributes when sniffing metadata, sniffing metadata from 2.x and retry logic in case the selector returns no nodes.

I will look closer at docs and naming as part of the next review round. As for testing against real responses I am fine if it is unit tests that include responses from different ES versions. We may want to add proper integration tests also for the sniffer as it is not well tested but that can come as a follow-up I think.

*/
public Response performRequest(String method, String endpoint, Header... headers) throws IOException {
return performRequest(method, endpoint, Collections.<String, String>emptyMap(), null, headers);
public void setHosts(HttpHost... hosts) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we could remove this if we wish to. I feel like the method that everybody is using is RestClient#builder(HttpHost...) and is nice to keep also as a shortcut, while setHosts is used by the sniffer and more specialized. Yet it is no problem to keep either I think.

static List<HttpHost> selectHosts(NodeTuple<List<Node>> nodeTuple,
Map<HttpHost, DeadHostState> blacklist, AtomicInteger lastNodeIndex,
long now, NodeSelector nodeSelector) throws IOException {
class DeadNodeAndRevival {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that once you merge master in this will hold Node and DeadHostState, you could rename it to DeadNode or DeadNodeState?

if (false == livingNodes.isEmpty()) {
/*
* Normal state: there is at least one living node. Rotate the
* list so subsequent requests to will see prefer the nodes in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something is wrong in this sentence :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++. I accidentally left an extra extra word.

/**
* Stateless view into a {@link RestClient} with customized parameters.
*/
class RestClientView extends AbstractRestClientActions {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe RestClientWrapper but its not much better than RestClientView :)

* if we don't know what roles the node has.
*/
private final Roles roles;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need node attributes in here too

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of want to do it in a followup if that is ok with you.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok as long as we don't forget :)

* returns returns a {@link #getHost() host} that is not useful to the
* client.
*/
public Node withHost(HttpHost host) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

call it replaceHost?

performRequestAsyncNoCatch(method, endpoint, params, entity, httpAsyncResponseConsumerFactory,
listener, headers);
return listener.get();
public Node[] getNodes() { // TODO is it ok to expose this? It feels excessive but we do use it in tests.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that we need it from another package, I think it's ok.

listener, headers);
return listener.get();
public Node[] getNodes() { // TODO is it ok to expose this? It feels excessive but we do use it in tests.
return nodeTuple.nodes.toArray(new Node[0]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nodes are immutable, so there can't be changes made to the returned array which can affect the running rest client right? Can you use the known size here rather than 0?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I can get away with just returning the List which is indeed immutable. I'll try that.

}
this.hostTuple = new HostTuple<>(Collections.unmodifiableSet(httpHosts), authCache);
this.blacklist.clear();
public static RestClientBuilder builder(Node... nodes) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we clarify in the docs why one would use this over the other one and what a Node is for compared to a host?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!

* first and then running the "left" selector on the results of the "right"
* selector.
*/
class Compose implements NodeSelector {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am pretty sure that we don't need this, especially given that the only time we use it in our tests is to compose (any , selector) . It is cool but I think it adds complexity, needs testing, while a single selector can do all it wants alone, no need to chain them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also use it in RestClientView.withNodeSelector. I expect we want to keep withNodeSelector on RestClientView because we'll end up with other withXXX style methods which will all return RestClientView. So folks will do client.withTime(123).withNodeSelector(bar).

We could implement RestClientView.withNodeSelector to check if the NodeSelector in the view is ANY and throw an IllegalStateException if it isn't but that feels weird.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would keep things simple and use SetOnce to make sure that users can only set one node selector, if not set we use ANY

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh man SetOnce is a lucene thing though :x

@javanna
Copy link
Member

javanna commented Mar 28, 2018

one more thing, on the usability aspect and naming with the new RestClientActions interface, maybe @tlrx could have a look too, I am curious to hear what he thinks.

@nik9000
Copy link
Member Author

nik9000 commented Mar 28, 2018

RestClientActions

I'm really not sure what a good name is. I want to communicate:

  1. You can throw it away without worrying about closing it.
  2. You can make all the performRequest calls on it but can't do things like setHosts. I mean, we could make setHosts work by pushing it back to the real RestClient but that feels weird to me.
  3. It really exists just to fix parameters for some number of requests.
  4. You can keep this object around for a long time if you want or you can make a new one every time without much fuss.

Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did another pass, looked deeper at javadocs, found a few small things. I think it's time to merge master in.

import org.apache.http.HttpHost;

/**
* Metadata about an {@link HttpHost} running Elasticsearch.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some more info here maybe on the type of metadata?

}
/**
* The node runs ingest pipelines.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we add docs to getters should we add "Returns whether" or something like that?

*/
public interface NodeSelector {
/**
* Select the {@link Node}s to which to send requests. This may be called
Copy link
Member

@javanna javanna Mar 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would try and be even more explicit on what this method does (feel free to make adjustments though), something like: "Receives a sorted list of nodes, in the order that the client would try them for a single request, and returns a sorted list of nodes selected out of the provided ones. Provided nodes may be alive ones, in which case returning an empty list means that the client will try and revive one dead node taken from the blacklist. When reviving nodes, this method will be called too with a list of sorted nodes as argument, where the first one is the one that the client would revive. The first node returned by the selector is the one that will be revived."

* return a list of nodes sorted by its preference for which node is used.
* If it is operating on "living" nodes that it returns function as
* fallbacks in case of request failures. If it is operating on dead nodes
* then the dead node that it returns is attempted but no others.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only the first node returned is attempted but no others

* first and then running the "left" selector on the results of the "right"
* selector.
*/
class Compose implements NodeSelector {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh man SetOnce is a lucene thing though :x

return new NodeTuple<>(hosts.iterator(), nodeTuple.authCache);
}

static List<Node> selectHosts(NodeTuple<List<Node>> nodeTuple,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this could be folded back into nextNode now?

* If no living and no dead nodes match the selector we retry a few
* times to handle concurrent modifications of the list of dead nodes.
* We never block the thread or {@link Thread#sleep} or anything like
* that. If the retries fail this throws a {@link IOException}.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this part which doesn't apply anymore

if (false == livingNodes.isEmpty()) {
/*
* Normal state: there is at least one living node. Rotate the
* list so subsequent requests to will prefer the nodes in a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

subsequent requests to ?

*/
Collections.rotate(livingNodes, lastNodeIndex.getAndIncrement());
List<Node> selectedLivingNodes = nodeSelector.select(livingNodes);
if (false == selectedLivingNodes.isEmpty()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

* NodeSelector so it can have its say in which nodes are ok and their
* ordering. If the selector is ok with any of the nodes then use just
* the first one in the list because we only want to revive a single
* node.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++

* Actions that can be taken on a {@link RestClient} or the stateless views
* returned by {@link RestClientActions#withNodeSelector withNodeSelector}.
*/
public interface RestClientActions {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we talked about this new way of sending requests with @nik9000 and we said that we will look into changing the existing performRequest methods to accept a Request object, so that we can trim down the different performRequest methods that we have to two: sync and async. I feel like this is going to be more familiar to most of our users and something that we should have done from day one.

@nik9000
Copy link
Member Author

nik9000 commented May 10, 2018

I've spent the morning rebuilding this on top of the Request object work (#29623) and I'll open a new version of this soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants