forked from elastic/elasticsearch
-
Notifications
You must be signed in to change notification settings - Fork 0
Merge Remote Tracking Branch #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This change reintroduces the system index APIs for Kibana without the changes made for marking what system indices could be accessed using these APIs. In essence, this is a partial revert of #53912. The changes for marking what system indices should be allowed access will be handled in a separate change. The APIs introduced here are wrapped versions of the existing REST endpoints. A new setting is also introduced since the Kibana system indices' names are allowed to be changed by a user in case multiple instances of Kibana use the same instance of Elasticsearch. Relates #52385
Co-authored-by: Lisa Cawley <[email protected]>
Today the shards of searchable snapshots are allocated with a naive `ExistingShardsAllocator` which selects the first valid node for each shard. Thanks to #54729 we can now allow these shards to fall through to the balanced shards allocator so that they are allocated in a more balanced fashion. Relates #50999
If we run into an INIT state snapshot and the current master didn't create it, it will be removed anyway. => no need to have that block another snapshot from starting. This has practical relevance because on master fail-over after snapshot INIT but before start, the create snapshot request will be retried by the client (as it's a transport master node action) and needlessly fail with an unexpected exception (snapshot clearly didn't exist so it's confusing to the user). This allowed making two disruption type tests stricter
) Implement DATETIME_FORMAT(<date/datetime/time>, ) function which allows for formatting a timestamp to the specified format. The patterns allowed as those of java.time.format.DateTimeFormatter. Related to #53714
When remote run configuration is executed in IntelliJ Community edition, it automatically adds the RunnedSettings section. This section doesn't seem to have any negative effect on the Ultimate Edition.
* Drop BASE TABLE type in favour for just TABLE This commit drops the table type 'BASE TABLE' and replaces all occurences with just 'TABLE', since his type is wider-used and friendlier to the client applications that query for certain table types in their discovery mode. The 'TABLE' type is also explicitely mentioned by the JDBC and ODBC standards and although other data source-specific types are permitted, older apps will not work well with them. * Refactor table type constants out of IndexType Move SQL_TABLE/_ALIAS out of IndexType, so that they can also be used in that Enum definition. Co-authored-by: Elastic Machine <[email protected]>
We use to allow non-data/non-master nodes to not have a persistent data path via the undocumented node.local_storage setting. We recently removed this setting, but left behind was a guard around a check that the data paths support atomic moves. This commit unguards this check, so that all nodes are required to have persistent storage that supports atomic move operations.
Today we construct the node environment relatively early in the node construction process, before we have even constructed the final environment, which means before the final settings are available. Rather, we should defer constructing the node environment until the final environment is available. This commit does that. This helps delay node environment construction until after the node roles are properly determined, which is important since the node environment does some checks on the basis of whether or not the node is neither a data nor a master node (such nodes should not have index metadata nor shard data on disk). Note that a consequence of this is that the initial log line that displays the node name, node ID, and cluster name does not appear until later in startup (after we have loaded plugins). This seems okay.
This commit bumps the minimum JDK required for compilation to JDK 14.
…54931) The PercolatorQuerySearchIT tests do not support SMILE since it cannot create valid UTF-8 which the Percolator queries want.
which is to what it is set in the 7.x branch.
This commit updates the CI defaults so that JAVA14_HOME is set.
We occasionally add a global template for our YAML tests, and this can cause warnings for these template tests. This commit adds these warnings so they don't cause test failures. Resolves #54822 Co-authored-by: Elastic Machine <[email protected]>
This changes the priority of the cluster state update that stops ILM altogether to `IMMEDIATE`. We've chosen to change this as it can be useful to temporarily stop ILM if a cluster is overwhelmed, but a `NORMAL` priority can see the "stop ILM update" not make it up the tasks queue. On the same note, we're keeping the `start ILM` cluster update priority to `NORMAL` on purpose such that we only start `ILM` if the cluster can handle it.
This commits adds a timeout when moving ILM back on to a failed step. In case the master is struggling with processing the cluster update requests these ones will expire (as we'll send them again anyway on the next ILM loop run) ILM more descriptive source messages for cluster updates Use the configured ILM step master timeout setting
* Add Snapshot Resiliency Test for Master Failover during Delete We only have very indirect coverage of master failovers during snaphot delete at the moment. This comment adds a direct test of this scenario and also an assertion that makes sure we are not leaking any snapshot completion listeners in the snapshots service in this scenario. This gives us better coverage of scenarios like #54256 and makes the diff to the upcoming more consistent snapshot delete implementation in #54705 smaller.
If we run into `length == 0` we trip an assertion in `randomIntBetween(0, length -1)`.
When a new index is rolled over, we check to see whether there are any duplicate alias configurations in the index template configuration. Additionally, when a new index is created from a bulk action, we check the templates to see if there are any ingest pipelines that need to be applied to the index that will be newly created. Both of these actions previously checked the v1 templates for their settings, they now also check the v2 index templates, with the v2 index templates taking precendence similar to the way they do when creating an index. Relates to #53101
This change converts the module and plugin parameters for testClusters to be lazy. Meaning that the values are not resolved until they are actually used. This removes the requirement to use project.afterEvaluate to be able to resolve the bundle artifact. Note - this does not completely remove the need for afterEvaluate since it is still needed for the custom resource extension.
ForbiddenApis task via the precommit task currently makes an assumption that only the test and main source sets are present for any given project. This commit removes that assumption and allows for any project source set's compileClasspath class path to be added to the forbiddenApis classpath configuration.
These work *much* better. Co-authored-by: Christoph Büscher <[email protected]>
This change makes sure that all internal client requests spawned by the data frame analytics persistent task executor and that use the end user security credentials, have the parent task id assigned. The objective here is to permit auditing (as well as tracking for debugging purposes) of all the end-user requests executed on its behalf by persistent tasks. Because data frame analytics taks already implements graceful shutdown of child tasks, this change does not interfere with it by opting out of the persistent task cancellation of child tasks. Relates #54943 #52314
Adds support for filters to T-Test aggregation. The filters can be used to select populations based on some criteria and use values from the same or different fields. Closes #53692
We found some problems during the test. Data: 200Million docs, 1 shard, 0 replica hits | avg | sum | value_count | ----------- | ------- | ------- | ----------- | 20,000 | .038s | .033s | .063s | 200,000 | .127s | .125s | .334s | 2,000,000 | .789s | .729s | 3.176s | 20,000,000 | 4.200s | 3.239s | 22.787s | 200,000,000 | 21.000s | 22.000s | 154.917s | The performance of `avg`, `sum` and other is very close when performing statistics, but the performance of `value_count` has always been poor, even not on an order of magnitude. Based on some common-sense knowledge, we think that `value_count` and sum are similar operations, and the time consumed should be the same. Therefore, we have discussed the agg of `value_count`. The principle of counting in es is to traverse the field of each document. If the field is an ordinary value, the count value is increased by 1. If it is an array type, the count value is increased by n. However, the problem lies in traversing each document and taking out the field, which changes from disk to an object in the Java language. We summarize its current problems with Elasticsearch as: - Number cast to string overhead, and GC problems caused by a large number of strings - After the number type is converted to string, sorting and other unnecessary operations are performed Here is the proof of type conversion overhead. ``` // Java long to string source code, getChars is very time-consuming. public static String toString(long i) { int size = stringSize(i); if (COMPACT_STRINGS) { byte[] buf = new byte[size]; getChars(i, size, buf); return new String(buf, LATIN1); } else { byte[] buf = new byte[size * 2]; StringUTF16.getChars(i, size, buf); return new String(buf, UTF16); } } ``` test type | average | min | max | sum ------------ | ------- | ---- | ----------- | ------- double->long | 32.2ns | 28ns | 0.024ms | 3.22s long->double | 31.9ns | 28ns | 0.036ms | 3.19s long->String | 163.8ns | 93ns | 1921 ms | 16.3s #36752 The program heat map shows that the toString time is particularly serious. ## optimization Our optimization code is actually very simple. It is to manage different types separately, instead of uniformly converting to string unified processing. We added type identification in ValueCountAggregator, and made special treatment for number and geopoint types to cancel their type conversion. Because the string type is reduced and the string constant is reduced, the improvement effect is very obvious. ## result hits | avg | sum | value_count | value_count | value_count | value_count | value_count | value_count | | | | double | double | keyword | keyword | geo_point | geo_point | | | | before | after | before | after | before | after | ----------- | ------- | ------- | ----------- | ----------- | ----------- | ----------- | ----------- | ----------- | 20,000 | 38s | .033s | .063s | .026s | .030s | .030s | .038s | .015s | 200,000 | 127s | .125s | .334s | .078s | .116s | .099s | .278s | .031s | 2,000,000 | 789s | .729s | 3.176s | .439s | .348s | .386s | 3.365s | .178s | 20,000,000 | 4.200s | 3.239s | 22.787s | 2.700s | 2.500s | 2.600s | 25.192s | 1.278s | 200,000,000 | 21.000s | 22.000s | 154.917s | 18.990s | 19.000s | 20.000s | 168.971s | 9.093s | - The results are more in line with common sense. `value_count` is about the same as `avg`, `sum`, etc., or even lower than these. Previously, `value_count` was much larger than avg and sum, and it was not even an order of magnitude when the amount of data was large. - When calculating numeric types such as `double` and `long`, the performance is improved by about 8 to 9 times; when calculating the `geo_point` type, the performance is improved by 18 to 20 times.
* EQL: Add string() function * EQL: Reorder queryfolder_tests * EQL: Add test queries * EQL: Fix InternalEqlScriptUtils.string and test case * EQL: Fix testStringFunctionWithText error message * EQL: Flatten ToStringFunctionPipe.equals * EQL: Reorder painless whitelist * EQL: Address feedback and remove string(null) handling * EQL: Move string(pid) test over * EQL: Rename source -> value
…5069) Mute test in versions that do not support password protected keystores. This didn't fail in the PR check since we run MixedClusterClientYamlTestSuiteIT against 4 nodes (2 old and 2 new ) and this happened to hit a node that was on master rather than one that was on 7.8.0-SNAPSHOT
Moves `indices` content from the [Modules][0] section to the [Configuring Elasticsearch][1] section. Also removes the [Indices][2] landing page and adds a related redirect. [0]: https://www.elastic.co/guide/en/elasticsearch/reference/master/modules.html [1]: https://www.elastic.co/guide/en/elasticsearch/reference/master/settings.html [2]: https://www.elastic.co/guide/en/elasticsearch/reference/master/modules-indices.html
Currently the remote cluster sniff connection process can succeed even if no connections are opened. This commit fixes this by failing the connection process if no connections are successfully opened.
…nse (#54619) We currently create the .async-search index if necessary before performing any action (index, update or delete). Truth is that this is needed only before storing the initial response. The other operations are either update or delete, which will anyways not find the document to update/delete even if the index gets created when missing. This also caused `testCancellation` failures as we were trying to delete the document twice from the .async-search index, once from `TransportDeleteAsyncSearchAction` and once as a consequence of the search task being completed. The latter may be called after the test is completed, but before the cluster is shut down and causing problems to the after test checks, for instance if it happens after all the indices have been cleaned up. It is totally fine to try to delete a response that is no longer found, but not quite so if such call will also trigger an index creation. With this commit we remove all the calls to createIndexIfNecessary from the update/delete operation, and we leave one call only from storeInitialResponse which is where the index is expected to be created. Closes #54180
We added a fancy method to provide random realistic test data to the reduction tests in #54910. This uses that to remove some of the more esoteric machinations in the agg tests. This will marginally increase the coverage of the serialiation tests and, more importantly, remove some mysterious value generation code that only really made sense for random reduction tests but was used all over the place. It doesn't, on the other hand, make the tests shorter. Just *hopefully* more clear. I only cleaned up a few tests this way. If we like this it'd probably be worth grabbing others.
The isAuthAllowed() method for license checking is used by code that wants to ensure security is both enabled and available. The enabled state is dynamic and provided by isSecurityEnabled(). But since security is available with all license types, an check on the license level is not necessary. Thus, this change replaces isAuthAllowed() with calling isSecurityEnabled().
Some of these characters are special to Asciidoctor and they ruin the rendering on this page. Instead, we use a macro to passthrough these characters without Asciidoctor applying any subtitutions to them. This commit then addresses some rendering issues in the thread pool docs. Co-authored-by: James Rodewig <[email protected]>
Changes boilerplate sentence of "If using a field as the argument, this parameter only supports..." to "...this parameter supports only...". The latter is a bit more clear and readable.
* Update policy-definitions.asciidoc The docs show how to create an ILM policy, but not how to add the attr `node.attr.data: warm` (or hot). This PR adds to the hot,warm example.
The usage of local parameter for GetFieldMappingRequest has been removed from the underlying transport action since v2.0. This PR deprecates the parameter from rest layer. It will be removed in next major version.
Co-authored-by: James Rodewig <[email protected]>
Enables EQL in release builds for testing Fixes #55112
Preparation for backport of #55066
We needlessly send documents to be persisted. If there are no stats added, then we should not attempt to persist them. Also, this PR fixes the race condition that caused issue: #54786
Adds analytics plugin usage stats to _xpack/usage. Closes #54847
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.