Skip to content

Commit ea2dbd9

Browse files
author
Christoph Büscher
authored
Add field type for version strings (#59773)
This PR adds a new 'version' field type that allows indexing string values representing software versions similar to the ones defined in the Semantic Versioning definition (semver.org). The field behaves very similar to a 'keyword' field but allows efficient sorting and range queries that take into accound the special ordering needed for version strings. For example, the main version parts are sorted numerically (ie 2.0.0 < 11.0.0) whereas this wouldn't be possible with 'keyword' fields today. Valid version values are similar to the Semantic Versioning definition, with the notable exception that in addition to the "main" version consiting of major.minor.patch, we allow less or more than three numeric identifiers, i.e. "1.2" or "1.4.6.123.12" are treated as valid too. Relates to #48878
1 parent cad2560 commit ea2dbd9

File tree

20 files changed

+2094
-3
lines changed

20 files changed

+2094
-3
lines changed

docs/reference/mapping/types.asciidoc

+4
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,8 @@ Dates:: Date types, including <<date,`date`>> and
5151
<<range,Range>>:: Range types, such as `long_range`, `double_range`,
5252
`date_range`, and `ip_range`.
5353
<<ip,`ip`>>:: IPv4 and IPv6 addresses.
54+
<<version,Version>>:: Software versions. Supports https://semver.org/[Semantic Versioning]
55+
precedence rules.
5456
{plugins}/mapper-murmur3.html[`murmur3`]:: Compute and stores hashes of
5557
values.
5658

@@ -148,6 +150,8 @@ include::types/geo-shape.asciidoc[]
148150

149151
include::types/ip.asciidoc[]
150152

153+
include::types/version.asciidoc[]
154+
151155
include::types/parent-join.asciidoc[]
152156

153157
include::types/keyword.asciidoc[]

docs/reference/mapping/types/keyword.asciidoc

+1
Original file line numberDiff line numberDiff line change
@@ -129,3 +129,4 @@ The following parameters are accepted by `keyword` fields:
129129
include::constant-keyword.asciidoc[]
130130

131131
include::wildcard.asciidoc[]
132+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
[role="xpack"]
2+
[testenv="basic"]
3+
[[version]]
4+
=== Version field type
5+
++++
6+
<titleabbrev>Version</titleabbrev>
7+
++++
8+
9+
The `version` field type is a specialization of the `keyword` field for
10+
handling software version values and to support specialized precedence
11+
rules for them. Precedence is defined following the rules outlined by
12+
https://semver.org/[Semantic Versioning], which for example means that
13+
major, minor and patch version parts are sorted numerically (i.e.
14+
"2.1.0" < "2.4.1" < "2.11.2") and pre-release versions are sorted before
15+
release versions (i.e. "1.0.0-alpha" < "1.0.0").
16+
17+
You index a `version` field as follows
18+
19+
[source,console]
20+
--------------------------------------------------
21+
PUT my-index-000001
22+
{
23+
"mappings": {
24+
"properties": {
25+
"my_version": {
26+
"type": "version"
27+
}
28+
}
29+
}
30+
}
31+
32+
--------------------------------------------------
33+
34+
The field offers the same search capabilities as a regular keyword field. It
35+
can e.g. be searched for exact matches using `match` or `term` queries and
36+
supports prefix and wildcard searches. The main benefit is that `range` queries
37+
will honor Semver ordering, so a `range` query between "1.0.0" and "1.5.0"
38+
will include versions of "1.2.3" but not "1.11.2" for example. Note that this
39+
would be different when using a regular `keyword` field for indexing where ordering
40+
is alphabetical.
41+
42+
Software versions are expected to follow the
43+
https://semver.org/[Semantic Versioning rules] schema and precedence rules with
44+
the notable exception that more or less than three main version identifiers are
45+
allowed (i.e. "1.2" or "1.2.3.4" qualify as valid versions while they wouldn't under
46+
strict Semver rules). Version strings that are not valid under the Semver definition
47+
(e.g. "1.2.alpha.4") can still be indexed and retrieved as exact matches, however they
48+
will all appear _after_ any valid version with regular alphabetical ordering. The empty
49+
String "" is considered invalid and sorted after all valid versions, but before other
50+
invalid ones.
51+
52+
[discrete]
53+
[[version-params]]
54+
==== Parameters for version fields
55+
56+
The following parameters are accepted by `version` fields:
57+
58+
[horizontal]
59+
60+
<<mapping-field-meta,`meta`>>::
61+
62+
Metadata about the field.
63+
64+
[discrete]
65+
==== Limitations
66+
67+
This field type isn't optimized for heavy wildcard, regex or fuzzy searches. While those
68+
type of queries work in this field, you should consider using a regular `keyword` field if
69+
you strongly rely on these kind of queries.
70+

server/src/main/java/org/elasticsearch/index/mapper/TermBasedFieldType.java

+3-2
Original file line numberDiff line numberDiff line change
@@ -34,9 +34,10 @@
3434

3535
/** Base {@link MappedFieldType} implementation for a field that is indexed
3636
* with the inverted index. */
37-
abstract class TermBasedFieldType extends SimpleMappedFieldType {
37+
public abstract class TermBasedFieldType extends SimpleMappedFieldType {
3838

39-
TermBasedFieldType(String name, boolean isSearchable, boolean hasDocValues, TextSearchInfo textSearchInfo, Map<String, String> meta) {
39+
public TermBasedFieldType(String name, boolean isSearchable, boolean hasDocValues, TextSearchInfo textSearchInfo,
40+
Map<String, String> meta) {
4041
super(name, isSearchable, hasDocValues, textSearchInfo, meta);
4142
}
4243

x-pack/plugin/analytics/src/main/java/org/elasticsearch/xpack/analytics/stringstats/StringStatsAggregator.java

+1-1
Original file line numberDiff line numberDiff line change
@@ -101,7 +101,7 @@ public void collect(int doc, long bucket) throws IOException {
101101
for (int i = 0; i < valuesCount; i++) {
102102
BytesRef value = values.nextValue();
103103
if (value.length > 0) {
104-
String valueStr = value.utf8ToString();
104+
String valueStr = (String) format.format(value);
105105
int length = valueStr.length();
106106
totalLength.increment(bucket, length);
107107

Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
# Integration tests for the version field
2+
#
3+
---
4+
setup:
5+
6+
- skip:
7+
features: headers
8+
version: " - 7.99.99"
9+
reason: "version field is added to 8.0 first"
10+
11+
- do:
12+
indices.create:
13+
index: test_index
14+
body:
15+
mappings:
16+
properties:
17+
version:
18+
type: version
19+
20+
- do:
21+
bulk:
22+
refresh: true
23+
body:
24+
- '{ "index" : { "_index" : "test_index", "_id" : "1" } }'
25+
- '{"version": "1.1.0" }'
26+
- '{ "index" : { "_index" : "test_index", "_id" : "2" } }'
27+
- '{"version": "2.0.0-beta" }'
28+
- '{ "index" : { "_index" : "test_index", "_id" : "3" } }'
29+
- '{"version": "3.1.0" }'
30+
31+
---
32+
"Store malformed":
33+
- do:
34+
indices.create:
35+
index: test_malformed
36+
body:
37+
mappings:
38+
properties:
39+
version:
40+
type: version
41+
42+
- do:
43+
bulk:
44+
refresh: true
45+
body:
46+
- '{ "index" : { "_index" : "test_malformed", "_id" : "1" } }'
47+
- '{"version": "1.1.0" }'
48+
- '{ "index" : { "_index" : "test_malformed", "_id" : "2" } }'
49+
- '{"version": "2.0.0-beta" }'
50+
- '{ "index" : { "_index" : "test_malformed", "_id" : "3" } }'
51+
- '{"version": "v3.1.0" }'
52+
- '{ "index" : { "_index" : "test_malformed", "_id" : "4" } }'
53+
- '{"version": "1.el6" }'
54+
55+
- do:
56+
search:
57+
index: test_malformed
58+
body:
59+
query: { "match" : { "version" : "1.el6" } }
60+
61+
- do:
62+
search:
63+
index: test_malformed
64+
body:
65+
query: { "match_all" : { } }
66+
sort:
67+
version: asc
68+
69+
- match: { hits.total.value: 4 }
70+
- match: { hits.hits.0._source.version: "1.1.0" }
71+
- match: { hits.hits.1._source.version: "2.0.0-beta" }
72+
- match: { hits.hits.2._source.version: "1.el6" }
73+
- match: { hits.hits.3._source.version: "v3.1.0" }
74+
75+
---
76+
"Basic ranges":
77+
- do:
78+
search:
79+
index: test_index
80+
body:
81+
query: { "range" : { "version" : { "gt" : "1.1.0", "lt" : "9999" } } }
82+
83+
- match: { hits.total.value: 2 }
84+
85+
- do:
86+
search:
87+
index: test_index
88+
body:
89+
query: { "range" : { "version" : { "gte" : "1.1.0", "lt" : "9999" } } }
90+
91+
- match: { hits.total.value: 3 }
92+
93+
- do:
94+
search:
95+
index: test_index
96+
body:
97+
query: { "range" : { "version" : { "gte" : "2.0.0", "lt" : "9999" } } }
98+
99+
- match: { hits.total.value: 1 }
100+
101+
- do:
102+
search:
103+
index: test_index
104+
body:
105+
query: { "range" : { "version" : { "gte" : "2.0.0-alpha", "lt" : "9999" } } }
106+
107+
- match: { hits.total.value: 2 }
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
# Integration tests for the version field
2+
#
3+
---
4+
setup:
5+
6+
- skip:
7+
features: headers
8+
version: " - 7.99.99"
9+
reason: "version field is added to 8.0 first"
10+
11+
- do:
12+
indices.create:
13+
index: test_index
14+
body:
15+
mappings:
16+
properties:
17+
version:
18+
type: version
19+
20+
- do:
21+
bulk:
22+
refresh: true
23+
body:
24+
- '{ "index" : { "_index" : "test_index", "_id" : "1" } }'
25+
- '{"version": "1.1.12" }'
26+
- '{ "index" : { "_index" : "test_index", "_id" : "2" } }'
27+
- '{"version": "2.0.0-beta" }'
28+
- '{ "index" : { "_index" : "test_index", "_id" : "3" } }'
29+
- '{"version": "3.1.0" }'
30+
31+
---
32+
"Filter script":
33+
- do:
34+
search:
35+
index: test_index
36+
body:
37+
query: { "script" : { "script" : { "source": "doc['version'].value.length() > 5"} } }
38+
39+
- match: { hits.total.value: 2 }
40+
- match: { hits.hits.0._source.version: "1.1.12" }
41+
- match: { hits.hits.1._source.version: "2.0.0-beta" }
42+
43+
---
44+
"Sort script":
45+
- do:
46+
search:
47+
index: test_index
48+
body:
49+
sort: { "_script" : { "type" : "number", "script" : { "source": "doc['version'].value.length()" } } }
50+
51+
- match: { hits.total.value: 3 }
52+
- match: { hits.hits.0._source.version: "3.1.0" }
53+
- match: { hits.hits.1._source.version: "1.1.12" }
54+
- match: { hits.hits.2._source.version: "2.0.0-beta" }
+20
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
evaluationDependsOn(xpackModule('core'))
2+
3+
apply plugin: 'elasticsearch.esplugin'
4+
apply plugin: 'elasticsearch.internal-cluster-test'
5+
6+
esplugin {
7+
name 'versionfield'
8+
description 'A plugin for a field type to store sofware versions'
9+
classname 'org.elasticsearch.xpack.versionfield.VersionFieldPlugin'
10+
extendedPlugins = ['x-pack-core', 'lang-painless']
11+
}
12+
archivesBaseName = 'x-pack-versionfield'
13+
14+
dependencies {
15+
compileOnly project(path: xpackModule('core'), configuration: 'default')
16+
compileOnly project(':modules:lang-painless:spi')
17+
compileOnly project(':modules:lang-painless')
18+
testImplementation project(path: xpackModule('core'), configuration: 'testArtifacts')
19+
testImplementation project(path: xpackModule('analytics'), configuration: 'default')
20+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
/*
2+
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
3+
* or more contributor license agreements. Licensed under the Elastic License;
4+
* you may not use this file except in compliance with the Elastic License.
5+
*/
6+
7+
package org.elasticsearch.xpack.versionfield;
8+
9+
import org.elasticsearch.action.search.SearchResponse;
10+
import org.elasticsearch.common.xcontent.XContentFactory;
11+
import org.elasticsearch.plugins.Plugin;
12+
import org.elasticsearch.search.aggregations.AggregationBuilders;
13+
import org.elasticsearch.search.aggregations.bucket.terms.Terms;
14+
import org.elasticsearch.search.aggregations.bucket.terms.Terms.Bucket;
15+
import org.elasticsearch.test.ESIntegTestCase;
16+
import org.elasticsearch.xpack.core.LocalStateCompositeXPackPlugin;
17+
18+
import java.util.Collection;
19+
import java.util.List;
20+
21+
import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder;
22+
23+
public class VersionFieldIT extends ESIntegTestCase {
24+
25+
@Override
26+
protected Collection<Class<? extends Plugin>> nodePlugins() {
27+
return List.of(VersionFieldPlugin.class, LocalStateCompositeXPackPlugin.class);
28+
}
29+
30+
public void testTermsAggregation() throws Exception {
31+
String indexName = "test";
32+
createIndex(indexName);
33+
34+
client().admin()
35+
.indices()
36+
.preparePutMapping(indexName)
37+
.setSource(
38+
XContentFactory.jsonBuilder()
39+
.startObject()
40+
.startObject("_doc")
41+
.startObject("properties")
42+
.startObject("version")
43+
.field("type", "version")
44+
.endObject()
45+
.endObject()
46+
.endObject()
47+
.endObject()
48+
)
49+
.get();
50+
ensureGreen();
51+
52+
client().prepareIndex(indexName).setId("1").setSource(jsonBuilder().startObject().field("version", "1.0").endObject()).get();
53+
client().prepareIndex(indexName).setId("2").setSource(jsonBuilder().startObject().field("version", "1.3.0").endObject()).get();
54+
client().prepareIndex(indexName)
55+
.setId("3")
56+
.setSource(jsonBuilder().startObject().field("version", "2.1.0-alpha").endObject())
57+
.get();
58+
client().prepareIndex(indexName).setId("4").setSource(jsonBuilder().startObject().field("version", "2.1.0").endObject()).get();
59+
client().prepareIndex(indexName).setId("5").setSource(jsonBuilder().startObject().field("version", "3.11.5").endObject()).get();
60+
refresh();
61+
62+
// terms aggs
63+
SearchResponse response = client().prepareSearch(indexName)
64+
.addAggregation(AggregationBuilders.terms("myterms").field("version"))
65+
.get();
66+
Terms terms = response.getAggregations().get("myterms");
67+
List<? extends Bucket> buckets = terms.getBuckets();
68+
69+
assertEquals(5, buckets.size());
70+
assertEquals("1.0", buckets.get(0).getKey());
71+
assertEquals("1.3.0", buckets.get(1).getKey());
72+
assertEquals("2.1.0-alpha", buckets.get(2).getKey());
73+
assertEquals("2.1.0", buckets.get(3).getKey());
74+
assertEquals("3.11.5", buckets.get(4).getKey());
75+
}
76+
}

0 commit comments

Comments
 (0)