Skip to content

Fix the timestamp field of a data stream to @timestamp #59076

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Jul 8, 2020

Conversation

martijnvg
Copy link
Member

@martijnvg martijnvg commented Jul 6, 2020

The commit makes the following changes:

  • The timestamp field of a data stream definition in a composable
    index template can only be set to @timestamp.
  • Removed custom data stream timestamp field validation and reuse the validation from TimestampFieldMapper and
    instead only check that the _timestamp field mapping has been defined on a backing index of a data stream.
  • Moved code that injects _timestamp meta field mapping from MetadataCreateIndexService#applyCreateIndexRequestWithV2Template(...) method
    to MetadataIndexTemplateService#collectMappings(...) method.
  • Fixed a bug (Component templates are considered invalid without timestamp_field #58956) that cases timestamp field validation to be performed
    for each template and instead of the final mappings that is created.

Relates to #58642
Relates to #53100
Closes #58956
Closes #58583

The commit makes the following changes:
* The timestamp field of a data stream definition in a composable
  index template can only be set to '@timestamp'.
* Removed custom data stream timestamp field validation and reuse the validation from `TimestampFieldMapper` and
  instead only check that the _timestamp field mapping has been defined on a backing index of a data stream.
* Moved code that injects _timestamp meta field mapping from `MetadataCreateIndexService#applyCreateIndexRequestWithV2Template58956(...)` method
  to `MetadataIndexTemplateService#collectMappings(...)` method.
* Fixed a bug (elastic#58956) that cases timestamp field validation to be performed
  for each template and instead of the final mappings that is created.

Relates to elastic#58642
Relates to elastic#53100
Closes elastic#58956
Closes elastic#58583
@martijnvg
Copy link
Member Author

@elasticmachine run elasticsearch-ci/2

@martijnvg martijnvg marked this pull request as ready for review July 6, 2020 15:27
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (:Core/Features/Data streams)

@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Jul 6, 2020
Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks Martijn! I left some really minor comments

@@ -256,6 +257,10 @@ public String toString() {
private final String timestampField;

public DataStreamTemplate(String timestampField) {
if ("@timestamp".equals(timestampField) == false) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this a public static string so it can be referenced elsewhere?

* @return a mapping snippet for a backing index with `_timestamp` meta field mapper properly configured.
*/
public Map<String, Object> getDataSteamMappingSnippet() {
return Map.of("_doc", Map.of(TimestampFieldMapper.NAME, Map.of("path", timestampField)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super minor:

Suggested change
return Map.of("_doc", Map.of(TimestampFieldMapper.NAME, Map.of("path", timestampField)));
return Map.of(MapperService.SINGLE_MAPPING_NAME, Map.of(TimestampFieldMapper.NAME, Map.of("path", timestampField)));


public TimestampField(String name) {
assert "@timestamp".equals(name) : "unexpected timestamp field [" + name + "]";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should transform this into an actual error, that way if we end up using it for the HLRC it doesn't fail to throw because assertions are disabled

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think HLRC uses its own DataStream pojo. I will change this from an assertion to an actual error.


@SuppressWarnings("unchecked")
private static final ConstructingObjectParser<TimestampField, Void> PARSER = new ConstructingObjectParser<>(
"timestamp_field",
args -> new TimestampField((String) args[0], (Map<String, Object>) args[1])
args -> new TimestampField((String) args[0])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes me wonder, this is fine as-is, but should we even bother having this class or specifying the name of the field if it's going to be hardcoded for the foreseeable future?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to keep this class. I think that that the timestamp field will not stay hard coded for a long time.

@@ -902,6 +909,23 @@ public static String findV2Template(Metadata metadata, String indexName, boolean
Optional.ofNullable(template.template())
.map(Template::mappings)
.ifPresent(mappings::add);

// Only include _timestamp mapping snippet if creating backing index.
if (indexName.startsWith(DataStream.BACKING_INDEX_PREFIX)) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dakrone I had to add this if statement otherwise the meta field mapper was going to be applied on each create index api call. In a docs test, the new logs-- composable index template was triggered by a regular create index api call and then the test failed, because the document being indexed had no timestamp field.

I think only applying the meta field automatically makes sense for backing indices only and not when a user creates a new index via create index api and the index composable template matches?

See commits: a23d2e4, which then was superseded by: c787332

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note currently in master, the _timestamp field gets only applied if the create index request originates from create data stream api and data stream rollover (signalled via CreateIndexClusterStateUpdateRequest#dataStreamName field), but now this logic moved MetadataIndexTemplateService.collectMappings(...) method, which didn't have that context.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be cleaner to pull this check out of collectMappings and into the caller? That way we could avoid the workaround here: https://github.com/elastic/elasticsearch/pull/59076/files#diff-115cd2f0fff9dd97acc95e1d29012a15R1075.

@martijnvg martijnvg merged commit cb6b05d into elastic:master Jul 8, 2020
martijnvg added a commit that referenced this pull request Jul 8, 2020
(a change got in after I merged in master into this pr, but didn't notice it until I merged this pr)
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Jul 8, 2020
Backport elastic#59076 of to 7.x branch.

The commit makes the following changes:
* The timestamp field of a data stream definition in a composable
  index template can only be set to '@timestamp'.
* Removed custom data stream timestamp field validation and reuse the validation from `TimestampFieldMapper` and
  instead only check that the _timestamp field mapping has been defined on a backing index of a data stream.
* Moved code that injects _timestamp meta field mapping from `MetadataCreateIndexService#applyCreateIndexRequestWithV2Template58956(...)` method
  to `MetadataIndexTemplateService#collectMappings(...)` method.
* Fixed a bug (elastic#58956) that cases timestamp field validation to be performed
  for each template and instead of the final mappings that is created.
* only apply _timestamp meta field if index is created as part of a data stream or data stream rollover,
this fixes a docs test, where a regular index creation matches (logs-*) with a template with a data stream definition.

Relates to elastic#58642
Relates to elastic#53100
Closes elastic#58956
Closes elastic#58583
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Jul 8, 2020
(a change got in after I merged in master into this pr, but didn't notice it until I merged this pr)
martijnvg added a commit that referenced this pull request Jul 8, 2020
Backport of #59076 to 7.x branch.

The commit makes the following changes:
* The timestamp field of a data stream definition in a composable
  index template can only be set to '@timestamp'.
* Removed custom data stream timestamp field validation and reuse the validation from `TimestampFieldMapper` and
  instead only check that the _timestamp field mapping has been defined on a backing index of a data stream.
* Moved code that injects _timestamp meta field mapping from `MetadataCreateIndexService#applyCreateIndexRequestWithV2Template58956(...)` method
  to `MetadataIndexTemplateService#collectMappings(...)` method.
* Fixed a bug (#58956) that cases timestamp field validation to be performed
  for each template and instead of the final mappings that is created.
* only apply _timestamp meta field if index is created as part of a data stream or data stream rollover,
this fixes a docs test, where a regular index creation matches (logs-*) with a template with a data stream definition.

Relates to #58642
Relates to #53100
Closes #58956
Closes #58583
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Jul 8, 2020
Copy link
Contributor

@jtibshirani jtibshirani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few late comments!

/**
* @return a mapping snippet for a backing index with `_timestamp` meta field mapper properly configured.
*/
public Map<String, Object> getDataSteamMappingSnippet() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noticed this typo as I was trying to pull up the method: DataSteam -> DataStream.

@@ -196,6 +196,10 @@ public void validate(DocumentFieldMappers lookup) {
}
}

public String getPath() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does TimestampFieldMapper need a path parameter now? We're not sure how we will eventually implement custom timestamp names, I can see an implementation where we don't use path here at all. So maybe we could simplify and just have an enabled boolean flag instead.

Depending on the implementation, we can always add path later and just default to @timestamp if it's not specified.

} else {
mappingPath = "properties." + fieldPath.replace(".", ".properties.");
if (timestampFieldName.equals(fieldMapper.getPath()) == false) {
throw new IllegalArgumentException("[_timestamp] meta field doesn't point to data stream timestamp field [" +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this error will be hard for end users to understand, since they shouldn't be aware of the _timestamp field. Maybe we won't even need this check, once we remove the ability to set timestamp_field, as discussed in #59317 (comment)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more of a sanity check. Kind of ensuring that we did apply the _data_stream_timestamp meta field mapper. If a user would run into this error then this would be a bug. Can tweak this check.

martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Jul 14, 2020
…led' option and

adjusted exception messages.

Relates to elastic#59076
martijnvg added a commit that referenced this pull request Jul 16, 2020
…led' option (#59503)

and adjusted exception messages.

Relates to #59076
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Jul 16, 2020
…led' option

Backport elastic#59503 to 7.x

and adjusted exception messages.

Relates to elastic#59076
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Jul 16, 2020
…led' option

Backport elastic#59727 to 7.x

and adjusted exception messages.

Relates to elastic#59076
martijnvg added a commit that referenced this pull request Jul 16, 2020
…led' option (#59727)

Backport #59503 to 7.x

and adjusted exception messages.

Relates to #59076
martijnvg added a commit that referenced this pull request Jul 16, 2020
…led' option (#59728)

Backport #59727 to 7.x

and adjusted exception messages.

Relates to #59076
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Data streams Data streams and their lifecycles >non-issue Team:Data Management Meta label for data/management team v7.9.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Component templates are considered invalid without timestamp_field Make timestamp field mapping part of data stream definition
5 participants