Skip to content

Fix ingest timezone parsing #63876

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Oct 26, 2021
Merged

Conversation

pgomulka
Copy link
Contributor

@pgomulka pgomulka commented Oct 19, 2020

this commit aligns timezone parsing logic with DateFormat.Iso8601. Timezone can be provided on a pattern and as ingest parameter. When no timezone is provided on a pattern it is defaulted to UTC or an ingest parameter (if provided).

When timezone is not present in a pattern, but an ingest timezone parameter is provided - a date should be parsed as if it was in that timezone.
Example: pattern "uuuu-MM-dd'T'HH:mm", text "2020-01-01T01:00", timezone: -01:00 should return 2020-01-01T01:00:00-01:00
If the pattern has a timezone and a timezone parameter is provided - a date should parsed with a timezone from input, and then "recalculated" to an ingest parameter
Example "uuuu-MM-dd'T'HH:mm XXX", text "2020-01-01T01:00 -02:00", timezone -01:00 should return 2020-01-01T02:00:00-01:00
relates #38407
relates #51215
closes #63458

When timezone paramter is specified it should recalculate the timezone
provided in a date format
@pgomulka pgomulka added WIP :Core/Infra/Core Core issues without another label :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP v8.0.0 v7.11.0 labels Oct 19, 2020
@pgomulka pgomulka self-assigned this Oct 19, 2020
@pgomulka
Copy link
Contributor Author

@elasticmachine update branch

@pgomulka pgomulka changed the title DRAFT Fix ingest timezone parsing Fix ingest timezone parsing Nov 24, 2020
@pgomulka pgomulka removed the WIP label Nov 24, 2020
@pgomulka pgomulka marked this pull request as ready for review November 24, 2020 11:22
@elasticmachine elasticmachine added Team:Data Management Meta label for data/management team Team:Core/Infra Meta label for core/infra team labels Nov 24, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (Team:Core/Features)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@pgomulka pgomulka requested a review from rjernst November 24, 2020 11:31
Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine, but it seems there is an unrelated change in here?

final DateFormatter formatter = dateFormatter;
return text -> {
TemporalAccessor accessor = formatter.parse(text);
// if there is no year nor year-of-era, we fall back to the current one and
// fill the rest of the date up with the parsed date
if (accessor.isSupported(ChronoField.YEAR) == false
&& accessor.isSupported(ChronoField.YEAR_OF_ERA) == false
&& accessor.isSupported(WeekFields.of(locale).weekOfWeekBasedYear()) == false) {
&& accessor.isSupported(WeekFields.of(locale).weekBasedYear()) == false) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change unrelated to the timezone issue?

Copy link
Contributor Author

@pgomulka pgomulka Dec 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not related to the timezone issue, but also mentioned by a user #63458 (comment)

I realised I have not added a test case for it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to isolate such a separate fix, so that the commit message is associated with what is being fixed, rather than a seemingly unrelated change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a good idea, better to isolate this change.
A separate PR #65717

@pgomulka
Copy link
Contributor Author

pgomulka commented Dec 1, 2020

@elasticmachine update branch

@pgomulka pgomulka requested a review from rjernst December 29, 2020 08:29
Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pgomulka
Copy link
Contributor Author

@elasticmachine update branch

@pgomulka pgomulka added the >bug label Oct 26, 2021
@pgomulka pgomulka merged commit 226f116 into elastic:master Oct 26, 2021
pgomulka added a commit to pgomulka/elasticsearch that referenced this pull request Oct 26, 2021
this commit aligns timezone parsing logic with DateFormat.Iso8601. Timezone can be provided on a pattern and as ingest parameter. When no timezone is provided on a pattern it is defaulted to UTC or an ingest parameter (if provided).

When timezone is not present in a pattern, but an ingest timezone parameter is provided - a date should be parsed as if it was in that timezone.
Example: pattern "uuuu-MM-dd'T'HH:mm", text "2020-01-01T01:00", timezone: -01:00 should return 2020-01-01T01:00:00-01:00
If the pattern has a timezone and a timezone parameter is provided - a date should parsed with a timezone from input, and then "recalculated" to an ingest parameter
Example "uuuu-MM-dd'T'HH:mm XXX", text "2020-01-01T01:00 -02:00", timezone -01:00 should return 2020-01-01T02:00:00-01:00
relates elastic#38407
relates elastic#51215
closes elastic#63458
pgomulka added a commit that referenced this pull request Oct 27, 2021
this commit aligns timezone parsing logic with DateFormat.Iso8601. Timezone can be provided on a pattern and as ingest parameter. When no timezone is provided on a pattern it is defaulted to UTC or an ingest parameter (if provided).

When timezone is not present in a pattern, but an ingest timezone parameter is provided - a date should be parsed as if it was in that timezone.
Example: pattern "uuuu-MM-dd'T'HH:mm", text "2020-01-01T01:00", timezone: -01:00 should return 2020-01-01T01:00:00-01:00
If the pattern has a timezone and a timezone parameter is provided - a date should parsed with a timezone from input, and then "recalculated" to an ingest parameter
Example "uuuu-MM-dd'T'HH:mm XXX", text "2020-01-01T01:00 -02:00", timezone -01:00 should return 2020-01-01T02:00:00-01:00
relates #38407
relates #51215
closes #63458

backports #63876
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Core/Infra/Core Core issues without another label :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP Team:Core/Infra Meta label for core/infra team Team:Data Management Meta label for data/management team v7.16.0 v8.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Ingest Pipeline] Date processor ignores timezone offset XXX
7 participants