Skip to content

ingest date processor parsing #51108

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
fabide opened this issue Jan 16, 2020 · 5 comments · Fixed by #51215
Closed

ingest date processor parsing #51108

fabide opened this issue Jan 16, 2020 · 5 comments · Fixed by #51215
Labels
>bug :Core/Infra/Core Core issues without another label :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP

Comments

@fabide
Copy link

fabide commented Jan 16, 2020

Elasticsearch version: 7.4.2

JVM version : openjdk version "1.8.0_232"

OS version : 4.9.0-11-amd64 #1 SMP Debian 4.9.189-3+deb9u2 (2019-11-11) x86_64 GNU/Linux

Description of the problem including expected versus actual behavior:
trying to parse a timestamp with iso8601 format. I expect @timestamp 2020-01-15T14:28:09.452+01:00 but it shows 2020-01-15T15:28:09.452+01:00.

Steps to reproduce:

curl -XPOST 'http://localhost:9200/_ingest/pipeline/_simulate' \
-h 'Content-Type: application/json' -d'{
  "pipeline": {
    "processors": [
      {
        "date": {
          "field": "timestamp",
          "timezone": "+0100",
          "formats": [ "ISO8601"]
        }
      }
    ]
  },
  "docs": [
    {
      "_source": {
        "timestamp": "2020-01-15T14:28:09,452"
      }
    }
    ]
}'

Result:

{
    "docs": [
        {
            "doc": {
                "_index": "_index",
                "_type": "_doc",
                "_id": "_id",
                "_source": {
                    "@timestamp": "2020-01-15T15:28:09.452+01:00",
                    "timestamp": "2020-01-15T14:28:09,452"
                },
                "_ingest": {
                    "timestamp": "2020-01-16T15:06:47.477281Z"
                }
            }
        }
    ]
}
@matriv matriv added the :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP label Jan 16, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (:Core/Features/Ingest)

@matriv
Copy link
Contributor

matriv commented Jan 16, 2020

Reproduced the behaviour in master.
To double check inserted actual document using the pipeline and the result is

"hits": [
  {
      "_index": "myindex",
      "_id": "1",
      "_score": 1.0,
      "_source": {
          "@timestamp": "2020-01-15T15:28:09.452+01:00",
          "timestamp": "2020-01-15T14:28:09,452"
      }
  }
]

Using stored fields I also get:

"hits": [
  {
      "_index": "myindex",
      "_id": "1",
      "_score": 1.0,
      "fields": {
          "timestamp": [
              "2020-01-15T14:28:09.452Z"
          ]
      }
  }
]

To me it also seems like an incorrect behaviour as defining the timezone shouldn't do a conversion on the _source field, but save it as 2020-01-15T14:28:09.452+01:00 which in UTC should become: 2020-01-15T13:28:09.452Z

@matriv
Copy link
Contributor

matriv commented Jan 17, 2020

Imho, as a user, if I have a date with a timezone e.g.: 2020-01-15T15:28:09.452+01:00 and a timezone -05:00
I would expect a transformation that would give: 2020-01-15T09:28:09.452-05:00
Similarly for a date: 2020-01-15T15:28:09.452Z and a timezone +03:00 I would expect:
2020-01-15T18:28:09.452+03:00.
But if the date is missing the timezone part: 2020-01-15T15:28:09.452 I would expect to just add the tz without transformation: 2020-01-15T15:28:09.452+03:00.

@pgomulka
Copy link
Contributor

From what I see in 6.x it was working as you described @matriv
I will mark this as a bug and work on this

@pgomulka pgomulka added :Core/Infra/Core Core issues without another label >bug labels Jan 17, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (:Core/Infra/Core)

@pgomulka pgomulka self-assigned this Jan 17, 2020
pgomulka added a commit that referenced this issue Feb 3, 2020
when a timezone is not provided Ingest logic should consider a time to be in a timezone provided as a parameter.
When a timezone is provided Ingest should recalculate a time to the timezone provided as a parameter

closes #51108
pgomulka added a commit to pgomulka/elasticsearch that referenced this issue Feb 3, 2020
when a timezone is not provided Ingest logic should consider a time to be in a timezone provided as a parameter.
When a timezone is provided Ingest should recalculate a time to the timezone provided as a parameter

closes elastic#51108
pgomulka added a commit that referenced this issue Feb 3, 2020
when a timezone is not provided Ingest logic should consider a time to be in a timezone provided as a parameter.
When a timezone is provided Ingest should recalculate a time to the timezone provided as a parameter

closes #51108
backport(#51215)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Core/Infra/Core Core issues without another label :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants