Provide more information about cause of deprecation log messages #26836

cbuescher · 2017-09-29T13:55:13Z

Currently, the deprecation log can be quite useful to spot features and queries that are deprecated and will be removed in a future major version, but especially with deprecated query syntax it can be hard to figure out the query what caused the warning to be triggered.
We also already add warning headers to the request, so clients will also be informed about the deprecation, but may choose to ignore this. For cluster administrators it can be hard to figure out where those queries come from.

For example, when you only see

[WARN ][o.e.d.c.ParseField       ] Deprecated field [type] used, replaced by [match_phrase and match_phrase_prefix query]

in the log, this neither tells you about the query type that has changed (this is from 5.x where the match query type parameter has been deprecated in favour of the match_phrase and match_phrase_prefix query. Here, improving the message would already help, but this still leaves the origin of the query unknown.

This relates to #21772, but its not only important to add replacement information but also potential cues for the origin of the offending query or request.

The text was updated successfully, but these errors were encountered:

markharwood · 2017-10-06T14:00:14Z

Currently the logging for deprecation happens "down low" in the call stack and is missing some of the context that would be required for this more detailed logging.
Rather than pushing added query context down into the call stack the suggestion is we move the actual logging up higher in the call stack "on the way out" where the code has access to more context (who the user is, what query was being executed etc). Deprecation failures are already added to the thread context and hence can be detected and logged at higher sections in the call stack.

As well as allowing admins to know which clients are using deprecated features that might break in the next software upgrade, it would be useful if admins could mark certain fields as deprecated in the mapping in order to understand which clients were making use of fields the administrator would be keen to remove in the next revision of an index. This is perhaps for another issue but would leverage the same "up high" logging of query/client context and their use of deprecated features.

colings86 · 2017-10-06T15:04:07Z

We should be careful about logging the request body here, for large requests (such as bulk) it could be very slow.

MartinsDBA · 2017-10-11T10:47:15Z

Will add also here what I said in Elastic support case:
Proposal would be to add addition config parameter to elasticsearch.yml file.
It could launch advanced logging to another file beside existing one.
But this time log not only existing message, but also query, or at least first X characters.
It could be enough to track down who is using deprecated features and queries

pgomulka · 2019-09-13T08:27:08Z

we added x-opaque-id to JSON deprecation logs, which are enabled by default.
This however still does not indicates where in code this deprecation happened.
So for instance if X-Opaque-Id is provided as a header in a request

curl --request PUT \
  --url 'http://localhost:9200/inde2x?include_type_name=true' \
  --header 'authorization: Basic ZWxhc3RpYzpwYXNzd29yZA==' \
  --header 'content-type: application/json' \
  --header 'x-opaque-id: myuseerid' \
  --data '{
}
'

then the deprecation.json log would contain

{
  "type": "deprecation",
  "timestamp": "2019-09-13T10:25:32,807+02:00",
  "level": "WARN",
  "component": "o.e.d.r.a.a.i.RestCreateIndexAction",
  "cluster.name": "distribution_run",
  "node.name": "node-0",
  "message": "[types removal] Using include_type_name in create index requests is deprecated. The parameter will be removed in the next major version.",
  "cluster.uuid": "SUwTW-ujQ2WioDjab92DBw",
  "node.id": "_rVKHYAtQdmzimgf7XdxAg",
  "x-opaque-id": "myuseerid"
}

WDYT?

rjernst · 2019-10-10T21:08:11Z

@pgomulka Wouldn't xopaqueid cause each deprecation log to be unique, negating the optimization we have to emit deprecations once for a given call site? This is particularly important in tight loop calls to deprecations, eg internally within scripting.

Perhaps #46106 would better satisfy this problem as we could then store every unique deprecation, but have identifiers for each unique call site so the categories of deprecations can be found from the index.

pgomulka · 2019-10-11T07:53:16Z

@rjernst yes - a key for deprecation logging caching is x-opaque-id+key . That means that multiple calls to the same deprecation logging code by different users will result with multiple log lines. This was done intentionally as otherwise there is no way to tell who triggered deprecation logs, and there would be little purpose in having x-opaque-id in deprecation logs (it would only tell who triggered the deprecation code first).
I might be missing the details on the problem with scripting and logging. Can you provide details? If no x-opaque-id is provided, it behaves same as previously (key in a cache is the same)

But I also agree that more information is needed in deprecation logs. Most notably query context or where the deprecation comes from in code (stacktrace?)
And also agree that indexing deprecation logs would be very useful and we should look into this.

In regards to identifiers per call site, this is something that would be useful in other areas in logging too. I spoke with @cachedout and it would help to for instance group logging when snapshotting so that it could help to measure how long this took. Markers would be handy for this, and for JSON logs they could be extracted to a separate field (they are part of the message now).

rjernst · 2019-10-11T23:37:20Z

I might be missing the details on the problem with scripting and logging. Can you provide details?

If a method is called within a script that operates per document (eg script_score query), and that method is deprecated, we emit a deprecation warning. Currently that warning will be emited once, rather than per document/per request. With x-opaque-id, it turns it into a unique deprecation key per request, so the deprecation log can quickly be filled. The general idea of keeping track of the last N deprecations was to avoid a gigantic deprecation log.

However, I think #46106 I mentioned before will generally solve this issue, although we still need to be careful in that case about only making one deprecation log entry for the request, rather than per document.

pgomulka · 2020-12-07T14:11:12Z

@cbuescher do you think this is still an issue given we have x-opaque-id in deprecation logs?

pgomulka · 2020-12-18T12:45:01Z

@rjernst

If a method is called within a script that operates per document (eg script_score query), and that method is deprecated, we emit a deprecation warning.

Shouldn't the script_score query still have an information about the x-opaque-id of the user who triggered it? That x-opaque-id would be the same for all documents. I agree that deprecation log should not absolutely not be emitted per document - hopefully this can't happen at the moment, or am I missing something?

But I think it would be beneficial to see these deprecation per request - helps identifying which app needs to be upgraded.

rjernst · 2020-12-18T18:45:31Z

That x-opaque-id would be the same for all documents

You are right. I'm not sure exactly what I was thinking from my previous comment, but it must have been thinking the x-opaque-id would be different per document.

deprecation log should not absolutely not be emitted per document - hopefully this can't happen at the moment

The key is what determines whether this happens. The deprecation logger keeps track of the key for the last N deprecations. So if we had a unique key per document, we would thrash that tracking, and emit every document. It shouldn't be happening here, though, as you pointed out.

pgomulka · 2020-12-28T11:35:30Z

In that case I think we can safely close the issue.

cbuescher added :Core/Infra/Logging Log management and logging utilities discuss >enhancement labels Sep 29, 2017

markharwood added help wanted adoptme and removed discuss labels Oct 6, 2017

jaymode mentioned this issue Dec 6, 2019

Write deprecation logs to an index #46106

Closed

rjernst added the Team:Core/Infra Meta label for core/infra team label May 4, 2020

rjernst added the needs:triage Requires assignment of a team area label label Dec 3, 2020

pgomulka closed this as completed Dec 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide more information about cause of deprecation log messages #26836

Provide more information about cause of deprecation log messages #26836

cbuescher commented Sep 29, 2017 •

edited by javanna

Loading

markharwood commented Oct 6, 2017

colings86 commented Oct 6, 2017

MartinsDBA commented Oct 11, 2017

pgomulka commented Sep 13, 2019 •

edited

Loading

rjernst commented Oct 10, 2019

pgomulka commented Oct 11, 2019

rjernst commented Oct 11, 2019

pgomulka commented Dec 7, 2020

pgomulka commented Dec 18, 2020

rjernst commented Dec 18, 2020

pgomulka commented Dec 28, 2020

Provide more information about cause of deprecation log messages #26836

Provide more information about cause of deprecation log messages #26836

Comments

cbuescher commented Sep 29, 2017 • edited by javanna Loading

markharwood commented Oct 6, 2017

colings86 commented Oct 6, 2017

MartinsDBA commented Oct 11, 2017

pgomulka commented Sep 13, 2019 • edited Loading

rjernst commented Oct 10, 2019

pgomulka commented Oct 11, 2019

rjernst commented Oct 11, 2019

pgomulka commented Dec 7, 2020

pgomulka commented Dec 18, 2020

rjernst commented Dec 18, 2020

pgomulka commented Dec 28, 2020

cbuescher commented Sep 29, 2017 •

edited by javanna

Loading

pgomulka commented Sep 13, 2019 •

edited

Loading