Skip to content

Provide more information about cause of deprecation log messages #26836

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
cbuescher opened this issue Sep 29, 2017 · 11 comments
Closed

Provide more information about cause of deprecation log messages #26836

cbuescher opened this issue Sep 29, 2017 · 11 comments
Labels
:Core/Infra/Logging Log management and logging utilities >enhancement help wanted adoptme needs:triage Requires assignment of a team area label Team:Core/Infra Meta label for core/infra team

Comments

@cbuescher
Copy link
Member

cbuescher commented Sep 29, 2017

Currently, the deprecation log can be quite useful to spot features and queries that are deprecated and will be removed in a future major version, but especially with deprecated query syntax it can be hard to figure out the query what caused the warning to be triggered.
We also already add warning headers to the request, so clients will also be informed about the deprecation, but may choose to ignore this. For cluster administrators it can be hard to figure out where those queries come from.

For example, when you only see

[WARN ][o.e.d.c.ParseField       ] Deprecated field [type] used, replaced by [match_phrase and match_phrase_prefix query]

in the log, this neither tells you about the query type that has changed (this is from 5.x where the match query type parameter has been deprecated in favour of the match_phrase and match_phrase_prefix query. Here, improving the message would already help, but this still leaves the origin of the query unknown.

This relates to #21772, but its not only important to add replacement information but also potential cues for the origin of the offending query or request.

@cbuescher cbuescher added :Core/Infra/Logging Log management and logging utilities discuss >enhancement labels Sep 29, 2017
@markharwood
Copy link
Contributor

Currently the logging for deprecation happens "down low" in the call stack and is missing some of the context that would be required for this more detailed logging.
Rather than pushing added query context down into the call stack the suggestion is we move the actual logging up higher in the call stack "on the way out" where the code has access to more context (who the user is, what query was being executed etc). Deprecation failures are already added to the thread context and hence can be detected and logged at higher sections in the call stack.

As well as allowing admins to know which clients are using deprecated features that might break in the next software upgrade, it would be useful if admins could mark certain fields as deprecated in the mapping in order to understand which clients were making use of fields the administrator would be keen to remove in the next revision of an index. This is perhaps for another issue but would leverage the same "up high" logging of query/client context and their use of deprecated features.

@markharwood markharwood added help wanted adoptme and removed discuss labels Oct 6, 2017
@colings86
Copy link
Contributor

We should be careful about logging the request body here, for large requests (such as bulk) it could be very slow.

@MartinsDBA
Copy link

Will add also here what I said in Elastic support case:
Proposal would be to add addition config parameter to elasticsearch.yml file.
It could launch advanced logging to another file beside existing one.
But this time log not only existing message, but also query, or at least first X characters.
It could be enough to track down who is using deprecated features and queries

@pgomulka
Copy link
Contributor

pgomulka commented Sep 13, 2019

we added x-opaque-id to JSON deprecation logs, which are enabled by default.
This however still does not indicates where in code this deprecation happened.
So for instance if X-Opaque-Id is provided as a header in a request

curl --request PUT \
  --url 'http://localhost:9200/inde2x?include_type_name=true' \
  --header 'authorization: Basic ZWxhc3RpYzpwYXNzd29yZA==' \
  --header 'content-type: application/json' \
  --header 'x-opaque-id: myuseerid' \
  --data '{
}
'

then the deprecation.json log would contain

{
  "type": "deprecation",
  "timestamp": "2019-09-13T10:25:32,807+02:00",
  "level": "WARN",
  "component": "o.e.d.r.a.a.i.RestCreateIndexAction",
  "cluster.name": "distribution_run",
  "node.name": "node-0",
  "message": "[types removal] Using include_type_name in create index requests is deprecated. The parameter will be removed in the next major version.",
  "cluster.uuid": "SUwTW-ujQ2WioDjab92DBw",
  "node.id": "_rVKHYAtQdmzimgf7XdxAg",
  "x-opaque-id": "myuseerid"
}

WDYT?

@rjernst
Copy link
Member

rjernst commented Oct 10, 2019

@pgomulka Wouldn't xopaqueid cause each deprecation log to be unique, negating the optimization we have to emit deprecations once for a given call site? This is particularly important in tight loop calls to deprecations, eg internally within scripting.

Perhaps #46106 would better satisfy this problem as we could then store every unique deprecation, but have identifiers for each unique call site so the categories of deprecations can be found from the index.

@pgomulka
Copy link
Contributor

@rjernst yes - a key for deprecation logging caching is x-opaque-id+key . That means that multiple calls to the same deprecation logging code by different users will result with multiple log lines. This was done intentionally as otherwise there is no way to tell who triggered deprecation logs, and there would be little purpose in having x-opaque-id in deprecation logs (it would only tell who triggered the deprecation code first).
I might be missing the details on the problem with scripting and logging. Can you provide details? If no x-opaque-id is provided, it behaves same as previously (key in a cache is the same)

But I also agree that more information is needed in deprecation logs. Most notably query context or where the deprecation comes from in code (stacktrace?)
And also agree that indexing deprecation logs would be very useful and we should look into this.

In regards to identifiers per call site, this is something that would be useful in other areas in logging too. I spoke with @cachedout and it would help to for instance group logging when snapshotting so that it could help to measure how long this took. Markers would be handy for this, and for JSON logs they could be extracted to a separate field (they are part of the message now).

@rjernst
Copy link
Member

rjernst commented Oct 11, 2019

I might be missing the details on the problem with scripting and logging. Can you provide details?

If a method is called within a script that operates per document (eg script_score query), and that method is deprecated, we emit a deprecation warning. Currently that warning will be emited once, rather than per document/per request. With x-opaque-id, it turns it into a unique deprecation key per request, so the deprecation log can quickly be filled. The general idea of keeping track of the last N deprecations was to avoid a gigantic deprecation log.

However, I think #46106 I mentioned before will generally solve this issue, although we still need to be careful in that case about only making one deprecation log entry for the request, rather than per document.

@rjernst rjernst added the Team:Core/Infra Meta label for core/infra team label May 4, 2020
@rjernst rjernst added the needs:triage Requires assignment of a team area label label Dec 3, 2020
@pgomulka
Copy link
Contributor

pgomulka commented Dec 7, 2020

@cbuescher do you think this is still an issue given we have x-opaque-id in deprecation logs?

@pgomulka
Copy link
Contributor

@rjernst

If a method is called within a script that operates per document (eg script_score query), and that method is deprecated, we emit a deprecation warning.

Shouldn't the script_score query still have an information about the x-opaque-id of the user who triggered it? That x-opaque-id would be the same for all documents. I agree that deprecation log should not absolutely not be emitted per document - hopefully this can't happen at the moment, or am I missing something?

But I think it would be beneficial to see these deprecation per request - helps identifying which app needs to be upgraded.

@rjernst
Copy link
Member

rjernst commented Dec 18, 2020

That x-opaque-id would be the same for all documents

You are right. I'm not sure exactly what I was thinking from my previous comment, but it must have been thinking the x-opaque-id would be different per document.

deprecation log should not absolutely not be emitted per document - hopefully this can't happen at the moment

The key is what determines whether this happens. The deprecation logger keeps track of the key for the last N deprecations. So if we had a unique key per document, we would thrash that tracking, and emit every document. It shouldn't be happening here, though, as you pointed out.

@pgomulka
Copy link
Contributor

In that case I think we can safely close the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Logging Log management and logging utilities >enhancement help wanted adoptme needs:triage Requires assignment of a team area label Team:Core/Infra Meta label for core/infra team
Projects
None yet
Development

No branches or pull requests

6 participants