Skip to content

Fix UnknownRemoteOperation for Database instrumentation. #50

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 36 commits into from
Feb 14, 2024

Conversation

zzhlogin
Copy link
Contributor

@zzhlogin zzhlogin commented Feb 9, 2024

Description of changes:
When "db.operation" is not provided in span attributes, the "aws.remote.operation" will be UnknowRemoteOperation.
This PR updated the _set_remote_service_and_operation function to parsing the "db.statement" attribute when "db.operation" is missing, and extract the valid "aws.remote.operation" (We setup a set of valid database related "aws.remote.operation" for security concern):

  1. Add a json file configuration/dialect_keywords.json saving all the valid keywords. In this file, the sequence of key words matter -> The keyword with longer word length are placed towards the front of the list so it will be matched first.
  2. We retrieve the first 27 characters from "db.statement" to avoid the case where very large statements, and use regular expression to match the keyword, it will match the beginning of the string. If the string's start does not conform to the regular expression, the match fails.
  3. Add unit test for _set_remote_service_and_operation performance where tested _DB_STATEMENT is/is not present and is/is not valid.
  4. Add unit test covering different "db.statement" cases:
    a. Only 1 valid keywords match
    b. More than 1 valid keywords match, we want to pick the longest match
    c. More than 1 valid keywords match, but the other keywords is not at the start of the SpanAttributes.DB_STATEMENT. We want to only pick start match
    d. No valid match
    e. Have valid but it is not at the start of SpanAttributes.DB_STATEMENT
    f. Have valid keywords, match the longest word
    g. Have valid keywords, match with first word
    h.Have valid keywords, match with upper case
    5.Add unit test for testing keywords sequence in json file:
    a.Confirm the keywords are sorted based on descending order of keywords character length
    b.Confirm maximum length of keywords is not longer than MAX_KEYWORD_LENGTH
  5. exclude configuration/dialect_keywords.json from codespell check because the Keywords are fixed for the query, and codespell does not recognize the comments in json.

Testing
Tested by deploying the code changes to EKS and nodes, where we can see the actual operation instead of UnknowRemoteOperation:

Screenshot 2024-02-13 at 6 21 34 PM

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@zzhlogin zzhlogin requested a review from a team February 9, 2024 01:27
@zzhlogin zzhlogin merged commit 77ac67b into main Feb 14, 2024
@zzhlogin zzhlogin deleted the db_remote_operator branch February 14, 2024 23:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants