-
-
Notifications
You must be signed in to change notification settings - Fork 4.3k
ref(slack): Add APM to Slack and use use ApiClient #18346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
31aca4a
to
43b3fc4
Compare
5ec1159
to
e8d8ffa
Compare
self.logger.info( | ||
"rule.fail.slack_post", | ||
extra={ | ||
"error": response.get("error"), | ||
"error": six.text_type(e), | ||
"project_id": event.project_id, | ||
"event_id": event.event_id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@scefali still going to add the channel in here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MeredithAnya Well, we want to add the channel on SlackClient
as logging_context
. This log message is going to be somewhat redundant since we'll already log the error at the client level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking at it more, adding the channel to the logging_context doesn't seem as helpful to me
self.logger.info(u"%s.http_response" % (self.integration_type), extra=extra)
If an alert rule for slack is failing, I rather look up the logs using rule.fail.slack_post
instead of slack.http_response
and having to figure out which logs have to do with the alert rule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MeredithAnya could we do it for both? Sometimes we might look at the http_response
I think.
datadog_prefix = "integrations.slack" | ||
|
||
def __init__(self): | ||
super(SlackClient, self).__init__() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need this if its not doing anything?
# TODO(meredith): Slack actually supports json now for the chat.postMessage so we | ||
# can update that so we don't have to pass json=False here | ||
response = self._request(method, path, headers=headers, data=data, params=params, json=json) | ||
if not response.json.get("ok"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will response.json work even when you get an HTML page back from slack?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm no, but I think the _request
would raise the ApiError
in that case
logger.error("slack.event.unfurl-error", extra={"response": response}) | ||
client = SlackClient() | ||
try: | ||
client.post("/chat.unfurl", data=payload) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MeredithAnya Are we sure that Slack will never send us anything but a 200
status code? Wondering if we could get an ApiError
from a 4xx
or 500
error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean I haven't tested every scenario but I'm sure they send responses other than 200's. So yeah I think the ApiError
could be from a variety of things, is the problem here that we should be re-raising?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MeredithAnya Yea, we should re-raise it. I noticed that we eat the error in the PD integration but I also know that when an API error happens, it will cause a Sentry error lol: https://sentry.io/organizations/sentry/issues/1551869421/?referrer=jira_integration
I am definitely not a fan that in many places where we get an error, we just log it and don't re-raise it. I realize that is how it was originally and this PR shouldn't change it (except in |
resp = client.get("/team.info", params=payload) | ||
except ApiError as e: | ||
logger.error("slack.team-info.response-error", extra={"error": six.text_type(e)}) | ||
raise IntegrationError("Could not retrieve Slack team information.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MeredithAnya Is there a reason you don't want to just re-raise the original exception?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we can catch it here:
except IntegrationError as e: |
c2cafed
to
4eb80ab
Compare
Context:
Slack works differently than other integrations in terms of letting us know when something went wrong. In many cases, we still get a
200
level status code back, but it's on us to check the response paramok
to see whether that isTrue
orFalse
. Because of this we haven't use a shared ApiClient. Because of that, APM for sentry was not added to the Slack integration in this PR #17792. The current PR covers:ApiClient
for the Slack integrationtrack_response_data
to make sure we can record whenok: False
happens and the error along with itApiClient for Slack:
Normally in the
_request
method for the shared integrations client, we raise an ApiError. That won't happen in the shared client when we get 200s, so instead I've raised in within the Slack client'srequest
method if the response includesok: False
.We've been sending the response as extra data when capture the errors in sentry but I've changed this to just report the error message since it's not much different:
before
after
APM for Slack:
I've added some extra tags along with the other data:
ok
: either True or Falseslack_error
: ifok
is False, then we can see what the error was.todos: