Skip to content

ref(slack): Add APM to Slack and use use ApiClient #18346

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 24, 2020
Merged

Conversation

MeredithAnya
Copy link
Member

@MeredithAnya MeredithAnya commented Apr 17, 2020

Context:

Slack works differently than other integrations in terms of letting us know when something went wrong. In many cases, we still get a 200 level status code back, but it's on us to check the response param ok to see whether that is True or False. Because of this we haven't use a shared ApiClient. Because of that, APM for sentry was not added to the Slack integration in this PR #17792. The current PR covers:

  • Using the ApiClient for the Slack integration
  • Modifies track_response_data to make sure we can record when ok: False happens and the error along with it

ApiClient for Slack:

Normally in the _request method for the shared integrations client, we raise an ApiError. That won't happen in the shared client when we get 200s, so instead I've raised in within the Slack client's request method if the response includes ok: False.

We've been sending the response as extra data when capture the errors in sentry but I've changed this to just report the error message since it's not much different:

before

Screen Shot 2020-04-23 at 10 00 03 AM

after

Screen Shot 2020-04-23 at 10 28 07 AM


APM for Slack:
I've added some extra tags along with the other data:

  • ok: either True or False
  • slack_error: if ok is False, then we can see what the error was.

Screen Shot 2020-04-23 at 9 51 28 AM

todos:

  • update link identity to use slack client
  • update event_action to use slack client
  • update getting team.info to use slack client

self.logger.info(
"rule.fail.slack_post",
extra={
"error": response.get("error"),
"error": six.text_type(e),
"project_id": event.project_id,
"event_id": event.event_id,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@scefali still going to add the channel in here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MeredithAnya Well, we want to add the channel on SlackClient as logging_context. This log message is going to be somewhat redundant since we'll already log the error at the client level.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking at it more, adding the channel to the logging_context doesn't seem as helpful to me

self.logger.info(u"%s.http_response" % (self.integration_type), extra=extra)

If an alert rule for slack is failing, I rather look up the logs using rule.fail.slack_post instead of slack.http_response and having to figure out which logs have to do with the alert rule.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MeredithAnya could we do it for both? Sometimes we might look at the http_response I think.

datadog_prefix = "integrations.slack"

def __init__(self):
super(SlackClient, self).__init__()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need this if its not doing anything?

# TODO(meredith): Slack actually supports json now for the chat.postMessage so we
# can update that so we don't have to pass json=False here
response = self._request(method, path, headers=headers, data=data, params=params, json=json)
if not response.json.get("ok"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will response.json work even when you get an HTML page back from slack?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm no, but I think the _request would raise the ApiError in that case

logger.error("slack.event.unfurl-error", extra={"response": response})
client = SlackClient()
try:
client.post("/chat.unfurl", data=payload)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MeredithAnya Are we sure that Slack will never send us anything but a 200 status code? Wondering if we could get an ApiError from a 4xx or 500 error.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean I haven't tested every scenario but I'm sure they send responses other than 200's. So yeah I think the ApiError could be from a variety of things, is the problem here that we should be re-raising?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MeredithAnya Yea, we should re-raise it. I noticed that we eat the error in the PD integration but I also know that when an API error happens, it will cause a Sentry error lol: https://sentry.io/organizations/sentry/issues/1551869421/?referrer=jira_integration

@scefali
Copy link
Contributor

scefali commented Apr 23, 2020

I am definitely not a fan that in many places where we get an error, we just log it and don't re-raise it. I realize that is how it was originally and this PR shouldn't change it (except in get_team_info as I know that breaks). Maybe in a future PR we can fix it for good.

resp = client.get("/team.info", params=payload)
except ApiError as e:
logger.error("slack.team-info.response-error", extra={"error": six.text_type(e)})
raise IntegrationError("Could not retrieve Slack team information.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MeredithAnya Is there a reason you don't want to just re-raise the original exception?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we can catch it here:

except IntegrationError as e:

@MeredithAnya MeredithAnya merged commit e817470 into master Apr 24, 2020
@MeredithAnya MeredithAnya deleted the slack/API-937 branch April 24, 2020 21:41
@github-actions github-actions bot locked and limited conversation to collaborators Dec 19, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants