-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: alertmanager conditional log gathering #545
feat: alertmanager conditional log gathering #545
Conversation
Skipping CI for Draft Pull Request. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: rluders The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
130db6b
to
574c25c
Compare
6ef73be
to
b4f7903
Compare
@@ -79,6 +81,7 @@ var defaultGatheringRules = []GatheringRule{ | |||
}, | |||
}, | |||
}, | |||
// GatherAPIRequestCounts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, I added these comments to help to identify the conditions for each gather. It was a little bit hard for me to read it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense and I agree that these lists tend to be difficult to read, especially as they grow longer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside from a couple of small details, I don't see any particular issue with this PR. Please let me know if you want to make any of the suggested changes. Otherwise, I could probably approve it as is.
{ | ||
Conditions: []ConditionWithParams{ | ||
{ | ||
Type: AlertIsFiring, | ||
Alert: &AlertConditionParams{ | ||
Name: "AlertmanagerClusterFailedToSendAlerts", | ||
}, | ||
}, | ||
}, | ||
GatheringFunctions: GatheringFunctions{ | ||
GatherAlertmanagerLogs: GatherAlertmanagerLogsParams{ | ||
AlertName: "AlertmanagerClusterFailedToSendAlerts", | ||
TailLines: 50, | ||
}, | ||
}, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unfortunate that the conditional gatherer doesn't pass the alert name to the gatherer function is some way because it leads to this ugly code duplication that would be very easy to accidentally mess up (forgetting to set both alert name strings when copy-pasting). Not an issue with this PR, just a general note.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeh, I was thinking exactly the same when I was implementing it. Maybe it would be a good idea to create a task to look for a better approach. What do you think?
/retest |
I experimented with this one and reviewed couple of times. Looks good! Thank you! |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
1 similar comment
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/label qe-approved |
/label px-approved |
/label docs-approved |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
13 similar comments
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
2 similar comments
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
This PR implements a new conditional gathering to collect Alertmanager logs when
AlertmanagerClusterFailedToSendAlerts
orAlertmanagerFailedToSendAlerts
events are fired.How to test it
Administration > Cluster Settings > Global Configuration > Alertmanager
Critical
andDefault
to useWebhook
, and add any invalid address to theURL
It may take some time until the event gets fired, you can follow it up from
Monitoring > Alert
Categories
Sample Archive
docs/insights-archive-sample/conditional/namespaces/openshift-monitoring/pods/alertmanager-main-0/containers/logs/alertmanager-alertmanagerfailedtosendalerts.log
Documentation
docs/gathered-data.md
Unit Tests
pkg/gatherers/conditional/gather_alertmanager_logs_test.go
Privacy
Yes. There are no sensitive data in the newly collected information.
Changelog
Breaking Changes
No
References
https://issues.redhat.com/browse/CCXDEV-6036