-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
diagnostics: missing logging project shouldn't be fatal error #18714
Merged
openshift-merge-robot
merged 1 commit into
openshift:master
from
wozniakjan:diagnostics/logging/18705/missing_project_fatal
Mar 6, 2018
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing this error seems wrong. If we fail to fetch the default logging project, we will incorrectly print 'Found default logging project...'? Returning error in this case is reasonable.
Other problem we need to deal with is when 'oadm diagnostics all' is called, one diag failure should not block execution of other diag.
I think the problem is in pkg/oc/admin/diagnostics/diagnostics.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, I didn't inspect the logic closely enough here. It deals correctly with the projects not being there, but not any other error that could happen. Although I'd say the correct response here would be to log the error and return
nil
(so that other diagnostics could still run).I think you're probably right that this needs adjustment. The current "abort everything"
Complete()
behavior comes from it being run during the "diagnostics build" phase where anything that goes wrong is considered a critical error. I would like to think through how to clearly separate indicators of what exactly is wrong (the diagnostic runner, the individual diagnostic, its requirements, the user flags, the environment, the thing being diagnosed), all while making the default behavior as helpful as possible. It probably actually simplifies things that you can only run all or one now.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way I understand
Complete()
is, just to finish the setup in best effort kind of way. The diagnostics are run inCheck()
and they will give the user the desired information about the state of the cluster. Keeping the error there will result in aborting all diagnostics if fetching one of default logging projects fails for some reason, removing it as suggested here will result in informing the user about the error inCheck()
but allowing other diagnostics to run as well.@sosiouxme should I try to change as @pravisankar suggests here or is it being done as part of #18709? It doesn't feel correct to refactor diagnostics in this PR and I would rather fix the broken behavior while still complying with the current implementation as best as we possibly can. And once the diagnostics implementation is changed, we can refactor this as well.
Unfortunately, we don't have another lead for the default logging project than to try a couple of namespaces,
logging
and newlyopenshift-logging
. Another approach to logging diagnostics could be to run theCheck()
for each and if one is with no issues, consider diagnostics to be without issues, otherwise display issues for both.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wozniakjan your understanding seems correct to me. We just didn't want the
Complete()
logic to fall through to log "'Found default logging project...'" when really no project was found. To proceed with what you have now, justComplete()
with an empty project and put this "failed fetching the logging project" error inCanRun()
so the skip logic is engaged and other diagnostics proceed.The decision on how to handle
Complete()
failures better is IMHO outside the scope of this PR. BTW #18709 doesn't address this at all, it's mainly a cosmetic reorg of code.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, thanks! And about #18709, I wanted to clarify on what were the intentions there, in the name it said 'diagnostic reorg' so just wanted to make sure the PRs won't overlap in an undesired way
Updated the PR to return
nil
if fetching one of the default logging projects errors and also log the error so it is not completely lost (even though given theCanRun()
skips because no logging project has been selected, it won't get to the user unless the diagnostics algorithm changes)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering the current limitations, that is a decent approach. Thanks!