Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether it is really a good idea to continue the loop in this case. Are there definitely no side effects from having stale directories at the point we run subsequent forecasts?
If there are definitely no side effects from having failed to clean up then I guess this is ok. Alternatively, since you delete all pending forecasts at the end of this function, another option might be to set the m_Shutdown flag here, break out of the loop and check the worker hasn't shutdown in the code which schedules forecasts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think continuing the loop is that bad. Remember that a new process is only started for forecasting if the job is not currently open. So exiting the loop would mean it was not possible to do a forecast for a real-time job without closing and reopening the job, and that could be a major pain for production use cases.
Maybe I'm missing something though. If shutting down the loop is determined to be the best way forward then it needs to be set to true with the mutex locked in this method and checked with the mutex locked in the push method. Otherwise there would be potential for a race if a forecast is being pushed at the same time as cleanup of another forecast is failing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, that's a good point. I'd thought the thread was spawned/joined each time a forecast was requested, but it does indeed look like the forecast thread is started only once when the
CAnomalyJob
object is created. In that case ever exiting this loop - except on process shutdown - seems undesirable.I'm just wanted to raise this, since it is quite a significant change in behaviour, and wanted to be sure we'd thought about possible side effects. A couple of things that occurred to me, but I hadn't checked are: 1) what do we do about filling up the disk on the write side? 2) can there be any side effects from having stale directories for subsequent forecasts, i.e. do names definitely not get recycled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it was never intended to shutdown the loop, this is a bug. Even the code documentation says:
and the log level is warning
The name of a temporary file is also not hardcoded but random: https://github.com/elastic/ml-cpp/blob/master/lib/model/CForecastModelPersist.cc#L34
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that the writing of forecast documents also checks the disk isn't full. I just wanted to double check.