You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/api-reference/time-series-root-cause-localization.md
+4-4Lines changed: 4 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
-
At Mircosoft, we develop a decision tree based root cause localization method which helps to find out the root causes for an anomaly incident at a specific timestamp incrementally.
1
+
At Microsoft, we have developed a decision tree based root cause localization method which helps to find out the root causes for an anomaly incident at a specific timestamp incrementally.
2
2
3
3
## Multi-Dimensional Root Cause Localization
4
-
It's a common case that one measure is collected with many dimensions (*e.g.*, Province, ISP) whose values are categorical(*e.g.*, Beijing or Shanghai for dimension Province). When a measure's value deviates from its expected value, this measure encounters anomalies. In such case, operators would like to localize the root cause dimension combinations rapidly and accurately. Multi-dimensional root cause localization is critical to troubleshoot and mitigate such case.
4
+
It's a common case that one measure is collected with many dimensions (*e.g.*, Province, ISP) whose values are categorical(*e.g.*, Beijing or Shanghai for dimension Province). When a measure's value deviates from its expected value, this measure encounters anomalies. In such case, users would like to localize the root cause dimension combinations rapidly and accurately. Multi-dimensional root cause localization is critical to troubleshoot and mitigate such case.
5
5
6
6
## Algorithm
7
7
@@ -13,7 +13,7 @@ The decision tree based root cause localization method is unsupervised, which me
13
13
14
14
### Decision Tree
15
15
16
-
[Decision tree](https://en.wikipedia.org/wiki/Decision_tree) algorithm chooses the highest information gain to split or construct a decision tree. We use it to choose the dimension which contributes the most to the anomaly. Following are some concepts used in decision tree.
16
+
The [Decision tree](https://en.wikipedia.org/wiki/Decision_tree) algorithm chooses the highest information gain to split or construct a decision tree. We use it to choose the dimension which contributes the most to the anomaly. Below are some concepts used in decision trees.
Where $Ent(D^v)$ is the entropy of set points in D for which dimension $a$ is equal to $v$, $|D|$ is the total number of points in dataset $D$. $|D^V|$ is the total number of points in dataset $D$ for which dimension $a$ is equal to $v$.
32
32
33
-
For all aggregated dimensions, we calculate the information for each dimension. The greater the reduction in this uncertainty, the more information is gained about D from dimension $a$.
33
+
For all aggregated dimensions, we calculate the information for each dimension. The greater the reduction in this uncertainty, the more information is gained about $D$ from dimension $a$.
Copy file name to clipboardExpand all lines: src/Microsoft.ML.TimeSeries/ExtensionsCatalog.cs
+9-5Lines changed: 9 additions & 5 deletions
Original file line number
Diff line number
Diff line change
@@ -134,7 +134,7 @@ public static SsaSpikeEstimator DetectSpikeBySsa(this TransformsCatalog catalog,
134
134
/// <param name="windowSize">The size of the sliding window for computing spectral residual.</param>
135
135
/// <param name="backAddWindowSize">The number of points to add back of training window. No more than <paramref name="windowSize"/>, usually keep default value.</param>
136
136
/// <param name="lookaheadWindowSize">The number of pervious points used in prediction. No more than <paramref name="windowSize"/>, usually keep default value.</param>
137
-
/// <param name="averageingWindowSize">The size of sliding window to generate a saliency map for the series. No more than <paramref name="windowSize"/>, usually keep default value.</param>
137
+
/// <param name="averagingWindowSize">The size of sliding window to generate a saliency map for the series. No more than <paramref name="windowSize"/>, usually keep default value.</param>
138
138
/// <param name="judgementWindowSize">The size of sliding window to calculate the anomaly score for each data point. No more than <paramref name="windowSize"/>.</param>
139
139
/// <param name="threshold">The threshold to determine anomaly, score larger than the threshold is considered as anomaly. Should be in (0,1)</param>
140
140
/// <example>
@@ -145,8 +145,8 @@ public static SsaSpikeEstimator DetectSpikeBySsa(this TransformsCatalog catalog,
/// Create <see cref="SrCnnEntireAnomalyDetector"/>, which detects timeseries anomalies for entire input using SRCNN algorithm.
@@ -156,7 +156,7 @@ public static SrCnnAnomalyEstimator DetectAnomalyBySrCnn(this TransformsCatalog
156
156
/// <param name="outputColumnName">Name of the column resulting from data processing of <paramref name="inputColumnName"/>.
157
157
/// The column data is a vector of <see cref="System.Double"/>. The length of this vector varies depending on <paramref name="detectMode"/>.</param>
158
158
/// <param name="inputColumnName">Name of column to process. The column data must be <see cref="System.Double"/>.</param>
159
-
/// <param name="threshold">The threshold to determine anomaly, score larger than the threshold is considered as anomaly. Must be in [0,1]. Default value is 0.3.</param>
159
+
/// <param name="threshold">The threshold to determine an anomaly. An anomaly is detected when the calculated SR raw score for a given point is more than the set threshold. This threshold must fall between [0,1], and its default value is 0.3.</param>
160
160
/// <param name="batchSize">Divide the input data into batches to fit srcnn model.
161
161
/// When set to -1, use the whole input to fit model instead of batch by batch, when set to a positive integer, use this number as batch size.
162
162
/// Must be -1 or a positive integer no less than 12. Default value is 1024.</param>
@@ -165,6 +165,7 @@ public static SrCnnAnomalyEstimator DetectAnomalyBySrCnn(this TransformsCatalog
165
165
/// When set to AnomalyOnly, the output vector would be a 3-element Double vector of (IsAnomaly, RawScore, Mag).
166
166
/// When set to AnomalyAndExpectedValue, the output vector would be a 4-element Double vector of (IsAnomaly, RawScore, Mag, ExpectedValue).
167
167
/// When set to AnomalyAndMargin, the output vector would be a 7-element Double vector of (IsAnomaly, AnomalyScore, Mag, ExpectedValue, BoundaryUnit, UpperBoundary, LowerBoundary).
168
+
/// The RawScore is output by SR to determine whether a point is an anomaly or not, under AnomalyAndMargin mode, when a point is an anomaly, an AnomalyScore will be calculated according to sensitivity setting.
168
169
/// Default value is AnomalyOnly.</param>
169
170
/// <example>
170
171
/// <format type="text/markdown">
@@ -182,7 +183,10 @@ public static IDataView DetectEntireAnomalyBySrCnn(this AnomalyDetectionCatalog
/// <param name="src">Root cause's input. The data is an instance of <see cref="Microsoft.ML.TimeSeries.RootCauseLocalizationInput"/>.</param>
185
-
/// <param name="beta">Beta is a weight parameter for user to choose. It is used when score is calculated for each root cause item. The range of beta should be in [0,1]. For a larger beta, root cause point which has a large difference between value and expected value will get a high score. On the contrary, for a small beta, root cause items which has a high relative change will get a high score.</param>
186
+
/// <param name="beta">Beta is a weight parameter for user to choose.
187
+
/// It is used when score is calculated for each root cause item. The range of beta should be in [0,1].
188
+
/// For a larger beta, root cause items which have a large difference between value and expected value will get a high score.
189
+
/// For a small beta, root cause items which have a high relative change will get a low score.</param>
0 commit comments