Skip to content

Commit 6323f07

Browse files
committed
autoscaler: update storageautoscaler design doc
Currrently the controller was dependent on reading cr status, update the controller to not depend on cr status Signed-off-by: parth-gr <[email protected]>
1 parent 11ec13f commit 6323f07

File tree

1 file changed

+43
-13
lines changed

1 file changed

+43
-13
lines changed

docs/design/storage-auto-scale.md

+43-13
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ Implement auto storage scale CR per device class of a specific Storagecluster.
2929
maxOsdSize: 8Ti
3030
timeoutSeconds: 1800
3131
status:
32-
phase: (NotStarted | InProgress | Succeeded | Failed)
32+
phase: ("" | NotStarted | InProgress | Succeeded | Failed)
3333
error:
3434
message: ""
3535
timestamp: <time>
@@ -59,7 +59,7 @@ Go-routine Scraper:
5959
Query: (ceph_osd_metadata * on (ceph_daemon, namespace, managedBy) group_right(device_class,hostname) (ceph_osd_stat_bytes_used / ceph_osd_stat_bytes))
6060
6161
2) With every run calculate different device class highest osd filled percentage.
62-
62+
6363
OsdPercentage:
6464
6565
ssd 69%
@@ -72,36 +72,54 @@ Controller:
7272
7373
1) Create a named controller that watches for [channel generic event](https://book-v1.book.kubebuilder.io/beyond_basics/controller_watches) per device class.
7474
75-
2) If the expansion is not in progress set the status `phase` to `NotStarted`
75+
2) Set the status `phase` to `NotStarted`, if no expansion has triggered(status.phase=="").
76+
77+
3) Check if the expansion is in progress:
78+
79+
1) Query actualOsdCount and actualOsdSize from the from Prometheus .
80+
81+
2) Calculate the desiredOsdCount and desiredOsdSize from the Storagecluster.
82+
83+
3) If the (actualOsdSize < desiredOsdSize || actualOsdCount < desiredOsdCount), expansion is in progress.
7684
77-
3) If an expansion is in progress(expectedOsdSize!=startOsdSize || expectedOsdCount!=startOsdCount), check the progress and then requeue each 1 minute until the expansion is completed successfully(jump to step 11).
85+
4) If the expansion is in progress, check the progress and then requeue each 1 minute until the expansion is completed successfully(jump to step 13).
7886
79-
4) If the LSO storageclass is detected in the storageClassDeviceSet, raise a warning and do not recocnile further.
87+
5) If no-expansion is in progress,
8088
81-
5) If the highest osd percentage reported in the sync map is more than osdScalingThresholdPercent(70%) means reaching osd nearfull, scaling is needed.
89+
1) Check the (status.phase == "InProgress"), if yes move the status phase to succeded.
8290
83-
6) If scaling is needed calculate the `expectedOsdSize` and `expectedOsdCount`.
91+
2) Proceed with further steps.
92+
93+
6) If the LSO storageclass is detected in the storageClassDeviceSet, raise a warning and do not recocnile further.
94+
95+
7) If the highest osd percentage reported in the sync map is more than osdScalingThresholdPercent(70%) means reaching osd nearfull, scaling is needed.
96+
97+
8) If scaling is needed calculate the `expectedOsdSize` and `expectedOsdCount`.
8498
8599
1) If the Osd size is less than maxOsdSize(default:8Tib), do vertical scaling by doubling the each osd sizes for that device class.
86100
87101
2) If the Osd size is equal to maxOsdSize(default:8Tib), do a horizontal scaling, by adding 1 osd of maxOsdSize(default:8Tib) on each `storageDeviceSet`.
88102
89-
7) Calculate the `expectedStorageCapacity` based on expected size and count.
103+
9) Calculate the `expectedStorageCapacity` based on expected size and count.
90104
91-
8) Check if the `storageCapacityLimit` > `expectedStorageCapacity`.
105+
10) Check if the `storageCapacityLimit` > `expectedStorageCapacity`.
92106
93107
1) If yes, Update `phase` to `InProgress` and `lastExpansionStartTime` as `current-time` on the `StorageAutoScaling` CR which need scaling.
94108
95109
2) If no, don't reconcile further and the set the `storageCapacityLimitReached` as true in the status.
96110
97-
9) Update the status, set the `expectedOsdSize` and `expectedOsdCount` to reflect the new expected value and also set `startOsdSize` and `startOsdCount` with current storagecluster values.
111+
11) Update the status, set the `expectedOsdSize` and `expectedOsdCount` to reflect the new expected value and also set `startOsdSize` and `startOsdCount` with current storagecluster values.
98112
99-
10) Scale by patching the Storagecluster, with all the device sets update needed at the same time.
113+
12) Scale by patching the Storagecluster, with all the device sets update needed at the same time.
100114
101-
11) Verify and Alert:
115+
13) Verify and Alert:
102116
103117
1) Verify the Storagecluster whether the new osds are added or scaled in size, for all the device sets.
104118
119+
1) For vertical scaling, Query osd size from Prometheus and match it with storagecluster.spec..size.
120+
121+
2) For horizontal scaling, Query osd count from Prometheus and match it with storagecluster.spec..count.
122+
105123
2) If the scaling is successful will update the status of the `StorageAutoScaling` CR with `lastExpansionCompletionTime` and `phase` and also osd count and size.
106124
107125
3) If the auto scale is not completed, it will do a requeue every 1 min and, change the phase to `failed` if scaling not `Succeeded` with in timeoutSeconds(default:1800) interval.
@@ -113,19 +131,31 @@ Controller:
113131
Based on the above algorithm there would be two conditions where in-progress is set, elaborating those conditions,
114132
115133
1) If scaling is just started:
134+
116135
1) Set `phase` to `InProgress`.
117-
2) Verify is the scaling is successful.
136+
137+
2) Verify is the scaling is successful.
138+
118139
3) If the scaling is successful set the `phase` to `Succeeded`.
140+
119141
4) Alert the user if the phase changes to `Succeeded`, alerting will be implemented with ocs-metrics-exporter.
142+
120143
5) If the scaling is not yet completed requeue every 1 min, we have the 2nd case.
121144
122145
2) If the scaling has already started and its requeue
146+
123147
1) Now the requeue will happen every 1 min.
148+
124149
2) At the start of reconcile will match that `startOsdSize` and `expectedOsdSize` is not equal and similar for osd count.
150+
125151
3) And another validation will do is equating storagecluster spec with prometheus response.
152+
126153
4) Will requeue till the scaling is in-progress.
154+
127155
5) If the scaling is in-progress with more than timeoutSeconds(default:1800) interval we set the phase to `failed`.
156+
128157
6) Alert the user if the phase changes to `Failed`, alerting will be implemented with ocs-metrics-exporter.
158+
129159
7) If there as a failure alert, provide a mitigation guide for the user.
130160
131161
## Failure case

0 commit comments

Comments
 (0)