Skip to content

Commit 06ec85a

Browse files
committed
autoscaler: update storageautoscaler design doc
currrently the controller was dependednt on reading cr status, update the controller to donot depend on cr status Signed-off-by: parth-gr <[email protected]>
1 parent 11ec13f commit 06ec85a

File tree

1 file changed

+44
-10
lines changed

1 file changed

+44
-10
lines changed

docs/design/storage-auto-scale.md

+44-10
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ Implement auto storage scale CR per device class of a specific Storagecluster.
2929
maxOsdSize: 8Ti
3030
timeoutSeconds: 1800
3131
status:
32-
phase: (NotStarted | InProgress | Succeeded | Failed)
32+
phase: ("" | NotStarted | InProgress | Succeeded | Failed)
3333
error:
3434
message: ""
3535
timestamp: <time>
@@ -58,23 +58,41 @@ Go-routine Scraper:
5858
1) Add a single go-routine to watch periodically(10mins) all the OSDs sizes using Prometheus metrics in the storagecluster namespace.
5959
Query: (ceph_osd_metadata * on (ceph_daemon, namespace, managedBy) group_right(device_class,hostname) (ceph_osd_stat_bytes_used / ceph_osd_stat_bytes))
6060
61-
2) With every run calculate different device class highest osd filled percentage.
62-
63-
OsdPercentage:
61+
2) With every run calculate different device class highest osd filled percentage, osdCount and osdSize.
62+
63+
OsdInfo:
6464
65-
ssd 69%
66-
nvme 71%
65+
ssd{
66+
osdPercentage: 69%
67+
osdCount: 3
68+
osdSize: 100Gi
69+
}
70+
nvme{
71+
osdPercentage: 71%
72+
osdCount: 3
73+
osdSize: 200Gi
74+
}
6775
...
6876
69-
3) Create a sync map with `OsdPercentage`, Send a event to the go channel.
77+
3) Create a sync map with `OsdInfo`, Send a event to the go channel.
7078
7179
Controller:
7280
7381
1) Create a named controller that watches for [channel generic event](https://book-v1.book.kubebuilder.io/beyond_basics/controller_watches) per device class.
7482
75-
2) If the expansion is not in progress set the status `phase` to `NotStarted`
83+
2) Set the status `phase` to `NotStarted`, if no expansion has triggered(status.phase=="").
84+
85+
3) Check if the expansion is in progress:
86+
87+
1) Load the actualOsdCount and actualOsdSize from the syncMap.
88+
89+
2) Load the desiredOsdCount and desiredOsdSize from the Storagecluster.
90+
91+
3) If the (actualOsdSize!=desiredOsdSize && actualOsdCount!=desiredOsdCount), expansion is in progress.
7692
77-
3) If an expansion is in progress(expectedOsdSize!=startOsdSize || expectedOsdCount!=startOsdCount), check the progress and then requeue each 1 minute until the expansion is completed successfully(jump to step 11).
93+
4) Check the progress and then requeue each 1 minute until the expansion is completed successfully(jump to step 11).
94+
95+
5) If no-expansion is in progress proceed to with further steps.
7896
7997
4) If the LSO storageclass is detected in the storageClassDeviceSet, raise a warning and do not recocnile further.
8098
@@ -102,6 +120,10 @@ Controller:
102120
103121
1) Verify the Storagecluster whether the new osds are added or scaled in size, for all the device sets.
104122
123+
1) For vertical scaling, Query osd size from Prometheus and match it with storagecluster.spec..size.
124+
125+
2) For horizontal scaling, Query osd count from Prometheus and match it with storagecluster.spec..count.
126+
105127
2) If the scaling is successful will update the status of the `StorageAutoScaling` CR with `lastExpansionCompletionTime` and `phase` and also osd count and size.
106128
107129
3) If the auto scale is not completed, it will do a requeue every 1 min and, change the phase to `failed` if scaling not `Succeeded` with in timeoutSeconds(default:1800) interval.
@@ -113,19 +135,31 @@ Controller:
113135
Based on the above algorithm there would be two conditions where in-progress is set, elaborating those conditions,
114136
115137
1) If scaling is just started:
138+
116139
1) Set `phase` to `InProgress`.
117-
2) Verify is the scaling is successful.
140+
141+
2) Verify is the scaling is successful.
142+
118143
3) If the scaling is successful set the `phase` to `Succeeded`.
144+
119145
4) Alert the user if the phase changes to `Succeeded`, alerting will be implemented with ocs-metrics-exporter.
146+
120147
5) If the scaling is not yet completed requeue every 1 min, we have the 2nd case.
121148
122149
2) If the scaling has already started and its requeue
150+
123151
1) Now the requeue will happen every 1 min.
152+
124153
2) At the start of reconcile will match that `startOsdSize` and `expectedOsdSize` is not equal and similar for osd count.
154+
125155
3) And another validation will do is equating storagecluster spec with prometheus response.
156+
126157
4) Will requeue till the scaling is in-progress.
158+
127159
5) If the scaling is in-progress with more than timeoutSeconds(default:1800) interval we set the phase to `failed`.
160+
128161
6) Alert the user if the phase changes to `Failed`, alerting will be implemented with ocs-metrics-exporter.
162+
129163
7) If there as a failure alert, provide a mitigation guide for the user.
130164
131165
## Failure case

0 commit comments

Comments
 (0)