autoscaler: update storageautoscaler design doc

parth-gr · parth-gr · commit 06ec85a03b44 · 2025-03-26T13:08:54.000+05:30
currrently the controller was dependednt on reading cr status,
update the controller to donot depend on cr status

Signed-off-by: parth-gr &lt;partharora1010@gmail.com&gt;
diff --git a/docs/design/storage-auto-scale.md b/docs/design/storage-auto-scale.md
@@ -29,7 +29,7 @@ Implement auto storage scale CR per device class of a specific Storagecluster.
         maxOsdSize: 8Ti
         timeoutSeconds: 1800   
     status:
-        phase: (NotStarted | InProgress | Succeeded | Failed)
+        phase: ("" | NotStarted | InProgress | Succeeded | Failed)
         error:
             message: ""
             timestamp: <time>
@@ -58,23 +58,41 @@ Go-routine Scraper:
 1) Add a single go-routine to watch periodically(10mins) all the OSDs sizes using Prometheus metrics in the storagecluster namespace.
     Query: (ceph_osd_metadata * on (ceph_daemon, namespace, managedBy) group_right(device_class,hostname) (ceph_osd_stat_bytes_used / ceph_osd_stat_bytes))
 
-2) With every run calculate different device class highest osd filled percentage.
-   
-        OsdPercentage:
+2) With every run calculate different device class highest osd filled percentage, osdCount and osdSize.
+    
+        OsdInfo:
         
-        ssd               69%
-        nvme              71%
+        ssd{               
+            osdPercentage: 69%
+            osdCount:      3
+            osdSize:       100Gi
+        }
+        nvme{               
+            osdPercentage: 71%
+            osdCount:      3
+            osdSize:       200Gi
+        }
         ...   
 
-3) Create a sync map with `OsdPercentage`, Send a event to the go channel.
+3) Create a sync map with `OsdInfo`, Send a event to the go channel.
 
 Controller:
 
 1) Create a named controller that watches for [channel generic event](https://book-v1.book.kubebuilder.io/beyond_basics/controller_watches) per device class.
 
-2) If the expansion is not in progress set the status `phase` to `NotStarted` 
+2) Set the status `phase` to `NotStarted`, if no expansion has triggered(status.phase==""). 
+   
+3) Check if the expansion is in progress:
+   
+   1) Load the actualOsdCount and actualOsdSize from the syncMap.
+   
+   2) Load the desiredOsdCount and desiredOsdSize from the Storagecluster.
+   
+   3) If the (actualOsdSize!=desiredOsdSize && actualOsdCount!=desiredOsdCount), expansion is in progress. 
    
-3) If an expansion is in progress(expectedOsdSize!=startOsdSize || expectedOsdCount!=startOsdCount), check the progress and then requeue each 1 minute until the expansion is completed successfully(jump to step 11).
+   4) Check the progress and then requeue each 1 minute until the expansion is completed successfully(jump to step 11).
+   
+   5) If no-expansion is in progress proceed to with further steps.
    
 4) If the LSO storageclass is detected in the storageClassDeviceSet, raise a warning and do not recocnile further.
    
@@ -102,6 +120,10 @@ Controller:
    
     1) Verify the Storagecluster whether the new osds are added or scaled in size, for all the device sets.
     
+       1) For vertical scaling, Query osd size from Prometheus and match it with storagecluster.spec..size.
+    
+       2) For horizontal scaling, Query osd count from Prometheus and match it with storagecluster.spec..count.
+    
     2) If the scaling is successful will update the status of the `StorageAutoScaling` CR with `lastExpansionCompletionTime` and `phase` and also osd count and size.
     
     3) If the auto scale is not completed, it will do a requeue every 1 min and, change the phase to `failed` if scaling not `Succeeded` with in timeoutSeconds(default:1800) interval.
@@ -113,19 +135,31 @@ Controller:
 Based on the above algorithm there would be two conditions where in-progress is set, elaborating those conditions,
 
 1) If scaling is just started:
+
     1) Set `phase` to `InProgress`.
-    2) Verify is the scaling is successful. 
+   
+    2) Verify is the scaling is successful.
+
     3) If the scaling is successful set the `phase` to `Succeeded`.
+
     4) Alert the user if the phase changes to `Succeeded`, alerting will be implemented with ocs-metrics-exporter.
+   
     5) If the scaling is not yet completed requeue every 1 min, we have the 2nd case. 
 
 2) If the scaling has already started and its requeue
+
     1) Now the requeue will happen every 1 min.
+
     2) At the start of reconcile will match that `startOsdSize` and `expectedOsdSize` is not equal and similar for osd count.
+
     3) And another validation will do is equating storagecluster spec with prometheus response.
+
     4) Will requeue till the scaling is in-progress.
+
     5) If the scaling is in-progress with more than timeoutSeconds(default:1800) interval we set the phase to `failed`.
+
     6) Alert the user if the phase changes to `Failed`, alerting will be implemented with ocs-metrics-exporter. 
+
     7) If there as a failure alert, provide a mitigation guide for the user. 
 
 ## Failure case