@@ -680,6 +680,25 @@ Example:
680
680
- the apiserver repair loops will generate periodic events informing the user that the Service with the IP allocated
681
681
is not within the configured range
682
682
683
+ One of the biggest problem when running with skewed apiservers is that each of them will use independent
684
+ allocators that will rely on the repair loops to reconcile the Services and ClusterIP. This can cause that
685
+ two Services created in different skewed apiservers, requesting the same ClusterIP, can succeed if the allocation
686
+ happens before the repair loops run, with the catastrophic result of having two Services with the same ClusterIP.
687
+
688
+ To avoid this race problem, the new IP allocator will implement a dual-write strategy, creating an IPAddress object and
689
+ also allocating the IP in the corresponding bitmap allocator. This behavior will be controlled with a new feature gate,
690
+ ` DisableAllocatorDualWrite ` , that will be disabled by default until ` MultiCIDRServiceAllocator ` is GA.
691
+ The next version after ` MultiCIDRServiceAllocator ` is GA, all the apiservers will be using the new IP allocator, so
692
+ it will be safe to disable the dual-write mode.
693
+
694
+
695
+ | Version | MultiCIDRServiceAllocator | DisableAllocatorDualWrite |
696
+ | ----------| ----------| ----------|
697
+ | 1.31 | Beta off | Alpha off |
698
+ | 1.32 | GA on (locked) | Beta off |
699
+ | 1.33 | GA on (there are no bitmaps running) | GA on (also delete old bitmap)|
700
+ | 1.34 | remove feature gate | remove feature gate |
701
+
683
702
## Production Readiness Review Questionnaire
684
703
685
704
### Feature Enablement and Rollback
0 commit comments