[FLINK-30571] Estimate scalability coefficient from past scaling history using linear regression #966
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What is the purpose of the change
Currently, target parallelism computation assumes perfect linear scaling. However, real-time workloads often exhibit nonlinear scalability due to factors like network overhead and coordination costs.
This change introduces an observed scalability coefficient, estimated using weighted linear regression on past (parallelism, processing rate) data, to improve the accuracy of scaling decisions.
Brief change log
Implemented a dynamic scaling coefficient to compute target parallelism based on observed scalability. The system estimates the scalability coefficient using a weighted least squares linear regression approach, leveraging historical (parallelism, processing rate) data.
The regression model minimizes the weighted sum of squared errors, where weights are assigned based on both parallelism and recency to prioritize recent observations. The baseline processing rate is computed using the smallest observed parallelism in the history. Model details:
The Linear Model
We define a linear relationship between parallelism (P) and processing rate (R):
where:
Weighted Squared Error
The loss function to minimize is the weighted sum of squared errors (SSE):
Substituting ( R̂_i = (β α) P_i ):
where w_i is the weight for each data point.
Minimizing the Error
Expanding ( (R_i - β α P_i)^2 ):
Multiplying by w_i and summing over all data points:
Solving for α
To minimize for α, taking the derivative and solving we get:
Verifying this change
New unit tests added to cover this
Does this pull request potentially affect one of the following parts:
Dependencies (does it add or upgrade a dependency): no
The public API, i.e., is any changes to the CustomResourceDescriptors: no
Core observer or reconciler logic that is regularly executed: no