Skip to content

[FLINK-30571] Estimate scalability coefficient from past scaling history using linear regression #966

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

pchoudhury22
Copy link

What is the purpose of the change

Currently, target parallelism computation assumes perfect linear scaling. However, real-time workloads often exhibit nonlinear scalability due to factors like network overhead and coordination costs.

This change introduces an observed scalability coefficient, estimated using weighted linear regression on past (parallelism, processing rate) data, to improve the accuracy of scaling decisions.

Brief change log

Implemented a dynamic scaling coefficient to compute target parallelism based on observed scalability. The system estimates the scalability coefficient using a weighted least squares linear regression approach, leveraging historical (parallelism, processing rate) data.
The regression model minimizes the weighted sum of squared errors, where weights are assigned based on both parallelism and recency to prioritize recent observations. The baseline processing rate is computed using the smallest observed parallelism in the history. Model details:

The Linear Model

We define a linear relationship between parallelism (P) and processing rate (R):

$$R_i = β * P_i * α$$

where:

  • R_i = actual processing rate for the i-th data point
  • P_i = parallelism for the i-th data point
  • β = base factor (constant scale factor)
  • α = scaling coefficient to optimize

Weighted Squared Error

The loss function to minimize is the weighted sum of squared errors (SSE):

$$Loss = Σ w_i * (R_i - R̂_i)^2$$

Substituting ( R̂_i = (β α) P_i ):

$$Loss = Σ w_i * (R_i - β α P_i)^2$$

where w_i is the weight for each data point.

Minimizing the Error

Expanding ( (R_i - β α P_i)^2 ):

$$(R_i - β α P_i)^2 = R_i^2 - 2β α P_i R_i + (β α P_i)^2$$

Multiplying by w_i and summing over all data points:

$$Loss = Σ w_i * (R_i^2 - 2β α P_i R_i + β^2 α^2 P_i^2)$$

Solving for α

To minimize for α, taking the derivative and solving we get:

$$α = (Σ w_i P_i R_i) / (Σ w_i P_i^2 * β)$$

Verifying this change

New unit tests added to cover this

Does this pull request potentially affect one of the following parts:

Dependencies (does it add or upgrade a dependency): no
The public API, i.e., is any changes to the CustomResourceDescriptors: no
Core observer or reconciler logic that is regularly executed: no

@pchoudhury22
Copy link
Author

Hi @gyfora , Please help review the PR! Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant