-
Notifications
You must be signed in to change notification settings - Fork 571
feat(DestinationRules): Adding aggression and min_weight_percent to DestinationRules API #3216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(DestinationRules): Adding aggression and min_weight_percent to DestinationRules API #3216
Conversation
😊 Welcome @frgaudet! This is either your first contribution to the Istio api repo, or it's been You can learn more about the Istio working groups, Code of Conduct, and contribution guidelines Thanks for contributing! Courtesy of your friendly welcome wagon. |
Hi @frgaudet. Thanks for your PR. I'm waiting for a istio member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we wrap all slow star config together
// By tuning aggression parameter, one could achieve polynomial or exponential speed for traffic increase. | ||
message aggression { | ||
uint32 default_value = 5; | ||
string runtime_key = 6; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do they mean and can you provide an demo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think we should expose a runtime key here. Also for how many services do you have to configure this? And does it differ from service to service?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think we should expose a runtime key here. Also for how many services do you have to configure this? And does it differ from service to service?
As far as I remember runtime_key parameter is mandatory if we want to use an aggression parameter (which is the one we really need).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do they mean and can you provide an demo
We're using Java microservices and pods need a warmup phase in order to have full performance.
In practice the goal is to avoid giving 100% of the traffic to a new READY pod. Leveraging a slow start allow us to give first a certain % of the traffic then ramp-up progressively to 100%.
For this first attempt we tried to use the LoadBalancerSettings feature from Istio. This allow us to specify the duration of the warmup. However in this config we can’t configure 2 important options because they are not exposed by Istio API :
min_weight_percent
: specifies the initial percent of origin load, if not present, it is default to 10%.
aggression
: will defined the evolution of the % of traffic sent to the pods from min_weight_percent to 100%, by default the the ramp-up curve is linear, but by customising it we can achieve exponential type of curve.
The result (sorry I don't have a picture to illustrate that) is that 10% of traffic still too much : our latency increase a lot and impact our users.
To check if the 2 parameters mentioned above impact our traffic, we used an EnvoyFilter
that we applied to 3 clients of our app.
Deploying this config from only a portion of our traffic (roughly 75%) with a slow_start_window
of 3 minutes and a min_weight_percent
of 1% we have been able to observe an impact were we can see the progressive ramp-up of the traffic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think runtime is must https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/core/v3/base.proto#envoy-v3-api-msg-config-core-v3-runtimedouble
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is actually mandatory if we want to use the agression
parameter. If I try this EnvoyFilter :
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: h2-control
spec:
configPatches:
- applyTo: CLUSTER
match:
cluster:
name: "outbound|8080||http-echo.infra.svc.cluster.local"
patch:
operation: MERGE
value:
name: "outbound|8080||http-echo.infra.svc.cluster.local"
lbPolicy: LEAST_REQUEST
leastRequestLbConfig:
slowStartConfig:
min_weight_percent: { value: 99 }
slow_start_window: "12s"
aggression: { default_value: 2 }
workloadSelector:
labels:
app: landing-f.gaudet
Then I have this warning in the logs :
landing-f.gaudet istio-proxy {"level":"warning","time":"2024-06-04T04:41:49.406026Z","scope":"envoy config","msg":"gRPC config for type.googleapis.com/envoy.config.cluster.v3.Cluster rejected: Proto constraint validation failed (ClusterValidationError.LeastRequestLbConfig: embedded message failed validation | caused by LeastRequestLbConfigValidationError.SlowStartConfig: embedded message failed validation | caused by SlowStartConfigValidationError.Aggression: embedded message failed validation | caused by RuntimeDoubleValidationError.RuntimeKey: value length must be at least 1 characters):
and the config is not applied. However, if I setup the runtime key
leastRequestLbConfig:
slowStartConfig:
min_weight_percent: { value: 99 }
slow_start_window: "12s"
aggression: { default_value: 2, runtime_key: "(" }
Then the config is successfully applied :
istioctl pc cluster landing-f.gaudet.infra --fqdn http-echo.infra.svc.cluster.local -ojson | jq ".[].leastRequestLbConfig"
{
"slowStartConfig": {
"slowStartWindow": "12s",
"aggression": {
"defaultValue": 2,
"runtimeKey": "("
},
"minWeightPercent": {
"value": 99
}
}
}
Just to be sure to understand your request : you mean wrap all slowStart fields into a new message struct ? What is the best practice you would recommend dealing with such proto change ?
|
1ac690d
to
78a2ea5
Compare
Yes @frgaudet i mean something like this |
Do you need separate value of aggression for each service? |
Potentially yes, depending on the Java code, the warmup could be tweaked differently |
@hzxuzhonghu @ramaraochavali do you need something else ? |
@howardjohn do you think this could be added in the next release to come ? |
21db8c8
to
9442f42
Compare
Signed-off-by: Frédéric Gaudet <[email protected]>
Signed-off-by: Frédéric Gaudet <[email protected]>
Signed-off-by: Frédéric Gaudet <[email protected]>
Co-authored-by: John Howard <[email protected]>
Signed-off-by: Frédéric Gaudet <[email protected]>
1201016
to
0246298
Compare
Signed-off-by: Frédéric Gaudet <[email protected]>
Signed-off-by: Frédéric Gaudet <[email protected]>
Signed-off-by: Frédéric Gaudet <[email protected]>
To keep this API as simple as possible, I defined
wdyt ? Thanks for your reviews :) |
/ok-to-test |
Signed-off-by: Frédéric Gaudet <[email protected]>
/retest |
Signed-off-by: Frédéric Gaudet <[email protected]>
/retest |
Signed-off-by: Frédéric Gaudet <[email protected]>
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
@ramaraochavali or @hzxuzhonghu can you please have a look, your approval is needed to move forward, thanks ! |
Adding envoy slowStartMode aggression and min_weight_percent parameters to DestinationRules API
Fixes #3215
Next PR to come on the cluster_traffic_policy side
First time I contribute here, hope this is good :)