-
Notifications
You must be signed in to change notification settings - Fork 40.7k
Support scaling HPA to/from zero pods for object/external metrics #74526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support scaling HPA to/from zero pods for object/external metrics #74526
Conversation
Hi @DXist. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test |
@DXist I am also working on this issue and I have discussed the below proposel in Sig weekly meeting, and as per discussion in meeting the @mwielgus will discuss this with networking team, and based on that we can modify the PR and I'll be happy to coordinate with you on this. https://docs.google.com/document/d/1p_Xlk8e5V32WOBVeRbJqbxhEKcJy_mDu9JgVp5ZNDxs/ |
This PR handles more general case and is not specific to HTTP workload. If an HTTP load balancer/Ingress could buffer requests and export a number of queued requests for given service as a custom metric then HPA can use this metric for scaling. I see that there could be some signaling about metric change or push based model of metric delivery to HPA to speed up the scaling process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of high-level items:
- How was it tested?
- Why didn't you handle: https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/podautoscaler/horizontal.go#L567
- Did you consider explicitly validating that there is Object/External metric configured?
- Maybe we should have a comment update in the API.
There are unittests that cover scaling up from zero pods and scaling down to zero for custom object and external metrics. For end-to-end testing I
Only one metric value (worker utilization) was available when there were no pods. Github collapsed the diff for this module. I've added extra condition to enable scaling when
No, I didn't. It seems to be a good protection from HPA misconfiguration.
A comment update should be near MinReplicas? |
Added metrics validation for |
This PR may require API review. If so, when the changes are ready, complete the pre-review checklist and request an API review. Status of requested reviews is tracked in the API Review project. |
API LGTM @mwielgus ping me when LGTM and I will approve |
/lgtm |
/retest |
Just stopping by after six months to see what, if anything, can be done to move this otherwise perfectly reasonable and working feature from Alpha to GA. This would be great to have on hosted K8S environments like GKE or AKS or EKS or... basically anywhere. |
Was just reading the release notes for 1.19, |
Sounds like a good topic for the SIG Autoscaling meeting :) |
@lavalamp were there results from the meeting? |
Apologies, that was super unclear of me-- I realize now it reads like I was going to take it to the SIG meeting, but I was actually only trying to suggest that it was the logical next thing to do. I don't personally have time to contribute to this other than the occasional review. |
We are also using this feature heavily and I would be happy to contribute time driving this towards GA. While I contributed some changes to k8s, I've never been part of an API graduation so I would need to some pointers how to do this if help is needed. |
API changes need an associated doc describing the change and graduation requirements, called a KEP (Kubernetes Enhancement Proposal). The key things we'd be looking for are evidence of sufficient test coverage, any performance implications of the change, and upgrade/downgrade/skew compatibility implications. It looks like #69687 (comment) might have been the original proposal associated with the alpha change, but before graduating to beta and being enabled by default, that should be ported to a KEP and make sure the test/performance/upgrade questions are answered. A template is available at https://github.com/kubernetes/enhancements/tree/master/keps/NNNN-kep-template and would be placed in https://github.com/kubernetes/enhancements/tree/master/keps/sig-autoscaling |
@johanneswuerbach @DXist is this something one of you can lead? It would be my first contribution, and is probably not a great thing to take a bite out of first. If not, I can attempt the first pass at the proposal. |
@jeffreybrowning go ahead. I changed project and work in another context. |
I planned to get something drafted on Friday, but I also never worked on a KEP before. If you have more time feel free to pick it @jeffreybrowning :-) |
I am embroiled in an update to another package right now -- it would likely be a sizeable delay until I get to this. |
I started the official propose here kubernetes/enhancements#2021. Looking forward to any of your comments or feedback on the KEP itself here kubernetes/enhancements#2022. |
What type of PR is this?
/kind feature
What this PR does / why we need it:
This PR targets worker deployments that use queues and scaling is based on object or external metric that depends on queue size. When workers are idle it is possible to scale corresponding deployment to zero replicas and save resources.
This technique is especially useful when workers request GPU resources and the amount of different idling worker types exceeds number of available GPUs.
Which issue(s) this PR fixes:
Fixes #69687
Special notes for your reviewer:
The PR is based on changes made in #61423
There is no conservative scale down behaviour introduced in In Horizontal Pod Autoscaler Controller when scaling on multiple metrics, handle invalid metrics. #61423
Scaling down works even if we have just one metric value.
--horizontal-pod-autoscaler-tolerance
flag is ignored when scaling up from zero pods.Does this PR introduce a user-facing change?: