Horizontal Pod Autoscaling vs Vertical Pod Autoscaling
In our last blog Horizontal Pod Autoscaling in Kubernetes with Prometheus on autoscaling, we started off by looking at horizontal auto-scaling of Kubernetes pods and how we can allow HPAs to ingest metrics from Prometheus.
In times where additional capacity is needed, horizontal scaling gives us additional copies of the same computational unit. Instead of allowing a single unit to handle more requests, the load is reduced per unit as requests are distributed across a larger set.
When one first thinks about what vertical autoscaling might mean, one would assume that a vertical pod auto-scaler would be an allegory to vertical scaling of a host machine or VM – in other words, increasing the amount of resource on that machine. If a VM is using 4GB of memory and is using 3.8GB
, then make an additional 2GB available to that machine.
This might make sense in, for example, the vSphere world where we can set resource pools for VMs - but in the Kubernetes world, this doesn’t quite make sense.
After all, Kubernetes is only a scheduler that sits on top of a host. The Kubelet determines and reports the amount of resource a host has installed, and using these values and the reported resource required by running workloads, the scheduler determines if a workload can fit onto a node.
What form could vertical pod autoscaling take in theory?
Well, since the scheduler can only work effectively if there are reported requests and limits for workloads, setting requests that are true to real-life application usage will allow the scheduler to ensure that that amount of resource within the pool of resources available in the will be guaranteed for use for that application on a specific node. This may prevent workloads from scheduling if there isn’t a hole on any of the cluster nodes that is big enough to fit that application, but again, this guarantees that we can run existing workloads on the cluster without overwhelming available resource and bringing nodes down.
While it’s possible to have VPA and HPA target the same workload, because VPA works exclusively with CPU and memory resources, you shouldn’t use the same metrics for horizontal scaling. Since the VPA and HPA controllers are not aware of each other (at the moment), both controllers may try to apply incompatible changes to the workload. VPA can be complementary to horizontal autoscaling, but you must use assign non-computational metrics to HPA.
With that out of the way, let’s take a dive into how the VPA works.
VPA Modes
VPA has various operational modes to suit depending on how aggressively you would like pods to be updated with new request values:
- Auto: Will assign request values both at pod startup and while the pod is live using the specified update mechanism. At the moment, this is equivalent to “recreate”, as there isn’t currently an “in-place” mechanism for updating request values on live pods
- Recreate: Will assign request values both at pod startup and, if the current recommended values vary wildly than current request values, the pod will be evicted and a new pod created.
- Initial: Will only assign request values when the pod is initially created.
- Off: The VPA will continue to generate recommended request values for pods but will defer the application of these values to the cluster operator.
VPA Components
Unlike the HPA controller, the components which realise vertical pod autoscaling aren’t installed in Kube by default, so we will need to install the components which make up the VPA architecture. There are three controllers that implement vertical pod autoscaling:
Recommender
The initial task required for autoscaling is to ingest metrics and determine the current usage of the workload. Based on current and past metrics for resources, the recommender will determine a “recommended” set of CPU and memory values for each container.
By default, this will be metrics-server. As metrics-server is designed to store metrics in-memory, it only provides metrics for the last 10 minutes. To provide the recommender with a broader history of the running history of the service it is monitoring, you can also plug in historical metrics from a time-series database like Prometheus.
Updater
As detection of what the “correct” requests should be is delegated to the recommender, the updater controller compares the current and recommended requests of each deployment with a delegated vertical pod autoscaler object. If the update mode of the deployment’s corresponding VPA object is set to Auto or Recreate, the updater controller facilitates the creation of new pods that contain new recommended requests. It doesn’t directly update pods with new recommended values, but instructing the Kube API that a particular pod should be evicted from the cluster. It will rely on other controllers in the Kube master plane to take care of the creation of the new pod and making sure the pod has the new desired request values.
Admission Controller
If you’ve already brushed up on what admission controllers are and their purpose, the VPA also includes a mutating admission webhook. If a VPA object’s mode is set to Auto, Recreate or Initial, this webhook will inject the current request values generated by the Recommender at the time a pod is admitted to the cluster.
VPA Object
Let’s take a look at a VPA object:
apiVersion: autoscaling.k8s.io/v1beta2
kind: VerticalPodAutoscaler
metadata:
annotations:
name: test-vpa
namespace: dev
spec:
resourcePolicy:
containerPolicies:
- containerName: '*'
maxAllowed:
memory: 1Gi
minAllowed:
memory: 500Mi
targetRef:
apiVersion: extensions/v1beta1
kind: Deployment
name: test
updatePolicy:
updateMode: Recreate
There are a couple of interesting things here.
Using resourcePolicy
, we can provide some boundaries on how wild the recommender can vary the CPU and memory resources for the pod by assign minimum and maximum values allowable by the recommender. The use case for how these resources are boundaried depends on how many containers are running in a single pod.
If a pod only has a single container, then setting a wildcard value should be fine:
containerName: '*'
If you want to apply this to pods with sidecars, then, of course, you will need to boundary values on a per container basis. It doesn’t make sense to apply a wildcard value, as this will apply the same memory values for every single pod - and I’m sure you don’t need 500-1000MB for every single container in your pod!
If the recommender controller is able to ingest metrics, in about five-minute internals it will generate and then write recommendations into the status block of the VPA object. Four different bounds are generated:
lowerBound: the minimum CPU and memory requests for a container. Not recommended to use this as a baseline for requests
Target: the baseline recommended CPU and memory requests for that container
upperBound: the maximum recommended CPU and memory requests.
uncappedTarget: the recommended CPU and memory requests, but without taking the restrictions defined in ContainerResourcePolicy into consideration.
For example:
status:
conditions:
- lastTransitionTime: 2019-06-12T14:29:00Z
status: "True"
type: RecommendationProvided
recommendation:
containerRecommendations:
- containerName: test
lowerBound:
cpu: 25m
memory: 262144k
target:
cpu: 25m
memory: 262144k
uncappedTarget:
cpu: 25m
memory: 262144k
upperBound:
cpu: 644m
memory: 1Gi
Installing
In order to use VPA, you should be using a version of Kube which supports mutating admission webhooks on the API server. This means a cluster that is at least version 1.9. In addition, mutating webhooks need to be enabled on the API server by including MutatingAdmissionWebhook
as a value when defining the --admission-control
flag. The ordering of admission controllers using this flag isn’t idempotent, so check the Using Admission Controllers documentation.
There are also a couple of VPA-specific requirements:
While VPA version 0.3 requires Kube version 1.9 and over, VPA version 0.4 and 0.5 require a cluster that is version 1.11 and over.
For the recommender to be able to ingest pod metrics, metrics-server also needs to be running on the cluster. As metrics-server is designed to store metrics in-memory, it only provides metrics for the last 10 minutes. To provide the VPA recommender with a broader history of the running history of the service it is monitoring, you can also plug in historical metrics from a time-series database like Prometheus.
Metrics-server isn’t typically installed on the cluster by default, but the easiest way to install it is using Helm:
helm install --namespace kube-system \
--name metrics-server stable/metrics-server
Or conversely, Minikube includes it as a bundled add-on:
minikube addons enable metrics-server
``
You can confirm metrics-server is operating as expecting when you are able to view current resource consumption using <code>kubectl top</code>:
```bash
kubectl top nodes
Which outputs:
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
master0 411m 13% 980Mi 17%
You can then install the VPA controllers. The official VPA repo has a bash script you can run to install the Kube resources for each controller. Alternatively, you can clone the Helm chart that we’ve made available as made of this blog post:
git clone [email protected]:livewyer-ops/verticalpodautoscaler.git
The only thing you may need to configure in the Helm chart is the location of your Prometheus instance. This is applied using the prometheus.url value in values.yaml
:
prometheus:
url: http://prometheus.monitoring.svc
Now you can install the Helm chart:
helm install --namespace kube-system
--name vpa .
When you install the helm chart, you will see pods for the three VPA controllers:
kubectl get pods -n kube-system
Which outputs:
NAME READY STATUS RESTARTS AGE
autoscale-vpa-admissioncontroller-74d489d767-hnp9c 1/1 Running 0 26m
autoscale-vpa-recommender-5944df6c7f-4zht4 1/1 Running 0 26m
autoscale-vpa-updater-cd668b489-jqc6b 1/1 Running 0 26m
metrics-server-77fddcc57b-c2mzc 1/1 Running 3 7d21h
There is a secret called vpa-tls-certswhich is mounted into the admission controller that contains a cert bundle.
In the install script, this bundle is generated using a shell script, but in the Helm chart we have the luxury of using the Sprig library and so this processed is scripted using functions:
{{- $altNames := list ( printf "%s.%s" (include "vpa.name" .) .Release.Namespace ) ( printf "%s.%s.svc" (include "vpa.name" .) .Release.Namespace ) -}}
{{- $ca := genCA "vpa-ca" 3650 -}}
{{- $server := genSignedCert ( include "vpa.name" . ) nil $altNames 3650 $ca -}}
The ca
object generated by genCA
will contain a CA certificate and key, so we can embed these in place in the Helm template:
caCert.pem: {{ b64enc $ca.Cert }}
caKey.pem: {{ b64enc $ca.Key }}
The server object will contain a certificate and key signed by the CA we just created and referenced, and so these we can also embed in the Helm template:
serverCert.pem: {{ b64enc $server.Cert }}
serverKey.pem: {{ b64enc $server.Key }}
We can also confirm that the admission controller is registered as a mutating webhook by checking its log:
I0617 13:36:44.847929 7 v1beta1_fetcher.go:84] Initial VPA v1beta1 synced successfully
I0617 13:36:44.851884 7 config.go:62] client-ca-file=-----BEGIN CERTIFICATE-----
[...]
-----END CERTIFICATE-----
I0617 13:36:54.877379 7 config.go:131] Self registration as MutatingWebhook succeeded.
How do I use Vertical Pod Autoscaling?
Once the three controllers are deployed, we are ready to start using VPA.
For this demonstration, we will use the modified php-apache container from the Horizontal Pod Autoscaler Walkthrough. If you’re not familiar with that example, this apache image is modified to create some additional computational load when its index.html is accessed.
First of all, deploy the php-apache image and a service for the php-apache pods. A VPA object is also deployed:
apiVersion: autoscaling.k8s.io/v1beta2
kind: VerticalPodAutoscaler
metadata:
name: php-apache-vpa
namespace: dev
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: php-apache
---
apiVersion: v1
kind: Service
metadata:
labels:
run: php-apache
name: php-apache
namespace: dev
spec:
ports:
* port: 80
protocol: TCP
targetPort: 80
selector:
run: php-apache
type: ClusterIP
---
apiVersion: apps/v1
kind: Deployment
metadata:
generation: 1
labels:
run: php-apache
name: php-apache
namespace: dev
spec:
progressDeadlineSeconds: 600
replicas: 2
revisionHistoryLimit: 2
selector:
matchLabels:
run: php-apache
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
labels:
run: php-apache
spec:
containers:
- image: gcr.io/google_containers/hpa-example
imagePullPolicy: Always
name: php-apache
resources:
requests:
cpu: 1m
This should generate a deployment, service, and VPA object for you:
verticalpodautoscaler.autoscaling.k8s.io/php-apache-vpa created
service/php-apache created
deployment.apps/php-apache created
kubectl get pods -n dev
NAME READY STATUS RESTARTS AGE
php-apache-59759c4b98-rczhn 1/1 Running 0 37s
php-apache-59759c4b98-swvcv 1/1 Running 0 37s
In order to use VPA, it seems to be a requirement that a targeted workload run at least two replicas. In my testing, the updater was unable to evict in cases where there was only a single replica deployed, as indicated by this log:
I0617 13:39:48.819919 6 pods_eviction_restriction.go:209] too few replicas for ReplicaSet dev/php-apache-7bfdf49c69. Found 1 live pods
In order to demonstrate the usage of the VPA and make it more likely that pod eviction would take place, the pod for this test deployment is assigned a default CPU request value of 1 millicore:
kubectl get pod -n dev \
-o=custom-columns=NAME:.metadata.name,PHASE:.status.phase,CPU-REQUEST:.spec.containers\[0\].resources.requests.cpu
NAME PHASE CPU-REQUEST
php-apache-7bfdf49c69-gf2jd Running 1m
php-apache-7bfdf49c69-pjb98 Running 1m
Once the workload is deployed on the cluster, the recommender will detect that there is a new VPA object detected and fetch the metrics available for the pods / containers targeted by this VPA via the metrics API. When this is completed, the recommender will update the status block of the VPA specification with its recommendations:
Kubectl describe vpa php-apache-vpa -n dev
Name: php-apache-vpa
Namespace: dev
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"autoscaling.k8s.io/v1beta2","kind":"VerticalPodAutoscaler","metadata":{"annotations":{},"name":"php-apache-vpa","namespace":"dev"},"spec...
API Version: autoscaling.k8s.io/v1beta2
Kind: VerticalPodAutoscaler
Metadata:
Creation Timestamp: 2019-06-17T11:29:25Z
Generation: 4
Resource Version: 288396
Self Link: /apis/autoscaling.k8s.io/v1beta2/namespaces/dev/verticalpodautoscalers/php-apache-vpa
UID: 2954b910-90f3-11e9-aae4-080027655ff0
Spec:
Resource Policy:
Container Policies:
Container Name: *
Max Allowed:
Memory: 1Gi
Target Ref:
API Version: extensions/v1beta1
Kind: Deployment
Name: php-apache
Update Policy:
Update Mode: Recreate
Status:
Conditions:
Last Transition Time: 2019-06-17T11:30:04Z
Status: True
Type: RecommendationProvided
Recommendation:
Container Recommendations:
Container Name: php-apache
Lower Bound:
Cpu: 25m
Memory: 262144k
Target:
Cpu: 25m
Memory: 262144k
Uncapped Target:
Cpu: 25m
Memory: 262144k
Upper Bound:
Cpu: 5291m
Memory: 1Gi
Because we have sent no load to the php-apache replicas yet, the metrics reported for these pods is minimal, and so the recommended resources will be the minimum that can be set. For CPU this seems to be 25m, and 256MB for memory.
Now let’s send generate some load on php-apache - Open a new terminal window and run a Busybox container in the same namespace:
kubectl run -i --tty load-generator -n dev --image=busybox:1.27 /bin/sh
When you have an interactive shell with Busybox, run a looped wget that is targeted towards our php-apache pods. If successful, you should see OK! flooding the output log:
while true; do wget -q -O- http://php-apache.dev.svc.cluster.local; done
Wait a few minutes for the recommender to run again, and perform another kubectl describe
to check the current recommendation. After a minute or so the load has increased to ~500m:
kubectl describe vpa php-apache -n dev
Name: php-apache-vpa
Namespace: dev
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"autoscaling.k8s.io/v1beta2","kind":"VerticalPodAutoscaler","metadata":{"annotations":{},"name":"php-apache-vpa","namespace":"dev"},"spec...
API Version: autoscaling.k8s.io/v1beta2
Kind: VerticalPodAutoscaler
Metadata:
Creation Timestamp: 2019-06-17T15:55:53Z
Generation: 4
Resource Version: 329553
Self Link: /apis/autoscaling.k8s.io/v1beta2/namespaces/dev/verticalpodautoscalers/php-apache-vpa
UID: 62a76e03-9118-11e9-aae4-080027655ff0
Spec:
Target Ref:
API Version: apps/v1
Kind: Deployment
Name: php-apache
Status:
Conditions:
Last Transition Time: 2019-06-17T15:56:08Z
Status: True
Type: RecommendationProvided
Recommendation:
Container Recommendations:
Container Name: php-apache
Lower Bound:
Cpu: 25m
Memory: 262144k
Target:
Cpu: 587m
Memory: 262144k
Uncapped Target:
Cpu: 587m
Memory: 262144k
Upper Bound:
Cpu: 17662m
Memory: 664103245
Events: <none>
We can see from the logs for the recommender that when a recommendation is generated, this result is written to the VPA in the form of a patch request on the VPA object on the cluster:
I0617 15:45:08.894873 1 metrics_client.go:69] 30 podMetrics retrieved for all namespaces
I0617 15:45:08.897170 1 cluster_feeder.go:376] ClusterSpec fed with #60 ContainerUsageSamples for #30 containers
I0617 15:45:08.897298 1 recommender.go:183] ClusterState is tracking 30 PodStates and 1 VPAs
I0617 15:45:08.898760 1 request.go:897] Request Body: [{"op":"add","path":"/status","value":{"recommendation":{"containerRecommendations":[{"containerName":"php-apache","target":{"cpu":"627m","memory":"262144k"},"lowerBound":{"cpu":"187m","memory":"262144k"},"upperBound":{"cpu":"46399m","memory":"993517772"},"uncappedTarget":{"cpu":"627m","memory":"262144k"}}]},"conditions":[{"type":"RecommendationProvided","status":"True","lastTransitionTime":"2019-06-17T15:37:08Z"}]}}]
I0617 15:45:08.924479 1 round_trippers.go:405] PATCH https://10.96.0.1:443/apis/autoscaling.k8s.io/v1beta2/namespaces/dev/verticalpodautoscalers/php-apache-vpa 200 OK in 24 milliseconds
After a couple of minutes, you should see the pods get recreated. Check the CPU requests on these new pods, it should match the target recommendation:
kubectl get pod -n dev -o=custom-columns=NAME:.metadata.name,PHASE:.status.phase,CPU-REQUEST:.spec.containers\[0\].resources.requests.cpu
NAME PHASE CPU-REQUEST
load-generator-66fb94857f-d4q2b Running <none>
php-apache-7bfdf49c69-p5klm Running 587m
php-apache-7bfdf49c69-xp56w Running 587m
If we check the logs for the updater, we can see that it has noticed there are new VPA-targetted pods, and that these pods have been evicted from the cluster:
I0617 13:36:48.818914 6 api.go:99] Initial VPA synced successfully
I0617 13:36:48.819475 6 reflector.go:131] Starting reflector *v1.Pod (1h0m0s) from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/updater/logic/updater.go:196
I0617 13:36:48.819532 6 reflector.go:169] Listing and watching *v1.Pod from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/updater/logic/updater.go:196
I0617 13:40:48.819590 6 update_priority_calculator.go:118] pod accepted for update php-apache-7bfdf49c69-p86nb with priority 2.62144001099e+11
I0617 13:40:48.819673 6 update_priority_calculator.go:118] pod accepted for update php-apache-7bfdf49c69-8b9m9 with priority 2.62144001099e+11
I0617 13:40:48.819692 6 updater.go:147] evicting pod php-apache-7bfdf49c69-p86nb
I0617 13:40:48.844718 6 updater.go:205] Event(v1.ObjectReference{Kind:"Pod", Namespace:"dev", Name:"php-apache-7bfdf49c69-p86nb", UID:"bbb1f53c-9103-11e9-aae4-080027655ff0", APIVersion:"v1", ResourceVersion:"306480", FieldPath:""}): type: 'Normal' reason: 'EvictedByVPA' Pod was evicted by VPA Updater to apply resource recommendation.
I0617 13:41:48.819643 6 update_priority_calculator.go:118] pod accepted for update php-apache-7bfdf49c69-8b9m9 with priority 2.62144001099e+11
I0617 13:41:48.819682 6 update_priority_calculator.go:118] pod accepted for update php-apache-7bfdf49c69-j9zl4 with priority 2.62144001099e+11
I0617 13:41:48.819689 6 updater.go:147] evicting pod php-apache-7bfdf49c69-8b9m9
I0617 13:41:48.833147 6 updater.go:205] Event(v1.ObjectReference{Kind:"Pod", Namespace:"dev", Name:"php-apache-7bfdf49c69-8b9m9", UID:"618670f6-9105-11e9-aae4-080027655ff0", APIVersion:"v1", ResourceVersion:"308483", FieldPath:""}): type: 'Normal' reason: 'EvictedByVPA' Pod was evicted by VPA Updater to apply resource recommendation.
I0617 13:42:07.321571 6 reflector.go:357] k8s.io/autoscaler/vertical-pod-autoscaler/pkg/target/v1beta1_fetcher.go:80: Watch close - *v1beta1.VerticalPodAutoscaler total 4 items received
Once the pods have been evicted and the replica set that manages these php-apache notices that these pods are missing, it will send a request to the kube-api to create these two pods. These requests will also be noticed by the mutating webhook admission controller. Because these two pods are managed by a VPA with recommendations set, and the VPA is set to allow the admission controller to mutate these pods, the admission controller will inject the recommended resources into the pod spec:
I0617 15:36:20.118442 6 server.go:62] Admitting pod {php-apache-59759c4b98-% php-apache-59759c4b98- dev 0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[pod-template-hash:59759c4b98 run:php-apache] map[] [{apps/v1 ReplicaSet php-apache-59759c4b98 a76f33fd-9115-11e9-aae4-080027655ff0 0xc0004d8127 0xc0004d8128}] nil [] }
I0617 15:36:20.118961 6 recommendation_provider.go:108] updating requirements for pod php-apache-59759c4b98-%.
I0617 15:36:20.119104 6 recommendation_provider.go:97] Let's choose from 1 configs for pod dev/php-apache-59759c4b98-%
I0617 15:36:20.119156 6 recommendation_provider.go:68] no matching recommendation found for container php-apache
I0617 15:36:20.119224 6 server.go:259] Sending patches: [{add /spec/containers/0/resources {map[] map[]}} {add /spec/containers/0/resources/requests map[]} {add /metadata/annotations map[vpaUpdates:Pod resources updated by php-apache-vpa: container 0: ]}]
Admittedly I did have some trouble getting the admission controller to work. I initially suspected that the configmap containing the cert bundle had an invalid configuration, but when I checked the API server log, I noticed that the webhook wasn’t being called - I had a vpa-webhook service in the same namespace as the admission controller pod, but the selectors were misconfigured, so there was no endpoint:
W0617 14:06:49.864288 1 dispatcher.go:70] Failed calling webhook, failing open vpa.k8s.io: failed calling webhook "vpa.k8s.io": Post https://vpa-webhook.kube-system.svc:443/?timeout=30s: dial tcp 10.102.151.220:443: connect: connection refused
E0617 14:06:49.864322 1 dispatcher.go:71] failed calling webhook "vpa.k8s.io": Post https://vpa-webhook.kube-system.svc:443/?timeout=30s: dial tcp 10.102.151.220:443: connect: connection refused
Now check the pods altered by the admission controller to check what modifications have taken place. In addition to CPU and memory requests, there is also an annotation that has been added to indicate that this pod has been altered due to VPA. There as well that there are no resource limits set.
kubectl get pod php-apache-59759c4b98-7z86g -n dev -o yaml --export
apiVersion: v1
kind: Pod
metadata:
annotations:
vpaUpdates: 'Pod resources updated by php-apache-vpa: container 0: cpu request,
memory request'
creationTimestamp: null
generateName: php-apache-59759c4b98-
labels:
pod-template-hash: 59759c4b98
run: php-apache
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: ReplicaSet
name: php-apache-59759c4b98
uid: a76f33fd-9115-11e9-aae4-080027655ff0
selfLink: /api/v1/namespaces/dev/pods/php-apache-59759c4b98-7z86g
spec:
containers:
- image: gcr.io/google_containers/hpa-example
imagePullPolicy: Always
name: php-apache
resources:
requests:
cpu: 627m
memory: 262144k
Interestingly, these changes are not propagated to the deployment that manages these pods:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "1"
creationTimestamp: null
generation: 1
labels:
run: php-apache
name: php-apache
selfLink: /apis/extensions/v1beta1/namespaces/dev/deployments/php-apache
spec:
progressDeadlineSeconds: 600
replicas: 2
revisionHistoryLimit: 2
selector:
matchLabels:
run: php-apache
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
run: php-apache
spec:
containers:
- image: gcr.io/google_containers/hpa-example
imagePullPolicy: Always
name: php-apache
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
So these request injections from the admission controller aren’t meant to replace the default requests that you set in your deployment. If you redeploy an application to the cluster multiple times a day, because of the way changes are applied to resources in Kube, it makes sense to not set any requests in the deployment and defer these to the VPA. This way, your current VPA-controller requests will be preserved when the live deployment on the cluster is patched with changes using kubectl apply
.
VPA Completion Reward
And that brings us to the end of our experimentation with VPAs. If you’ve made it this far we think you deserve a pint!
But if you’re thirsty for more Kubernetes scaling content, check out our Using KEDA Autoscaling with Prometheus and Redis blog.