Lesson 3.5: Horizontal Pod Autoscaler
Autoscaling in Kubernetes is a mechanism to dynamically adjust the resources allocated to workloads based on demand. It ensures that applications have the necessary resources to handle traffic spikes while optimizing resource utilization during periods of low demand. Kubernetes provides several autoscaling mechanisms, including:
The Horizontal Pod Autoscaler (HPA) automatically scales the number of Pod replicas in a Deployment, ReplicaSet, or StatefulSet based on observed CPU or memory utilization, or custom metrics.
-
How HPA Works:
- HPA continuously monitors the resource usage (e.g., CPU, memory) or custom metrics of the Pods.
- If the usage exceeds a predefined threshold, HPA increases the number of Pod replicas.
- If the usage falls below the threshold, HPA decreases the number of Pod replicas.
-
Key Features:
- Metrics: HPA can scale based on CPU, memory, or custom metrics (e.g., requests per second).
- Target Utilization: You define a target utilization percentage (e.g., 80% CPU usage).
- Min/Max Replicas: You specify the minimum and maximum number of replicas to control the scaling range.
-
Use Cases:
- Scaling stateless applications (e.g., web servers, APIs) to handle varying traffic loads.
- Ensuring high availability and performance during traffic spikes.
[root@master ~]# cd hpa/ [root@master hpa]# ls loaddocker php-apache.yml [root@master hpa]# cat php cat: php: No such file or directory [root@master hpa]# cat php-apache.yml apiVersion: apps/v1 kind: Deployment metadata: name: php-apache spec: selector: matchLabels: run: php-apache template: metadata: labels: run: php-apache spec: containers: - name: php-apache image: treehouses/php-apache:202109232218 ports: - containerPort: 80 resources: limits: cpu: 500m requests: cpu: 200m --- apiVersion: v1 kind: Service metadata: name: php-apache labels: run: php-apache spec: ports: - port: 80 selector: run: php-apache
[root@master hpa]# kubectl get deployments NAME READY UP-TO-DATE AVAILABLE AGE php-apache 1/1 1 1 19s [root@master hpa]# [root@master hpa]# kubectl get pods NAME READY STATUS RESTARTS AGE php-apache-559849875-9b7jw 1/1 Running 0 33s
[root@master hpa]# kubectl autoscale deployment php-apache --cpu-percent=10 --min=1 --max=10 horizontalpodautoscaler.autoscaling/php-apache autoscaled [root@master hpa]# kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE php-apache Deployment/php-apache <unknown>/10% 1 10 0 4s
Providing load
[root@master hpa]# kubectl run -i --tty load-generator --rm --image=busybox:1.28 --restart=Never -- /bin/sh -c "while sleep 0.001; do wget -q -O- http://php-apache; done"
When the cpu usage is more than 10% then the pods are scaled. So we can see in the output below, showing the increment in pod number when the usage is increased.
[root@master hpa]# kubectl get hpa -w NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE php-apache Deployment/php-apache 0%/10% 1 10 1 8m59s php-apache Deployment/php-apache 3%/10% 1 10 1 9m8s php-apache Deployment/php-apache 22%/10% 1 10 1 9m23s php-apache Deployment/php-apache 21%/10% 1 10 3 9m39s [root@master hpa]# kubectl get pods NAME READY STATUS RESTARTS AGE load-generator 1/1 Running 0 52s php-apache-559849875-9b7jw 1/1 Running 0 13m php-apache-559849875-dqf4n 1/1 Running 0 18s php-apache-559849875-kgs5t 1/1 Running 0 19s