article still under development
To truly scale your Kubernetes Cluster with pods / containers needed to run your Laravel PHP Application you will soon realize you will need both Horizontal Pod Autoscaling (HPA) as well as Cluster Autoscaling (CA).
When you dig a little deeper into Kubernetes and cloud providers like DigitalOcean in our use case you will see that these two work really well in harmony and are even necessary to make scaling work.
Cluster Autoscaler
DigitalOcean offers Kubernetes Cluster Autoscaling. You can set the cluster to resize automatically and set the minimum and maximum number of nodes.
Kubernetes documentation, which DigitalOcean refers to, mentions the following on Cluster Autoscalers :
Cluster Autoscaler increases the size of the cluster when:
- there are pods that failed to schedule on any of the current nodes due to insufficient resources.
- adding a node similar to the nodes currently present in the cluster would help.
So the auto scaling is triggered when there are not enough resources to add more pods or when adding a node to the current present nodes would help. So you do need to schedule pods to trigger autoscaling. And… this can be done with HPA (Horizontal Pod Autoscaling)
How does Horizontal Pod Autoscaler work with Cluster Autoscaler?
At Kubernetes docs concerning autoscaling FAQ you can also read the following about the use of HPA together with CA:
Horizontal Pod Autoscaler changes the deployment’s or replicaset’s number of replicas based on the current CPU load. If the load increases, HPA will create new replicas, for which there may or may not be enough space in the cluster.
If there are not enough resources, CA will try to bring up some nodes, so that the HPA-created pods have a place to run. If the load decreases, HPA will stop some of the replicas. As a result, some nodes may become underutilized or completely empty, and then CA will terminate such unneeded nodes.
So you can set up your HPA to add more pods when the maximum RAM used is hit for example. When the node resource limit is then hit based on the number of Pods and resources used, more nodes will be added.
DigitalOcean Control Panel Cluster Autoscaler
The DigitalOcean control panel allows you to set up min and max number of nodes and autoscaling. You can use their command line tool as well, but in most cases here their gui works just fine.
Horizontal Pod Autoscaler
To set up a HPA you can run a command and create one or use a file. There are many different types of horizontal pod autoscalers to se up. All differently depending on the resources they focus on.
Command
An example command to add one for the deployment called php is
kubectl autoscale deployment php --cpu-percent=50 --min=1 --max=10
horizontalpodautoscaler.autoscaling/php autoscaled
Deployment Configuration & Metrics Gathering
You will also configure your deployment to work with resource limitations . Here a Kubernetes example settings resources for two containers in one Pod:
apiVersion: v1
kind: Pod
metadata:
name: frontend
spec:
containers:
- name: app
image: images.my-company.example/app:v4
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
- name: log-aggregator
image: images.my-company.example/log-aggregator:v6
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
You also have to work with metrics gathering to know when pods reach their limits and new pods should be added. This can be done with the Kubernetes Metrics Server.
We wil however focus on settings things up on DigitalOcean so will not talk about a general setup here.
Digital Ocean K8 Autoscaling
DigitalOcean has a HPA with deployment setup example here. It explains how to do a basic setup with a deployment and a horizontal pod autoscaler.
Deployment Configuration
It has a deployment configuration for pod resources usage part is the following using cpu limits and cpu and memory request limitations:
. . .
template:
metadata:
creationTimestamp: null
labels:
app: web
spec:
containers:
- image: nginx:latest
imagePullPolicy: Always
name: nginx
resources:
limits:
cpu: 300m
requests:
cpu: 100m
memory: 250Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
status:
availableReplicas: 1
. . .
This is a part you add to the deployment itself. And in the case of a Laravel PHP Application you could use this setup for the Nginx deployment and or PHP FPM deployments. Either that or have PHP and Nginx as containers in one deployment and scale that deployment.
Metrics Server
Now to gather the metrics needed to decide when the HPA should step in they set up a metrics server. Kubernetes has one Metrics Server setup just for your needs. And as DO mentions you can install it with Helm:
helm install stable/metrics-server --name metrics-server
This chart however no longer seems to be under maintenance. You can however also use kubectl
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
NB The Metrics Serve mentions that container must maintain a container runtime interface or container metrics.
You do need to edit the setup to work internal communication via internal ip address instead of hostname using an extra flag so
kubectl edit deployment metrics-server
and add the flag:
--kubelet-preferred-address-types=InternalIP
You also need to add the flag
--metric-resolution
to change the default rate at which the Metrics Server scrapes metrics. So in the end it looks like
. . .
template:
metadata:
creationTimestamp: null
labels:
app: metrics-server
release: metrics-server
spec:
affinity: {}
containers:
- command:
- /metrics-server
- --cert-dir=/tmp
- --logtostderr
- --secure-port=8443
- --metric-resolution=60s
- --kubelet-preferred-address-types=InternalIP
image: gcr.io/google_containers/metrics-server-amd64:v0.3.4
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /healthz
. . .
Once this all starts working you can check metrics with the top command:
kubectl top pod
Horizontal Pod Autoscaler
They add the actual HPA with the command to deal with pod autoscaling:
kubectl autoscale deployment web --max=4 --cpu-percent=80
It is a really basic one though so would not recommend using it just like that. It focusses on CPU here and has few details added to the HPA setup. You could focus on traffic or RAM instead for example. We also prefer to set things up in files and version control our Kubernetes deployment so we will add a file example here soon.
example…
Pod Disruption Budgets
To prevent pods from being deleted too abruptly you can also add Pod Disruption Budgets. Or as DO states it
A PodDisruptionBudget
(PDB) specifies the minimum number of replicas that an application can tolerate having during a voluntary disruption, relative to how many it is intended to have…. We recommend you set a PDB for your workloads to ensure graceful scaledown.
And that sounds like sound advice does it not? We do want things to scale down properly and not too hasty for sure.
Kubernetes example for minimum number of available pods for app zookeeper
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: zk-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: zookeeper