Kubernetes HPA and CA Autoscaling Laravel on DigitalOcean

article still under development

To truly scale your Kubernetes Cluster with pods / containers needed to run your Laravel PHP Application you will soon realize you will need both Horizontal Pod Autoscaling (HPA) as well as Cluster Autoscaling (CA).

When you dig a little deeper into Kubernetes and cloud providers like DigitalOcean in our use case you will see that these two work really well in harmony and are even necessary to make scaling work.

Cluster Autoscaler

DigitalOcean offers Kubernetes Cluster Autoscaling. You can set the cluster to resize automatically and set the minimum and maximum number of nodes.

Kubernetes documentation, which DigitalOcean refers to, mentions the following on Cluster Autoscalers :

Cluster Autoscaler increases the size of the cluster when:

  • there are pods that failed to schedule on any of the current nodes due to insufficient resources.
  • adding a node similar to the nodes currently present in the cluster would help.

So the auto scaling is triggered when there are not enough resources to add more pods or when adding a node to the current present nodes would help. So you do need to schedule pods to trigger autoscaling. And… this can be done with HPA (Horizontal Pod Autoscaling)

How does Horizontal Pod Autoscaler work with Cluster Autoscaler?

At Kubernetes docs concerning autoscaling FAQ you can also read the following about the use of HPA together with CA:

Horizontal Pod Autoscaler changes the deployment’s or replicaset’s number of replicas based on the current CPU load. If the load increases, HPA will create new replicas, for which there may or may not be enough space in the cluster.

If there are not enough resources, CA will try to bring up some nodes, so that the HPA-created pods have a place to run. If the load decreases, HPA will stop some of the replicas. As a result, some nodes may become underutilized or completely empty, and then CA will terminate such unneeded nodes.

So you can set up your HPA to add more pods when the maximum RAM used is hit for example. When the node resource limit is then hit based on the number of Pods and resources used, more nodes will be added.

DigitalOcean Control Panel Cluster Autoscaler

The DigitalOcean control panel allows you to set up min and max number of nodes and autoscaling. You can use their command line tool as well, but in most cases here their gui works just fine.

Horizontal Pod Autoscaler

To set up a HPA you can run a command and create one or use a file. There are many different types of horizontal pod autoscalers to se up. All differently depending on the resources they focus on.

Command

An example command to add one for the deployment called php is

kubectl autoscale deployment php --cpu-percent=50 --min=1 --max=10
horizontalpodautoscaler.autoscaling/php autoscaled

Deployment Configuration & Metrics Gathering

You will also configure your deployment to work with resource limitations . Here a Kubernetes example settings resources for two containers in one Pod:

apiVersion: v1
kind: Pod
metadata:
  name: frontend
spec:
  containers:
  - name: app
    image: images.my-company.example/app:v4
    resources:
      requests:
        memory: "64Mi"
        cpu: "250m"
      limits:
        memory: "128Mi"
        cpu: "500m"
  - name: log-aggregator
    image: images.my-company.example/log-aggregator:v6
    resources:
      requests:
        memory: "64Mi"
        cpu: "250m"
      limits:
        memory: "128Mi"
        cpu: "500m"

You also have to work with metrics gathering to know when pods reach their limits and new pods should be added. This can be done with the Kubernetes Metrics Server.

We wil however focus on settings things up on DigitalOcean so will not talk about a general setup here.

Digital Ocean K8 Autoscaling

DigitalOcean has a HPA with deployment setup example here. It explains how to do a basic setup with a deployment and a horizontal pod autoscaler.

Deployment Configuration

It has a deployment configuration for pod resources usage part is the following using cpu limits and cpu and memory request limitations:

. . .
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: web
    spec:
      containers:
      - image: nginx:latest
        imagePullPolicy: Always
        name: nginx
        resources:
          limits:
            cpu: 300m
          requests:
            cpu: 100m
            memory: 250Mi
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
status:
  availableReplicas: 1
. . .

This is a part you add to the deployment itself. And in the case of a Laravel PHP Application you could use this setup for the Nginx deployment and or PHP FPM deployments. Either that or have PHP and Nginx as containers in one deployment and scale that deployment.

Metrics Server

Now to gather the metrics needed to decide when the HPA should step in they set up a metrics server. Kubernetes has one Metrics Server setup just for your needs. And as DO mentions you can install it with Helm:

helm install stable/metrics-server --name metrics-server

This chart however no longer seems to be under maintenance. You can however also use kubectl

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

NB The Metrics Serve mentions that container must maintain a container runtime interface or container metrics.

You do need to edit the setup to work internal communication via internal ip address instead of hostname using an extra flag so

kubectl edit deployment metrics-server

and add the flag:

--kubelet-preferred-address-types=InternalIP

You also need to add the flag

--metric-resolution

to change the default rate at which the Metrics Server scrapes metrics. So in the end it looks like

. . .
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: metrics-server
        release: metrics-server
    spec:
      affinity: {}
      containers:
      - command:
        - /metrics-server
        - --cert-dir=/tmp
        - --logtostderr
        - --secure-port=8443
        - --metric-resolution=60s
        - --kubelet-preferred-address-types=InternalIP
        image: gcr.io/google_containers/metrics-server-amd64:v0.3.4
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /healthz
. . .

Once this all starts working you can check metrics with the top command:

kubectl top pod

Horizontal Pod Autoscaler

They add the actual HPA with the command to deal with pod autoscaling:

kubectl autoscale deployment web  --max=4 --cpu-percent=80

It is a really basic one though so would not recommend using it just like that. It focusses on CPU here and has few details added to the HPA setup. You could focus on traffic or RAM instead for example. We also prefer to set things up in files and version control our Kubernetes deployment so we will add a file example here soon.

example…

Pod Disruption Budgets

To prevent pods from being deleted too abruptly you can also add Pod Disruption Budgets. Or as DO states it

PodDisruptionBudget (PDB) specifies the minimum number of replicas that an application can tolerate having during a voluntary disruption, relative to how many it is intended to have…. We recommend you set a PDB for your workloads to ensure graceful scaledown.

And that sounds like sound advice does it not? We do want things to scale down properly and not too hasty for sure.

Kubernetes example for minimum number of available pods for app zookeeper

apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
  name: zk-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: zookeeper
Jasper Frumau

Jasper has been working with web frameworks and applications such as Laravel, Magento and his favorite CMS WordPress including Roots Trellis and Sage for more than a decade. He helps customers with web design and online marketing. Services provided are web design, ecommerce, SEO, content marketing. When Jasper is not coding, marketing a website, reading about the web or dreaming the internet of things he plays with his son, travels or run a few blocks.