Learn How to Scale a Kubernetes Cluster

Kubernetes Cluster

Kubernetes Cluster

Application scaling is one approach that can be used to dynamically allocate resources. However, it has its limits and sometimes all node resources are used and you require more nodes to run your workloads. When scaling nodes is no longer possible, we need to scale our cluster.

To demonstrate how we scale a cluster, we will begin by setting up a Google Compute Engine Cluster (GCE). This tutorial will assume you have already set up Google Cloud Platform. If you have not, please refer to the ‘What is Kubernetes’ tutorial for instructions on how to setup the gcloud SDK.

Navigate to the console here https://cloud.google.com/container-engine/ and login. Create a new project titled ‘learning cluster scaling’.
new proj
Before proceeding, please ensure billing is active to ensure you are able to access Google Cloud Services. On your console, click on products and services then Billing to check your billing status.
check billing
Another prerequisite is enabling the Google Container Engine API. Click on products and services then API Manager. To enable the API, you need to select a project then click on Enable.
container api
Another prerequisite is enabling Google Compute Engine API. Click on Products & Services, then API manager, click on library then search for Google Compute Engine API, select it then enable it.
compute api
Although it is possible to operate Kubernetes and Google Cloud remotely from your machine, you can also use Google Cloud Shell, which is a free command line interface accessed through a browser. The underlying structure of the interface is a Docker container derived from Debian and it has docker, gcloud and kubetctl among other tools. Activating Google Cloud Shell requires you to select a project, then click on Activate Google Cloud Shell button next search. Provisioning and connecting will take a few seconds.
enable shell
The shell is then used to execute commands, execute help to ensure it is working correctly.
There are two ways in which we can create a cluster. The first way of creating a cluster is using the cloud platform console. The second approach to creating a cluster is using the gcloud command line interface. In this article, we will demonstrate the two approaches.
We will first demonstrate how a cluster is created using Google Cloud platform. A cluster is made up of a master and a set of nodes. Google hosts the master API and nodes are ordinary GCE virtual machines. Let us begin by demonstrating how to visually create a cluster.
On your console, click on Products & Services, compute engine, container clusters then click on create a container cluster.
create cluster
You will be prompted to provide a cluster name, a description, zone, machine type, and size which specifies the number of nodes.
3 node clust
A 3 node cluster with Kubernetes installed will then be created.
created cluster
Another approach to creating a cluster is using gcloud command interface. To create the previous cluster using CLI, we need to execute the commands below. We can change the name of our cluster.

 gcloud container clusters create cluster-scaling-cli \
  --disk-size 100 \
  --zone us-central1-c \
  --enable-cloud-logging \
  --enable-cloud-monitoring \
  --machine-type n1-standard-2 \
  --num-nodes 2

cli 1
The cluster we have created above is basic and it lacks high availability features. To add high availability to our cluster we can include additional zones using the –additional-zones. The modified code is shown below.

gcloud container clusters create kubernetes-lab1 \
  --disk-size 100 \
  --zone europe-west1-d \
  --additional-zones us-west1-b,us-west1-c \
  --enable-cloud-logging \
  --enable-cloud-monitoring \
  --machine-type n1-standard-2 \
  --num-nodes 2

After creating a cluster, you can then request for information about the cluster using the command below.

kubectl cluster-info

After creating a cluster, you can then request for information about the cluster using the command below.

kubectl cluster-info

clust info
In the previous sections, we focused on demonstrating how to set up clusters on GCE. In the following sections, we will focus on demonstrating how to scale Kubernetes clusters. Scaling a cluster is automatically taken care of and we refer to this as auto-scaling. To benefit from auto-scaling there are prerequisites that need to be met. We need to have an active project that has enabled Google Cloud Monitoring, Google Cloud Logging and stackdriver. To enable each of these APIs, click on Products & Services, API manager, Library then search for each API and enable it.
compute api
While specifying a pod, you have the option of specifying CPU and RAM resources that will be used by a container. Specifying resources that will be used by a container has the advantage of enabling the schedulers optimize pod placing on nodes. When container resources are limited competition for resources is better handled. When there are inadequate resources or all pod criteria are not met, the pod has to wait for termination of existing nodes or addition of more nodes. When pods cannot be scheduled the autoscaler will determine if node addition is helpful. When node addition is determined to be useful, the cluster size is increased to accommodate pending pods. When some nodes are not needed for more than 10 minutes, which is subject to change in future the cluster size is decreased.
In GCE the auto-scaler is configured at the instance level. The two options of enabling GCE auto-scaler are at cluster creation or using a kube-up.sh script. Auto-scaler needs you to configure three environment variables. The first variable is KUBE_ENABLE_CLUSTER_AUTOSCALER which when set to true enables auto-scaling. The second variable is KUBE_AUTOSCALER_MIN_NODES which specifies minimum cluster nodes. The third variable is KUBE_AUTOSCALER_MAX_NODES which specifies maximum cluster nodes.
To enable auto-scaling in the learn-cluster-scaling cluster we created earlier we use the command below.

gcloud alpha container clusters update learn-cluster-scaling --enable-autoscaling --min-nodes=1 --max-nodes=10 --zone=us-central1-a --node-pool=default-pool

When using auto-scaling there are several points to have in mind.

The auto-scaler works on the assumption that pods can be restarted leading to some interruption. If you cannot tolerate interruption, it is better not to use auto-scaling.
In clusters that have over 100 nodes and many pods, auto-scaling is not recommended.

In this post, we discussed the process of setting up clusters on GCE. We discussed how auto-scaling of clusters is implemented. Finally, we noted important issues to have in mind when implementing auto-scaling.



Please enter your comment!
Please enter your name here