06. How to configure HI GIO Kunernetes cluster autoscale
Overview
Step-by-step guide on how to configure HI GIO Kubernetes cluster autoscale
Install tanzu-cli
Create cluster-autoscaler deployment from tanzu package using tanzu-cli
Enable cluster autoscale for your cluster
Test cluster autoscale
Delete cluster-autoscaler deployment and clean up test resource
Procedure
Pre-requisites:
Ubuntu bastion can connect to your Kubernetes cluster
Permission for access to your Kubernetes cluster
Step 1: Install tanzu-cli
#Install tanzu-cli to ubuntu
sudo apt update
sudo apt install -y ca-certificates curl gpg
sudo mkdir -p /etc/apt/keyrings
curl -fsSL https://storage.googleapis.com/tanzu-cli-installer-packages/keys/TANZU-PACKAGING-GPG-RSA-KEY.gpg | sudo gpg --dearmor -o /etc/apt/keyrings/tanzu-archive-keyring.gpg
echo "deb [signed-by=/etc/apt/keyrings/tanzu-archive-keyring.gpg] https://storage.googleapis.com/tanzu-cli-installer-packages/apt tanzu-cli-jessie main" | sudo tee /etc/apt/sources.list.d/tanzu.list
sudo apt update
sudo apt install -y tanzu-cli
#Verify tanzu-cli installation
tanzu version

To install tanzu-cli in other environments, please refer to the documentation below:

Step 2: Create cluster-autoscaler deployment from tanzu package using tanzu-cli
Switched to your Kubernetes context
kubectl config use-context <your context name>

List available cluster-autoscaler in tanzu package and note the version name
tanzu package available list cluster-autoscaler.tanzu.vmware.com

Create kubeconfig secret name
cluster-autoscaler-mgmt-config-secret
in clusterkube-system
namespace
kubectl create secret generic cluster-autoscaler-mgmt-config-secret \
--from-file=value=<path to your kubeconfig file> \
-n kube-system

Please do not change the secret name (cluster-autoscaler-mgmt-config-secret) and namespace (kube-system)
Create
cluster-autoscaler-values.yaml
file
arguments:
ignoreDaemonsetsUtilization: true
maxNodeProvisionTime: 15m
maxNodesTotal: 0 #Leave this value as 0. We will define the max and min number of nodes later.
metricsPort: 8085
scaleDownDelayAfterAdd: 10m
scaleDownDelayAfterDelete: 10s
scaleDownDelayAfterFailure: 3m
scaleDownUnneededTime: 10m
clusterConfig:
clusterName: "demo-autoscale-tkg" #adjust here
clusterNamespace: "demo-autoscale-tkg-ns" #adjust here
paused: false
Required values:
clusterName
: your cluster nameclusterNamespace
: your cluster namespace
Install cluster-autoscaler
tanzu package install cluster-autoscaler \
--package cluster-autoscaler.tanzu.vmware.com \
--version <version available> \ #adjust the version listed above to match your kubernetes version
--values-file 'cluster-autoscaler-values.yaml' \
--namespace tkg-system #please do not change, this is default namespace for tanzu package

The cluster-autoscaler will deploy into the kube-system
namespace.
Run the command below to verify cluster-autoscaler deployment:
kubectl get deployments.apps -n kube-system cluster-autoscaler

Configure the minimum and maximum number of nodes in your cluster
Get
machinedeployments
name and namespace
kubectl get machinedeployments.cluster.x-k8s.io -A

Set
cluster-api-autoscaler-node-group-min-size
andcluster-api-autoscaler-node-group-max-size
kubectl annotate machinedeployment <machinedeployment name> cluster.x-k8s.io/cluster-api-autoscaler-node-group-min-size=<number min> -n <machinedeployment namespace>
kubectl annotate machinedeployment <machinedeployment name> cluster.x-k8s.io/cluster-api-autoscaler-node-group-max-size=<number max> -n <machinedeployment namespace>
Enable cluster autoscale for your cluster
Because this step requires provider permission to perform, please notify the cloud provider to perform this step.
Step 3: Test cluster autoscale
Get the current number of nodes
kubectl get nodes
There is currently only one worker node.
Create
test-autoscale.yaml
file
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
namespace: default
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
topologySpreadConstraints: #Spreads pods across different nodes (ensures no node has more pods than others)
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: nginx
Apply
test-autoscale.yaml
file to deploy 2 replicas of nginx pod in the default namespace (it will trigger to create a new worker node)
kubectl apply -f test-autoscale.yaml

Get nginx deployment
kubectl get pods

kubectl describe pod nginx-589656b9b5-mcm5j | grep -A 10 Events

You can see there is a new nginx pod with a status of Pending and the events shown FailedScheduling
and TriggeredScaleUp
:
Warning FailedScheduling 2m53s default-scheduler 0/2 nodes are available: 1 node(s) didn't match pod topology spread constraints, 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }. preemption: 0/2 nodes are available: 1 No preemption victims found for incoming pod, 1 Preemption is not helpful for scheduling.
Normal TriggeredScaleUp 2m43s cluster-autoscaler pod triggered scale-up: [{MachineDeployment/demo-autoscale-tkg-ns/demo-autoscale-tkg-worker-node-pool-1 1->2 (max: 5)}]
Waiting for a new node to be provisioned, then you can see a new worker node has been provisioned and new nginx pod status is running
Clean up test resource
kubectl delete -f test-autoscale.yaml

After deleting the nginx deployment test. The cluster waits a few minutes to delete the unneeded node (please see scaleDownUnneededTime
value in cluster-autoscaler-values.yaml
file)
Delete cluster-autoscaler deployment (Optional)

Last updated