06. How to configure HI GIO Kunernetes cluster autoscale

Overview

Step-by-step guide on how to configure HI GIO Kubernetes cluster autoscale

  • Install tanzu-cli

  • Create cluster-autoscaler deployment from tanzu package using tanzu-cli

  • Enable cluster autoscale for your cluster

  • Test cluster autoscale

  • Delete cluster-autoscaler deployment and clean up test resource

Procedure

Pre-requisites:

  • Ubuntu bastion can connect to your Kubernetes cluster

  • Permission for access to your Kubernetes cluster

1

Step 1: Install tanzu-cli

#Install tanzu-cli to ubuntu
sudo apt update
sudo apt install -y ca-certificates curl gpg
sudo mkdir -p /etc/apt/keyrings
curl -fsSL https://storage.googleapis.com/tanzu-cli-installer-packages/keys/TANZU-PACKAGING-GPG-RSA-KEY.gpg | sudo gpg --dearmor -o /etc/apt/keyrings/tanzu-archive-keyring.gpg
echo "deb [signed-by=/etc/apt/keyrings/tanzu-archive-keyring.gpg] https://storage.googleapis.com/tanzu-cli-installer-packages/apt tanzu-cli-jessie main" | sudo tee /etc/apt/sources.list.d/tanzu.list
sudo apt update
sudo apt install -y tanzu-cli
#Verify tanzu-cli installation
tanzu version

(Optional) If you want to configure tanzu completion, please run the command below and follow the instructions output

tanzu completion --help

2

Step 2: Create cluster-autoscaler deployment from tanzu package using tanzu-cli

  • Switched to your Kubernetes context

kubectl config use-context <your context name>
  • List available cluster-autoscaler in tanzu package and note the version name

tanzu package available list cluster-autoscaler.tanzu.vmware.com
  • Create kubeconfig secret name cluster-autoscaler-mgmt-config-secret in cluster kube-system namespace

kubectl create secret generic cluster-autoscaler-mgmt-config-secret \
--from-file=value=<path to your kubeconfig file> \
-n kube-system
  • Create cluster-autoscaler-values.yaml file

arguments:
  ignoreDaemonsetsUtilization: true
  maxNodeProvisionTime: 15m
  maxNodesTotal: 0 #Leave this value as 0. We will define the max and min number of nodes later.
  metricsPort: 8085
  scaleDownDelayAfterAdd: 10m
  scaleDownDelayAfterDelete: 10s
  scaleDownDelayAfterFailure: 3m
  scaleDownUnneededTime: 10m
clusterConfig:
  clusterName: "demo-autoscale-tkg" #adjust here
  clusterNamespace: "demo-autoscale-tkg-ns" #adjust here
paused: false
  • Install cluster-autoscaler

tanzu package install cluster-autoscaler \
--package cluster-autoscaler.tanzu.vmware.com \
--version <version available> \ #adjust the version listed above to match your kubernetes version
--values-file 'cluster-autoscaler-values.yaml' \
--namespace tkg-system #please do not change, this is default namespace for tanzu package
  • Configure the minimum and maximum number of nodes in your cluster

    • Get machinedeployments name and namespace

kubectl get machinedeployments.cluster.x-k8s.io -A
  • Set cluster-api-autoscaler-node-group-min-size and cluster-api-autoscaler-node-group-max-size

kubectl annotate machinedeployment <machinedeployment name> cluster.x-k8s.io/cluster-api-autoscaler-node-group-min-size=<number min> -n <machinedeployment namespace>
kubectl annotate machinedeployment <machinedeployment name> cluster.x-k8s.io/cluster-api-autoscaler-node-group-max-size=<number max> -n <machinedeployment namespace>
  • Enable cluster autoscale for your cluster

3

Step 3: Test cluster autoscale

  • Get the current number of nodes

kubectl get nodes

  • Create test-autoscale.yaml file

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
  namespace: default
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 80
      topologySpreadConstraints: #Spreads pods across different nodes (ensures no node has more pods than others)
      - maxSkew: 1 
        topologyKey: kubernetes.io/hostname
        whenUnsatisfiable: DoNotSchedule
        labelSelector:
          matchLabels:
            app: nginx
  • Apply test-autoscale.yaml file to deploy 2 replicas of nginx pod in the default namespace (it will trigger to create a new worker node)

kubectl apply -f test-autoscale.yaml
  • Get nginx deployment

kubectl get pods
kubectl describe pod nginx-589656b9b5-mcm5j | grep -A 10 Events
Warning  FailedScheduling  2m53s  default-scheduler   0/2 nodes are available: 1 node(s) didn't match pod topology spread constraints, 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }. preemption: 0/2 nodes are available: 1 No preemption victims found for incoming pod, 1 Preemption is not helpful for scheduling.
Normal   TriggeredScaleUp  2m43s  cluster-autoscaler  pod triggered scale-up: [{MachineDeployment/demo-autoscale-tkg-ns/demo-autoscale-tkg-worker-node-pool-1 1->2 (max: 5)}]
  • Waiting for a new node to be provisioned, then you can see a new worker node has been provisioned and new nginx pod status is running

  • Clean up test resource

kubectl delete -f test-autoscale.yaml
  • Delete cluster-autoscaler deployment (Optional)

In case you don't want your cluster to auto-scale anymore. You can delete cluster-autoscaler deployment using tanzu-cli:

tanzu package installed delete cluster-autoscaler -n tkg-system -y

Last updated