Upgrading EKS

Upgrading EKS

1. Pre-Upgrade Considerations

Backup the Cluster

  • Create a backup of critical resources like deployments, services, ConfigMaps, and secrets.
  • Use tools like kubectl or Velero to back up all resources.

Check Kubernetes Version Compatibility

  • Review the Kubernetes version you're upgrading to and make sure all your deployed workloads and third-party tools (e.g., Helm charts, ingress controllers) are compatible with the new version.

Review AWS EKS Release Notes

  • Check the EKS documentation for version-specific release notes, deprecations, or breaking changes.

Test in a Non-Production Environment

  • Before upgrading in production, test the upgrade process in a staging or development environment to identify potential issues.
2. Upgrade the EKS Control Plane

Step 1: Initiate the Control Plane Upgrade

During a master node upgrade, the control plane components, like the API server and controller manager, become temporarily unavailable.
  • This means we cannot run kubectl commands as the API server is down.
  • If a pod crashes, it won't be replaced because the controller manager isn't operational.
  • However, the worker nodes continue to function normally, so existing workloads remain unaffected, ensuring the application's uptime during the process."
  • Upgrade the control plane using the AWS CLI:
    • aws eks update-cluster-version --name <cluster_name> --kubernetes-version <new_version>
  • AWS Console: You can also initiate the upgrade via the AWS Management Console by navigating to your EKS cluster and clicking on the "Update" button.

Step 3: Monitor the Upgrade Process

  • Monitor the progress of the control plane upgrade:
    • aws eks describe-update --name <cluster_name> --update-id <update_id>
3. Upgrade the Worker Nodes
Once the master node has been upgraded, we need to upgrade the worker nodes (upgrade the k8s components running on them). As the worker nodes serve traffic, there are various strategies to upgrade them.

Step 1: Identify Your Node Group

  • If you’re using managed node groups, identify the node groups you want to upgrade:
    • aws eks list-nodegroups --cluster-name <cluster_name>

Step 2: Upgrade Managed Node Group

  • Use the following command to upgrade managed node groups:
    • aws eks update-nodegroup-version --cluster-name <cluster_name> --nodegroup-name <nodegroup_name> --kubernetes-version <new_version>
4. Post-Upgrade Validation

Step 1: Check the Kubernetes API

  • Ensure the Kubernetes API server is responsive and operating normally:
    • kubectl get nodes

Step 2: Validate Deployed Applications

  • Ensure that all deployments and services are running as expected:
    • kubectl get pods --all-namespaces

Step 3: Validate Node Version

  • Check that all nodes are running the new Kubernetes version:
    • kubectl get nodes -o wide

Step 4: Run Functional Tests

  • Perform smoke tests and run your CI/CD pipeline to validate that the cluster and applications are functioning correctly.

Step 5: Monitor Logs and Metrics

  • Use CloudWatch, Prometheus, or any other monitoring tools to monitor cluster performance and application health post-upgrade.

5. Rollback Plan (If Necessary)

  • If the upgrade introduces issues, be prepared to roll back by:
    • Switching back to the previous version of the control plane (if supported by AWS).
    • Restoring worker nodes by launching older versions of node groups or EC2 instances.
    • Restoring from the backup taken earlier.
Best Practices
  • Always upgrade in a staging environment first to catch potential issues.
  • Upgrade during maintenance windows to minimize business impact.
  • Use blue-green deployments for node upgrades to reduce downtime.
  • Ensure you have adequate monitoring and alerting in place to detect issues early during the upgrade process.