Amazon EKS Upgrade Journey From 1.22 to 1.23

Marcin Cuber
7 min readAug 12, 2022

We are now welcoming “The Next Frontier”. Process and considerations while upgrading EKS control-plane to version 1.23.

Overview

The upstream project theme for this release is “The Next Frontier.” One of the reasons for naming this release was because Kubernetes release lead is a big Star Trek fan. Star Trek V: Final Frontier starts and ends in Yosemite National Park, “a place where I grew up going to”. The release itself brings new and graduated enhancements encapsulating what’s next for Kubernetes. Since each Release Team brings new contributors and for those folks, the Release Team and the release is their first contribution in their open-source frontier.

1.23 Kubernetes release has numerous important changes including the Pod Security admission controller moving to beta, updates to the Container Storage Interface (CSI) Migration for working with AWS Elastic Block Store (Amazon EBS) volumes, and deprecation and removals of certain beta application programming interfaces (APIs) and features that you can read about in the section. I believe the CSI driver updates will require most administrators to take actions to ensure a smooth upgrade process. Kubernetes 1.23 also introduces Pod Security Standards (PSS) and Pod Security Admission (PSA), as well as the GA of Horizontal Pod Autoscaler (HPA) v2.

AWS release of EKS 1.23 still uses a deprecated runtime dockerd as the switch to new runtime will happen in EKS 1.24. Make should to make the switch to using containerd asap as you do need to test your workloads before next upgrade. Latest Amazon Linux 2 EKS optimized AMI images come with containerd support built in. The runtime can be specified for EKS nodes by using the following flag: --container-runtime containerd.This option is passed to your node through EC2 user data.

Previous Stories and Upgrades

If you are looking at

  • upgrading EKS from 1.21 to 1.22 then check the previous story
  • upgrading EKS from 1.20 to 1.21 check out this story
  • upgrading EKS from 1.19 to 1.20 check out this story
  • upgrading EKS from 1.18 to 1.19 check out this story
  • upgrading EKS from 1.17 to 1.18 check out this story

Kubernetes 1.23 features and removals

Most Important- CNI Driver

The Kubernetes in-tree to Container Storage Interface (CSI) Volume migration feature is enabled. This enables the replacement of existing Kubernetes in-tree storage plugins for Amazon EBS with a corresponding Amazon EBS CSI driver. It also means that With this if you use existing StorageClass, PersistentVolume, and PersistentVolumeClaim objects that belong to these workloads, there likely won't be any noticeable change. The feature enables Kubernetes to delegate all storage management operations from the in-tree plugin to the CSI driver.

If you use Amazon EBS volumes in your cluster, install the Amazon EBS CSI driver in your cluster before you update your cluster to version 1.23. It doesn’t matter whether you install it using EKS addons system or another way. If you don't install the driver before updating your cluster, interruptions to your workloads might occur. For instructions on how to install the Amazon EBS CSI driver on your cluster, see Amazon EBS CSI driver.

Other

  • As mentioned already, Amazon EKS will end support for dockershim starting in Amazon EKS version 1.24. Starting with Amazon EKS version 1.24, Amazon EKS official AMIs will have containerd as the only runtime. Basically move to new runtime asap.
  • Kubernetes graduated IPv4/IPv6 dual-stack networking for pods, services, and nodes to general availability. However, Amazon EKS and the Amazon VPC CNI plugin for Kubernetes don't support dual-stack networking. Your clusters can assign IPv4 or IPv6 addresses to pods and services, but can't assign both address types.
  • Kubernetes graduated the Pod Security Admission feature to a beta state. The feature is enabled by default. For more information, see Pod Security Admission in the Kubernetes documentation. Pod Security Admission replaces the Pod Security Policy (PSP) admission controller. The PSP admission controller isn’t supported and is scheduled for removal in Kubernetes version 1.25. The PSP admission controller enforces pod security standards on pods in a namespace based on specific namespace labels that set the enforcement level.
  • The kube-proxy image deployed with clusters is now the minimal base image maintained by Amazon EKS Distro. The image contains minimal packages and doesn’t have shells or package managers.
  • Kubernetes graduated ephemeral containers to beta. Ephemeral containers are temporary containers that run in the same namespace as an existing pod. You can use them to observe the state of pods and containers for troubleshooting and debugging purposes. This is especially useful for interactive troubleshooting when kubectl exec is insufficient because either a container has crashed or a container image doesn’t include debugging utilities. An example of a container that includes a debugging utility is distroless images. For more information, see Debugging with an ephemeral debug container in the Kubernetes documentation.
  • Kubernetes graduated the HorizontalPodAutscaler autoscaling/v2 stable API to general availability. The HorizontalPodAutoscaler autoscaling/v2beta2 API is no longer supported. We now have a stable api version where we can scale our workloads using both CPU and Memory.

For the complete set of changes in Kubernetes 1.23, see https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#changelog-since-v1220.

Upgrade your EKS with terraform

This time upgrade of the control plane takes around ~9 minutes and didn’t cause any issues. AWS are doing a great job at reducing the time it takes to upgrade EKS control plan.

I immediately upgraded worker nodes which took around 10–20 minutes to join the upgraded EKS cluster. This time is dependent on how many worker nodes you have and how many pods need to be drained from old nodes.

I personally use Terraform to deploy and upgrade my EKS clusters. Here is an example of the EKS cluster resource.

resource "aws_eks_cluster" "cluster" {
enabled_cluster_log_types = ["audit"]
name = local.name_prefix
role_arn = aws_iam_role.cluster.arn
version = "1.23"

vpc_config {
subnet_ids = flatten([module.vpc.public_subnets, module.vpc.private_subnets])
security_group_ids = []
endpoint_private_access = "true"
endpoint_public_access = "true"
}

encryption_config {
resources = ["secrets"]
provider {
key_arn = module.kms-eks.key_arn
}
}

tags = var.tags
}

For worker nodes I have used official AMI with id: ami-06ea9792058bd9291. I didn’t notice any issues after rotating all nodes.

Templates I use for creating EKS clusters using Terraform can be found in my Github repository reachable under https://github.com/marcincuber/eks/tree/master/terraform-aws

Upgrading Managed EKS Add-ons

In this case the change is trivial and works fine, simply update the version of the add-on. In my case, from this release I utilise kube-proxy, coreDNS and ebs-csi-driver.

Terraform resources for add-ons

resource "aws_eks_addon" "kube_proxy" {
cluster_name = aws_eks_cluster.cluster[0].name
addon_name = "kube-proxy"
addon_version = "1.23.7-eksbuild.1"
resolve_conflicts = "OVERWRITE"
}
resource "aws_eks_addon" "core_dns" {
cluster_name = aws_eks_cluster.cluster[0].name
addon_name = "coredns"
addon_version = "v1.8.7-eksbuild.2"
resolve_conflicts = "OVERWRITE"
}
resource "aws_eks_addon" "aws_ebs_csi_driver" {
cluster_name = aws_eks_cluster.cluster[0].name
addon_name = "aws-ebs-csi-driver"
addon_version = "v1.10.0-eksbuild.1"
resolve_conflicts = "OVERWRITE"
}

After upgrading EKS control-plane

Remember to upgrade core deployments and daemon sets that are recommended for EKS 1.23.

  1. CoreDNS — 1.8.7
  2. Kube-proxy —1.23.7-minimal-eksbuild.1 (note the change to minimal version, it is only stated in the official documentation)
  3. VPC CNI — 1.11.2-eksbuild.1
  4. aws-ebs-csi-driver- v1.10.0-eksbuild.1

The above is just a recommendation from AWS. You should look at upgrading all your components to match the 1.23 Kubernetes version. They could include:

  1. calico-node
  2. cluster-autoscaler or Karpenter
  3. kube-state-metrics
  4. metrics-server
  5. csi-secrets-store
  6. calico-typha and calico-typha-horizontal-autoscaler
  7. reloader

Looking ahead

The v1.25 release that’s planned for next year will stop serving beta versions of several Kubernetes APIs that are stable right now and have been for some time. The same v1.25 release will remove PodSecurityPolicy, which is deprecated and won’t graduate to a stable state. See PodSecurityPolicy Deprecation: Past, Present, and Future for more information.

The official list of API removals planned for Kubernetes 1.25 is:

  • The beta CronJob API (batch/v1beta1)
  • The beta EndpointSlice API (networking.k8s.io/v1beta1)
  • The beta PodDisruptionBudget API (policy/v1beta1)
  • The beta PodSecurityPolicy API (policy/v1beta1)

Summary and Conclusions

Mega surprise with this release, upgrade of the cluster only took ~9mins. This is a huge reduction in time over previous upgrades. This upgrade time got reduced by at least 30mins compared to EKS 1.21 upgrade!

I have to say that this was a nice, pleasant and relatively fast upgrade. Yet again, no significant issues. Hope you will have the same easy job to perform. All workloads worked just fine. I didn’t have to modify anything really.

If you are interested in the entire terraform setup for EKS, you can find it on my GitHub -> https://github.com/marcincuber/eks/tree/master/terraform-aws

Hope this article nicely aggregates all the important information around upgrading EKS to version 1.23 and it will help people speed up their task.

Long story short, you hate and/or you love Kubernetes but you still use it ;).

Enjoy Kubernetes!!!

Sponsor Me

Like with any other story on Medium written by me, I performed the tasks documented. This is my own research and issues I have encountered.

Thanks for reading everybody. Marcin Cuber

Like with any other story on Medium written by me, I performed the tasks documented. This is my own research and issues I have encountered.

Thanks for reading everybody. Marcin Cuber

--

--

Marcin Cuber

Technical Lead/Principal Devops Engineer and AWS Community Builder