Amazon EKS Upgrade Journey From 1.23 to 1.24

Marcin Cuber
7 min readNov 17, 2022

We are now welcoming “Stargazer. Process and considerations while upgrading EKS control-plane to version 1.24.

Overview

EKS finally announced support for Kubernetes version 1.24 . I am mega excited for everybody to experience the power of the “Stargazer” release. This time Kubernetes team selected “Stargazer” name for this release to honor the work done by hundreds of contributors across the globe: “Every single contributor is a star in our sky…”

The long awaited feature of supporting only containerd runtime is finally here. You can no longer use dockerd runtime with your EKS 1.24 as worker node AMI doesn’t contain it. This is probably the most significant change in this release. Starting with version 1.24 of Kubernetes, the Amazon Machine Images (AMIs) provided by Amazon EKS will only support the containerd runtime. The EKS optimized AMIs for version 1.24 no longer support passing the flags enable-docker-bridge, docker-config-json, and container-runtime.

Important: Before upgrading your worker nodes to Kubernetes 1.24, you must remove all references to these flags.

The open container initiative (OCI) images generated by docker build tools will continue to run in your Amazon EKS clusters as before. As an end-user of Kubernetes, you will not experience significant changes.

For more information, see Kubernetes is Moving on From Dockershim: Commitments and Next Steps on the Kubernetes Blog.

EKS has started supporting version 1.24.7 of Kuberentes at the time of writing this story.

Previous Stories and Upgrades

If you are looking at

  • upgrading EKS from 1.22 to 1.23 check out this story
  • upgrading EKS from 1.21 to 1.22 check out this story
  • upgrading EKS from 1.20 to 1.21 check out this story
  • upgrading EKS from 1.19 to 1.20 check out this story
  • upgrading EKS from 1.18 to 1.19 check out this story
  • upgrading EKS from 1.17 to 1.18 check out this story

Kubernetes 1.24 features and removals

Admission controller enabled

CertificateApproval, CertificateSigning, CertificateSubjectRestriction, DefaultIngressClass, DefaultStorageClass, DefaultTolerationSeconds, ExtendedResourceToleration, LimitRanger, MutatingAdmissionWebhook, NamespaceLifecycle, NodeRestriction, PersistentVolumeClaimResize, Priority, PodSecurityPolicy, ResourceQuota, RuntimeClass, ServiceAccount, StorageObjectInUseProtection, TaintNodesByCondition, and ValidatingAdmissionWebhook.

Important changes

Starting with Kubernetes 1.24, new beta APIs are no longer enabled in clusters by default. Existing beta APIs and new versions of existing beta APIs continue to be enabled. Amazon EKS will have exactly the same behaviour. For more information, see KEP-3136: Beta APIs Are Off by Default on GitHub.

In Kubernetes 1.23 and earlier, kubelet serving certificates with unverifiable IP and DNS Subject Alternative Names (SANs) were automatically issued with unverifiable SANs. These unverifiable SANs are omitted from the provisioned certificate. Starting from version 1.24, kubelet serving certificates aren't issued if any SAN can't be verified. This prevents kubectl exec and kubectl logs commands from working. For more information, see Certificate signing considerations for Kubernetes 1.24 and later clusters.

New Features

Topology Aware Hints. It is a standard or it should be for everybody to deploy Kubernetes workloads to nodes running across different availability zones (AZs) for resiliency and fault isolation. While this architecture provides great benefits, in many scenarios it will also result in cross-AZ data transfer charges. With EKS you can now use Topology Aware Hints keep Kubernetes service traffic within the same availability zone. Topology Aware Hints provide a flexible mechanism to provide hints to components, such as kube-proxy, and use them to influence how the traffic is routed within the cluster.

Pod Security Policy (PSP) was deprecated in Kubernetes version 1.21 and will be removed in version 1.25. PSPs are being replaced by Pod Security Admission (PSA), a built-in admission controller that implements the security controls outlined in the Pod Security Standards (PSS). PSA and PSS have both reached beta feature status as of Kubernetes version 1.23 and are now enabled in EKS. Please read the following when implementing PSP and PSS, please review this blog post.

Another great mechanism is Kyverno, and OPA/Gatekeeper from the Kubernetes ecosystem as an alternative to PSA. I personally use Kyverno, it is a seperate open-source policy operator which in my opinion is the best one out there.

Simplified scaling for EKS Managed Node Groups. The upstream Cluster Autoscaler project that simplifies scaling of the EKS managed node group a and you can finally scale your group from and to zero nodes. Long awaited feature is finally here.

Change to certificates controller. In Kubernetes 1.23 and earlier, kubelet serving certificates with unverifiable IP and DNS Subject Alternative Names (SANs) were automatically issued with the unverifiable SANs. Beginning with version 1.24, no kubelet-serving certificates will be issued if any SANs cannot be confirmed. This will prevent the kubectl exec and kubectl logs commands from working. Please follow the steps outlined in the EKS user guide to determine if you are impacted by this issue, the recommended workaround, and long-term resolution.

Kubelet Credential Provider Graduates to Beta. If you are using Amazon EKS Anywhere, Kubernetes 1.24 includes kubelet support for image credential providers. You can configure --image-credential-provider-config and --image-credential-provider-bin-dir flags for the kubelet to request credentials for a container registry dynamically, as opposed to storing static credentials on disk. Please refer to this link for a sample kubelet credential provider config file to use Amazon ECR plugins.

Upgrade your EKS with terraform

This time upgrade of the control plane takes around ~10 minutes and didn’t cause any issues. AWS are doing a great job at reducing the time it takes to upgrade EKS control plane.

I immediately upgraded worker nodes which took around 10–20 minutes to join the upgraded EKS cluster. This time is dependent on how many worker nodes you have and how many pods need to be drained from old nodes.

In general full upgrade process controlplane + worker nodes took around ~22 mins. Really good time I would say.

I personally use Terraform to deploy and upgrade my EKS clusters. Here is an example of the EKS cluster resource.

resource "aws_eks_cluster" "cluster" {
enabled_cluster_log_types = ["audit"]
name = local.name_prefix
role_arn = aws_iam_role.cluster.arn
version = "1.24"

vpc_config {
subnet_ids = flatten([module.vpc.public_subnets, module.vpc.private_subnets])
security_group_ids = []
endpoint_private_access = "true"
endpoint_public_access = "true"
}

encryption_config {
resources = ["secrets"]
provider {
key_arn = module.kms-eks.key_arn
}
}

tags = var.tags
}

For worker nodes I have used official AMI with id: ami-0c5a6a57a98e2c797. I didn’t notice any issues after rotating all nodes.

Templates I use for creating EKS clusters using Terraform can be found in my Github repository reachable under https://github.com/marcincuber/eks/tree/master/terraform-aws

Upgrading Managed EKS Add-ons

In this case the change is trivial and works fine, simply update the version of the add-on. In my case, from this release I utilise kube-proxy, coreDNS and ebs-csi-driver.

Terraform resources for add-ons

resource "aws_eks_addon" "kube_proxy" {
cluster_name = aws_eks_cluster.cluster[0].name
addon_name = "kube-proxy"
addon_version = "1.24.7-eksbuild.2"
resolve_conflicts = "OVERWRITE"
}
resource "aws_eks_addon" "core_dns" {
cluster_name = aws_eks_cluster.cluster[0].name
addon_name = "coredns"
addon_version = "v1.8.7-eksbuild.3"
resolve_conflicts = "OVERWRITE"
}
resource "aws_eks_addon" "aws_ebs_csi_driver" {
cluster_name = aws_eks_cluster.cluster[0].name
addon_name = "aws-ebs-csi-driver"
addon_version = "v1.13.0-eksbuild.1"
resolve_conflicts = "OVERWRITE"
}

After upgrading EKS control-plane

Remember to upgrade core deployments and daemon sets that are recommended for EKS 1.24.

  1. CoreDNS — 1.8.7
  2. Kube-proxy — 1.24.7-minimal-eksbuild.2 (note the change to minimal version, it is only stated in the official documentation)
  3. VPC CNI — 1.11.4-eksbuild.1 (there is versions 1.12 available but 1.11.4 is the recommended one)
  4. aws-ebs-csi-driver- v1.13.0-eksbuild.1

The above is just a recommendation from AWS. You should look at upgrading all your components to match the 1.24 Kubernetes version. They could include:

  1. calico-node
  2. cluster-autoscaler or Karpenter
  3. kube-state-metrics
  4. metrics-server
  5. csi-secrets-store
  6. calico-typha and calico-typha-horizontal-autoscaler
  7. reloader

Looking ahead

The v1.25 release that’s planned for next year will stop serving beta versions of several Kubernetes APIs that are stable right now and have been for some time. The same v1.25 release will remove PodSecurityPolicy, which is deprecated and won’t graduate to a stable state. See PodSecurityPolicy Deprecation: Past, Present, and Future for more information.

The official list of API removals planned for Kubernetes 1.25 is:

  • The beta CronJob API (batch/v1beta1)
  • The beta EndpointSlice API (networking.k8s.io/v1beta1)
  • The beta PodDisruptionBudget API (policy/v1beta1)
  • The beta PodSecurityPolicy API (policy/v1beta1)

Summary and Conclusions

Mega surprise with this release, upgrade of the cluster only took ~9mins. This is a huge reduction in time over previous upgrades. This upgrade time got reduced by at least 30mins compared to EKS 1.21 upgrade!

I have to say that this was a nice, pleasant and relatively fast upgrade. Yet again, no significant issues. Hope you will have the same easy job to perform. All workloads worked just fine. I didn’t have to modify anything really.

If you are interested in the entire terraform setup for EKS, you can find it on my GitHub -> https://github.com/marcincuber/eks/tree/master/terraform-aws

Hope this article nicely aggregates all the important information around upgrading EKS to version 1.24 and it will help people speed up their task.

Long story short, you hate and/or you love Kubernetes but you still use it ;).

Enjoy Kubernetes!!!

Sponsor Me

Like with any other story on Medium written by me, I performed the tasks documented. This is my own research and issues I have encountered.

Thanks for reading everybody. Marcin Cuber

--

--

Marcin Cuber

Technical Lead/Principal Devops Engineer and AWS Community Builder