Amazon EKS Upgrade Journey From 1.17 to 1.18

Marcin Cuber
4 min readOct 14, 2020

Process and considerations while upgrading EKS control-plane to version 1.18

Overview

AWS recently released support for Amazon Kubernetes Service 1.18. With this release there are some new features introduced and there are not too many deprecated options. In this post I will go through the services that are a must to check and upgrade if necessary before even thinking of upgrading EKS.

If you are looking at upgrading EKS from 1.15 to 1.16 then check out my previous story. And for upgrades from 1.16 to 1.17 check out this story.

Kubernetes 1.18 features

  • Topology Manager has reached beta status. This feature allows the CPU and Device Manager to coordinate resource allocation decisions, optimising for low latency with machine learning and analytics workloads. For more information, see Control Topology Management Policies on a node in the Kubernetes documentation.
  • Server-side Apply is updated with a new beta version. This feature tracks and manages changes to fields of all new Kubernetes objects, allowing you to know what changed your resources and when. For more information, see What is Server-side Apply? in the Kubernetes documentation.
  • A new pathType field and a new IngressClass resource has been added to the Ingress specification. These features make it simpler to customise Ingress configuration, and are supported by the ALB Ingress Controller. For more information, see Improvements to the Ingress API in Kubernetes 1.18 in the Kubernetes documentation.
  • Configurable horizontal pod autoscaling behaviour. For more information, see Support for configurable scaling behaviour in the Kubernetes documentation.
  • Pod Topology Spread has reached beta status. You can use topology spread constraints to control how pods are spread across your cluster among failure-domains such as Regions, zones, nodes, and other user-defined topology domains. This can help to achieve high availability as well as efficient resource utilisation. For more information, see Pod Topology Spread Constraints in the Kubernetes documentation.

Deprecated APIs

The following deprecated APIs can no longer be served in EKS 1.18:

  • All resources under apps/v1beta1 and apps/v1beta2 — must use apps/v1
  • Resources such asdaemonsets, deployments, replicasets resources under extensions/v1beta1 — must use apps/v1 instead
  • networkpolicies resources under extensions/v1beta1 — must use networking.k8s.io/v1 instead
  • podsecuritypolicies resources under extensions/v1beta1 — must use policy/v1beta1 instead

Upgrade with terraform

This time upgrade of the control plane takes around 35 minutes and didn’t cause any issues. I have noticed that the control plane wasn’t available immediately so upgraded worker nodes took around 2 minutes to join the upgraded EKS cluster.

I personally use Terraform to deploy and upgrade my EKS clusters. Here is an example of the EKS cluster resource.

resource "aws_eks_cluster" "cluster" {
enabled_cluster_log_types = ["audit"]
name = local.name_prefix
role_arn = aws_iam_role.cluster.arn
version = "1.18"

vpc_config {
subnet_ids = flatten([module.vpc.public_subnets, module.vpc.private_subnets])
security_group_ids = []
endpoint_private_access = "true"
endpoint_public_access = "true"
} encryption_config {
resources = ["secrets"]
provider {
key_arn = module.kms-eks.key_arn
}
} tags = var.tags
}

Template I use for creating EKS clusters using Terraform can be found in my Github repository reachable under https://github.com/marcincuber/eks/tree/master/terraform-aws

After upgrading EKS control-plane

Remember to upgrade core deployments and daemon sets that are recommended for EKS 1.18.

The above is just a recommendation from AWS. You should look at upgrading all your components to match the 1.18 version. They could include:

  1. calico-node
  2. cluster-autoscaler
  3. Kube-state-metrics
  4. calico-typha and calico-typha-horizontal-autoscaler

CodeDNS 1.7.0 known issue

Problem:

It is advised to use CoreDNS 1.7.0. When doing so, you are likely to get CrashLoopBackoff due to:

plugin/kubernetes: /etc/coredns/Corefile:6 - Error during parsing: unknown property 'upstream'

Solution:

upstream option is no longer supported in CoreDNS v1.7.0. That means you can update your CoreDns configmap to the following which will solve the issue:

apiVersion: v1
kind: ConfigMap
data:
Corefile: |
.:53 {
errors
health
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . /etc/resolv.conf
cache 30
loop
reload
loadbalance
}

Kube-proxy

AWS decided to again change the naming convention. So if you previously used an image such as the following:

kubectl get daemonset kube-proxy --namespace kube-system -o=jsonpath='{$.spec.template.spec.containers[:1].image}'602401143452.dkr.ecr.eu-west-1.amazonaws.com/eks/kube-proxy:v1.17.7

You now need to set it in your yaml to the following:

602401143452.dkr.ecr.eu-west-1.amazonaws.com/eks/kube-proxy:v1.18.8-eksbuild.1

Now the images have additional text `eksbuild.1` in the tag value.

Second part of the changes for kube-proxy was to ensure that nodeSelectorTermsare set correctly. Here is the updated and working version:

nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
- key: kubernetes.io/arch
operator: In
values:
- amd64
- arm64
- key: "beta.kubernetes.io/arch"
operator: In
values:
- amd64
- arm64
- key: eks.amazonaws.com/compute-type
operator: NotIn
values:
- fargate

Summary

I have to say that this was a nice, pleasant and fast upgrade. I didn’t come across any issues.

If you are interested in the entire terraform setup for EKS, you can find it on my GitHub -> https://github.com/marcincuber/eks/tree/master/terraform-aws

Hope this article nicely aggregates all the important information around upgrading EKS and it will help people speed up their task.

Enjoy Kubernetes!!!

--

--

Marcin Cuber

Technical Lead/Principal Devops Engineer and AWS Community Builder