Kubernetes GitLab Runners on Amazon EKS

Marcin Cuber
9 min readJun 24, 2020

--

Find out how to configure GitLab Runners efficiently and trouble free on Amazon EKS following GitOps strategy.

Overview

Custom Gitlab Runners in AWS is probably the best feature of Gitlab, especially when you have a managed cloud GitLab server. I don’t believe such functionality is offered by any other known provider such as CircleCI, TravisCI, TeamCity or disgusting Jenkins. GitLab Runner is used to run your jobs and send the results back to GitLab. It is used in conjunction with GitLab CI/CD, the open-source continuous integration service included with GitLab that coordinates the jobs.

I am using Kubernetes platform to spin up all my Gitlab Runners which is proving to be a very efficient and fast way of spinning them up. Previously, Gitlab Runners were running on AWS EC2, this was proving to be very challenging in terms of configuration and was taking long time to get new runners when needed.

In this article, I will provide details and templates defining how my Gitlab Runners are implemented. Also, I will outline the setup of Amazon EKS cluster and GitOps strategy using Flux that helps hugely with upgrades of the runners.

Note that all configuration details for Gitlab Runners and EKS can be found in the following:

Cluster configuration details

I am currently running latest available version of Amazon EKS 1.16. Full configuration of the cluster can be found in my Github EKS repository.

My cluster is configured with following addons to make Gitlab runners work nicely:

  • External Secrets — this is used to pull gitlab token from AWS SSM parameter store and populate it as a secret
  • Reloader — this is extremely useful as it watches changes in ConfigMap and Secret and performs rolling upgrades on Pods

All my Kubernetes addons are deployed in GitOps way using Flux. Flux is a tool that automatically ensures that the state of a cluster matches the config in git. It uses an operator in the cluster to trigger deployments inside Kubernetes, which means you don’t need a separate CD tool. It monitors all relevant image repositories, detects new images, triggers deployments and updates the desired running configuration based on that (and a configurable policy). If you want to learn more about it, head to https://github.com/fluxcd/flux.

Flux is a fantastic tool and in my opinion it is much better than Helm, which I am avoiding at all costs. Helm is simply another wrapper around wrapper which to me doesn’t make any sense. In fact, I am hugely surprised that people waste their time trying to fight Helm issues instead of configuring a clean fully working yaml templates for their applications. If you follow GitOps approach you can also get nice release versions of your code in Gitlab or Gihub. In case you want to find out more about Helm madness, please head to https://helm.sh/

Current limitation of Gitlab runners in Kubernetes

Gitlab runners don’t work very well with HPA (Horizonal Pod Autoscaler). Also, metrics server and/or Prometheus custom metrics doesn’t provide good enough metrics to make it work. I confirmed that information with Gitlab support. So, even though Gitlab docs and code suggests that is it possible (https://gitlab.com/gitlab-org/charts/gitlab-runner/blob/master/values.yaml#L382) in fact it doesn’t work, so simply don’t waste your time.

My GitOps configuration allows you to easily rollout version upgrades of the runners. Easily upgrade number of required replicas of runners per Gitlab group. And many other simply upgrades which require a minimal modification to yaml template and flux will handle the deployment.

Gitlab Runner Configs

We have reached the essence of this article. Here I am going to share the templates that I use to configure deployment, configmap, namespace, gitlab-token external-secret and RBAC.

Note that all templates are available in my github -> https://github.com/marcincuber/kubernetes-gitlab-runner

Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: gitlab-runner-gitlab-runner
name: gitlab-runner-gitlab-runner
namespace: gitlab
annotations:
reloader.stakater.com/auto: "true"

spec:
progressDeadlineSeconds: 600
replicas: 10
selector:
matchLabels:
app: gitlab-runner-gitlab-runner
strategy:
rollingUpdate:
maxSurge: 50%
maxUnavailable: 50%
type: RollingUpdate

template:
metadata:
labels:
app: gitlab-runner-gitlab-runner
spec:
containers:
- command:
- /bin/bash
- /scripts/entrypoint
env:
- name: CI_SERVER_URL
value: https://gitlab.com/
- name: CLONE_URL
value: https://gitlab.com/
- name: RUNNER_REQUEST_CONCURRENCY
value: "1"
- name: RUNNER_EXECUTOR
value: kubernetes
- name: REGISTER_LOCKED
value: "true"
- name: RUNNER_TAG_LIST
value: "test"
- name: RUNNER_OUTPUT_LIMIT
value: "4096"
- name: KUBERNETES_IMAGE
value: ubuntu:20.04
- name: KUBERNETES_PRIVILEGED
value: "true"

- name: KUBERNETES_NAMESPACE
value: "gitlab"
- name: KUBERNETES_POLL_TIMEOUT
value: "180"
- name: KUBERNETES_POLL_INTERVAL
value: "3"
- name: KUBERNETES_CPU_LIMIT
value: "2"
- name: KUBERNETES_MEMORY_LIMIT
value: "3.5Gi"
- name: KUBERNETES_CPU_REQUEST
value: "1.5"
- name: KUBERNETES_MEMORY_REQUEST
value: "3.5Gi"
- name: KUBERNETES_SERVICE_ACCOUNT
- name: KUBERNETES_SERVICE_CPU_LIMIT
- name: KUBERNETES_SERVICE_MEMORY_LIMIT
- name: KUBERNETES_SERVICE_CPU_REQUEST
- name: KUBERNETES_SERVICE_MEMORY_REQUEST
- name: KUBERNETES_HELPER_CPU_LIMIT
- name: KUBERNETES_HELPER_MEMORY_LIMIT
- name: KUBERNETES_HELPER_CPU_REQUEST
value: "100m"
- name: KUBERNETES_HELPER_MEMORY_REQUEST
value: "128Mi"
- name: KUBERNETES_HELPER_IMAGE
- name: KUBERNETES_PULL_POLICY
value: "if-not-present"
- name: CACHE_TYPE
value: s3
- name: CACHE_PATH
value: gitlab-caches

- name: CACHE_SHARED
value: "true"
- name: CACHE_S3_SERVER_ADDRESS
value: s3.amazonaws.com
- name: CACHE_S3_BUCKET_NAME
value: kubernetes-gitlab-cache
- name: CACHE_S3_BUCKET_LOCATION
value: eu-west-1

- name: RUNNER_CACHE_DIR
value: /opt/gitlab-runner/cache/
- name: RUNNER_BUILDS_DIR
value: /opt/gitlab-runner/builds/
image: gitlab/gitlab-runner:alpine-v13.1.0
imagePullPolicy: IfNotPresent
lifecycle:
preStop:
exec:
command:
- /entrypoint
- unregister
- --all-runners
livenessProbe:
exec:
command:
- /bin/bash
- /scripts/check-live
name: gitlab-runner-gitlab-runner
ports:
- containerPort: 9252
name: metrics
protocol: TCP
readinessProbe:
exec:
command:
- /usr/bin/pgrep
- gitlab.*runner
volumeMounts:
- mountPath: /secrets
name: runner-secrets
- mountPath: /home/gitlab-runner/.gitlab-runner
name: etc-gitlab-runner
- mountPath: /scripts
name: scripts
resources:
requests:
cpu: "50m"
memory: "200Mi"
dnsPolicy: ClusterFirst
initContainers:
- command:
- sh
- /config/configure
env:
- name: CI_SERVER_URL
value: https://gitlab.com/
- name: CLONE_URL
value: https://gitlab.com/
- name: RUNNER_REQUEST_CONCURRENCY
value: "1"
- name: RUNNER_EXECUTOR
value: kubernetes
- name: REGISTER_LOCKED
value: "true"
- name: RUNNER_TAG_LIST
value: "test"
- name: RUNNER_OUTPUT_LIMIT
value: "4096"
- name: KUBERNETES_IMAGE
value: ubuntu:20.04
- name: KUBERNETES_PRIVILEGED
value: "true"

- name: KUBERNETES_NAMESPACE
value: "gitlab"
- name: KUBERNETES_POLL_TIMEOUT
value: "180"
- name: KUBERNETES_POLL_INTERVAL
value: "3"
- name: KUBERNETES_CPU_LIMIT
value: "2"
- name: KUBERNETES_MEMORY_LIMIT
value: "3.5Gi"
- name: KUBERNETES_CPU_REQUEST
value: "1.5"
- name: KUBERNETES_MEMORY_REQUEST
value: "3.5Gi"
- name: KUBERNETES_SERVICE_ACCOUNT
- name: KUBERNETES_SERVICE_CPU_LIMIT
- name: KUBERNETES_SERVICE_MEMORY_LIMIT
- name: KUBERNETES_SERVICE_CPU_REQUEST
- name: KUBERNETES_SERVICE_MEMORY_REQUEST
- name: KUBERNETES_HELPER_CPU_LIMIT
- name: KUBERNETES_HELPER_MEMORY_LIMIT
- name: KUBERNETES_HELPER_CPU_REQUEST
value: "100m"
- name: KUBERNETES_HELPER_MEMORY_REQUEST
value: "128Mi"
- name: KUBERNETES_HELPER_IMAGE
- name: KUBERNETES_PULL_POLICY
value: "if-not-present"
- name: CACHE_TYPE
value: s3
- name: CACHE_PATH
value: gitlab-caches

- name: CACHE_SHARED
value: "true"
- name: CACHE_S3_SERVER_ADDRESS
value: s3.amazonaws.com
- name: CACHE_S3_BUCKET_NAME
value: kubernetes-gitlab-cache
- name: CACHE_S3_BUCKET_LOCATION
value: eu-west-1

- name: RUNNER_CACHE_DIR
value: /opt/gitlab-runner/cache/
- name: RUNNER_BUILDS_DIR
value: /opt/gitlab-runner/builds/
image: gitlab/gitlab-runner:alpine-v13.1.0
imagePullPolicy: IfNotPresent
name: configure
volumeMounts:
- mountPath: /secrets
name: runner-secrets
- mountPath: /config
name: scripts
readOnly: true
- mountPath: /init-secrets
name: init-runner-secrets
readOnly: true
resources:
requests:
cpu: "50m"
memory: "200Mi"
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 65533
runAsUser: 100
serviceAccount: gitlab-runner-gitlab-runner
serviceAccountName: gitlab-runner-gitlab-runner
terminationGracePeriodSeconds: 300
volumes:
- emptyDir:
medium: Memory
name: runner-secrets
- emptyDir:
medium: Memory
name: etc-gitlab-runner
- name: init-runner-secrets
projected:
defaultMode: 420
sources:
- secret:
items:
- key: runner-registration-token
path: runner-registration-token
name: gitlab-runner-token

- configMap:
defaultMode: 420
name: gitlab-runner-gitlab-runner
name: scripts

I have highlighted some important bits that you may want to reconfigure yourself. That is a S3 bucket which will have a different name and potentially a different region.

There are various environment variables that you should review as well. Setting such as Privileged container is very important. I strongly recommend that you have a dedicated cluster to run all your Gitlab runners. In most case they will need to be privileged. Additionally, all Kubernetes cluster should be treated like cattles.

Lastly, I want to point out that if you are following GitOps model of deployments, upgrades to runners are very simple and you only need to update image field with the new tag of the docker image. Usually upgrades take few seconds depending on rollingUpdate strategy.

ConfigMap

apiVersion: v1
kind: ConfigMap
metadata:
labels:
app: gitlab-runner-gitlab-runner
name: gitlab-runner-gitlab-runner
namespace: gitlab
data:
check-live: |
#!/bin/bash
if /usr/bin/pgrep -f .*register-the-runner; then
exit 0
elif /usr/bin/pgrep gitlab.*runner; then
exit 0
else
exit 1
fi
config.toml: |
concurrent = 1
check_interval = 10
log_level = "info"
configure: |
set -e
cp /init-secrets/* /secrets
entrypoint: |
#!/bin/bash
set -e
mkdir -p /home/gitlab-runner/.gitlab-runner/
cp /scripts/config.toml /home/gitlab-runner/.gitlab-runner/
# Register the runner
if [[ -f /secrets/accesskey && -f /secrets/secretkey ]]; then
export CACHE_S3_ACCESS_KEY=$(cat /secrets/accesskey)
export CACHE_S3_SECRET_KEY=$(cat /secrets/secretkey)
fi
if [[ -f /secrets/gcs-applicaton-credentials-file ]]; then
export GOOGLE_APPLICATION_CREDENTIALS="/secrets/gcs-applicaton-credentials-file"
else
if [[ -f /secrets/gcs-access-id && -f /secrets/gcs-private-key ]]; then
export CACHE_GCS_ACCESS_ID=$(cat /secrets/gcs-access-id)
# echo -e used to make private key multiline (in google json auth key private key is oneline with \n)
export CACHE_GCS_PRIVATE_KEY=$(echo -e $(cat /secrets/gcs-private-key))
fi
fi
if [[ -f /secrets/runner-registration-token ]]; then
export REGISTRATION_TOKEN=$(cat /secrets/runner-registration-token)
fi
if [[ -f /secrets/runner-token ]]; then
export CI_SERVER_TOKEN=$(cat /secrets/runner-token)
fi
if ! sh /scripts/register-the-runner; then
exit 1
fi
# Start the runner
exec /entrypoint run --user=gitlab-runner \
--working-directory=/home/gitlab-runner
register-the-runner: "#!/bin/bash\nMAX_REGISTER_ATTEMPTS=30\n\nfor i in $(seq 1
\"${MAX_REGISTER_ATTEMPTS}\"); do\n echo \"Registration attempt ${i} of ${MAX_REGISTER_ATTEMPTS}\"\n
\ /entrypoint register \\\n --non-interactive\n\n retval=$?\n\n if [ ${retval}
= 0 ]; then\n break\n elif [ ${i} = ${MAX_REGISTER_ATTEMPTS} ]; then\n exit
1\n fi\n\n sleep 5 \ndone\n\nexit 0\n"

In the configmap template you don’t really need to customise anything. Only things you may want to look at are the namespace (if you decide to change it) and perhaps concurrency. A single Gitlab Runner can run more than one concurrent worker/executer. I set it to exactly to one, this allows me to have a single runner managing exactly one job at a time.

External Secret- Gitlab token

apiVersion: kubernetes-client.io/v1
kind: ExternalSecret
metadata:
name: gitlab-runner-token
namespace: gitlab
spec:
backendType: systemManager
data:
- key: /k8s-gitlab-runner/token/test
name: runner-registration-token

To make use of this, you will need to have external-secrets addon installed. I highly recommend having it, especially when you are running things in AWS. The highlighted line is the important part which points at the name of the AWS SSM parameter. This parameter stores the registration token for Gitlab. Such token can be found in More about SSM parameters can be read in docs.

Namespace and RBAC

apiVersion: v1
kind: Namespace
metadata:
name: gitlab
labels:
app: gitlab-runner-gitlab-runner
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
labels:
app: gitlab-runner-gitlab-runner
name: gitlab-runner-gitlab-runner
namespace: gitlab
rules:
- apiGroups:
- ""
resources:
- '*'
verbs:
- '*'
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
labels:
app: gitlab-runner-gitlab-runner
name: gitlab-runner-gitlab-runner
namespace: gitlab
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: gitlab-runner-gitlab-runner
subjects:
- kind: ServiceAccount
name: gitlab-runner-gitlab-runner
namespace: gitlab
---
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
app: gitlab-runner-gitlab-runner
name: gitlab-runner-gitlab-runner
namespace: gitlab
automountServiceAccountToken: true

Just a basic configs for namespace and role. Role is has all permissions assigned, however it is bound to its namespace so doesn’t cause any issues.

Conclusion

I have present you with the latest configuration I am using for Gitlab runners. They are using up-to-date docker image which is version v13.1.0. I have been running this setup for at least 4 months and rolling updates is extremely easy. I had zero issues.

Something I have not mentioned was that my EKS cluster fully utilises spot instances which makes this setup very cheap. As this is all Gitlab related, you can always opt to use cloud managed Gitlab Runners which are provided by Gitlab. For security reasons and easy integration with AWS IAM, it was a must for me to have and run my own runners in AWS.

GitLab runners or in fact GitLab itself doesn’t provide one feature which is very important and this is managed Mac OSX runners. Hopefully, this feature is in works and at some point we will get fully equipped platform with everything we need :).

Thanks for reading, hope this story helped. Stay tuned.

If you have any questions please leave comments or private notes. Otherwise you can find me on https://www.linkedin.com/in/marcincuber/

--

--

Marcin Cuber
Marcin Cuber

Written by Marcin Cuber

Principal Cloud Engineer, AWS Community Builder and Solutions Architect

Responses (3)