The following is part of a series of posts called "Building a complete Kubernetes backed infrastructure".
This series of posts describes my approach to standing up a scalable infrastructure for hosting both internal and external software in a company setting.
This series focuses on AWS specifically as I have found EKS to be the most complicated Kubernetes provider to set up, but the principles and workflow should be easily applied to any other provider or on-prem situation.
If you’ve used Kubernetes for a least a little while you’ll be aware of the issues that can be
caused when using
PersistentVolumes that only support
ReadWriteOnce claims and not
ReadWriteOnce claim, only one pod can be assigned a given volume claim and have it mounted
at any given time.
This might not sound like an issue if you’re only planning to run a single replica of a service, and not scale horizontally, which is often the case especially for personal projects, but there are a number of unseen issues in this case.
One of the biggest problem with
ReadWriteOnce volume claims is when you are using a
and you want to do rollouts. A rollout will typically create a new
will create the require
Pod, check things have started correctly, then switch traffic over to the
Pod before deleting the old
The issue here is the new
Pod will not be able to start as it will not be able to mount the
PersistentVolumeClaim which will currently be bound to the exiting
ReadWriteOnce stops you from doing
Deployment based roll-outs and introduce the need for
service downtime as the existing
Pod must be removed before starting the new
To get around this, instead of using the default
gp2 storage class which is made available when
creating the cluster, a new storage class can be created which is backed by AWS EFS. EFS is
basically a NFS offering with a different name.
Using EFS/NFS allows multiple pods to access a given
PersistentVolumeClaim at the same time,
avoiding the issue outlined previously.
To set up, I used this repo. I’ve found that there is a lot of conflicting information regarding getting EFS working in EKS, but this was the best source to follow. It is installed via the following:
kubectl apply -k "github.com/kubernetes-sigs/aws-efs-csi-driver/deploy/kubernetes/overlays/stable/?ref=release-1.0"
You’ll also need to create a new EFS drive. It is pretty easy to create a new EFS volume in the
AWS console and you can just select the defaults when creating. You also need to create a new
security policy to allow your EC2 instances within your Kubernetes cluster to access EFS. To do
this, go to the EC2 dashboard in the AWS console and navigate to the Security Groups section. From
here create a New Security Group. Add a name and description for the policy and ensure that you
select the VPC of your cluster. Add an inbound rule with the type of NFS and for the source. In my
case I added
172.31.0.0/16 allowing anything from the shared VPC (you can check this within your
VPC, it is just what is listed as the IPv4 CIDR).
Adding this security policy to the EFS drive you just created is done via the Network tab in the EFS view. ‘Manage’ the mount targets which were automatically created and change the security policy to the one you just created. Without this, the EFS mounts will fail with some very unhelpful error messages, so if things don’t work, this might be the first thing that you check is set up correctly.
Once complete, take a note of the file system id for the new drive which is in the form of
fs-XXXXXXX. This is required in the Kubernetes configuration.
Next, it is a case of creating a new storage class for EFS, which can be done by applying a configuration such as the following.
kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: aws-efs provisioner: efs.csi.aws.com
After the storage class is available the next step is to create a
PersistentVolume using the
filesystem id of the EFS volume that was just created, like the below:
apiVersion: v1 kind: PersistentVolume metadata: name: VOLUME_NAME spec: capacity: storage: 1Gi volumeMode: Filesystem accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Retain storageClassName: aws-efs claimRef: name: VOLUME_PVC csi: driver: efs.csi.aws.com volumeHandle: FILE_SYSTEM_ID
After running the above you should now have a provisioned
PersistentVolume. To claim that volume,
need to create a
PersistentVolumeClaim like the below.
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: VOLUME_PVC spec: accessModes: - ReadWriteMany storageClassName: aws-efs resources: requests: storage: 1Gi
Any number of pods can now use this
PersistentVolumeClaim and mount it as they want. Apart from
the benefit explained above of now being able to do
Deployment rollouts, there is also another
great benefit that different services can share a common file system. This is obviously an
anti-pattern if not done correctly, as services should be built around a well designed and defined
API but if done with limited scope with proper guardrails this can be beneficial.
One such example is a Hadoop / MapReduce style pattern where different services can take directories as input and output into other directories for other services to pick up. On a previous project this worked fantastic, where we had a good number of integrations with external parties that would extract data and store it within a directory in EFS, another worker service would pick up these files and transform them into a number of outputs, which in turn was processed by dedicated loading services. This workflow made things much more scalable than doing a full ETL system for each integration.
efs driver does have the ability as well to load an EFS directory as the mount as well. This
can be leveraged so that you only need one EFS instance and then each service can have it’s own
directory on the one EFS. This is done via the
volumeHandle property and will look something
fs-xxxxxxxx:/service-name, but beware, if that directory doesn’t exist on the EFS when you
try to mount it the mount will fail. You will need to create these directories on the EFS volume
manually prior to using them.