Sharing files between pods seems to be an easy task at first sight, however it's quite the opposite. Hence I decided to write this article to share how at Abyssale we tackled this issue.

Storage solution on Amazon:

On Amazon you can find 3 types of storage that you can use for your production pods:

  • S3: Simple Storage Service
  • EBS: Elastic Block Store
  • EFS: Elastic File System

I will not share information about Glacier and Snowball that are respectively a cold storage and a transfer/storage tool.

Simple Storage Service (S3):

S3 is an object storage service. It can be used as shared storage via s3fs. It allows to mount a bucket via FUSE (Filesystem in Userspace interface). I have not tried it but the performance when you use S3 as a classic file system seems quite low and the read/write limitation quite high. On the other hand it's the cheapest solution.

Elastic Block Store (EBS):

EBS is a persistent block storage with a defined size (It is however possible to change it later). It is fully configurable and could be the most performant solution of Amazon. The major drawback is that it can be only mounted to one EC2 instance. This also means that it can't be shared between multiple pods.

Elastic File System (EFS):

EFS is a shared file storage with a dynamic elasticity and scalable performance. It's basically a network file system (NFS).

Which means:

  • Performance over 10 GB/sec and up to 500,000 IOPS.
  • The file system storage scales up and down as you remove or add new files. According to the documentation, there is no capacity limit. Only a maximum size for a single file: 47.9 TiB. Most users should be good with theses limitations.
  • 400 mount targets for each Virtual Private Cloud.

Our needs:

Two pods must access the same storage. One pod is always writing new files and the other one is reading it. For us at Abyssale, our need is to have the highest performance possible because our product output is served to our users.

At first glance the EFS solution seemed to fit our needs but the implementation was daunting. This is the reason why I wrote the following tutorial.

Setup the stack:

1. Requirements:

In order to pursue this tutorial there are some base requirements that are necessary to prepare and/or have already up and running, such as:

  • Already have a up and running EKS.
  • AWS EFS must be available in your region.
  • EKS and EFS must be in the same region.

Check the region table for additional information.

The second point is very important to check. We had to migrate our production infrastructure from Paris to Ireland because EFS was not available in the region.

2. Create the Elastic File System:

First create a "Security Group":

  • The VPC has to be the same as the one on EKS.  You can find the information on the EKS cluster page.
  • Fill Inbound and Outbound as follow:
Security Group: Inbound
Security group: Inbound
Security Group: Outbound
Security Group: Outbound

Then go on the EFS creation page:

EFS creation page
EFS creation page
  • Again choose the same VPC.
  • Add the Security Group, we have previously created.

Then next step, create File System.

3. Deploy the efs-provisioner:

The efs-provisioner allows to mount EFS storage as PersistentVolumes in Kubernetes. (Don't forget to change the namespace with your own.)

Deploy efs-provisioner :

efs-provisioner
efs-provisioner

Replace those values by your own:

MY_FILE_SYSTEM_ID: You can find it on the EFS creation page. e.g: fs-xxx
MY_AWS_REGION:  The region where you have created your EFS. e.g.: eu-west-1
MY_PROVISIONER_NAME: Choose a name that will be reused later.
MY_EFS_DNS_NAME: You can find it on the EFS creation page.
e.g.: fs-xx.eu-west-1.amazonaws.com

A serviceAccount provides an identity for processes that run in a Pod. We bind the pod previously created with a name (efs-provisioner). It will be used later in RBAC to bind the roles and the pod.

ServiceAccount
ServiceAccount


Role-based access control (RBAC) is enabled on EKS by default. We must authorize efs-provisioner to access resources.

RBAC
RBAC


StorageClass is used to link the efs-provisioner with the PersistentVolumeClaim. PersistentVolumeClaim is a request for storage.

We bind the PersistentVolumeClaim with the StorageClass (aws-efs). As the EFS has no storage limitation, the storage limit doesn't matter.

Storage
Storage

Change the following value:

MY_PROVISIONER_NAME: The name you gaved in the efs-provisioner

4. Use the persistent volume in your pods:

Now you can use the persistent volume in several pods.
We use the claimeName we have previously created in PersistentVolumeClaim (efs). You can use the PersistentVolumeClaim in several Deployment.

Deployment example

Conclusion and feedback:

I struggled to deploy the whole solution because of the following points:

  • At first i misconfigured the Security Group.
  • RBAC documentation for efs-provisioner was not complete. I had to search in the github's issues to find the answer.

We are really happy about this solution because the performance are meeting our expectations. The fact that the storage is extensible is also a good aspect. Last but not least you can also configure a backup option.

If you have any questions or need some help setting up your EFS solution feel free to reach me out on Twitter @LemRemy.

Also if you enjoyed this story, please share it to help others find it!