The Ultimate Guide to Kubernetes Volumes: Types, Use Cases, and Best Practices

In Kubernetes, containers often have short lifespans and can be created or terminated frequently. However, when a container is deleted, any data stored within it is also lost. This poses challenges when persistent storage is required for the application.

To solve this issue, Kubernetes provides Volumes - a mechanism to enable data persistence beyond the lifecycle of individual containers.

What Are Kubernetes Volumes?

In Kubernetes, Volumes are shared storage areas inside a Pod that multiple containers can use. These volumes are set up at the Pod level and attached to specific locations in each container, making it easy for containers to share data.

Volumes help store data that needs to stay safe even if a container stops or restarts. Unlike container storage, the data in a volume is not deleted when a container is removed, ensuring that important information is always preserved.

Types of Volumes

Kubernetes supports various types of Volumes, including:

EmptyDir: a temporary Volume that is created when a Pod is assigned to a Node and is deleted when the Pod is removed.
HostPath: a Volume that mounts a file or directory from the host Node’s filesystem into the Pod.
PersistentVolume (PV): It is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using StorageClasses. It is a resource in the cluster, just like a Node, and exists independently of any Pod.
PersistentVolumeClaim (PVC): It is a request for storage by a user. It specifies details such as the size and access mode needed. Once a PVC is created, Kubernetes binds it to a suitable PersistentVolume to meet the storage requirements.

Introduction to EmptyDir

EmptyDir is a basic type of Volume in Kubernetes that provides an empty directory for temporary storage.

It is created when a Pod is scheduled to a node and can be accessed by all containers in the Pod.

EmptyDir volumes are useful for scenarios where a container needs to write temporary data during the lifetime of a Pod, and the data doesn’t need to be persisted across Pod restarts.

EmptyDir volumes can also be used for sharing data between containers in a Pod. This is useful when multiple containers need to access a shared piece of data or communicate with each other. The EmptyDir volume is mounted at the same path in each container, and any changes made by one container are visible to all other containers that share the volume.

How to User EmptyDir in Kubernetes?

In order to use an EmptyDir volume, you need to define it in the Pod specification. Here is an example YAML definition of a Pod that contains an EmptyDir volume:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: flm
spec:
  replicas: 2
  selector:
    matchLabels:
      app: swiggy
  template:
    metadata:
      labels:
        app: swiggy
    spec:
      containers:
      - name: cont-1
        image: ubuntu
        command: ["/bin/bash", "-c", "while true; do echo Welcome to DevOps class; sleep 10; done"]
        volumeMounts:
        - mountPath: "/tmp/jenkins"
          name: devops
      - name: cont-1
        image: ubuntu
        command: ["/bin/bash", "-c", "while true; do echo Welcome to DevOps class; sleep 10; done"]
        volumeMounts:
        - mountPath: "/tmp/docker"
          name: devops
      volumes:
      - name: devops
        emptyDir: {}

In the above file, it will creates 2 containers on each pod. if you create a file in any container, then the file will be available on both the containers.

kubectl exec -it pod-name -c cont-1 -- /bin/bash

By using the above command create a file in cont-1 and it will be available on cont-2 as well

Introduction to HostPath:

This volume type is the advanced version of the previous volume type EmptyDir.
In EmptyDir, the data is stored in the volumes that reside inside the Pods only where the host machine doesn’t have the data of the pods and containers.
hostpath volume type helps to access the data of the pods or container volumes from the host machine.
hostpath replicates the data of the volumes on the host machine and if you make the changes from the host machine then the changes will be reflected to the pods volumes(if attached).

How to User HostPath in Kubernetes?

apiVersion: apps/v1
kind: Deployment
metadata:
  name: flm
spec:
  replicas: 2
  selector:
    matchLabels:
      app: swiggy
  template:
    metadata:
      labels:
        app: swiggy
    spec:
      containers:
      - name: cont-1
        image: ubuntu
        command: ["/bin/bash", "-c", "while true; do echo Welcome to DevOps class; sleep 10; done"]
        volumeMounts:
        - mountPath: "/tmp/jenkins"
          name: devops
      - name: cont-1
        image: ubuntu
        command: ["/bin/bash", "-c", "while true; do echo Welcome to DevOps class; sleep 10; done"]
        volumeMounts:
        - mountPath: "/tmp/docker"
          name: devops
      volumes:
      - name: devops
        hostPath:
          path: /tmp/mydata

In the above file, it will creates a pod. if you create a file in container, by using the command

docker exec -it pod-name -c cont-1 /bin/bash

file will gets created. Now even if you delete the pod then another pods will gets created with same data.

Because we are using HostPath which creates local volume in our host machine.

Here, if the pod is deleted the new pod will be created and also data will be present in the new pod also.

But the problem is if we delete the node then it will create a mess. We will lose the data.

Persistent Volume:

Persistent means always available.
Persistent Volume is an advanced version of EmptyDir and hostPath volume types.
Persistent Volume does not store the data over the local server. It stores the data on the cloud or some other place where the data is highly available.
In previous volume types, if pods get deleted then the data will be deleted as well. But with the help of Persistent Volume, the data can be shared with other pods or other worker node’s pods as well after the deletion of pods.
PVs are independent of the pod lifecycle, which means they can exist even if no pod is using them.
With the help of Persistent Volume, the data will be stored on a central location such as EBS, Azure Disks, etc.
One Persistent Volume is distributed across the entire Kubernetes Cluster. So that, any node or any node’s pod can access the data from the volume accordingly.
In K8S, a PV is a piece of storage in the cluster that has been provisioned by an administrator.
If you want to use Persistent Volume, then you have to claim that volume with the help of the manifest YAML file.
When a pod requests storage via a PVC, K8S will search for a suitable PV to satisfy the request.
If a PV is found that matches the request, the PV is bound to the PVC and the pod can use the storage.
If no suitable PV is found, K8S then PVC will remain unbound (pending).

Persistent Volume Claim:

To get the Persistent Volume, you have to claim the volume with the help of PVC.
When you create a PVC, Kubernetes finds the suitable PV to bind them together.
After a successful bound to the pod, you can mount it as a volume.
Once a user finishes its work, then the attached volume gets released and will be used for recycling such as new pod creation for future usage.
If the pod is terminating due to some issue, the PV will be released but as you know the new pod will be created quickly then the same PV will be attached to the newly created Pod.
After bounding is done to pod you can mount it as a volume. The pod specifies the amount and type of storage it needs, and the cluster provisions a persistent volume that matches the request. If its not matches then it will be in pending state.

FACTS ABOUT EBS:

Now, As you know the Persistent Volume will be on Cloud. So, there are some facts and terms and conditions are there for EBS because we are using AWS cloud for our K8 learning. So, let’s discuss it as well:

EBS Volumes keeps the data forever where the emptydir volume did not. If the pods get deleted then, the data will still exist in the EBS volume.
The nodes on which running pods must be on AWS Cloud only(EC2 Instances).
Both(EBS Volume & EC2 Instances) must be in the same region and availability zone.
EBS only supports a single EC2 instance mounting a volume

Create an EBS volume by clicking on ‘Create volume’.

Pass the Size for the EBS according to you, and select the Availability zone where your EC2 instance is created, and click on Create volume.

Now, copy the volume ID and paste it into the PV YML file

Persistent Volume Manifest file:

apiVersion: v1
kind: PersistentVolume 
metadata:
  name: myebsvol
spec:
  capacity:
    storage: 5Gi 
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Recycle 
  awsElasticBlockStore:
    volumeID: vol-0a0232b56c59cc682
    fsType: ext4

Persistent Volume Claim Manifest file:

apiVersion: v1
kind: PersistentVolumeClaim 
metadata:
  name: pvc-1
spec:
  accessModes:
    - ReadwriteOnce 
  resources:
    requests:
      storage: 4Gi

apiVersion: apps/v1
kind: Deployment
metadata:
  name: flm
spec:
  replicas: 2
  selector:
    matchLabels:
      app: swiggy
  template:
    metadata:
      labels:
        app: swiggy
    spec:
      containers:
      - name: cont-1
        image: ubuntu
        command: ["/bin/bash", "-c", "while true; do echo Welcome to DevOps class; sleep 10; done"]
        volumeMounts:
        - mountPath: "/tmp/jenkins"
          name: devops
      - name: cont-1
        image: ubuntu
        command: ["/bin/bash", "-c", "while true; do echo Welcome to DevOps class; sleep 10; done"]
        volumeMounts:
        - mountPath: "/tmp/docker"
          name: devops
      volumes:
      - name: devops
        persistentVolumeClaim:
          claimName: pvc-1

Now after creating the deployment file, lets check the pv and pvc using the below commands

To check PV:

kubectl get pv

To check PVC:

kubectl get pvc

we have deleted the pod, and then because of replicas the new pod was created quickly. Now, we have logged in to the newly created pod and checked for the file that we created in the previous step, and as you can see the file is present which is expected.

Access Modes

access modes determine how many pods can access a Persistent Volume (PV) or a Persistent Volume Claim (PVC) simultaneously. There are several access modes that can be set on a PV or PVC, including:

ReadWriteOnce: This access mode allows a single pod to read and write to the PV or PVC. This is the most common access mode, and it’s appropriate for use cases where a single pod needs exclusive access to the storage.
ReadOnlyMany: This access mode allows multiple pods to read from the PV or PVC, but does not allow any of them to write to it. This access mode is useful for cases where many pods need to read the same data, such as when serving a read-only database.
ReadWriteMany: This access mode allows multiple pods to read and write to the PV or PVC simultaneously. This mode is appropriate for use cases where many pods need to read and write to the same data, such as a distributed file system.
Execute: This access mode allows the pod to execute the data on the PV or PVC but not read or write to it. This mode is useful for use cases where the data is meant to be executed by the pods only, such as application code.

Conclusion

Kubernetes volumes provide a flexible and scalable way to manage storage for containerized applications. By understanding the various types of volumes, their specific use cases, and best practices, you can design efficient, resilient, and secure storage solutions for your workloads. With the examples and strategies discussed in this guide, you’re well-equipped to make informed decisions about storage in your Kubernetes environment.

If you love stories that inspire learning, growth, and productivity, consider subscribing for more! If this article added value to your journey, your support would mean the world to me — only if it’s within your means. Let’s stay connected on LinkedIn too. Thank you for reading!