K8s Storage

K8s Storage


Container File Systems

The container file system is ephemeral. Files on the container’s file system exist only as long as the container exists. If a container is deleted or re-created in K8s, data stored on the container file system is lost.

Volumes

Many applications need a more persistent method of data storage.
Volumes allow you to store data outside the container file system while allowing the container to access the data at runtime.
This can allow data to persist beyond the life of the container.

Persistent Volumes

  • Container
  • Pod
  • External Storage Volumes
  • Persistent Volumes
Volumes offer a simple way to provide external storage to containers within the Pod/container spec. Persistent Volumes are a slightly more advanced form of Volume. They allow you to treat storage as an abstract resource and consume it using your Pods.

Volume Types

Both Volumes and Persistent Volumes each have a volume type. The volume type determines how the storage is actually handled. Various volume types support storage methods such as:
  • NFS
  • Cloud storage mechanisms (AWS, Azure, GCP)
  • ConfigMaps and Secrets
  • A simple directory on the K8s node

hostPath

  • This type of volume mounts a file or directory from the host node's filesystem into your pod.
  • The hostPath directory refers to a directory created on the Node where the pod is running.
  • Use it with caution because when pods are scheduled on multiple nodes, each node gets its own hostPath storage volume. These may not be in sync with each other, and different pods might be using different data.
  • Let's say the pod with hostPath configuration is deployed on Worker node 2, then "host" refers to Worker node 2. So any hostPath location mentioned in the manifest file refers to Worker node 2 only.
  • When a node becomes unstable, the pods might fail to access the hostPath directory and eventually get terminated.

emptyDir

  • The emptyDir volume is first created when a Pod is assigned to a Node.
  • It is initially empty and has the same lifetime as the Pod.
  • emptyDir volumes are stored on whatever medium is backing the node - that might be disk, SSD, network storage, or RAM.
  • Containers in the Pod can all read and write the same files in the emptyDir volume.
  • This volume can be mounted at the same or different paths in each Container.
  • When a Pod is removed from a node for any reason, the data in the emptyDir is deleted forever.
  • emptyDir is not deleted when pod get restarted.
  • It is mainly used to store cache or temporary data to be processed.

  1. Create a Pod that uses a hostPath volume to store data on the host.
Create a YAML file named volume-pod.yml with the following content:
apiVersion: v1 kind: Pod metadata: name: volume-pod spec: restartPolicy: Never containers: - name: busybox image: busybox command: ['sh', '-c', 'echo Success! > /output/success.txt'] volumeMounts: - name: my-volume mountPath: /output #location inside the container volumes: - name: my-volume hostPath: path: /var/data
Run the following command to create the Pod:
kubectl create -f volume-pod.yml
To check which worker node the pod is running on, run:
kubectl get pod volume-pod -o wide
Log in to that host and verify the contents of the output file:
cat /var/data/success.txt
  1. Create a multi-container Pod with an emptyDir volume shared between containers.
Create a YAML file named shared-volume-pod.yml with the following content:
apiVersion: v1 kind: Pod metadata: name: shared-volume-pod spec: containers: - name: busybox1 image: busybox command: ['sh', '-c', 'while true; do echo Success! > /output/output.txt; sleep 5; done'] volumeMounts: - name: my-volume mountPath: /output - name: busybox2 image: busybox command: ['sh', '-c', 'while true; do cat /input/output.txt; sleep 5; done'] volumeMounts: - name: my-volume mountPath: /input volumes: - name: my-volume emptyDir: {}
Run the following command to create the multi-container Pod:
kubectl create -f shared-volume-pod.yml
  1. Check the container log for busybox2. You should see the data that was generated by busybox1.
Run the following command to view the logs of busybox2 container:
kubectl logs shared-volume-pod -c busybox2
This will display the data generated by busybox1 that is being read by busybox2.