Kubernetes Storage – vZilla https://vzilla.co.uk One Step into Kubernetes and Cloud Native at a time, not forgetting the world before Tue, 10 Aug 2021 10:28:51 +0000 en-GB hourly 1 https://wordpress.org/?v=6.8.1 https://vzilla.co.uk/wp-content/uploads/2018/01/cropped-profile_picture_symbol-32x32.png Kubernetes Storage – vZilla https://vzilla.co.uk 32 32 How to – Amazon EBS CSI Driver https://vzilla.co.uk/vzilla-blog/how-to-amazon-ebs-csi-driver https://vzilla.co.uk/vzilla-blog/how-to-amazon-ebs-csi-driver#comments Tue, 06 Apr 2021 11:02:48 +0000 https://vzilla.co.uk/?p=2928 In a previous post, we hopefully covered the why and where the CSI has come from and where it is going and the benefits that come with having an industry-standard interface by enabling storage vendors to develop a plugin once and have it work across a number of container orchestration systems.

The reason for this post is to highlight how to install the driver and enable volume snapshots, the driver itself is still in the beta phase, and the volume snapshot is in the alpha phase, alpha phase software is not supported within Amazon EKS clusters. The driver is well tested and supported in Amazon EKS for production use. The fact that we must deploy it in our new Amazon EKS clusters means that the CSI for Amazon EBS volumes is not the default option today. But this will become the standard or default in the future.

Implements CSI interface for consuming Amazon EBS volume

Before we start the first thing we need is an EKS cluster, to achieve this you can follow either this post that walks through creating an EKS cluster or this which will walk through creating an AWS Bottlerocket EKS cluster. If you want the official documentation from Amazon then you can also find that here.

OIDC Provider for your cluster

For the use case or at least my use case here with the CSI driver I needed to use IAM roles for services accounts to do this you need an IAM OIDC provider to exist in your cluster. First up on your EKS cluster run the following command to understand if you have an existing IAM OIDC provider for your cluster.

#Determine if you have an existing IAM OIDC provider for your cluster


aws eks describe-cluster --name bottlerocket --query "cluster.identity.oidc.issuer" --output text

040621 0648 HowtoAmazon1

Now we can run the following command to understand if we have any OIDC providers, you can take that id number shown above and pipe that with a grep search into the below command.

#List the IAM OIDC Providers if nothing is here then you need to move on and create


aws iam list-open-id-connect-providers

040621 0648 HowtoAmazon2

If the above command did not create anything then we need to create an IAM OIDC provider. We can do this with the following command.

#Create an IAM OIDC identity provider for your cluster


eksctl utils associate-iam-oidc-provider --cluster bottlerocket –-approve

Repeat the AWS IAM command that will now or should return something as per the above screenshot.

IAM Policy Creation

The IAM policy that we now need to create is what will be used for the CSI drivers service account. This service account will be used to speak to AWS APIs

Download the IAM Policy example if this is a test cluster you can use this, you can see the actions allowed for this IAM account in the JSON screenshot below the command.

#Download IAM Policy - https://github.com/kubernetes-sigs/aws-ebs-csi-driver/blob/master/docs/example-iam-policy.json


curl -o example-iam-policy.json https://raw.githubusercontent.com/kubernetes-sigs/aws-ebs-csi-driver/v0.9.0/docs/example-iam-policy.json

040621 0648 HowtoAmazon3

For test purposes, I am also going to keep the same name as the documentation walkthrough. Following the command, I will show how this looks within the AWS Management Console.

#Create policy


aws iam create-policy --policy-name AmazonEKS_EBS_CSI_Driver_Policy --policy-document file://example-iam-policy.json

the below shows the policy from within the AWS Management Console but you can see, well hopefully that the JSON file outputs are the same.

040621 0648 HowtoAmazon4

Next, we need to create the IAM role

#Create an IAM Role

aws eks describe-cluster --name bottlerocket --query "cluster.identity.oidc.issuer" --output text

aws iam create-role --role-name AmazonEKS_EBS_CSI_DriverRole --assume-role-policy-document "file://D:\Personal OneDrive\OneDrive\Veeam Live Documentation\Blog\AWS EKS Setup\trust-policy.json"

040621 0648 HowtoAmazon5

The reason for the first command is to gather the ARN and to add that to the trust-policy.json file You would need to replace the Federated line with your AWS Account ID. Further information can be found on the official AWS documentation. You can find the trust-policy.json below here.

040621 0648 HowtoAmazon6

Next, we need to attach the policy to the role, this can be done with the following command. Take a copy of the ARN output from the above command.

#Attach policy to IAM Role


aws iam attach-role-policy --policy-arn arn:aws:iam::197325178561:policy/AmazonEKS_EBS_CSI_Driver_Policy --role-name AmazonEKS_EBS_CSI_DriverRole

040621 0648 HowtoAmazon7

Installing the CSI Driver

There seem to be quite a few different ways to install the CSI driver but Helm is going to be the easy option.

#Install EBS CSI Driver - https://github.com/kubernetes-sigs/aws-ebs-csi-driver#deploy-driver


helm repo add aws-ebs-csi-driver https://kubernetes-sigs.github.io/aws-ebs-csi-driver


helm repo update


helm upgrade --install aws-ebs-csi-driver --namespace kube-system --set enableVolumeScheduling=true --set enableVolumeResizing=true --set enableVolumeSnapshot=true aws-ebs-csi-driver/aws-ebs-csi-driver

Now annotate your controller pods so that they understand how to interact with AWS to create EBS storage and attach nodes.

kubectl annotate serviceaccount ebs-csi-controller-sa -n kube-system eks.amazonaws.com/role-arn=arn:aws:iam::197325178561:role/AmazonEKS_EBS_CSI_DriverRole

kubectl delete pods -n kube-system -l=app=ebs-csi-controller

Regardless of how you deployed the driver, you will then want to run the following command to confirm that the driver is running. You will see on the screenshot you will see the CSI controller and CSI node; the node should be equal to the number of worker nodes you have within your cluster.

#Verify driver is running (ebs-csi-controller pods should be running)


kubectl get pods -n kube-system

040621 0648 HowtoAmazon8

Now that we have everything running that we should have running we will now create a storage class

#Create a StorageClass


kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/aws-ebs-csi-driver/master/examples/kubernetes/snapshot/specs/classes/storageclass.yaml


kubectl apply -f "D:\Personal OneDrive\OneDrive\Veeam Live Documentation\Blog\AWS EKS Setup\storageclass.yaml"

CSI Volume Snapshots

Before we continue to check and configure volume snapshots, confirm that you have the ebs-snapshot-controller-0 running in your kube-system namespace.

CSI

You then need to install the following CRDs that can be found at this location if you wish to view them before implementing them.

kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshotclasses.yaml

kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshotcontents.yaml

kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshots.yaml

040621 0648 HowtoAmazon10

Finally, we need to create a volume snapshot class this enables operators much like a storage class, to describe the storage when provisioning a snapshot.

#Create volume snapshot class using the link https://github.com/kubernetes-sigs/aws-ebs-csi-driver/tree/master/examples/kubernetes/snapshot


kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/aws-ebs-csi-driver/master/examples/kubernetes/snapshot/specs/classes/snapshotclass.yaml


kubectl apply -f "D:\Personal OneDrive\OneDrive\Veeam Live Documentation\Blog\AWS EKS Setup\snapshotclass.yaml"

Those steps should get you up and running with the CSI Driver within your AWS EKS cluster. There are a few steps I need to clarify for myself especially around the snapshot steps. The reason for this for me was so that I could use Kasten K10 to create snapshots of my applications and export those to S3, which is why I am unsure if this is required or not.

If you have any feedback either comment down below or find me on Twitter, I am ok to be wrong as this is a learning curve for a lot of people.

]]>
https://vzilla.co.uk/vzilla-blog/how-to-amazon-ebs-csi-driver/feed 2
Understanding the Kubernetes storage journey https://vzilla.co.uk/vzilla-blog/understanding-the-kubernetes-storage-journey https://vzilla.co.uk/vzilla-blog/understanding-the-kubernetes-storage-journey#comments Sun, 04 Apr 2021 07:43:16 +0000 https://vzilla.co.uk/?p=2914 Some may say that Kubernetes is built for only stateless workloads but one thing we have seen over the last 18-24 months is an increase in those stateful workloads, think your databases, messaging queues and batch processing functions all requiring some state to be consistent and work. Some people will also believe that these states should land outside the cluster but can be consumed by the stateless workloads ran within the Kubernetes cluster.

The people have spoken

040421 0732 Understandi1

In this post, we are going to briefly talk about the storage options available in Kubernetes and then spend some time on the Container Storage Initiative / Interface which has enabled storage vendors and cloud providers the ability to fast track development into being able to offer cloud-native based storage solutions for those stateful workloads.

Before CSI

Let’s rewind a little, before CSI there was the concept of in-tree and this means that this code was part of the Kubernetes core code. This meant that new in-tree providers from various storage offerings would be delayed or would only be released when the main Kubernetes code was shipped and released. It was not just creating new in-tree provisioner plugins it was also any bug fixes they would also have to wait which means a slow down in adoption really for all those storage vendors and cloud vendors out there wanting to bring their offerings to the table.

From the side of Kubernetes code would also have potential risks if the third-party code caused reliability and security issues. Then also thinking about testing, how would the code maintainers be able to test and ensure everything was good without physical access in some cases to physical storage systems.

The CSI massively helps resolve most of these issues and we are going to get into this shortly.

Kubernetes Storage Today

Basically, we have a blend of the in-tree providers and the new CSI drivers, we are in that transition period of when everything if everything will spin over to CSI and In-Tree will be removed completely. Today you will find especially within the hyperscalers AWS, Azure and GCP that the default storage options are using the in-tree provider and you have access to alpha and beta code to test out the CSI functionality. I have more specific content upcoming around this in some later posts.

With the In-Tree as we mentioned you do not need to install any additional components whereas you do with CSI. In-Tree is the easy button but easy is not always the best option.

Before you can start consuming underlying infrastructure resources with the CSI the drivers must be installed in your cluster. I am not sure this will change moving forward and how this looks in the future, to be honest. The table below shows the current and targeted time frames for when we will see some of the specific CSI driver releases, some are here now for us to test and try and some are targeted for later releases.

040421 0732 Understandi2

Source – https://kubernetes.io/blog/2019/12/09/kubernetes-1-17-feature-csi-migration-beta/

What is the CSI

CSI is a way in which third party storage providers can provide storage operations for container orchestration systems (Kubernetes, Docker Swarm, Apache Mesos etc) it is an open and independent interface specification. As mentioned before this also enables those third-party providers to develop their plugins and add code without the requirement to wait for Kubernetes code releases. Overall, it is a great community effort from community members from Kubernetes, Docker and Mesosphere and this interface standardises the model for integrating storage systems.

This also means Developers and Operators only must worry about one storage configuration which stays in line with the premise of Kubernetes and other container orchestrators with the requirement around being portable.

CSI Driver Responsibility

Going a little deeper into the responsibilities here and I may also come back to this in a follow-up post as I find it intriguing the process that has been standardised here. There are 4 things that need to be considered for what is happening under the hood with the CSI Driver.

CSI Driver – Must be installed on each node that would leverage the storage, I have only seen the CSI pods running within the kube-system so my assumption at this stage is that it needs to be within there and runs as a privileged pod. There are 3 services worth mentioning

Identity Service – This must be on any node that will use the CSI Driver, it informs the node about the instance and driver capabilities such as snapshots or is storage topology-aware pod scheduling supported?

Controller Service – Makes the decisions but does not need to run on a worker node.

Node Service – like the identity service it must run on every node that will use the driver.

Example workflow

040421 0732 Understandi3

This was more of a theory post for me to get my head around storage in Kubernetes, this was something of interest because of the new open-source project that was just released called Kubestr, this handy little tool gives you the ability to identity storage, both in tree provisioners and CSI. It enables you to validate that your CSI driver is configured correctly and then lastly lets you run a Flexible IO (FIO) test against your storage both in tree and CSI this can give you a nice way to automate the benchmarking of your storage systems. In the next posts, we are going to walk through getting the CSI driver configured in the public cloud likely starting with AWS and Microsoft Azure both have pre-release versions available today.

Any feedback or if I have missed something, drop it in the comments down below.

]]>
https://vzilla.co.uk/vzilla-blog/understanding-the-kubernetes-storage-journey/feed 3
Introducing Kubestr – A handy tool for Kubernetes Storage https://vzilla.co.uk/vzilla-blog/introducing-kubestr-a-handy-tool-for-kubernetes-storage https://vzilla.co.uk/vzilla-blog/introducing-kubestr-a-handy-tool-for-kubernetes-storage#comments Tue, 30 Mar 2021 13:01:00 +0000 https://vzilla.co.uk/?p=2907 My big project over the last month has not only been getting up to speed around Kubernetes but has had a parallel effort around Kubernetes storage and an open-source project that has been developed and today is released. In this post we are going to touch on how to get going with Kubestr, the first thing to mention is that this is a handy set of tools to help you identify, validate, and evaluate your Kubernetes storage.

The Challenge

The challenge we have with Kubernetes storage is that it’s not all that easy and it’s very manual to achieve some of the tasks that Kubestr helps you with, for example, the adoption of CSI drivers and choice of storage available to us within our Kubernetes clusters is growing so fast. This tool is going to assist in validating that the CSI driver is configured correctly for snapshots for example this, in turn, means we can use data protection methods within our cluster. Another hard task is benchmarking storage, it can be done today or prior to Kubestr but it’s a potential pain to make this happen and it takes time. Kubestr allows us to hit the easy button to evaluate.

All of this whilst there are so many options out there for storage, we want to make sure we are using the right storage for the right task, at the end of the day you can go and pay for the most expensive disk especially in the public cloud but let’s make sure you need it and you don’t overspend and also instead of spending your time building benchmarking tools manually this will save you time to giving you a better understanding and visibility into your storage options.

You can find out more here on the Kasten by Veeam blog explaining in more detail the challenges and the reasons Kubestr was born.

Getting Started with Kubestr

We all use different operating systems to manage our Kubernetes clusters, first and foremost Kubestr is available across Windows, macOS and Linux you can find links to these releases as well as source code here.

Once you have this installed on your OS the first command, I suggest is (I am running windows) We can see then the simplicity of what can be used from a command point of view as well as additional available commands.

.\kubestr.exe --help

Kubernetes

Identify your Kubernetes Storage options

The first step that this handy little tool can help you with is just giving you visibility into your Kubernetes storage options available to you. I am running this below against an Amazon EKS cluster using the Bottlerocket OS on the nodes. I have also installed the AWS EBS CSI drivers and snapshot capabilities that now is not deployed by default. Now my cluster is new and has been configured correctly but this tool is going to highlight when things are not configured maybe you have the storage class available but you do not have the Volume Snapshot class or maybe you have multiple storages available and some of that is not being used and this highlights that you have this storage attached and could highlight that you could save by removing it.

.\kubestr.exe

032921 1559 Introducing2

Validate your Storage

Now that we have our Storage classes and our volume snapshot class, we can now run a check against the CSI driver to confirm if this was configured correctly. If we run the same help command with the csicheck command, you get the following options.

032921 1559 Introducing3

If we run against our Kubernetes cluster, storage class and volume snapshot class we will see the process on the below image that runs through creating the application, taking a snapshot, restoring the snapshot and confirming that the configuration is complete.

032921 1559 Introducing4

.\kubestr.exe csicheck -s ebs-sc -v csi-aws-vsc

032921 1559 Introducing5

Evaluate your Storage

Obviously, most people will not just have access to one Kubernetes cluster, for us to run against additional clusters you simply change the kubectl config context to the cluster you would like to perform the tests against. In this section, we want to look into the options around evaluating your Kubernetes storage. This has a very similar walkthrough to the CSIcheck we mentioned and covered above apart from there is no restore but we are going to get the performance results from Flexible IO.

032921 1559 Introducing6

Let’s start with the help command to see our options.

.\kubestr.exe fio –help

032921 1559 Introducing7

Now we can run a test against our storage class with the following and default configurations as listed above.

.\kubestr.exe fio -s ebs-sc

032921 1559 Introducing8

Now we can get more catered to specific workloads with different file sizes for the tests.

.\kubestr.exe fio -s ebs-sc -z 400Gi

032921 1559 Introducing9

Then we can output this to JSON and this is where we see the community helping here to be able to extract that JSON and allow for a better reporting method on all of the results so that the community can understand storage options without having to run these tests manually on their own clusters.

.\kubestr.exe fio -s ebs-sc -z 400Gi -o json


.\kubestr.exe fio -s ebs-sc -z 400Gi -o json > results.json

I won’t post the whole JSON but you get the idea.

032921 1559 Introducing10

Finally, we also can bring your own FIO configurations, you can find these open source files here

#BYOFIO - # Demonstrates how to read backwards in a file.

.\kubestr.exe fio -s ebs-sc -f "D:\Personal OneDrive\OneDrive\Veeam Live Documentation\Blog\Kubestr\fio\examples\backwards-read.fio"


#BYOFIO - fio-seq-RW job - takes a long time!


.\kubestr.exe fio -s ebs-sc -f "D:\Personal OneDrive\OneDrive\Veeam Live Documentation\Blog\Kubestr\fio\examples\fio-seq-RW.fio"

I have just uploaded a quick lightning talk I gave at KubeCon 2021 EU on this handy little tool

My next ask is simple, please go and give it a go and then give us some feedback,

032921 1559 Introducing11

]]>
https://vzilla.co.uk/vzilla-blog/introducing-kubestr-a-handy-tool-for-kubernetes-storage/feed 1