vZilla

Deploying the Veeam Software Appliance on Kubevirt

michaelcade — Tue, 23 Sep 2025 07:28:14 +0000

Virtual Machines on Kubernetes is a thing so I thought it would be a good idea to run through how to get the Veeam Software Appliance up and running on KubeVirt which will also translate to enterprise variants that use KubeVirt to enable virtualization on top of Kubernetes such as Red Hat OpenShift Virtualization and SUSE Harvester.

I wrote about the Veeam Software Appliance and ran through the steps to get the system up and running as the brains of your backup environment along with some of the benefits it brings.

The Veeam Software Appliance – The Linux Experience

For those familiar with the process I took in the above link in vSphere, be warned defining virtual machines in Kubernetes is all about the YAML.

What you will need

A Kubernetes cluster, I will be using Talos Linux as my Kubernetes distribution in the home lab but any bare metal Kubernetes cluster should work providing you have the system requirements.

The system requirements for the nodes in your cluster should have enough to meet the requirements of the Veeam Software Appliance. 8 vCPU, 16+ GB RAM, and two disks (≈240 GB+ each)

You should have a StorageClass with enough capacity to store the above disk requirements.

KubeVirt + Containerized Data Importer (CDI) installed and working. CDI is used to create DataVolumes and import/upload images. (I am running version 1.6.0 of kubevirt)

Client side you will need kubectl and virtctl, kubectl is how we will interact with the cluster, virtctl helps us to upload images, start virtual machines and open a VNC/Console to the machine.

High Level Steps

Upload the Veeam Software Appliance ISO into the cluster (DataVolume)
Create Blank DataVolumes (PVCs) that will become the appliance target disks (this needs to be 2 x 250gb due to us needing 248gb)
Create a Virtual Machine that attaches the ISO and the blank PVCs as disks (ensuring the correct boot order)
A service so that the system can be accessed outside of the cluster.
Start the VM and work through the initial configuration wizard of the appliance

Upload the ISO into the cluster

We now need to get the ISO into our cluster as a DataVolume to be used with the virtual machine creation in the next step. First we need to create a namespace where will create the virtual machine and everything associated with it.


kubectl create ns veeam

You could use port-forward but for a large ISO like this it might take a considerable amount of time. I opted to use a local web server on my system to share the ISO. I used a simple python web server with the following command.


python -m http.server 8080

We will then create a DataVolume for our ISO and we will specify the URL to get our ISO. Be sure to change your path and storageclass.


# veeam-iso-dv.yaml

apiVersion: cdi.kubevirt.io/v1beta1

kind: DataVolume

metadata:

name: veeam-iso-dv

namespace: veeam

spec:

pvc:

accessModes: ["ReadWriteOnce"]

resources:

requests:

storage: 13Gi

storageClassName: ceph-block

source:

http:

url: http://192.168.169.5:8080/VeeamSoftwareAppliance_13.0.0.4967_20250822.iso

We will then apply this with


kubectl apply -f veeam-iso-dv.yaml

The import process will then start and you can use the following command to see the import progress.


kubectl -n veeam get dv veeam-iso-dv -w

You can keep on eye on this, don’t panic if it gets to 99.99% and sits there for the same time again with no updates. You can come out of this watch command and check the pod logs, you will have an importer pod in your veeam namespace that you can run this command against.


kubectl logs importer-prime-a1ec931f-5335-4f59-aa97-6f165f1c38eb -n veeam

When complete you can run the following command and hopefully you will see something like the screenshot below


kubectl -n veeam get dv veeam-iso-dv

Note when I first tried to use port-forward with this iso it was suggesting 7+ hours so this way is much more efficient.

Create blank target disks (DataVolumes)

You might at this stage be realising the difference between Kubevirt and a hypervisor like vSphere…

We have the ISO uploaded to the cluster and we are now ready to create our Virtual Machine.

There is an option in the following YAML to use CloudInitNoCloud to inject an SSH user for access after installation, I have not added this as although you can SSH into the Veeam Software Appliance from a security standpoint it is not enabled and requires elevated security officer approval, maybe more on this in another post.

We are now going to define our Virtual Machine in code. Again if you are copying and pasting then please amend resources, storageclass and namespace if different to what I have used above.

The appliance requires UEFI vs BIOS so we have made that change to the VM configuration below.

You can find the VM code listed here in this GitHub repository
veeam-vm.yaml

https://github.com/MichaelCade/vsa_kubevirt

Now we can create our VM using the command below


kubectl apply -f veeam-vm.yaml

At this stage our virtual machine is powered off but to confirm the status we can check using


kubectl get virtualmachine -n veeam

Start the VM

Four YAML files later, we are now ready to start the machine.

When ready we can start things using the following virtctl command


virtctl start veeam-appliance -n veeam

lets then use that command again to check the status of our virtual machine


kubectl get virtualmachine -n veeam

Now we need a console to complete our initial configuration wizard within the TUI, I have installed tigervnc as my client

you can then run a virtctl command to connect


virtctl vnc veeam-appliance -n veeam

and with this command you will also see the following pop up,

Ok, I think that is enough for this post. I have been storing the code snippets in a repository on GitHub that can be found here. – https://github.com/MichaelCade/vsa_kubevirt

Once you get to this page of the Initial configuration, you can then follow this post to complete the steps. – https://vzilla.co.uk/vzilla-blog/the-veeam-software-appliance-the-linux-experience

The Veeam Software Appliance – The Linux Experience

michaelcade — Mon, 08 Sep 2025 15:18:24 +0000

The first week in September 2025 saw a massive initial release of the Veeam Software Appliance. Since the inception of Veeam and the ability to protect Virtual Machines on VMware vSphere Veeam has been a Windows Server based product, until now.

I have also skipped over how “Virtualisation was just the start” and it was, now the Veeam Data Platform protects workloads and data across many different platforms, VMware vSphere, Microsoft Hyper-V, Proxmox, Oracle Linux Virtualisation and lots more hypervisors, as well as protecting public cloud workloads on AWS, Microsoft Azure and Google Cloud. The protection of Kubernetes came almost five years ago with the acquisition of Kasten now known as Veeam Kasten or Veeam Kasten for Kubernetes. We have M365, Salesforce backup, EntraID, Agents for Windows, Linux, Solaris, AIX and then we also protect unstructured data by way of NAS devices and Object Storage locations.

As of this week the management layer of Veeam is no longer just an option on Windows, Linux entered the room.

Home Lab Setup Plan

My morning of the 3^rd September was sat waiting for the downloadable, public release of the Veeam Software Appliance (VSA) We as Veeam have made big splashes about this at our event earlier in the year and showed off a lot of the features and functionality, I am not going to get into too much of that here.

When the downloads became available it was time to start the engines, but not without a solid plan. I have documented my home lab setup in previous posts. As much as this is a home lab I still want to treat things with some real life accountability. However, what I am about to show you is that my repository will be living on the same virtual environment I am likely going to be protecting… for now and this will be resolved down the line with the Veeam Data Cloud Vault option where I can store those backups offsite in the Veeam first party solution.

OK, back to the plan.

As you can see from the above image we had some virtual machines to create, 10 if we include that DNS server down there. We also have some futures that are documented from Veeam as coming in the not too distant future.

The VSA is available in an OVA and ISO format. There is then a smaller ISO for the Veeam Components such as the Hardened Repository and Proxies. You can see we have 5 proxies planned and a Hardened Repository (again bad practice to store a backup repository on the infrastructure you are going to protect, don’t be like me)

For the VBR (Veeam Backup & Replication) server we will use the OVA and import this into our vSphere environment and create a new VM.

For the EM (Enterprise Manager) server we will use the ISO and create a VM and run through the configuration steps to get this setup.

The proxies and hardened repository will use the smaller ISO

Veeam ONE is a powerhouse with this release, still Windows but massively important. We will provision a Windows 2025 Server for this.

I will mention for this plan, I also provisioned a Windows 2025 Server Core for the DNS server and created a new forest called “vzilla.local”

Hopefully that makes everything in the picture a little clearer, the web client can be any OS and any mainstream browser (I have tried with Chrome, Edge and Firefox) The thick client aspect is used to access some of the Veeam ONE and the VBR server. I have installed the VBR thick client on the Veeam ONE Windows Server.

VBR Deployment Process

As mentioned I am going to be using vSphere here to deploy my virtual machines for this plan, clearly anything can be used including physical systems. Anything that supports the OVA but equally this is why there is an ISO for those physical systems or where OVA is not supported.

Select the OVA

I am not going to go through step by step here, basically select the OVA, give the machine a relevant name for your environment choose the DC, Cluster or Host you wish to deploy to, select some shared storage and finish.

When that import is complete, you will have a powered off virtual machine ready to be powered on. (I created another import to capture the process here and it is named 2025-veeam-vbr1)

Powering on the VM

Next we will see a booting ‘Veeam Backup & Replication” followed by the initial configuration wizard, starting with the license agreements.

Accept this then we can give the box a name.

Next we can configure our network, I have used the static option to set a standard static ip4 but ipv6 is also available to set here.

Then we need to change the time, I am based in the UK so I selected change and searched for London to update the timezone but left the available NTP servers as is.

Then we are onto setting a host administrator, this process is the same for any managed Veeam machine, proxies, repositories and enterprise manager.

The next screen is configuring the multi factor authentication (MFA), I am using the Microsoft Authenticator for all of my MFA needs so I hit show QR code, scan that within the app and then type the number provided from the app to proceed to the next step.

Following that step you are then asked if you would like to create a Security Officer,

At the Security Officer step of the Initial Configuration wizard, configure the default security officer account to perform specific operations in the Host Management console — veeamso. This account type provides an additional security layer to protect your infrastructure against malicious system administration.

The above was taken from the documentation pages – https://helpcenter.veeam.com/docs/vbr/em/deployment_linux_iso_install_security_officer.html?ver=13

if you choose to skip this step, you are prompted with a very red warning. It will not be possible to enable this role later on after this step. For the purposes of this demo walk through I have said OK but in my lab environment I have created that veeamso account on all hosts, when you set a password in this wizard, you will be required to change that when your security officer first logs into the management console to approve tasks.

The final page is a summary of what you have configured

Select finish and then wait for the state configuration to be saved and then for the services to come up. When they are up you will be able to access.

Host Management Console

The Host Management Console allows administrators to perform configuration and maintenance tasks, including managing network settings, server time, host users and roles, backup infrastructure, OS and Enterprise Manager updates, maintenance operations, and security settings.

This can be accessed via https://192.168.169.122:10443/

obviously changing to your set IP address, you will then be able to login here with the veeamadmin account.

Web UI Preview

The Veeam Backup & Replication web UI is a browser-based interface that enables you to manage backup and recovery operations, monitor your backup infrastructure, and configure system settings from any supported device. The web UI provides a modern, streamlined experience designed to simplify daily administration and deliver at-a-glance visibility into your data protection environment.

Please note that this is a preview and not all capabilities within Veeam Data Platform are available here today in this release, you will be able to manage some of your environment from here but you will need the thick client access for all tasks.

This can be accessed via https://192.168.169.122/

again changing your IP address above.

Thick Client Access

For those of you familiar with the thick client approach, it has been enhanced and looks much better, its faster and is used and designed to let you quickly find commands that you need and perform data protection and disaster recovery tasks.

As stated in my diagram above, you will need a Windows based machine to install and use the thick client. Point the client to your address above.

Next Steps

To complete the home lab setup, I went and created 6 further virtual machines using the smaller JeOS ISO and this is what was used to create the hardened Linux repository (bad practice to run this on a VM, the security score in threat center will also warn you against this, don’t be like me) and then the five proxies, the process is very much the same as the initial configuration wizard we went through above and then you add them into your VBR thick client.

With the proxies, I created one proxy per vSphere host, when I added them in you have the choice of what proxies they are going to be.

I then repeated the steps for VMware and VMware CDP as I want these to be the data movers for all tasks.

I had also deployed a Windows Server, downloaded the new Veeam ONE ISO and got things up and running there, maybe another post on those steps to come. Adding Veeam ONE is important as this gives you some great insight into the security posture of your backup environment. The threat center element is pulling data from Veeam ONE to display within the thick client and the web UI.

Next, I wanted to get my Enterprise Manager VM up and running and to do this I used the Veeam Software Appliance ISO and this runs again through a similar configuration wizard as shown but on first boot you will get a choice of Enterprise Manager or Veeam Backup and Replication to install on this, process was to create a VM with the required CPU, Memory and disks and then run through that process.

The final steps was to add some infrastructure to protect and then create some backup jobs. If you are familiar with Veeam this should be the same process as before.

To caveat once again here, this is a lab environment where I can show demos of Veeam Software, I am using virtualization for components that in production should be on physical hardware and not virtual machines, but for home lab environments what I have built will cover elements of what I need to cover in demonstrations. I will also be adding cloud based protection workloads and Veeam Kasten instances later on down the line that I have running to extend the lab into different platforms.

Fixing a bounced vCenter server

michaelcade — Wed, 23 Jul 2025 09:02:16 +0000

A few times a year, my home lab gets bounced for whatever reason and generally because I have the vSphere vCenter living on top of the 5 ESXi nodes this causes an issue where when the hosts come back up we are lacking a fully functional vCenter, probably need to consider a better approach but here we are.

I am able to get into the vCenter Server Management console at https://192.168.169.181:5480 a port etched into my brain for some reason!

But when we head to services we have many that are not running and they should be.

I check the access settings and ensure that SSH is enabled so we can dive deeper and try and get these services up and running.

I first try ssh root@192.168.169.181 but I am met with

So my alternate way is using

ssh -o IdentitiesOnly=yes -o PreferredAuthentications=password -o PubkeyAuthentication=no root@192.168.169.181

Which gets me in to

I then use the service-control –status command to see the same what we have in the management UI

Followed by service-control –start –all which seems to sit here for a very long time but noting this down at this stage to make sure future me doesn’t do anything silly.

While the above was going on I checked the time tab in the management UI and we were back in time, around 2023 and wondered if this could be why things were not quite right and we were also using host based NTP, I changed this setting and got things up to date

In changing the time we were now able to get into

We do have to wait a while for things to finish initializing here. But maybe it was all down to time! I actually got impatient and hit the reboot button from the management UI.

Things do not look all that pretty and I may update if relevant here with what is going on!

The Hypervisor Hunger Games – Service Provider Edition

michaelcade — Sun, 29 Jun 2025 09:52:01 +0000

Many MSPs (Managed service providers) have hedged their platform offering in and around the vSphere ecosystem and now what?

I have said before about the cost conundrum here and these are some decisions that people in all worlds will have to consider. But in a service provider world it’s maybe not a simple rip and replace with Nutanix AHV or another.

Service providers bring values by having this stack that they not only bring a relationship with their customers they also can automate and provide additional wrap around services and join up this vast ecosystem we have when it comes to VMware.

It’s also very much a price per fight here for MSPs. Value add + capabilities so spending the winnings on software licensing probably doesn’t add up. Maybe platform replacements like Nutanix AHV or even Red Hat OpenShift are not that much different licensing cost wise compared to the Broadcom tax. (Maybe it’s a valid tax being the best hypervisor but also the strongest ecosystem)

What I do think we could see is a lot of service providers looking into KVM based options. Albeit the ecosystem is maybe not as polished and supported it might just be enough to ramp up.

I am talking about options like Proxmox, XCP-NG and maybe even the new hypervisor option from HPE but this will come as a premium as well. These options will also not be free, they will be free like a puppy but the cost will come from elsewhere.

The other option could be KubeVirt. KubeVirt is what underpins Red Hat OpenShift Virtualisation but it is an open source project that can be used across many Kubernetes distributions and managed with a bit more effort to OpenShift. Could this be a real option for service providers to accelerate their own offerings into the cloud native ecosystem? An ecosystem that has been built over the last 10+ years.

I am going to share a fantastic resource for the vSphere admin here from my good friend Dean Lewis

Learn KubeVirt: Deep Dive for VMware vSphere Admins

I want to be clear that KubeVirt is even though established and been around a while it’s still missing that polish that we have within mainstream vSphere, Hyper-V, Nutanix AHV hypervisors and platforms but I remember when vSphere was like this and we all flocked in that direction.

All I do know is that wherever service providers land the requirement for data protection and management will be there so regardless.

I wrote about protecting these VMs on Kubernetes here

VMs on Kubernetes protected unofficially by Veeam*

Finally, one thing is for sure. Virtual Machines are not going anywhere! We might be in a world surrounded by AI but the trusty virtualisation era isn’t over and will continue to be a staple be it in the data centre. Or…. In the public cloud…. Could the public cloud IaaS be an option instead of on premises for providers?

Taking a look at KubeBuddy for Kubernetes

michaelcade — Thu, 15 May 2025 10:17:42 +0000

I have been meaning to get to this little project for a while, and here we are. You can find a link to the site below, I like this initial in your face message though, this tells me that this tool is going to tell me something about my Kubernetes cluster that I didn’t know, for the record I am going to download and run this on my home lab cluster and see what we get. This is not a production cluster!

So what is it…

KubeBuddy powered by KubeDeck helps you monitor, analyze, and report on your Kubernetes environments with ease. Whether you’re tracking cluster health, reviewing security configurations, or troubleshooting workloads, KubeBuddy provides structured insights.

Lets get started

Suspiciously this Kubernetes tool is built using PowerShell, I don’t think I can name another tool with this characteristic?

Luckily, PowerShell is now available cross platform, I am using a Mac so as part of this getting started we will also be getting PowerShell installed via brew.


brew install powershell

Other installation steps can be found in the usage section of the page link above. Ok, Good stuff we have our PowerShell installed and we can use

pwsh

from our Ghostty terminal to get into the shell. We can then run a command to get the KubeBuddy module installed.


Install-Module -Name KubeBuddy -Scope CurrentUser

Also from the above we can see the way in which we can start playing with KubeBuddy is by starting with the

Invoke-KubeBuddy

command.

We can then also use

Get-Help Invoke-KubeBuddy -Detailed

as a way to understand some additional flags we have access to here.

Are we ready to find out something we didn’t know about our cluster?

As you can see it was pretty easy to run against my Kubernetes cluster, I am running a Talos cluster, which is designed to be very minimum and extremely secure so there might be some things reported that are related this.

The Output

As you can see from the end of the video above, we have an output. For this output we chose html but you can get JSON and have seen in the report a save to pdf feature as well.

Here is the html output, I am not going to get into the issues its found, maybe that is a follow up but I think its great that we get a lot of detail without a lot of effort, the tool has taken away the having to search and find this.

Navigation along the top allows you to dive into each of those areas and display warnings and errors found in those specifics.

When we scroll down we see some more detail about the cluster, even for home lab 20 Critical seems like something we should investigate further.

Finally, on this initial page we see some information about resources and cluster events, not much going on in the lab right now or something not being picked up is my suspicion here.

As you then go across the tabs at the top you can get more granular detail on each area, all tabs have this similar layout, the initial Total of resources and those with issues then some recommendations and some findings. Again useful as to find this using

kubectl

would be a needle in a haystack.

My Thoughts?

This was a very quick overview of this little tool, I am intrigued by the PowerShell, I am intrigued by how this can be progressed and the future of the project and where it can go and highlight.

My Thoughts on Retrieval-Augmented Generation (RAG) and the Power of Vector Databases

michaelcade — Tue, 13 May 2025 08:48:35 +0000

Some of you may have heard of RAG, retrieval augmented generation?

If you want to use an LLM to answer questions about data it wasn’t trained on, you can use the RAG pattern to supplement it with extra data.

Image Source = https://learnopencv.com/rag-with-llms/

But before we get into RAG, I wanted to touch on Vector Databases a little as they have become popular with the world of AI.

TLDR; A Vector Database is fantastic at cataloging how different pieces of data are related to each other.

What is a Vector?

Vectors are arrays of numbers and when those arrays represent something we call them embeddings. The term vector really just refers to the mathematical concept whereas embedding is kind of like an applied vector if you will. So what do these embeddings represent? Well, technically anything you want, but because it’s very common to use vector databases for natural language processing and semantic search.

Want to learn more about Vector Databases, take on this book! I have not braved it but in the content I have been reading and watching this is mentioned a lot.

Deep Learning: A Visual Approach by Andrew Glassner

Vector databases are just collections of embeddings and these are organised into indexes. An index is kind of like like a table, so a collection of rows of embeddings and we call those records.

RAG

Ok this then brings us back to one of the initial things we said:

If you want to use an LLM to answer questions about data it wasn’t trained on, you can use the RAG pattern to supplement it with extra data.

Let’s say you have a bunch of support docs.

These would get turned into embeddings and stored in a vector database. Then when the user types in a prompt, that prompt gets turned into embedding which is used to search the vector database for similar information.

What you’re doing here is a similarity search. Basically, you’re just looking for the nearest neighbour’s to the embedding that you give the database.

An example

Obviously, I wanted to get hands-on and start playing with some of this stuff in a world of AI but also as a Data Technologist I wanted to see what was possible with some of this data and see how it would handle being hovered above a powerful LLM.

Which then led me down a rabbit hole of how important do these Vector Databases become after your own data is embedded, how much CPU and GPU time and effort does this cost to re embed if something was to go wrong? Anyway that might be another post shortly.

Above we mentioned

Let’s say you have a bunch of support docs.

Now instead of docs lets pretend that we have an amazing community repository called 90DaysOfDevOps full of data and learning information. Kind of similar to support docs! We could probably ask an LLM about 90DaysOfDevOps and get some info back… but its going to be vast and wide and the LLM probably was not trained on this repository.

I am using Ollama with Mistral here… the other model will become clear later.

and if we then ask mistral a question about 90DaysOfDevOps what do we get?

For some this might be the way we have been interacting with LLMs so far, but what if we were able to take that personal data, or data that we want to specifically embed and use against or alongside (not sure terms) with an LLM. We can surely get a more rich response overall?

I have my dataset in the 90DaysOfDevOps repository, locally git cloned to my machine. I then have that mxbai-embed-large model you saw above and a trusty friend of mine in a Postgres Database instance running on a VM but could be anywhere and this has the Pg-Vector extension enabled for Knowledge storage. (Maybe another post, lets see how this one goes first)

I wrote a little app to deal with that embed process which is then in turn the same app which will allow me to interact with that RAG + LLM via a chat / API interface.

https://github.com/MichaelCade/vector-demo

Again maybe we need to go into more detail about this app another time, but for now. We have our Knowledge from our 90DaysOfDevOps repository. Each of these markdown files contains basically a blog about a topic related to the world of DevOps.

We have our Golang code to embed our data.

When the worlds align, and we run our binary against our data that has access to our likely hard coded postgres database instance…. we should start the embedding process into our vector database.

NOTE: if you made it this far and want to see how to spike your GPU… change the code to use mistral for the embedding process, a model that does not know how to embed or has not been trained on that like the embed model. Another rabbit hole I found that there are all sorts of models trained for different scenarios.

Here is what things look like within our super secure vector database, that we leaked connection info and all sorts via GitHub.

Using the same Golang binary we ran we can now interact with that API and chat with the vector plus mistral model.

I wanted to be sure that we were indeed getting something from the vector when we did this so added some additional code to tell me the chunks it was using to respond.

Now our whole app looks like the above embed part but also we added Backend API to the same code base. In the GitHub repository, linked above you will see a vector-demo-ui this is the React Frontend… no shame in saying I used vibe coding for this… who likes frontend stuff anyway.

and to top things off if you don’t want to interact with your AI chat assistant via curl then the frontend almost looks pretty…

Before we wrap things up, we should ask it something specific to the vector embeddings we have provided. First if we ask mistral directly about Day 49 of 90DaysOfDevOps we get:

Then with our RAG + LLM we get:

If you made it this far, I am impressed! We have seen a demise in blog views I think over the last few years so when I jot something down it is mostly for future me, looking for something I have done before, but hopefully this helps spur on someone else to unlock some of their data, and if useful, let me know… Also if you would like to see some content about protecting vector databases, or a deeper dive into the terrible coding I am doing with Golang let me know.

My initial thoughts on using AI to manage Kubernetes Clusters – kubectl-ai

michaelcade — Mon, 12 May 2025 08:52:15 +0000

As with most Mondays, we start with a job and task in mind but quickly as we begin catching up on news from the weekend, we find some interesting rabbit holes to investigate. This Monday morning was no different but I also do not usually have the urge to share such information.

As you all know AI is everywhere, I mean if you do not have a chatbot can you even spell AI!?

My morning started with reading up on a tool called ‘kubectl-cli’ from Google – https://github.com/GoogleCloudPlatform/kubectl-ai

I had seen others doing similar things so was intrigued when Google come out with a project, to name one that I had on my list would be k8sgpt – https://k8sgpt.ai/

K8sGPT is for understanding and debugging what’s going wrong inside a Kubernetes cluster.

kubectl-ai is for interacting with the cluster more easily, translating your intent into commands.

The premise of these tools is the ability to use AI to manage your Kubernetes cluster and resources leveraging natural language. For me this does a few things, the barrier to entry in learning Kubernetes is the overwhelming CLI options and variations, albeit this is a superpower in itself its a challenge for many people that do not have that background. Kubernetes does have a complexity to it, its why it is so diversified in the fields we see it which means by the nature of it, it can do many things which brings complexity. My dad used to say to me “children should be seen but not heard” never really understood that saying but Kubernetes is the same… should be used but not seen… by most people… Maybe that works, who knows…

By adding the ability to query your cluster and instruct tasks via this and other tools we now dont need to memorise everything about kubectl and we can instruct it to run this and do that, or provide me feedback on this.

I started off trying to use the Google AI Studio API key but initially it said the model was overloaded and then the key seemed to be wrong.

So I then went and tried the ability to use a local model with Ollama but I only had my MacBook and you need to download the gemini pro model which is around 8GB and with no GPU I need to wait to do this with my desktop PC… maybe a video on this setup.

You can bring many models and services, so I used my trusty OpenAI key and got to work… exporting the key and asking some initial questions.

As I am focused an interested in the world of data management within Kubernetes I wanted to see how we could go about creating a backup policy and what I needed to provide to make this work.

Meanwhile up to this point, we were barely touching the spending on our OpenAI key…

As the $$ are low, we can ask a few more things about our backup policies

I then thought, what about getting some insight into our cluster, whats the health of things… maybe things I have not been able to see yet, I can just ask right and get a simple output of things I need to troubleshoot.

Very quick post to start with, but I am now intrigued into this simplification. Maybe I could release some of that RAM in my brain where I am storing all those kubectl commands and store something else.

As a beginner to Kubernetes you have the best chance to accelerate on here and get to grips with a lot more much faster… Just ask it to deploy your nginx deployment and expose it via a service… no longer do you have to worry about the YAML and kubectl commands.

My final thought, this is great for home labs and dev environments… Still be mindful running this on anything important… I also want to give K8sGPT a try as I can see this might do the same and some more things here.

I am sure there are many other tools popping up in this area, but as a quick comparison of the two I created this table.

Feature	K8sGPT	kubectl-ai
Purpose	Diagnoses and explains Kubernetes cluster issues	Helps write and understand kubectl commands using AI
Focus	Cluster health, error analysis, and troubleshooting	Command-line assistant for kubectl
AI Role	Uses AI to explain root causes and suggest fixes	Uses AI to translate natural language to kubectl commands
Installation	CLI tool + CRDs (optional for full diagnostics)	kubectl plugin via Krew or direct install
Integration	Can run inside clusters; supports multi-language output	Works locally in the terminal as a plugin
Common Use Case	Debugging failed pods, misconfigurations, alerts	Helping users construct or correct kubectl commands

Visualising Veeam: Kubernetes Monitoring with Grafana and ArgoCD

michaelcade — Wed, 09 Apr 2025 11:45:55 +0000

I have been concentrating a lot this year on my home lab, in previous posts I have covered the set up but basically I have a 5 node Talos Kubernetes cluster with rook-ceph as my storage layer and I needed some monitoring for my home lab.

In a VM I am running Veeam Backup & Replication and I wanted to get some hands-on with Grafana, I have more plans but this was project #1

My good friend Jorge has been years into the Grafana dashboards for Veeam. You can find one of the dashboards here.

The Plan:

We are going to use our Kubernetes cluster to host our Grafana instance. Jorge has shared a script that we are going to repurpose into a cronjob, this job will run on a schedule. I think every 5 minutes. This will grab us some details via the Veeam Backup & Replication API and we will have some data visualisation inside of our grafana dashboard.

Deployment: Grafana & InfluxDB

We obviously need Grafana to show our Grafana Dashboard, we will also need InfluxDB which is where the cronjob will store our API data collected from Veeam Backup & Replication. There are many ways to deploy Grafana into your Kubernetes cluster, you could use helm (Kubernetes package manager) but I am going to be using ArgoCD.

I am storing my ArgoCD application here in this GitHub Repository.

This will get you up and running with Grafana. Next you need the IP to access your Grafana instance and the secret to go with the default user ‘admin’

Head over to a browser and get logged in and the first page here you can go and find some more stuff out about Grafana

Select Dashboards, you will notice that I have currently two configured, the one we are focused on is the “Grafana Dashboard for Veeam Backup & Replication” If you have not added this in your configuration you can manually add this as well using the New button in the top right.

and if you have been able to run the cronjob you will have something resembling your Veeam environment

Step Back

Ok all the above is great but I have not really helped you get there yet.

We have used ArgoCD to hopefully deploy Grafana and you will also see a application in there for InfluxDB so lets hope that we have those two up and running. But we need to put some more things in place.

First we will need an influx token and we can get this with the following command.


kubectl get secret -n monitoring influxdb-influxdb2-auth -o jsonpath="{.data.admin-password}" | base64 --decode; echo

Second we need a secret to enable our cronjob to hit our Veeam Backup & Replication server. Obviously add your details there.


kubectl create secret generic veeam-influxdb-sync-secret \
--namespace monitoring \
--from-literal=veeamUsername=administrator \
--from-literal=veeamPassword= \
--from-literal=veeamInfluxDBToken=

Then in the same GitHub Repository you will find a file called ‘veeam-influx-sync.yaml’ this is our cronjob configuration file so we need to apply this into our cluster as well but before we get to that we need to make sure we change some of the environment variables within this file as your environment might be different to mine.


          - name: veeamInfluxDBURL

            value: "http://influxdb-influxdb2.monitoring.svc.cluster.local"

          - name: veeamInfluxDBPort

            value: "80"

          - name: veeamInfluxDBBucket

            value: "veeam"

          - name: veeamInfluxDBOrg

            value: "influxdata"

          - name: veeamBackupServer

            value: "192.168.169.185"

          - name: veeamBackupPort

            value: "9419"

          - name: veeamAPIVersion

            value: "1.2-rev0"

Then deploy that into the cluster


kubectl apply -f veeam-influxdb-sync.yaml

This cronjob will run every 5 minutes but if you wanted to trigger it straight away we can use this command


kubectl create job --from=cronjob/veeam-influxdb-sync veeam-influxdb-sync-manual -n monitoring

You can then check the progress of this process using the following command


POD_NAME=$(kubectl get pods -n monitoring | grep '^veeam-influxdb-sync-manual-' | awk '{print $1}')

kubectl logs -f $POD_NAME -n monitoring

A big thank you to Jorge on this one, if it wasn’t for his hard work in this area then we would not have these dashboards! He has also created some amazing content around this and it is also not just Veeam dashboards, lots of great stuff.

Notes

On the final section of the cronjob script I have filtered to only show the VMware platform if you want to change this back then you can do so by changing the below code you will need to remove


?platformNameFilter=VMware"


veeamVBRURL="https://$veeamBackupServer:$veeamBackupPort/api/v1/backupObjects?platformNameFilter=VMware"

I am working on an update to see if this can be resolved and catch all objects without filtering.

Iteration

If you made it this far… you must be interested! I was not happy with the above situation where I could only display my VMware or one platform when I have several within my environment. I have iterated and now you will find an updated script that loops through the different platforms providing the data to influx and then in turn to Grafana.

Here is that script

And from there you can see that I have my MacOS backups, HyperV backups and Kasten backups all now showing

HomeLab: Trials, Tribulations and Packer

michaelcade — Mon, 24 Feb 2025 14:26:24 +0000

Over the last few weeks I have been lifting, shifting and reshaping some of the home lab and within that process we needed some more templates for both Windows and Linux.

I found an amazing project GitHub Repo – vmware-samples/packer-examples-for-vsphere

And Documentation can be found here

This will give you the ability to quickly get some Linux and Windows templates up and running quickly in your vSphere environment.

My advice from the start is do not use WSL (Windows Subsystem for Linux) but that could be my own user error.

I am using an Ubuntu server in my home lab to perform these tasks and I hit a snag not with the configuration but with some of the dependancies you need to run.

Fixing the “No Module Named ‘winrm'” Error in Packer + Ansible for Windows VM Provisioning

When using Packer with Ansible to provision Windows virtual machines on vSphere, I recently encountered the following error during the Ansible playbook execution:


fatal: [default]: FAILED! => {"msg": "winrm or requests is not installed: No module named 'winrm'"}

This stopped my automated build in its tracks. After some debugging, I found that Ansible’s execution environment was missing the pywinrm module, which is required for managing Windows hosts via WinRM. Here’s how I diagnosed and fixed the issue.

Understanding the Problem

Ansible relies on the pywinrm Python module to communicate with Windows hosts using the WinRM (Windows Remote Management) protocol. If this module isn’t installed in the correct environment, Ansible cannot establish a connection, resulting in the “No module named ‘winrm'” error.

Even though pywinrm might be installed in the system’s Python, Packer’s execution context (often running inside a virtual environment) might not have access to it.

Step-by-Step Solution

Check if pywinrm is Installed in the Correct Python Environment
Since Ansible was running inside a pipx-managed virtual environment, I first verified whether the winrm module was available:


/home/veeam/.local/pipx/venvs/ansible/bin/python -c "import winrm; print(winrm)"

This returned:


ModuleNotFoundError: No module named 'winrm'

That confirmed the issue—pywinrm was missing from the environment Ansible was using.

Install pywinrm in the Correct Environment
Since Ansible was installed via pipx, I needed to install pywinrm inside the same environment rather than globally.

Option 1: Using pipx to Inject pywinrm


pipx inject ansible pywinrm

This ensures that pywinrm is available within Ansible’s execution context.

Option 2: Installing Directly in the Virtual Environment
If you prefer, you can manually install pywinrm inside the virtual environment:


/home/veeam/.local/pipx/venvs/ansible/bin/pip install pywinrm

If pip is missing, install it first:


apt install python3-pip -y

Verify the Fix
To confirm that pywinrm is now correctly installed, run:


/home/veeam/.local/pipx/venvs/ansible/bin/python -c "import winrm; print(winrm)"

If no errors appear, the installation was successful!

Re-run Packer and Ansible
With pywinrm installed, I restarted the Packer build:


packer build -var-file=variables.pkrvars.hcl windows-server.pkr.hcl

This time, Ansible successfully connected to the Windows VM over WinRM, and provisioning completed without issues.

Final Thoughts

This issue highlighted an important lesson about managing dependencies within virtual environments. When working with Packer and Ansible, always ensure that required Python modules are installed inside the environment that Ansible is running in.

By using pipx inject, I was able to keep my environment clean while ensuring Ansible had access to the necessary modules. If you run into similar issues, check:

Where Ansible is installed
Which Python environment it’s using
That required modules like pywinrm are installed in the same environment

Hope this helps anyone facing the same issue!

Veeam Kasten: ARM Support (Raspberry PI – How to)

michaelcade — Sun, 22 Dec 2024 22:16:29 +0000

This has been on my product bucket list for a while, in fact this initial feature request went in on the 9th September 2021. My reasons then were not sales orientated, I was seeing the Kubernetes community using the trusty Raspberry PIs as part of a Kubernetes cluster at home.

By supporting in my eyes this architecture it would have opened the door to the home users, technologists and community to having a trusted way to protect the learning environment at home.

Here we are 3 years on and we got the support.

I have a single node k3s cluster running on a single Raspberry Pi. We have 4gb of memory and we had to make some changes to get things up and running. It is a Pi4.

I chose K3s due to the lightweight approach and I was limited by only having this one box for now, the others are elsewhere in the house serving as print servers and other useful stuff.

I actually also started with minikube on the pi with some nightly builds as this is a very fast way to rinse and repeat things but the resources consumed were too much.

As Veeam Kasten for Kubernetes is focused on protecting, moving and restoring your Kubernetes applications and data I need also a layer of storage to play with. the CSI hostpath driver is something quite easy to deploy and mimics any other CSI in a single node cluster. With this in mind we also created a storageclass and volumesnapshotclass

I am not going to repeat the steps as they can be found here.

Deploying Veeam Kasten

With the above Kubernetes storage foundations in place we can now get Kasten deployed and working on our single node cluster.

We will start this process with a script that runs a primer on your cluster to ensure that you have met requirements, storageclasses are present, and if a CSI provisioner exists so we run the following command on our system. (this is the same process for any deployment of Kasten) (Air gap methods can also be found in the documentation)


curl https://docs.kasten.io/tools/k10_primer.sh | bash

At this point you should have helm and everything else pre installed and available for use here.

As of today, the process to get things installed as with any x86 or IBM Power based cluster deployment of Kasten can be as simple as the command below, although you will likely want to check the documentation.


helm install k10 kasten/k10 --namespace=kasten-io --create-namespace

In an ideal world you will have all pods come up and be running and this might be the case on your cluster or your single node depending on resources. Within my cluster I have also deployed the bitnami Postgres chart as well so resources were low. But in an ideal world you have this.

I did not… so I had to make some modifications… I am going to state here that this is not supported but then I don’t think Raspberry PI deployments on a single node is something we will have to deal with either. I also believe though that resources are going to play a crucial play in things later on when we come to protecting some data.

My gateway pod was in a state of not enough memory resource to get up and running, I simply modified the deployment and made some reductions to that. to get to the above state.

Backing up

In the below demo, I have created a simple policy considerate of local storage space and only keeping a couple of snapshots for test and demo purposes.

My Deployment modification


    resources:

      limits:

        cpu: "1"

        memory: 100Mi

      requests:

        cpu: 200m

        memory: 100Mi

by default the gateway deployment is


    resources:

      limits:

        cpu: "1"

        memory: 1Gi

      requests:

        cpu: 200m

        memory: 300Mi