I Built a 5-Node Proxmox and Ceph Home Lab with 17TB and Dual 10Gb LACP

Proxmox and ceph home lab

Over the last few weeks I have been deeply involved in a Proxmox home lab project that is designed to reshape my home lab as I know it as I am fully lifting my footprint from having any VMware at all running my production services. For this, I have chosen a lot of things that will be different for me historically compared to my full sized racks and VMware running things. I have chosen a 5 node Proxmox cluster running on top of Minisforum MS-01s which fit perfectly in my TecMojo 12-U mini rack. However, this has been quite a deep dive for me and more serious learning around Ceph, Proxmox networking, and NVMe OSDs. So sit back and enjoy this post as it is meant to be a walkthrough of my latest rendition of the mini rack as it has evolved even from what it was just a few weeks back

What I have been up to with the home lab

My home lab in 2026 has intentionally become smaller, not necessarily in compute or storage power but in terms of actual footprint. I am going away from my extensive full-size rack gear and collapsing things down to a mini rack this go around.

My new configuration with the mini cluster
My new configuration with the mini cluster

This is due to a lot of things, including industry trends, physical space, noise, power draw, and other things. This has coincided with my move to get totally away from VMware vSphere running my production workloads. Proxmox has become my defacto choice for this migration for many reasons.

First of all, refreshingly compared to VMware and Broadcom shenanigans, it is free and open-source. I no longer have to rely on paying for a VMUG subscription, which now no longer gets you licensing any way. I also wanted to give Ceph a real go in the home lab.

Honestly I really liked vSAN once I started using it in the VMware-based lab a few years ago and have been itching to get back to HCI for home lab workloads. But Broadcom killed vSAN in my opinion with licensing and I seriously saw many businesses walk away from vSAN back to traditional storage just simply due to the licensing changes. Ceph gives us a real way forward for HCI that is powerful and very enterprise ready. Ceph can scale as large as you want it to and is natively integrated with Proxmox.

Why I rebuilt my cluster using uniform nodes

I had played around with Ceph before and even use Microceph with my Docker Swarm cluster, but I always had the itch to do Ceph erasure coding. To do erasure coding, I went with a 5 node Ceph cluster. However, starting out with this, if you read my previous post here, you know I had dissimilar hardware and other variables. While it worked shockingly well, I was mainly limited by 2.5 GB networking. I wanted to get back to 10 GbE which is where HCI in general and Ceph in particular really start performing.

To do that, I decided to retrofit my original 5-node Ceph cluster that used dissimilar hardware with a 5-node uniform cluster made up of Minisforum MS-01’s. These are still great little machines and have native 2.5 and 10 gig networking built in, with (2) SFP+ connections. I already had (2) MS-01s that I had previously been using to run my VMware ESXi cluster connected to an all flash NVMe Terramaster NAS.

Read my review of the MS-01 here: Minisforum MS-01 Review: Best Home Server Mini PC Early 2024.

Network ports on the minisforum ms 01 for my proxmox ceph home lab
Network ports on the minisforum ms 01 for my proxmox ceph home lab

What I decided to do was buy (3) more MS-01 barebones units and use the memory and storage from my other mini PCs that would be replaced. This resulted in a uniform 5 node Ceph cluster running 10 GbE with erasure coding. Awesome!

Minisforum ms 01 exploded view of drives and internals
Minisforum ms 01 exploded view of drives and internals

So the reasons come down to the following for using uniform nodes:

  • 10 gig networking for ALL the nodes in the Ceph cluster
  • Uniform performance from all nodes
  • No inconsistencies for HCI that would require troubleshooting and chasing different hardware

Ceph is extremely forgiving, but HCI in general does MUCH better on uniform hardware with the fewest amount of variables. Same machines, same networking, same storage, etc.

Storage configuration of the MS-01s with Ceph

Currently with the MS-01s, I am running the following disks in the (3) NVMe slots of the MS-01. The M.2 slot that is closest to the U.2 switch is a PCI-e 4×4 and is the fastest slot in the MS-01. So, I have a 2 TB Samsung 980 2TB NVMe drive in each of the 5 MS-01 nodes in the cluster in this slot as the first OSD for Ceph. Then, in the next slot, you have a PCI-e 3.0×4 slot that I have also used for Ceph. Two fo the nodes have a 2TB drive and 3 of the nodes have a 1 TB drive here. Then in the slot furthest away from the U.2 switch in the MS-01, I have the boot M.2 drive. This is the slowest slot in the MS-01 so reserving this for the Proxmox install itself. These are 1TB PCI-e 3.0 NVMe drives in each of my MS-01s.

Viewing the drive configuration in one of my ms 01s
Viewing the drive configuration in one of my ms 01s

Careful migration to get to this

You may be wondering how I did the migration from all dissimilar mini PCs to the 5 MS-01s. This happened very carefully and one mini PC at a time. Believe it or not, what I was able to do was simply pull the drives out of the other mini PCs and install into the new MS-01s and Proxmox just picked right up and figured itself out on the new hardware.

The only thing I had to do was reconfigure the networking on the MS-01s Proxmox installation once things booted since the NIC naming changed with the different hardware.

This was a quick edit of the network config file in Proxmox:

/etc/network/interfaces

Here I just simply changed the network interfaces for the default Linux bridge to the names of the new adapters. Below is an example of one of my PVE host’s network configuration:

auto lo
iface lo inet loopback

iface enp2s0f0np0 inet manual
        mtu 9000

iface enp2s0f1np1 inet manual
        mtu 9000

iface wlo1 inet manual

auto bond0
iface bond0 inet manual
        bond-slaves enp2s0f0np0 enp2s0f1np1
        bond-miimon 100
        bond-mode 802.3ad
        bond-xmit-hash-policy layer2+3
        mtu 9000

auto vmbr0
iface vmbr0 inet static
        address 10.3.33.210/24
        gateway 10.3.33.1
        bridge-ports bond0
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 2-4094
        mtu 9000

# VLAN 2 - Dedicated VM bridge (no VM tagging needed)
auto bond0.2
iface bond0.2 inet manual
        vlan-raw-device bond0
        mtu 9000

auto vmbr2
iface vmbr2 inet manual
        bridge-ports bond0.2
        bridge-stp off
        bridge-fd 0
        mtu 9000

# VLAN 10 - Dedicated VM bridge (no VM tagging needed)
auto bond0.10
iface bond0.10 inet manual
        vlan-raw-device bond0
        mtu 9000

auto vmbr10
iface vmbr10 inet manual
        bridge-ports bond0.10
        bridge-stp off
        bridge-fd 0
        mtu 9000

# VLAN 149 - Dedicated VM bridge (no VM tagging needed)
auto bond0.149
iface bond0.149 inet manual
        vlan-raw-device bond0
        mtu 9000

auto vmbr149
iface vmbr149 inet manual
        bridge-ports bond0.149
        bridge-stp off
        bridge-fd 0
        mtu 9000

# VLAN 222 - Dedicated VM bridge (no VM tagging needed)
auto bond0.222
iface bond0.222 inet manual
        vlan-raw-device bond0
        mtu 9000

auto vmbr222
iface vmbr222 inet manual
        bridge-ports bond0.222
        bridge-stp off
        bridge-fd 0
        mtu 9000

auto vmbr0.334
iface vmbr0.334 inet static
        address 10.3.34.210/24
        mtu 9000
#Ceph

auto vmbr0.335
iface vmbr0.335 inet static
        address 10.3.35.210/24
        mtu 9000
#Cluster

auto vmbr0.336
iface vmbr0.336 inet static
        address 10.3.36.210/24
        mtu 9000
#Migration

source /etc/network/interfaces.d/*

One of my mini PCs required more work which I documented in another blog post my journey getting from a SATA SSD over to an M.2 drive. I used Hiren’s Boot CD and Macrium Reflect to image the SATA SSD over to the M.2 drive. This worked flawlessly and had zero issues doing that.

The Ceph lessons that understanding that took the most time

Ceph was one of the areas for me where the real learning and experience of “doing” really paid off. Ceph is incredibly forgiving to be honest. It will absolutely run when you think it may not and it will try its best to get your data back if things come back up.

One of the things I learned with my setup is provisioning Ceph with erasure coding, but also about setting it up for autogrow placement groups. I configured it so that these autogrow on their own when more storage is added.

However, I also shot myself in the foot on one of my nodes and actually rebuilt and wiped the wrong drive! DOH! In looking at the output of lsblk I misread the NVMe drive and zapped the wrong NVMe identifier. But, all was not lost. I simply removed the OSD that I actually wiped from the Ceph config to be safe and then just brought it back in as a new OSD and let Ceph do what it does and copy and move data back. do yourself a favor and double/triple check to make sure you have the right drive before you wipe anything ever.

My erasure coding setup

So here are the details of my erasure coding Ceph configuration.

Viewing the details of the erasure coding profile
Viewing the details of the erasure coding profile

Erasure Coding Scheme: 3+2 and what this means.

  • k=3 – Data is split into 3 data chunks
  • m=2 – There are 2 parity/coding chunks
  • Total: 5 chunks stored across your cluster

Storage Efficiency:

  • My cluster can tolerate 2 simultaneous failures (any 2 hosts/OSDs)
  • Storage overhead: 66.67% efficiency (uses 5 chunks to store 3 chunks worth of data)
  • This is much better than 3x replication (33% efficiency)

What this configuration means in real-world terms:

  • With 5 total chunks, you need at least 3 chunks available to read/write data
  • Can lose any 2 hosts and still access your data
  • Uses roughly 67% less storage than triple replication
  • Great balance of efficiency and reliability for your setup

Networking took the most time in the cluster to get right

The main reason that I went with the MS-01s in the Ceph cluster was for 10 gig networking. Ceph really opens up in terms of performance when you have 10 gig in place. So, I had a plan of action in my cluster to get up to 10 gig and an order of operations.

  1. Replace the dissimilar mini PCs with MS-01s to get the 2 SFP+ ports in place
  2. Bought a 10 gig switch for the mini rack top of rack switch. You can read my review on that here:
  3. Migrate from the 2.5 gig ports on the MS-01s currently up to the 10 gig SFP+ ports
  4. Get a single 10 gig connection up and running with the MS-01s
  5. Get both SFP+ ports up and running with connectivity on the cluster with LACP in place

So, I wanted to take a guaged approach to introducing 10 gig connectivity in the cluster. Bonding them with LACP seemed to be the obvious choice to make the best use of both “lanes” of 10 gig connectivity to work together in a meaningful way. This was the plan, but I ran into some “pain” along the way, mostly due to some mistakes I made in implementing the config.

Jumbo frame issues

The first issue I had was self-inflicted jumbo frame issues. I missed the MTU config on a couple of my MS-01s Proxmox networking config and had a few hours of troubleshooting some really weird behavior because of this.

The dangerous thing with Jumbo frames is they don’t always give you an obvious issue that you can easily pinpoint. Instead, you may see some servies work just fine with a mismatched jumbo frame configuration but then see other apps and services have all kinds of issues. This is exactly what I saw.

Because:

  • Small control packets work
  • ARP works
  • ICMP 1500 works
  • SSH works
  • Web UI works

But:

  • Storage I/O stalls
  • Ceph flaps
  • VM migration fails
  • SMB metadata errors
  • Only large bursts fail

It is easy to forget, but you actually need to make sure you have jumbo frames enabled in the entire chain of communication:

VM → virtio → bridge → VLAN → NIC → Switch → NIC → bridge → VM. Even if one hop is 1500, large frames get dropped somewhere in that chain.

Linux bridge, VLAN tagging, and Firewall flag issues when used in combination

I learned something very interesting about Proxmox and Linux bridges in general that was discovered after I started trying to enable LACP across my cluster hosts. After enabling the first host with LACP, everything was working fine. But after enabling LACP on the second host, I started to see very weird issues in my cluster. I couldn’t ping certain hosts on the network, or certain hosts even on other segments started going down. Very strange.

When I first looked at this, it looked like “LACP was being flaky,” or I was having issues with my new switch. But, I discovered the issue was not LACP itself, but how Proxmox handles VLAN tagging when the VM firewall is enabled.

When you configure a VM NIC with:

  • bridge=vmbr0
  • tag=<VLAN_ID>
  • and firewall=1

Proxmox dynamically creates additional bridge and VLAN subinterfaces behind the scenes. These are not defined in /etc/network/interfaces. They are generated at runtime to allow per-VM firewall filtering on tagged traffic.

For example, if a VM is configured with tag=149 and firewall enabled, Proxmox can create interfaces such as:

  • vmbr0v149
  • bond0.149
  • nicX.149
  • and related fwbr, fwpr, and tap devices

These interfaces can form separate VLAN processing paths specifically for firewall filtering of your VMs. Under a single physical NIC, this often works without obvious symptoms. However, once I configured an 802.3ad LACP bond, that is when I started to see problems. The dynamically created VLAN subinterfaces could exist on the bond and possibly on the bond slaves. This will cause multiple VLAN plumbing paths.

I don’t think this created a classic spanning tree loop, but it did create parallel forwarding paths that were not part of my intended traffic flow. So, it led to inconsistent pings, etc.

Ceph and CephFS

In this cluster, I am running both Ceph and CephFS. CephFS is a file system that exists on top of Ceph underneath. So you get the resiliency of Ceph storage that underpins normal file storage that you can use for all kinds of things.

One of the cool things that I do with CephFS is run shared storage for my Kubernetes cluster. With CephFS, you get easy to interact with storage that is also shared across your nodes. So you can work with the files with any tool that you want to use to connect to your CephFS Proxmox host and copy or migrate data you want to live in CephFS.

Running Talos Kubernetes on Top of CephFS

Once I got my networking lined out with the bridge, VM VLAN tagging, and firewall flag configuration, layering my Talos Kubernetes setup on top of that was pretty simple. In this configuration, I run Talos Linux as virtual machines stored directly on my Ceph RDB storage, and then I have a storage provider that I have configured to work to have CephFS as the shared persistent storage for my Kubernetes cluster.

Some people have asked me or I have seen some criticism on my posts about this as to why I would choose to run Talos in Proxmox instead of on baremetal, especially with Sidero Omni to manage Talos? Well, I can tell you this setup works beautifully and gives you a lot of advantages I think. First of all, when you run Talos on top of Proxmox, it abstracts the Talos nodes from the underlying physical hardware.

Talos omni running on my proxmox and ceph home lab
Talos omni running on my proxmox and ceph home lab

This means if you want to take down a physical node for maintenance, you can still have ALL of your Talos Kubernetes nodes up and running. Also, it allows me to use the cluster and the hardware for more than just Kubernetes. I can use Proxmox to run standalone Docker nodes, Windows nodes, or really anything else I want to do with the resources. And, for me, options are king in the home lab.

All in all, I think this is a great combination of hypervisor, Kubernetes management, and immutable Kubernetes nodes with Talos that make this a killer setup.

Why five nodes was the right number

Could I have gotten away with three nodes? I could. That would have been the minimum. Four nodes would have added some margin in there as well. However, five nodes gives the cluster breathing room. With the 5 nodes, I can lose two nodes and still have quorum in the cluster. Ceph has more flexibility in data placement.

Viewing the proxmox cluster nodes with pvecm
Viewing the proxmox cluster nodes with pvecm

Also, maintenance windows are less stressful because taking a node offline doesn’t push the cluster near its limit or failure tolerance threshold.

Seventeen terabytes in a mini rack

While 17 terabytes might not sound massive compared to enterprise storage arrays, for me, it is pretty awesome to have an all NVMe distributed storage solution across five physical nodes and the cluster support VM migrations, Kubernetes workloads, and tolerate node failures like a boss.

Viewing ceph storage in pulse
Viewing ceph storage in pulse

And, what is even more cool is that all of this lives inside a compact mini rack footprint that is powered by small mini PCs. Having a setup like this no longer requires having enterprise server gear with all the power draw they require and it fits into a MUCH smaller footprint and is much quieter.

Wrapping up

I am really excited to see what I can accomplish with my new mini rack Proxmox-powered home lab that is running Ceph and CephFS, along with Talos Kubernetes. It can run basically anything I want to run at this point and I am really looking forward to the workloads and experimentation I can do in the lab in 2026. What do you guys think? Would you change anything that I have done here? I would like your honest feedback.

Google
Add as a preferred source on Google

Google is updating how articles are shown. Don’t miss our leading home lab and tech content, written by humans, by setting Virtualization Howto as a preferred source.

About The Author

Brandon Lee

Brandon Lee

Brandon Lee is the Senior Writer, Engineer and owner at Virtualizationhowto.com, and a 7-time VMware vExpert, with over two decades of experience in Information Technology. Having worked for numerous Fortune 500 companies as well as in various industries, He has extensive experience in various IT segments and is a strong advocate for open source technologies. Brandon holds many industry certifications, loves the outdoors and spending time with family. Also, he goes through the effort of testing and troubleshooting issues, so you don't have to.

5 1 vote
Article Rating
Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments