Jumbo Frames Broke My Proxmox Cluster and Here’s Why

Proxmox jumbo frames

If you have ever setup a production virtualization environment, including, VMware, Hyper-V, Proxmox, and others, when it comes to networking and storage efficiency, jumbo frames always come into the conversation. On paper jumbo frames are the MTU value that supports high throughput storage traffic with the least penalty on processing power. With my new Proxmox mini rack running Ceph erasure coding, jumbo frames seemed to be a given. Also, with VM migrations, and other large traffic loads, they make sense. However, after I enabled jumbo frames in my cluster, things started breaking in ways that were subtle, frustrating, and at times completely misleading. If you are running Proxmox, read on as this is what you need to know before you enable jumbo frames.

Why I enabled jumbo frames

When thinking about how to best design my recent mini cluster build with Proxmox, 10 gig networking was a given. I decided to retrofit the Proxmox cluster with (5) Minisforum MS-01s for compute, mainly because these little mini PCs have a wealth of networking packed into a small package. They have (2) 2.5 GbE connections and (2) 10 gig SFP+ connections.

In my opinion, 10 gig networking is where virtualization environments really start to perform and shine in the way that you expect. Anything under this, and you are going to see things that are sluggish from time to time and maybe even terrible performance (in the case of 1 GbE for HCI).

For my configuration, jumbo frames seemed like a “shoe in” for the config that I was running:

  • Five Minisforum MS-01 nodes
  • Dual 10Gb SFP+ per node
  • LACP configured
  • Dedicated VLANs for Ceph public, cluster, and migrations
  • NVMe OSDs pushing real throughput
Jumbo frames configuration
Jumbo frames configuration

In case you are wondering why jumbo frames are desired, they are supposed to reduce CPU overhead and increase efficiency for large payload transfers that are common with storage traffic and other types of traffic like VM migration networks. Ceph traffic definitely qualifies under this description. The Ceph OSDs have data that is constantly moving and replicating around or blocks being written or read.

Reducing packet processing overhead is a win when you are talking about performance. When it comes to jumbo frames and my Proxmox environment I had all of the following requirements met:

  • The switch supported MTU 9000
  • The NICs supported MTU 9000
  • Debian Linux underneath Proxmox supports MTU 9000
  • Proxmox bridges support jumbo frames

So, I had everything configured for jumbo frames, or so I thought.

What happened?

The first signs were not dramatic or no complete “outages” per se, but I had some really weird issues that started to crop up. No red lights, no Ceph health errors, but the first thing that went for me was my Docker Swarm cluster that was running Microceph started having storage issues. Swarm nodes couldn’t connect to each other or could sporadically.

My Swarm cluster basically bit the dust with services going offline and back online, then back off again. I thought it was related to a specific node, but looking back I think it had to do with which specific host this node was on.

Another anomaly cropped up had to do with one of the other Swarm hosts I had that Veeam could no longer connect to. Why? I wondered:

Veeam error with backing up my swarm host
Veeam error with backing up my swarm host

Outside of the errors mentioned above, I started noticing:

  • VM migrations occasionally hanging
  • Ceph latency spikes that did not make sense
  • SMB transfers stalling mid-transfer
  • Random timeouts between nodes
  • Inconsistent iperf results

The worst kind of problem is the one where things mostly work. Proxmox showed green. Ceph showed active and clean. The cluster had quorum. But performance was inconsistent and unpredictable. When I ran certain tests, I could push high throughput. Other times, the same test would stall or drop. However, this is the behavior that tipped me off that something wasn’t right or consistent with jumbo frames.

The real problem with jumbo frames

Jumbo frames are not an on or off setting. They are an end-to-end commitment. The problem often comes in as it did with my configuration when you inadvertently either don’t enable it, or you do enable it on bridges in Proxmox that you didn’t mean to enable it.

What do I mean by that? Well, typically, you only want to enable Jumbo frames on “service type” network traffic, not clients or servers you are running within Proxmox. This can cause issues as there is no TCP/IP mechanism that “discovers” what MTU that a client is running on one side and then only communicates with the agreed upon MTU or so that I know of. This means, if a VM running in Proxmox is powered up on a Linux bridge that supports jumbo frames, the virtual NICs by default with Proxmox virtual machines are set to inherit the MTU value of the parent bridge device. So, your VM will inherit this setting as well.

Mtu value is same as bridge for a proxmox vm unless explicitly set
Mtu value is same as bridge for a proxmox vm unless explicitly set

So where this can cause you grief is if you have servers on different networks for instance that need to be routed to communicate. In my case, my Veeam server is on a management VLAN, and servers are on a different VLAN. Even though both bridges in Proxmox were set to jumbo frames, the router/firewall in between was not. So both servers on each side though they were perfectly fine communicating with jumbo frames, but the router was not. Leading to fragmentation, retransmits, etc.

Jumbo frames and the network path in your proxmox cluster
Jumbo frames and the network path in your proxmox cluster

To reiterate, every device along the network path must support the same MTU.

That means:

  • Proxmox bridge interfaces
  • Bonded interfaces
  • VLAN subinterfaces
  • Physical NICs
  • Switch ports
  • LACP bonds on the switch
  • Any intermediate switches
  • Any firewalls in the path

If you miss one you create fragmentation or silent packet drops. I didn’t intentionally set jumbo frames on my server VLANs, but in the haste of getting things configured, I inadvertently did. Double and triple check your jumbo frames configuration.

Don’t enable jumbo frames on virtual machine networks

I think there is a misconception among those getting started when they hear about jumbo frames, they might think, well I can enable that on my clients and servers and this will make everything faster! Wrong.

There is usually not a need to use jumbo frames on virtual machine networks and those bridges backing them. If you do, as I have shown above, you will get VERY sporadic connectivity and all kinds of problems if you have it enabled, even inadvertently.

You may have a special case to do this for a single VM or two depending on their role, but you can create a network just for this purpose. In general, when you enable it for virtual machines, most modern operating systems will inherit the bridge setting for jumbo frames and will use that setting. Then you are in a world of hurt trying to figure out why things won’t communicate or they sporadically communicate.

So really and truly, jumbo frames are usually only needed or beneficial in your “special service” networks like storage networks, migration networks and other traffic types with large flows.

Beware of mixed MTU sizes between same bridge names and different Proxmox hosts

Also, you have to be aware of the fact that even on the same named bridge between different Proxmox hosts, you could have mismatches in the MTU settings on those bridges. Each host can have unique configurations.

This can show up in weird ways and you will see errors related to mismatched MTU sizes between different hosts in normal everyday operations. While I was in the process of changing the MTU sizes for VM bridges, I saw one of these errors, when I had reverted a host to 1500 MTU size and a VM that I tried to migrate to it still had a 9000 MTU size.

Jumbo frames mismatch between bridges on different hosts
Jumbo frames mismatch between bridges on different hosts

As you can see this can cause all kinds of issues when you have MTU variances not only between different devices, but also between the Proxmox hosts within the same cluster.

LACP can make this even more confusing

When you are running dual 10Gb bonded interfaces, traffic is hashed across links. If one physical port on the switch had correct MTU and another did not, traffic flows may be successful on one path and then fail on another.

That can explain random behavior with jumbo frames. Make sure you have your physical links set to 9000 along with the bridge itself and the bond. Strange problems will definitely rear their head when flows land on one link that is on a clean 9000 path vs another traffic flow that lands on a 1500 MTU bottleneck and mismatch.

Lacp links with proxmox jumbo frames misconfigured in the cluster
Lacp links with proxmox jumbo frames misconfigured in the cluster

Basic test to make sure you have have jumbo frames enabled

One of the most basic tests that is helpful to test whether or not you have jumbo frames enabled is running the ping command with the following command parameters setting the MTU size.

From one node to another, run the following on your Proxmox hosts or from within a Linux VM on the segment you are testing.

ping -M do -s 8972 <target-ip>

If MTU 9000 is truly working end-to-end, that ping should be successful. Below is me pinging from my pvehost01 to a Ceph IP interface on pvehost02:

Jumbo frames successful using the ping command with jumbo frame packets
Jumbo frames successful using the ping command with jumbo frame packets

You may want to see jumbo frame pings fail on management interfaces or VLANs where VMs are connected to your Linux bridges. Below is me pinging to the management VLAN interface between Proxmox hosts and the untagged VLAN interface where I have a handful of VMs connected.

Jumbo frames failing from one proxmox host to another on a management link
Jumbo frames failing from one proxmox host to another on a management link

Example interface configuration on one of my Proxmox hosts

Here is an example of my working config after I lined out where jumbo frames needed to be. Note where I have jumbo frames configured:

  • Physical interfaces
  • Bon0 interface
  • Ceph interface
  • Migration interface

Below is my file from my pvehost01 found at /etc/network/interfaces.

auto lo
iface lo inet loopback

iface enp2s0f0np0 inet manual
        mtu 9000

iface enp2s0f1np1 inet manual
        mtu 9000

iface wlo1 inet manual

auto bond0
iface bond0 inet manual
        bond-slaves enp2s0f0np0 enp2s0f1np1
        bond-miimon 100
        bond-mode 802.3ad
        bond-xmit-hash-policy layer2+3
        mtu 9000

auto vmbr0
iface vmbr0 inet static
        address 10.3.33.210/24
        gateway 10.3.33.1
        bridge-ports bond0
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 2-4094
        mtu 1500

# VLAN 2 - Dedicated VM bridge (no VM tagging needed)
auto bond0.2
iface bond0.2 inet manual
        vlan-raw-device bond0
        mtu 1500

auto vmbr2
iface vmbr2 inet manual
        bridge-ports bond0.2
        bridge-stp off
        bridge-fd 0
        mtu 1500

# VLAN 10 - Dedicated VM bridge (no VM tagging needed)
auto bond0.10
iface bond0.10 inet manual
        vlan-raw-device bond0
        mtu 1500

auto vmbr10
iface vmbr10 inet manual
        bridge-ports bond0.10
        bridge-stp off
        bridge-fd 0
        mtu 1500

# VLAN 149 - Dedicated VM bridge (no VM tagging needed)
auto bond0.149
iface bond0.149 inet manual
        vlan-raw-device bond0
        mtu 1500

auto vmbr149
iface vmbr149 inet manual
        bridge-ports bond0.149
        bridge-stp off
        bridge-fd 0
        mtu 1500

# VLAN 222 - Dedicated VM bridge (no VM tagging needed)
auto bond0.222
iface bond0.222 inet manual
        vlan-raw-device bond0
        mtu 1500

auto vmbr222
iface vmbr222 inet manual
        bridge-ports bond0.222
        bridge-stp off
        bridge-fd 0
        mtu 1500

# Ceph
auto bond0.334
iface bond0.334 inet static
        address 10.3.34.210/24
        vlan-raw-device bond0
        mtu 9000

# Cluster
auto bond0.335
iface bond0.335 inet static
        address 10.3.35.210/24
        vlan-raw-device bond0
        mtu 1500

# Migration
auto bond0.336
iface bond0.336 inet static
        address 10.3.36.210/24
        vlan-raw-device bond0
        mtu 9000

source /etc/network/interfaces.d/*

Summary of important points to remember with jumbo frames

Take note of these important points that can help when you introduce jumbo frames into your Proxmox cluster.

RecommendationWhy It Matters
Validate MTU end-to-end using ping -M do -s 8972 before trusting configurationInterface settings don’t really mean jumbo frames is enabled. Only testing with actual jumbo frame packets with 9000 bytes does
Keep jumbo frames isolated to a dedicated storage or Ceph networks or only networks that benefitStorage replication benefits most. Management and VM networks often causes issues and complexity due to routing, etc
Make sure MTU is consistent across physical NICs, bonds, bridges, VLANs, and switch portsOne mismatched interface creates fragmentation or silent packet drops
Check each LACP member port individually on the switchA single port at 1500 inside a bond can create intermittent and hard-to-diagnose behavior
Be cautious with firewall-enabled bridges and VLAN-tagged interfacesThese can introduce additional interfaces that may not inherit the MTU setting you expect
Do measure your performance before and after enabling jumbo framesImprovements with jumbo frames are not guaranteed. In some home lab setups, the difference may be minimal
Don’t change the MTU value during major network redesignsWhen you combine MTU changes with LACP, VLAN restructuring, or Ceph reconfiguration, it can make troubleshooting much harder if things don’t work after the changes are made
Document and version control network configuration filesFuture tweaks can quietly reintroduce MTU inconsistencies without you realizing it

Wrapping up

A jumbo frames Proxmox configuration has a lot of advantages for your cluster when you are running hyperconverged storage like Ceph. Even with traditional storage using iSCSI and NFS, there are advantages. The larger packets are more efficient and they take less of a toll on your CPU than standard sized 1500 MTU packets. However, like a lot of other things, jumbo frames requires absolute perfection when it comes to consistency. If you miss one thing or design things poorly, you will cause more harm than good by turning them on. Also, a key point to remember is that it almost always the better decision to leave jumbo frames turned off on bridges where you connect virtual machines. How about you? Are you running jumbo frames in your home lab? What issues have you ran into?

Google
Add as a preferred source on Google

Google is updating how articles are shown. Don’t miss our leading home lab and tech content, written by humans, by setting Virtualization Howto as a preferred source.

About The Author

Brandon Lee

Brandon Lee

Brandon Lee is the Senior Writer, Engineer and owner at Virtualizationhowto.com, and a 7-time VMware vExpert, with over two decades of experience in Information Technology. Having worked for numerous Fortune 500 companies as well as in various industries, He has extensive experience in various IT segments and is a strong advocate for open source technologies. Brandon holds many industry certifications, loves the outdoors and spending time with family. Also, he goes through the effort of testing and troubleshooting issues, so you don't have to.

0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments