I Thought ZFS Was Required for Proxmox Replication in My Home Lab but I Was Wrong

Proxmox replication without zfs

When it comes to Proxmox replication in the home lab or production for that matter, if you want to have replication you needed ZFS. That was just how the solution is engineered in the box. This is built into the platform and well documented that way. So, since I haven’t been running ZFS and have been using Ceph and now ISCSI with my clusters, I have just tried to design around that assumption. But recently, that assumption got flipped on its head and it has changed how I think about storage design in my Proxmox home lab.

Why ZFS replication feels like the only option

If you have spent any time with Proxmox, you know that replication is tightly tied to ZFS. The built in replication feature works by leveraging ZFS snapshots and send and receive operations. You can read this in the Proxmox Admin Guide for PVE 9.x here: Proxmox VE Administration Guide.

Proxmox supported storage with replication
Proxmox supported storage with replication

That gives you a lot of benefits when using ZFS and replication:

  • Incremental replication
  • Efficient snapshot transfers
  • Built in scheduling
  • Tight integration with the Proxmox UI
  • Automatically changes direction if you migrate
  • It’s well documented and works

But, the downside here is that if the storage is not ZFS, the replication feature simply does not apply. LVM, LVM thin, Ceph, iSCSI, etc do not get the same native replication capabilities. If you try to enable replication on a VM that sits on something outside of ZFS, you will see the following:

No volumes found that can be replicated
No volumes found that can be replicated

So naturally, most people conclude that if you want to make use of replication, you will need to configure and run ZFS storage. This is exactly where I was.

ZFS is great unless you want shared storage in a Proxmox cluster

Don’t get me wrong. ZFS is a great file system that works very well for local storage. However, once you shift into the mindset of a proper Proxmox cluster, this changes things. In a cluster, you want to have “shared storage” for all your nodes. This means that you want to have all of your cluster nodes be able to access the same storage at the same time and run different workloads from the same storage.

That is not possible with ZFS. So, you are stuck between capabilities depending on what type of storage you use natively in Proxmox.

  • If you use ZFS, this storage can’t be used in a cluster
  • But if you use ZFS you can use replication between ZFS storage
  • If you use a shared storage technology (Ceph, iSCSI), you can share between nodes
  • If you use a shared storage technology (Ceph, iSCSI), you can’t use native Proxmox replication

So it is a bit of a conundrum with the native functionality. Now, can you use something like Veeam to replicate your Proxmox virtual machines between two different environments? Yes, you absolutely can do that, and this is similar to how many who run VMware with Veeam are operating today.

Where the gap actually is in Proxmox

Proxmox itself is not lacking in replication capability. It is just that the “built-in” replication feature is tied specifically to ZFS. It seems like this would be an easy win for Proxmox development to add this native feature in there. But if you take a step back, replication with ZFS in Proxmox is really just:

  • Copying VM disk state from one node to another
  • Doing it in a consistent way
  • Keeping those copies reasonably up to date

ZFS is the vehicle that solves that using snapshots and send streams. But when it comes down to it, this is not the only way to achieve the same outcome. The gap is not that Proxmox cannot replicate non ZFS storage. The gap is that it does not have a native, general-purpose replication engine for all storage types.

The tool to replicate non-ZFS VMs

The other day, I stumbled onto functionality using PegaProx that was very intriguing to me. And this came right after I had just added iSCSI to my Proxmox cluster that was already running Ceph. In case you haven’t heard about PegaProx, I think it is probably the best third-party UI out there right now for Proxmox and has many great features that will help fill the gap of look and feel for vSphere admins moving away from the vSphere Client.

Check out my posts on PegaProx so far here:

In PegaProx underneath the Datacenter options for your cluster, there is the Replication menu. In looking at the replication options there, PegaProx actually has an option to use what they just refer to as Snapshot replication. The description for this replication type is this:

  • Clone + migrate approach. Works with any storagge (LVM, dir, etc)

Also, what is interesting in terms of efficiency, according to the official PegaProx documentation, it states:

  • Snapshot-based replication copies VM data between clusters. After initial full sync, transfers are incremental (delta only).
Snapshot replication in pegaprox
Snapshot replication in pegaprox

How to replicate VMs that aren’t on ZFS

Now that we see the tool and functionality to do the snapshot replication, let’s see what this workflow looks like. Click the +Add button on the Snapshot Replication section.

Adding snapshotshot replication using pegaprox
Adding snapshotshot replication using pegaprox

This will launch a window that looks like the following. As you can see in the dialog box, it has you select:

  • VM/CT
  • Target Node
  • Target Storage
  • Schedule
Creating a replication job dialog with the snapshot replication in pegaprox
Creating a replication job dialog with the snapshot replication in pegaprox

Note below, the target storage I have selected is an iSCSI LUN and not ZFS. The Scheduling options are also interesting. The default setting is every 6 hours. But as you can see you have the options:

  • Every 15 minutes
  • Every 30 minutes
  • Every hour
  • Every 6 hours (Default)
  • Daily
  • Custom

This sets up a CRON style schedule that will snapshot and replicate the VM at the specified interval.

Selecting the vm ct target node target storage and schedule for the replication job
Selecting the vm ct target node target storage and schedule for the replication job

Since I completed the dialog box above and set the options then click Create, you will see the VM/CT listed as Pending. This just simply means the replication hasn’t made its initial seed of the virtual machine and it is waiting on the replication interval.

Zfs replication with snapshot replication pending for a new configuration
Zfs replication with snapshot replication pending for a new configuration

This will kick the task off. You can select the task in PegaProx and view the status of the job.

Task status after kicking off the replication job manually
Task status after kicking off the replication job manually

If you scroll down in the status window you can see the status of the clone copy in real time.

Progress of the disk clone using the proxmox replication without zfs with pegaprox
Progress of the disk clone using the proxmox replication without zfs with pegaprox

After the clone completes successfully, it will display OK on the status of the replica.

Proxmox replication job without zfs has completed successfully
Proxmox replication job without zfs has completed successfully

What are the resulting virtual machines called? These are prepended with “repl” and then the VMID and the host that you targeted for the replication.

New replicated proxmox virtual machines in inventory
New replicated proxmox virtual machines in inventory

How I think this changes our storage options in home lab

Have you ever just used a technology because you wanted a single feature? That may be the case when some use ZFS honestly. It is a great filesystem, but doesn’t make sense in all scenarios. So if you only used ZFS simply to get replication capabilities, this can help to change that. As we see above, instead of relying on ZFS snapshots, this approach focuses on VM level snapshots or disk state and then transferring that between nodes.

I think this is great since it shifts replication from being a “storage” feature to being a “platform” feature and this can really change things. You are no longer tied to ZFS for the replication feature and opens your storage options up a lot.

In my case, now the following was possible:

  • Use iSCSI backed LVM or LVM thin storage
  • Leverage external SAN features like thin provisioning and snapshots
  • Keep centralized storage while still having replication
  • Mix and match storage types depending on the workload

Is there still a case for ZFS replication?

Understand that ZFS is probably the most efficient and powerful replication since it operates at the filesystem level with incremental changes. So for pure performance and efficiency, ZFS is the best option IF you are not limited by its other limitations for shared storage, etc.

Keep in mind according to the official documentation from PegaProx is also is doing incremental transfers, but arguably still won’t be as efficient as ZFS as the storage layer.

Personally, I would recommend using ZFS replication when you are running local disks and you want to have simple, built-in replication without any setup to speak of. The main reason to steer the other direction is if you have need for running shared storage between multiple nodes.

I think ZFS is still one of the strongest features or Proxmox and if the tool fits, definitely use it. This is not about replacing it entirely but about having options and having another way to replicate your data if you can’t use ZFS.

Wrapping up

The Proxmox ecosystem is growing and expanding with great tools that offer a lot of really nice features. Tools like PegaProx, Pulse, and ProxMenux are tools that immediately come to my mind. They drastically increase the toolset of your Proxmox environment with very little if any overhead and they are all easy to deploy. So, if you have been holding back from using iSCSI or other non ZFS storage because you thought replication was off the table, this may be worth revisiting now. What about you? Are you currently using ZFS replication? Is this feature of PegaProx something that interests you? Do let me know in the comments.

Google
Add as a preferred source on Google

Google is updating how articles are shown. Don’t miss our leading home lab and tech content, written by humans, by setting Virtualization Howto as a preferred source.

About The Author

Brandon Lee

Brandon Lee

Brandon Lee is the Senior Writer, Engineer and owner at Virtualizationhowto.com, and a 7-time VMware vExpert, with over two decades of experience in Information Technology. Having worked for numerous Fortune 500 companies as well as in various industries, He has extensive experience in various IT segments and is a strong advocate for open source technologies. Brandon holds many industry certifications, loves the outdoors and spending time with family. Also, he goes through the effort of testing and troubleshooting issues, so you don't have to.

0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments