ESXi host on Hybrid...
 
Notifications
Clear all

ESXi host on Hybrid cloud Homelab routing question

5 Posts
2 Users
2 Likes
105 Views
(@joseap)
Posts: 3
Active Member
Topic starter
 

Hi,

I have a headache with my homelab network and I'm stuck, maybe someone could help me a bit as I'm not networking guy, in fact I'm hardware tech.

I've been rebuilding my vmware homelab (I know time to search new hypervisor that's one reason to have a homelab working for testing):

basically, 3 esxi hosts at home and 1 on vps cloud (where vcenter lives usually, at least by now)

VPS server is a cheap one, with only 1 nic, so i installed pfsense in vm on it, also in in one of the esxi hosts at home, so i made a working Wireguard site to site vpn with pfsense plugin and at home i have a 172.26.0.0/24 network, and at vps server i have 172.16.0.0/24. I can manage everything from my home laptop, where I  also installed vpn client for when i'm as roadwarrior.

Now I'm stuck for days with this issue when I wanted add nfs shred storage at esxi vps server:

In my VPS Server I created 2 vSwitch

-vswith 0 with VM Network portgroup where the pfsense WAN is (also a vmk0 for the original management network, that i'll plan remove), connected to vmnic0 (the only one).

-vSwitch 1, i made a lan management network vmk1 and a 'NAT network' portgroup with vcenter, debian and LAN of pfsense for and 'isolated net', without physical nic. All vm have internet access and homelab connection (from 172.16 to 172.26 and viceversa).

Ok, now I wanted in esxi host at home share nfs storage, I can add it perfectly to all home host, and i can also add it manually to the debian vm at cloud server. But, I can't add it to the VPS server esxi host. Checking, vcenter is able to ping 172.26.0.0/24 (home), as far as i added all hosts etc., but when i check vps esxi host console, I can't ping my 172.26.0.0 network.

esxcli netwotk ip route ipv4 list result:

esxcliroute

The marked one are adds I made today for testing, with no success, can be ignored. I'm lost tbh, i see all routes are added to vmk0 interface, and however VPS esxi host can ping 172.16.0.0 network (vcenter and vm on it, dns primary is pfsense ip), i can't make it route to my home network to reach the storage server. From vps esxi host can't ping internet either (I don't care, i won't it public as far as I'm connected with Site to Site VPN, but I don't know if it affects).

For me, if everything is connected except VPS esxi host, pfsense isn't the issue (It's very permisive as far i'm trying first make it work before add extra security)

Someone could help me to point in right direction if it's a esxi host routing or configuration mistake, or if it's related to pfsense firewall or routing? Maybe it's not the right way to isolate cloud esxi host... any help will be appreciated

Thanks

 

This topic was modified 4 weeks ago by joseap
 
Posted : 25/03/2024 5:24 pm
Brandon Lee reacted
Brandon Lee
(@brandon-lee)
Posts: 543
Member Admin
 

@joseap welcome to the forums! It definitely sounds like a routing issue to be honest. So let me summarize so you can see if I am understanding correctly:

  • All your VMs can see your home network that is running the wireguard tunnel between the two pfSense instances.
  • Your ESXi server in the VPS can't see your home network at all? So, you can't ping 172.26.0.0/24 network at all from the ESXi server?
  • Also, what IP is the 172.16.0.1 IP? Is this your pfSense firewall?
  • What IP address is your ESXi server? Is it on this same subnet?

Thanks @joseap

 
Posted : 25/03/2024 9:25 pm
(@joseap)
Posts: 3
Active Member
Topic starter
 

@brandon-lee Yes, you understood it right.

172.16.0.1 is effectively pfsense firewall at VPS server side, as 172.26.0.1 is at home side, and VPS esxi host is on same 172.16.0.0/24 network (i can admin it directly by vmk1 management network 172.16.0.10 in the vswitch1 through VPN from home, obviously also from vcenter), with 172.16.0.1 as primary dns and secondary is a generic one.

What I don't understand is why isn't routing pfsense to home network if everything else works and it connect the 172.16 network. Any firewall rule blocking or not allowing config mistake shouldn't be reflected with same issue in rest of VMs?

I'm not sure if it's an issue having 2 vswitch with 2 vmk(0/1) port group, as far as when I try add a new route directly in esxi host to 172.26.0.0/24 using pfsense 172.16.0.1 as gateway it adds to vmk0 interface (remove it maybe change anything?)

Thanks to you for helping

 

 
Posted : 26/03/2024 12:54 am
(@joseap)
Posts: 3
Active Member
Topic starter
 

Well, good notices, but few explanations (usually happens when i'm already woke up and start to work with something and i'm not still too conscient about what I'm doing LoL ). I'll explain what I did, maybe somethings are nosense, but I hate find posts around forums just saying, hey i fixed it without any king of explanation.

I thought about vmk1 issue I asked before, and I edited the port group Management Network in vSwitch 1 (I noticed that when i do right click over 3 dots menu that gives you See-Edit config and Delete, it keeps grey until i move the cursor outside the Virtual Switch window, then It activate the options Edit and delete). So I had DCHP in IPv4 config, i changed to static (keeping the same config), then some issues happened about Heartbeats in vCenter, esxi host couldn't ping 172.16.0.1 gateway, so i went to ipmi service, changed the esxi host ip from 172.16.0.100 to .10 (gave me error that was alreay in use), pings were weird, so I restarted Management Network, i did ping again to 172.16 and worked, tested a google dns ping, and worked, that didn't worked before, so I tested ping 172.26 network, and yes, It worked.

Then I listed routes and result is:

esxi routing vmk1 fixed

 where's possible to see that vmk0 disappeared (I'll need to test rebooting VPS server and more test) and every route goes on vmk1, so i guess, without any clue exactly why (I'm just learning and fixing everything on the way) that it affected on esxi host routing (maybe not able to route between vSwitches and fixed it when it points to vSwitch1?)

If someone is able to do a logical explanation I'll be happy to learn about. The point of NFS was do some backups before move forward with distributed switches in case of mess anything (I'm trying isolate my home network from homelab), so by now i'll do it and once i can test properly and secure without mess all homelab, i'll try replicate the issue checking steps. Hope this help for a future messy guy like me if needed

Thanks to anyone that checked anything about this

 
Posted : 26/03/2024 3:43 pm
Brandon Lee
(@brandon-lee)
Posts: 543
Member Admin
 

@joseap In looking closer at your screenshot, i think the issue may be your default gateway is on vmk0 and your adjacent network is showing on vmk1.

Vmk1 has a default gateway of 0.0.0.0 in your screenshot and is probably not making it to your default gateway which is where the pfSense box should know about the wireguard route. 

Notice in a screenshot of a test ESXi host below. The default gateway and adjacent network is on the same vmkernel port:

image

I think if you get your adjacent network and default gateway back on the same vmkernel port, it will resolve the issue 👍 

 
Posted : 26/03/2024 3:48 pm
joseap reacted