NSX Edge load balancer nodes not accessible
A while back I ran into an issue where NSX Edge load balancer nodes where not accessible. For this article I recreated a simulair setup in my lab, which I will first describe. After that I will explain what is going wrong and describe the solution for it.
The lab setup contains:
- One NSX edge with load balancing enabled. It has 1 interface with an IP address 192.168.0.10 which is also used for the virtual server.
- Two Linux apache servers which are the load balancer nodes. I adjusted the background color of the default apache webpage. This way we can see which node is being accessed trough the load balancer. The servers are reachable on 192.168.0.20 and 192.168.0.30
- One Windows 10 machine (192.168.0.40) to access the webpages.
The load balancer has the two Linux servers configured as member nodes for a pool, which are monitored using the default http monitor.
There are two security groups present:
- Website SG, contains the two Linux servers
- Management SG, contains the Windows 10 virtual machine
These security groups are used in two security policies:
- Website SP, allows any traffic from and to all objects in the Website SG. Meaning the two Linux servers are allowed to communicate with eachother.
- Management SP, allows any traffic from Management SG to Website SG and back. So the Windows 10 machine can contact the webservers.
Now these particular security policies aren’t really strict, you will most likely use more secure policies in your production environment. But these will suffice as an example.
Using this configuration results in the pool statistics on the load balancer to show the nodes as “Down”. Meaning the load balancer can’t reach the nodes with the default http monitor.
Using the Windows 10 machine to see if the website can be opened we get the following results:
|Virtual server 192.168.0.10||Node 192.168.0.20||Node 192.168.0.30|
So we can reach the webservers directly, but not trough the load balancer.
Even though the configuration looks good, there is still one componant missing. Namely the Load balancer IP address! With the security groups and policies that are in place, we only allowed for the clients and servers to communicate with each other. Nowhere in the configuration do we allow for communication to and from the load balancer.
The problem with an Edge though is that you can’t include them into a security group directly. You can’t add them as a “virtual machine”, eventhough they look like that in the inventory. And you can’t use security tags either, since you aren’t allowed to place security tags on Edges and DLR’s.
To work around this problem you can create a IP set that contains the IP address that is being used by the virtual server on the ESG. If you include this IP set in the website SG, it will be picked up by the security polices. And thus allowing communication to and from the load balancer.
As you can see on the screenshots. The http healt monitor now shows both nodes as “UP”. And from the Windows 10 machine I’m now able to reach both webserver via the load balancer.
All in all it isn’t to difficult, it’s more a way of thinking when using microsegmentation :)
I ran in to the same problem some time ago while trying to automate this with vRA.. Have you tried that part?