- Understand the NIC Teaming failover types and related physical network settings
There are 5 NIC Teaming policies:
Route based on the originating virtual switch port ID
This is the default configuration and the one that’s most commonly used. It works by choosing an uplink based on the virtual port where the traffic entered the virtual switch. traffic from a given virtual Ethernet adapter is consistently sent to the same physical adapter unless there has been a failover, Replies are sent to the same physical adapter and a virtual machine can not use more than 1 uplink at a time unless it has multiple virtual Ethernet adapters. This setting is most used for 2 reasons – its the default and its also the one that has less overhead on the system.
Route Based on IP hash
This NIC teaming policy works on selecting an uplink based on a hash of the source and destination IP address of each packet, Link aggression can be used with this policy( grouping multiple physical adapters together to create 1 logical network pipe for higher bandwidth) all NICs in the team must be connected to the same physical switch or multiple switches in the same stack if link aggression is used. This policy doesn’t always provide reliable load balancing as the evenness of traffic distribution depends on the number of TCP/IP sessions to unique destinations.
Route Based on source MAC hash
Similar to the above IP based hash, this creates a hash on the source MAC address, this method has very little overhead,
Use explicit failover order
This is just how it’s described it will use the first physical up-link this is listed under active adapters, Not really load balancing.
Route based on Physical NIC load (vDS only)
this will determine the best NIC to use based on the actual load of all the Active Physical NICs. This is the only true load balancing option but does have an overhead associated with it.
- Determine and apply Failover settings
I would wrap both Network Failover Detection and Failover Order into this.
Network Failover Detection is how the ESXi hosts determines if a Physical network card is offline. there is 2 different methods that can be selected:
Link Status Only: This basically detects the direct physical link from the network card to the switch, if the cable was unplugged it would show up as disconnected. similar to your home computer, unplug the network cable and windows will show you a big red x on the network. this is the same deal. The issue with this tho is most data center solutions will generally have a server plugging into a distribution switch which then plugs into the core or other switched down stream. if any of these down stream switches was to go offline Link Status will still be fine as the first switch its plugged into is up and running fine.
Beacon Probing: This sounds pretty cool but it basically probes the network, it will send out a probe and the information it receives will determine network status, This setting can pick up miss configurations as well as physical failures. You can not use this is IP Hash load balancing policy is being used.
Failover Order – this is how it picks was adapters are used in the team or what the failover order is in case of failures. There is 3 sections in this policy:
Active Adapters: These are all the current actively used adapters.
Standby Adapters: Any adapter here is for standby only, this will only be used when there has been a physical failure of an active adapter.
Unused Adapters: Anything listed here will not be used at all
- Configure explicit failover to conform with VMware best practices
As the title suggests Explicit Failover is VMware’s best practice but to be honest no matter where I have worked its never used unless in the case where 2 x 10GB NICs are assigned to a vSwitch that hosts both vMotion and Management. Other than that I don’t see it and I wouldn’t use it as it only allows for a single physical network to be used at any given time, so in situations where you have 5 1GB NICs for the VM network why use only one when you can use all 5 and balance it over the 5.
Either way the idea behind this is that you setup a switch with say 2 Physical NICs like the one below.
Then you would configure the management port group to use physical NIC 1 and put physical NIC 2 into stand by, You would then Configure the vMotion port group the exact opposite like below.
This way the 2 different pieces of traffic are separate unless there is a failover event.
- Configure port groups to properly isolate network traffic
Where ever possible it is best to keep networks separate generally most people will use VLANs to achieve this as it creates a logical separation on the same physical network.
But in some cases and considering secuirty requirements physcal seperation is not negotiable.
Physical separation is achieved by creating separate vSwitches (Standard or Distributed) and assigning different physical NICs, this NICs would then be plugged into separate networks.
VLAN separation can be achieved through creating a single vSwitch (Standard or Distributed) and then configuring each port group with a different VLAN, VLANs must be configured down stream on the physical network so they may be routed.
Below would be 2 examples