Wednesday, 17 February 2016

Large Packet Loss At Guest OS Level in VMware ESXi When Using VMXNET3


Symptoms

When using the VMXNET3 driver on ESXi 4.x, 5.x or 6.0, you see significant packet loss during periods of very high traffic bursts. Symptoms may include one or more of the following:

  • Poor performance
  • Packet loss
  • Network latency
  • Slow data transfer

Causes

This issue occurs when packets are dropped during  high traffic bursts. This can occur due to a lack of receive and transmit buffer space or when receive traffic is speed-constrained, as, for example, with a traffic filter.

Resolutions

#1 Increase Windows Buffer Settings:


  1. Click Start > Control Panel > Device Manager.
  2. Right-click vmxnet3 and click Properties.
  3. Click the Advanced tab. 
  4. Click Small Rx Buffers and increase the value. The default value is 512  and the maximum is 8192.
  5. Click Rx Ring #1 Size and increase the value. The default value is 1024 and the maximum is 4096.


Notes:
These changes will happen on the fly, so no reboot is required. However, any application sensitive to TCP session disruption can likely fail and have to be restarted. This applies to RDP, so it is better to do this work in a console window.
This issue is seen in the Windows guest OS with a VMXNET3 vNIC. It can occur with versions besides 2008 R2.
It is important to increase the value of Small Rx Buffers and Rx Ring #1 gradually to avoid drastically increasing the memory overhead on the host and possibly causing performance issues if resources are close to capacity.
If this issue occurs on only 2-3 virtual machines, set the value of Small Rx Buffers and Rx Ring #1 to the maximum value. Monitor virtual machine performance to see if this resolves the issue.
The Small Rx Buffers and Rx Ring #1 variables affect non-jumbo frame traffic only on the adapter.

#2 Change the Virtual Machine network adapter type to E1000:

To change the network adapter on a virtual machine:
Right-click the virtual machine and click Edit Settings.
Click Add.
Click Ethernet Adapter and click Next.
In the Type field, select E1000.
Select the desired Network Connection label.
Select Connect at power and click Next.
Click Finish, then click OK.

At this point, you may remove the old adapter and configure the new network adapter with your desired network settings:
Right-click the virtual machine and click Edit Settings.
Click the original VMXNET3 network adapter.
Make a note of the selected Network Connection label.
Click Remove.
Click on the new E1000 network adapter.
Change the Network Connection label to the value you noted in step 3.
Click OK.

Note: The virtual machine may require a reboot for the changes to take effect.

#3 Disable the TCP & UDP Checksum Offloading feature in Windows OS:

The issue may be caused by Windows TCP Stack offloading the usage of the network interface to the CPU. To resolve this issue, disable the TCP Checksum Offload feature, as well enable  RSS on the VMXNET3 driver.

Open the command prompt as administrator and run these commands:
netsh int tcp set global chimney=disabled
netsh int tcp set global autotuninglevel=disabled
netsh int tcp set global congestionprovider=none
netsh int tcp set global ecncapability=disabled
netsh int ip set global taskoffload=disabled
netsh int tcp set global timestamps=disabled
netsh int tcp set global dca=disabled

To validate type:
netsh int tcp show global

#4 Enable Received Side Scaling (RSS):

Only required where using multiple vCPUs.

Open the command prompt as administrator and run these commands:
netsh int tcp set global rss=enabled

To validate type:
netsh int tcp show global

References