Thursday, October 11, 2012

VMWare ESXi 4/5 APD Lockup Problem

Problem: You click Rescan All... in the VSphere client and the ESXi host becomes unmanageable due a dead LUN or downed path of offlined volume (this is for iSCSI, I dont know about any others if this problem still happens).  Only fix is to hard-reboot the server.

Despite this long and lengthy from VMware on how to do this cleanly (http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2004605) it still is prone to a lot of errors and most likely this will not work for your environment.  I'm not even about to connect to 6 different hosts and run all this nonsense to make sure VMware cleanly unmounts a volume.

The quick fix: Go to your Storage Adapters and click on the properties of iSCSI Software Adapter.  Click the Static Discovery tab.  Remove the dead connections.  Then you can rescan without the host locking up.  No other method has proven reliable for me other than this.

Update: this still didn't fix the issue.  The only real way to overcome this problem is to upgrade to 5.1 where they finally fixed the issue.