What I learned today – HA Split Brain, USB “Support”, Fix Broken OVF Exports…
Following on from this weeks series – today I learned 3 new things – and I wasn’t even supposed to be working! After getting up at 4.00am to take my partner and her daughter to the airport, I thought I’d take it easy by going back to bed, and then later catching up with some telly on me Sky+ box… But even just by doing as little as that I learned a couple of tidbits – mainly by keeping a watchful eye on my email and the forums.
HA Split Brain (Split Brian?):
Split brain is HA situation where an ESX host becomes “orphaned” from the rest of the cluster because its primary service console network has failed. As you might know the COS network is used in the process of checking if an ESX host has suffered an untimely demise. If you fail to protect the COS network by giving vSwitch0 two NICs or by adding a 2nd COS network to say your VMotion switch, underdesired consequences can occour. Anyway, the time for detecting split brain used to be 15 seconds, for some reason this has changed to 12 seconds. I’m not 100% why, or if in fact the underlying value has changed – or that VMware has merely corrected its own documentation. You see its possible to get split brain in Vi3.5 happening if the network goes down for more than 12 seconds, but comes back up on the 13th, 14th or 15th second. I guess I will have to do some research on this one. Of course, the duration can be changed – and split brain is trival matter if you take the neccessary network redundency steps…
USB “Support”:
This is an odd one. You can in a VM on ESX4 add a USB Controller. The VM must be hardware level 7 for this to be the problem. The trouble is – it doesn’t do anything. And no amount of adding or loading drivers at the COS will allow you to plug a USB device into an ESX host, and have it appear in the VM. You might ask if it doesn’t work does that mean its bug. No, its not a bug. It’s just there, but you can’t owt with it. NICE! The best we can say at the moment is that its been put there for some future technology which may or may equally may not become available. As ever I personally feel some USB-over-IP device is the best way to go.
Fix Broken OVF Exports:
This happened to me twice now. I power down a perfectly functioning virtual appliance (in my case the UDA 2.0 which is currently in beta) and export it to be OVF Template using the vCenter4. To check the export has been successful I re-import it only to get to 100% complete, but end with an unpleasant error message “Failed to deploy OVF package: The remote server returned an error: (500) Internal Server error”
The last time this happened to me I twittered about it – finally some helpful guy who follows me on twitter helped me fixed the problem. [I only wish I could remember his name because he deserves the credit for this fix!] It’s not clear precisely why it happens, but appears that occasionally the Export function of vCenter4 creates an OVF file that misreports the size of the virtual disk. A quick edit to the OVF file is all that is needed. So below is my “bad” OVF File:
<?xml version=”1.0″ encoding=”UTF-8″?>
<!– Generated by VMware VirtualCenter Server, User: Administrator, UTC time: 2009-07-21T13:53:10.761609Z –>
<Envelope vmw:buildId=”build-162856″ xmlns=”http://schemas.dmtf.org/ovf/envelope/1″ xmlns:cim=”http://schemas.dmtf.org/wbem/wscim/1/common” xmlns:ovf=”http://schemas.dmtf.org/ovf/envelope/1″ xmlns:rasd=”http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_ResourceAllocationSettingData” xmlns:vmw=”http://www.vmware.com/schema/ovf” xmlns:vssd=”http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_VirtualSystemSettingData” xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>
<References>
<File ovf:href=”uda20-beta-build5-disk1.vmdk” ovf:id=”file1″ ovf:size=”207287808″ />
</References>
<DiskSection>
<Info>Virtual disk information</Info>
<Disk ovf:capacity=”1” ovf:capacityAllocationUnits=”byte * 2^30” ovf:diskId=”vmdisk1″ ovf:fileRef=”file1″ ovf:format=”http://www.vmware.com/interfaces/specifications/vmdk.html#streamOptimized” />
</DiskSection>
The bad part is highlighted in bold. To correct this I found out the actual size of the -flat.vmdk by using ls at the Service Console:
I then modify the .OVF file accordingly:
<?xml version=”1.0″ encoding=”UTF-8″?>
<!– Generated by VMware VirtualCenter Server, User: Administrator, UTC time: 2009-07-21T13:53:10.761609Z –>
<Envelope vmw:buildId=”build-162856″ xmlns=”http://schemas.dmtf.org/ovf/envelope/1″ xmlns:cim=”http://schemas.dmtf.org/wbem/wscim/1/common” xmlns:ovf=”http://schemas.dmtf.org/ovf/envelope/1″ xmlns:rasd=”http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_ResourceAllocationSettingData” xmlns:vmw=”http://www.vmware.com/schema/ovf” xmlns:vssd=”http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_VirtualSystemSettingData” xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>
<References>
<File ovf:href=”uda20-beta-build5-disk1.vmdk” ovf:id=”file1″ ovf:size=”207287808″ />
</References>
<DiskSection>
<Info>Virtual disk information</Info>
<Disk ovf:capacity=”2040109056” ovf:capacityAllocationUnits=”byte” ovf:diskId=”vmdisk1″ ovf:fileRef=”file1″ ovf:format=”http://www.vmware.com/interfaces/specifications/vmdk.html#streamOptimized” />
</DiskSection>









July 21st, 2009 at 8:33 pm
I have not yet tested USB on ESX4 as you’ve described but even when using USB over IP you have to remember to provide drivers for the guest OS. When installing Windows OS on VMware this file is not installed. This may be where you’ve run into trouble.
For Windows VM, Copy over the usbd.sy_ file from I386 folder on the OS disk to \system32\drivers\usbd.sys directory. Give the guest a reboot and then try connecting the USB device.
I hope this helps. If not I hope it helps someone else if they have trouble with a USB-over-IP device.
Thanks for all the great posts.
July 22nd, 2009 at 3:42 am
@Mike – Regarding the USB support; I wonder if this has to do with future VDI support in VMware View possibly?
Carlo.
July 22nd, 2009 at 8:47 am
@Carlo. Yes, but… View already has a USB redirection service… and any USB calls in side a VDI VM would have to be redirected to a network layer to be accessible remotely. Sticking my neck out – the USB controller might become part of VM Direct Path… which allows “direct” access to the underlying PCI bus of the ESX host – thus allowing people to virtualize stuff with hardware that the VM doesn’t provide. I might in the future serve some kind of security function – where the VM talks the USB security dongle, on a ESX host locked in a cage – a scenario where VMotion wouldn’t be desired….
July 22nd, 2009 at 9:14 am
save a few keystrokes by using “ll” instead of “ls -l” :0)
July 22nd, 2009 at 12:52 pm
it’s the das.failuredetectiontime -2 seconds Mike. (so in your example it’s the 13th second…) I’m writing an article on this subject right now…
Duncan
July 22nd, 2009 at 1:18 pm
Hi Mike.
I think it was me helping out in twitter. The problem was as I remember it that the values in the vmdk disk was changed “behind” the back of the VM by using vmdk-tools. The running “object” do not reread the values in the vmdk values except you tell it to by using vimsh tools:
vimsh -e ‘vmsvc/reload VMID’
you can get the VMIDs by running:
vimsh -e ‘vmsvic/getallvms’
If you do that and then export the OVF by either using ovftool or the UI in the client you should be OK.
To change the values vmdk or vmx files are not recommended everyone should use the API
July 22nd, 2009 at 2:41 pm
[...] 2009 Published in Server I’m going to start with a quote from Mike’s article “What I learned today…“: Split brain is HA situation where an ESX host becomes “orphaned” from the rest of the [...]
July 22nd, 2009 at 3:41 pm
BTW. Thanks Eske. I’ve been thinking about the export/import process I’m wondering more and more if it is something to do about how I build my VM before the export. I think its me, not VMware. First I create a dummy VM called uda20-beta. Then from the COS I delete the .vmdk file. Then I download the UDA from Carl’s website (ultimatedeployment.org), and import it using vmkfstools -i. Then I create the OVF file. I have sneaking suspicion that the round the back manipulation of the vmdk, is meaning the VMX file still thinks the “old” disk is there and reports the old disk size rather than imported disk size. I will try building the VM in different way next time… The repeated error doesn’t necessarily indicate a bug as such, but perhaps me creating the same problem in the same way…