Thursday, October 15, 2009

NetApp mbrscan and mbralign for Virtual Machine Alignment In-Depth

Alignment of VMware virtual machines has been an issue for quite some time. This issue exists no matter who is the storage vendor is but I will use NetApp because it is what I know. Here are some links to get you up to speed in case you don't fully understand the situation:

Link to VMWare document on alignment
Link to NetApp document on alignment

What has NetApp done about the situation? I hear that NetApp will be releasing a tool that plugs into vCenter specifically for vSphere 4.0 and vSpherei (ESXi) 4.0 shortly. In the meantime, if you are still on ESX 3.5 (or vSphere with a Service Console) there is another answer that you can use today. This will not work on ESXi since you need a service console for the tools.

Eric Forgette at NetApp created a set of tools about a year ago that has matured and found its way into the NetApp VMware Host Utilities Kit version 5.1. If you do not have this loaded on your ESX/vSphere servers and you are connecting to NetApp, go load it NOW! (A reboot is required for the settings to take affect) Most of the following is research I have conducted in conjunction with conversations directly with Eric over at this thread on NetApp Communities.

Included in this tool are two executables, mbrscan (scans a vmdk for alignment) and mbralign (performs the alignment). The default installation location is /opt/netapp/santools.

While the readme does a good job of going over the basics, there are a number of caveats to run the tools correctly. I will go into each executable in depth but before I go down into the weeds, you need to know when NOT to run it!

You can not or do not want to use the alignment tool for the following:
  • Windows 2008 Server is aligned if the machine was created as a Windows 2008 server. If the machine is upgraded from Windows Server 20003, it will not be aligned.
  • Citrix Servers are not supported because they remap the c:\
  • Windows Dynamic Disks are not supported and will be corrupted if an alignment is performed (but mbrscan will detect them - see below)
  • Linux LVM volumes are not supported (mbrscan may NOT detect all LVM partitions)
  • Windows Server 2003 non-boot disks that have been added (d:, e:, etc) will need to be remapped in Computer Management. The drive letter will be lost on alignment.
  • GRUB booted Linux and Solaris will need to have GRUB reinstalled after alignment
With all that out of the way, it is a basic two step process: 1. run mbrscan, 2. run mbralign on machines as needed.

Step #1 - mbrscan

  • In order for the mbrscan to give reliable results, the machine must either be powered off or have a VMware snapshot!
  • I have a simple script that I put together that just takes a VMware snapshot on all machines on an ESX/vSphere host.
  • I then execute mbrscan using the scan all virtual machines parameter: mbrscan --all
  • After I have the scan results I need I execute another script to remove the VMware snapshots I just created for all machines on the host
  • NOTE: Windows Dynamic Disks will report a partition type of: unknown - 0x42. Do Not Align These Partitions!
Step #2 - mbralign
  • In order to perform an alignment, ALL VMware snapshots MUST be removed!
  • In order to perform the alignment, you NEED an amount of free space equal to the size of the vmdk. mbralign will make a backup of the file as the first step. This file have an extension: -mbralign-backup.
  • In order to perform the alignment, the virtual machine MUST be powered off!
  • If the virtual machine has multiple vmdk's, only one can be aligned at a time! 
  • Execute mbralign against the vmdk - I usually get about 1-2GB per minute speed
  • Boot up the virtual machine. If it works, delete the -mbralign-backup file
  • On Windows systems it will ask you to reboot one more time because it detects the hard disks as new hard disks
  • If it doesn't work, run mbralign again and it will detect the -mbralign-backup and ask you if you would like to restore the file. Very Nice!

15 comments:

Eric Forgette said...

Hi Aaron,
Thanks for putting that post together. I think its a great resource. While I can't comment on specifics, 'shortly' is probably overly optimistic with regard to a plugin. You can however use mbralign and mbrscan on ESX 4.0. In fact the NetApp VMware Host Utilities Kit is fully supported on ESX 4.0.
Thanks again for all your work on this!
Cheers,
-Eric

Aaron Delp said...

Thanks for the comment Eric!

The new tool is out:

http://blogs.netapp.com/storage_nuts_n_bolts/2009/10/netapp-virtual-storage-console-vsc-for-esx-ready-for-download.html

I am hoping to play with it soon.

bhavani said...

You can do citrix servers aswell
but there is a procedure to change the drive letters first in registry and then changing the userinit.exe path.

Anonymous said...

Is there a powercli script to shutdown the VM and kick-off the mbralign process on the vm

Masa said...

Hi, Aaron,

I am getting "Device busy" message after taking the snapshot for a VM for mbrscan. Does the VM have to be shut down after taking a VM? I took the snapshot from vSphere client with and without "Snapshot VM's memory" and "Quiesce guest file system", and it didn't seem to help out. Once the snap shot is taken, you still run mbrscan against the *-flat.vmdk file, correct?

This "Device busy" messages also comes up with the version 5.2 of the mbralign tool.

Could I also see the contents of the simple script to take snapshots you are talking about too?

Thanks!

Aaron Delp said...

@anon - There might be a Powercli script out there but I haven't seen one. (I really need to play with PowerCLI someday, just no time right now)

@masa That is interesting. Yes, you either need the vm powered off or you need a VMware snapshot for the virtual machine or you will get the resource busy error. Are you sure that the VMware snapshot isn't timing out?

You are correct, you scan the flat file.

If you installed the tool with the NetApp VSC or the HUK version 5.1 or greater NetApp will provide support on the tool now as well. You may give them a shot or also try the NetApp Communities link listed in the article. I haven't tried in awhile but maybe Eric (original author of the tool) may be able to help you out a little more. Good Luck!

Oh, about the scripts. I see if I can dig them up and I'll post them on a separate post later this week.

Thanks!

Masa said...

Hi, Aaron,

I finally figured out what I was doing wrong. Our environment has many ESX hosts with several shared storage accessible by all of these ESX hosts, and this mbrscan or mbralign --scan requires to be run against the *-flat.vmdk file on the ESX host which is hosting the VM's configuration file, .vmx. I don't remember reading this in documentation, but I probably missed it somewhere. I was initially thinking mbrscan could be run on any hosts to any shared *-flat.vmdk files, but that doesn't appear to be the case. Once I determined the correct ESX hosts with vmware-cmd -l, I was able to get mbrscan running for VMs that were currently running with snapshots. I plan on scripting the rest tomorrow.

Thank you again for your quick response!!

Masa said...

Hi, Aaron!

I recently encountered an issue that alignment "appears" to have failed on a W2K8 VM. I hear W2K8 should be aligned automatically, so I assume this VM may have been upgraded from W2K3.
So when I run mbralign command, it goes through all the process, and it says it completed ok. I don't get any error messages. However, when I power on the VM, the console says "Error loading operating system". Somebody was talking about driver letter change, and I removed the data disk.. so it should only have C:. Still the same error. The documentation says "Windows VM SHOULD load", but in case it doesn't, it doesn't give you anything to try like the Linux GRUB ones... The question then is "How do you recover the aligned Windows VM that does not boot?". Any info would be appreciated. Thanks!

Aaron Delp said...

Hey Masa - Is the disk a dynamic disk? Dynamic Disks don't align and will become unusable in the way you describe if you try to align them. That is the only thing I can think of. Good Luck!

Anonymous said...

Tried mbrscan and mbralign on a RH5 VM. a good tool, did alignment smoothly. But caught one issue.
disk partition overlap start/end cylinder. protential file system crash issue.

any comment?

thanks,

> Disk /dev/sda: 23.6 GB, 23622348288 bytes
> 255 heads, 63 sectors/track, 2871 cylinders Units = cylinders of 16065
> * 512 = 8225280 bytes
>
> Device Boot Start End Blocks Id System
> /dev/sda1 * 1 1045 8385898+ 83 Linux
> /dev/sda2 1045 2089 8385930 82 Linux swap
> /dev/sda3 2089 2350 2096482+ 83 Linux
> /dev/sda4 2350 2872 4192975 5 Extended
> Partition 4 does not end on cylinder boundary.
> /dev/sda5 2350 2611 2096451 83 Linux
> /dev/sda6 2611 2872 2096451 83 Linux

Aaron Delp said...

Sorry, nothing I can add on that one. My experience aligning Linux is very limited.

Masa said...

Hi, Aaron,

I was still having some issues aligning Windows VMs (basic disk), and found the following:

http://communities.vmware.com/message/1638108

""Error loading operating system" after running mbralign from EHU 5.2 on a VMDK"

Duh!!

"DESCRIPTION:
Under some circumstances, the mbralign utility that is packaged with the EHU 5.2
and VSC 2.0 may incorrectly calculate the Cylinder/Head/Sector information used
by the bootloader. When this occurs, the resulting GOS will be left in a state
that is not bootable."

The ones I was having issues in particular was WindowsXP VM. The EHU 5.2 seemed to work on Windows 2003 VM when I tried.

So, the current recommendation may be to just use mbralign from EHU 5.1...

Hope this helps somebody...

Aaron Delp said...

Masa - Good catch and thank you for posting! I haven't had a chance to use the 5.2 version so any information you have is very useful. Thank you again!

Satinder Sharma said...

@Masa The problem with mbralign bundled with VSC2.0 has been fixed in the latest release of mbralign bundled with VSC 2.0.1

vmexplorer said...

"Windows 2008 Server is aligned if the machine was created as a Windows 2008 server. If the machine is upgraded from Windows Server 20003, it will not be aligned. "

OR If your build process is windows 2008 server being pushed from an image file to your VM, there is a chance your not aligned...

Never assume its aligned... check it alwasy :)