Monday, January 25, 2010

Cisco UCS vs IBM and HP - Where are the Brains?

UPDATE: Thank you to everyone for the great comments!  Please look for the updated sections that I have highlighted below.  I have learned a lot from everyone and I will continue to update this as more information rolls in.  I welcome any and all comments.  Thank you!

As many of you know, my company recently acquired some very nice lab gear for customer demonstrations and proof of concept work.  Many of my peers already know the UCS systems inside and out but I really need hands on to "get it".

As I learn the UCS system I will share my experiences here.  My perspective is to share what is different (good and bad) about UCS compared to the IBM and HP Blade products.  Before anyone asks, I will only be covering IBM and HP.  If you have additional experiences, please share them in the comments.  I also have no intention of picking sides.  At the end of the day I sell and support all of the above systems and I can get the job done with all of them.  They all have their own unique strengths and weaknesses that I intend to highlight.

In case you aren't familiar with what UCS is, I suggest you take a look at Colin's post over on his blog.  He does a great job putting all the pieces together.  Plus, I'm going to steal a few of his graphics. (thanks Colin!!)

A UCS system consists of one or more chassis and a pair of Cisco 6120 switches that provide both the 10GB bandwidth to the blades as well as the management of the system.  The last part of that statement is the key to understanding how UCS is currently different from the competition.  I define management in this example as the control of the blade hardware state.  This includes identification, power on, power off, remote control, remote media, and the virtual I/O assignments for MAC and WWPN's.

By moving the management from the chassis level to the switch level, the solution can now take advantage of a multi-chassis environment.  Here's a simple modification of Colin's diagram to illustrate this point.


(UPDATED!) What are the limitations to the Cisco UCS model?
Someone asked in the comments how this scales.  Honestly that was a great question.  I'm still learning Cisco and I was wrapped up in making it work.  Let's take a look at that.  Currently you can have up to 8 chassis per pair of UCS Managers (Cisco 6100's).  That number will increase in the upcoming weeks and eventually the limit will top out at 40.  But, the more realistic limitation is either 10 or 20 depending on the number of FEX uplinks from the chassis to the 6100's unless you are using double wide blades.  If you don't understand what that means right now, don't sweat it.  I'll be posting about that shortly.


(UPDATED) What if you need to manage more than the chassis limitations today?
If you need to go above the limit, then you have two options.  The first option is to purchase another pair of 6100's to create another UCS System and they will be independent of each other.  The second option is provided by BMC software.  This will allow you to manage more chassis and the solution also provides additional enhancements.  I admit I know little to nothing about the product so I'll just post the link from the comments and you can take a look.  The brain mapping for that would like this.



How do you get into the brains?
Each 6120 has an ip address and both 6120's are linked together to create a clustered ip address.  The clustered ip is the preferred way to access the software.  The clustering is handled over dual 1GB links labeled L1 and L2 on each switch.  They are connected together like this:



Cisco uses a program to manage this environment called creatively enough, Cisco UCS Manager or UCSM.  To access UCSM, point a browser at the clustered ip address.  Once authenticated, you will be prompted to download a 20MB java package (yes it is java, yuck!).  Here is a pic of ours with both chassis powered up.



Notice that both chassis are in the same "pane of glass".  This allows for management of all the blades from one interface and the movement of server profiles (covered later) from one chassis to another within the same management tool.


How does this compare to IBM? 

IBM is a two part answer.

IBM Part One - Single Chassis Interface in AMM

IBM uses a module in each BladeCenter chassis called the Advanced Management Module (AMM).  There can be up to two AMM's in each chassis.  If there are two AMM's, one is active and the other is passive.  They share the configuration and a single ip address on the network.  In the case of failure of the primary, the passive module becomes active and communication resumes on the original ip address.  The AMM will control power state, identification, virtual media and remote control out of the box.  Virtual I/O (both WWPN and MAC) is an additional purchased license in the AMM.  The product is called the Blade Open Fabric Manager (BOFM).  I don't know if BOFM supports 10GB but I know it supports 1GB ethernet and 2/4GB FC.  This is what it would look like with brains in place:


As you can see, each chassis is managed individually.  In my experience, this is the most common configuration I have seen.

IBM Part Two - Multiple Chassis Management with IBM Director

IBM does have a free management product called IBM Director that can pull all this together into a single pane of glass.  The blade administration tasks are built into the interface and virtualized I/O is handled through the Advanced BladeCenter Open Fabric Manager.  Advanced BOFM is a Director plug-in and is a fee based product.  Logically it would look something like this:




The downside to this solution is you now have another server in your environment to manage.  In my experience Director is a little flaky at times but I also haven't tried the newest version which is a redesign to address many of the issues.

How does this compare to HP?


HP is a two part answer as well.  I haven't implemented HP's Virtual Connect over multiple chassis so I will ask that if you know this answer and can throw some links my way, please do and I will update this section.


(UPDATED!) HP Part One - Single Chassis Interface in Onboard Administrator (OA)


HP's approach is very similar to IBM.  HP's management modules are called the Onboard Administrator and there can be a maximum of two in each chassis.  HP is different from IBM because each module requires an ip address.  At any given time one ip address is active and one ip address is passive.  If you access the passive module on the network, it will tell you that you are on the passive module and instruct you connect to the active module.  Like the IBM AMM, the OA will control all basic functions such as power state, identification, virtual media, and remote control.  Like IBM, HP has a separate product for virtual I/O called Virtual Connect.  Unlike the IBM and Cisco products, HP's Virtual Connect is implemented at the I/O module level.  The only way to achieve virtual I/O is to purchase the HP I/O modules.  HP's brain mapping is a little different than IBM because you can connect up to four chassis into one interface.  Since you probably won't be able to power more than four chassis in a rack, think of it as consolidation at the rack level.


(UPDATED!) HP Part Two - Multiple Chassis Interface in HP Insight Tools


After you get to four chassis, HP Insight Tools need to be brought in to fulfill the needs.  Based on the comments below it appears that two products will fit the bill.  To manage the chassis and blade functions you will need Insight Dynamics VSE Server Suite and to manage the virtual I/O you will need the Virtual Connect Enterprise Manager product.  Both the Insight Dynamics VSE Server adn the Virtual Connect Enterprise Manager is fee based.



Summary

(If you made it this far, I'm impressed!)  Cisco's approach feels very "up to date".  I really like the idea of not having to add another server (and additional fees for virtualized I/O) to the environment for management of the products.  By moving all of the management centrally to the switches you are better able to see the environment and implement a multi-chassis/multi-rack solution.  IBM and HP offer a similar solution that has grown over time but the roots of the interface are in single chassis/rack management.  But, at the end of the day both IBM and HP offer a centralized management solution.

Thoughts?  Concerns?  Please leave a comment!

NetApp CIFs Tricks

Yes, the Cisco UCS blog posts will start up "soon".  Still putting the finishing touches on the first port.  In the meantime...

I recently had to perform the following during a CIFs (Windows File Sharing) installation from a NetApp storage controller.  The chances of me remembering this again aren't very good so I wanted to post it here for later.  We had two issues that caused us some grief.

Issue #1 - For whatever reason when looking for a domain controller it wasn't "attaching" to the local domain controller.  The system would ask for a list of domain controllers but then try to communicate with remote AD servers, some of which were behind firewalls.  NetApp is nice enough to allow us to "pin" the storage to a preferred list of domain controllers to correct this behavior.  From the command line, use the following commands:
  • cifs domaininfo - lists which domain controllers the NetApp is communicating with.  The preferred list is a list you specify, the favored list is the list AD thinks are closet to you, and then the rest are listed.
  • cifs prefdc - This command allows you to populate a list of the domain controllers you want to communicate with first.  More than one can be entered in the command seperated by spaces in the format: cifs prefdc add (domain) (dc1) (dc2) (etc...)
  • cifs resetdc - After a dc is added you need to reset the connection
  • cifs prefdc print - Shows the list
 Issue #2 - The site admin wasn't a domain admin.  This leads to many permission related issues because by default when a NetApp is added to AD only the local NetApp admin (created during CIFS setup) and the Domain Admins are in the machine administrators group.  We needed to add the site admin into the Administrators group on the NetApp.  This was achieved using the useradmin command.  Here is the syntax: useradmin domainuser add (username) -g Administrators

After these two steps were complete, we were able to proceed.

Tuesday, January 19, 2010

That's A Lot of Hardware!

Just a quick post today.  As some saw on Twitter yesterday I will be getting my hands on some pretty impressive hardware.  My company has decided to move our customer demo lab to our office and all the gear arrived yesterday.  Here's a few pictures for now but we will be setting all of this up over the next few weeks.  I will be posting some impressions and tips as I go.  With my HP and IBM Blade background I am hoping to write a good bit on the UCS experience.  In addition to the EMC NS-120, I am hoping to integrate our existing EMC NS-960 for some experience with that hardware as well.  Should be interesting!!

Picture #1 - 2x Cisco UCS Chassis each with 4 blades, 2x Cisco 6120 Nexus, 1x Cisco Nexus 5020


Picture #2 - A LOT of NetApp disk shelves (NetApp controller not pictured)


Picture #3 - EMC NS-120 still in the box (but not for long!)

Monday, January 18, 2010

Installing NetApp VSC According to Best Practices

If you haven't checked out NetApp's Virtual Service Console, you should.  I did an article on it after NetApp Insight which is available here.

Vaughn recently posted on setting up VSC access to the NetApp using RBAC (Role Based Access Control) permissions.  This procedure is not currently in the VSC manual.

Quick tangent: Creating RBAC for every product appears to be an ongoing trend within NetApp.  Documentation exists for RBAC installation on SMVI (it's in the manual), VSC (link above), Snap Drive in a virtual machine, and I think there is a RCU writeup around but I can't find it right now.  This is great from a security perspective but gets a little tedious if you are loading multiple products on the same NetApp controller, and double the pain if it is an HA unit! (HINT to NetApp, figure out a way to consolidate this please!!)

Let's say you were an early adopter to VSC and installed it per the manual.  You probably used root as the user id and you never enabled SSL on the filer.  If this is the case, you are sending the root password in clear text (Yikes!).  Based on Vaughn's article we can easily go back and fix this.

  • Configure and Enable SSH on each NetApp Controller if not already enabled
    • From the command line you can use the secureadmin setup ssl and secureadmin status  command as shown below. This can also be configured from FilerView -> Secure Admin
  •  Create the role, group, and user on each NetApp controller. Enter each line from the command line
    • useradmin role add vsc-role -a login-http-admin,api-aggr-list-info,api-cf-get-partner,api-cf-status,api-disk-list-info,api-ems-autosupport-log,api-fcp-adapter-list-info,api-fcp-get-cfmode,api-license-list-info,api-lun-get-vdisk-attributes,api-lun-list-info,api-lun-map-list-info,api-nfs-exportfs-list-rules,api-qtree-list,api-snmp-get,api-snmp-get-next,api-system-get-info,api-system-get-version,api-volume-autosize-get,api-volume-list-info,api-volume-options-list-info
    • useradmin group add vsc-group -r vsc-role
    • useradmin user add vsc-user -g vsc-group
  • From the vSphere Client, go to the NetApp tab, Repeat the following for each controller
    • Right Click on the controller and click Modify Credentials
  • Enter the newly created vsc-user id and password, check Use SSL and click OK

Congratulations, you have just configured your vCenter Server to communicate with the NetApp systems in safe and secure way!

    Thursday, January 14, 2010

    #vmtip Archive From Twitter

    For a few weeks now I have been posting VMware and storage related tips to Twitter.  I have been using the hash tag #vmtip for each of them.  I keep an archive of them in Evernote so I can remember what I have done but it isn't organized.  This is simply an attempt to better organize them into categories.  This won't be updated every day, but I will try to keep it somewhat up to date.

    Last Update: January 14th, 2010

    VMware Related Tips
    • #VMware tip: vSphereU1 increases the max# of vm's per host to 160 for up to 8 hosts (was 100), still only 40 vm's per hosts if >8 #vmtip
    • #VMware ESXi local boot only supported option today. ESXi Boot from SAN and PXE are both experimental right now (via @DuncanYB) #vmtip
    • Need to get data from a #VMware Workstation or ESX(i) vmdk? VMware Disk Mount Utility. It saved me this week! http://bit.ly/6rtY8e #vmtip
    VMware vCenter Related Tips
    • Prior to loading #VMware vCenter, make sure you set the final machine name, static ip address, and domain membership! #vmtip
    Virtual Machine Alignment Tips
    • #VMware tip: Windows 2008 vm's do not need alignment if created fresh. If it was upgraded from W2k03, it will be misaligned. #vmtip
    VMware Lab Manager Related Tips
    • #VMware Lab Manager Tip: If using VMFS, there is a maximum of 8 hosts per datastore due to disk chains. There is no limit for NFS. #vmtip
    • #VMWare Lab Manager Tip: Lab Manager disk chains can not span volumes due to the linked clone technology #vmtip
    NetApp Related Tips
    • #VMware on #NetApp tip: VSC will tweak ESXi installs for NetApp. Great since no NetApp Host Utilities Kit for ESXi #vmtip
    • #NetApp on #VMWare vmdk alignment tip: Windows Dynamic Disks, Linux LVM's and Citrix Servers can not be aligned with mbralgin #vmtip
    NetApp SMVI Related Tips
    • When doing #VMware SRM on #NetApp and using SMVI, you CAN'T take a VMware snapshot as part of the backup! #vmtip
    • This appears to be undocumented: #NetApp SMVI backup of Windows virtual machine on IDE disk are not eligible for single file restore #vmtip
    NetApp SnapDrive Related Tips
    • #VMware on #NetApp: When installing Snap Drive, check the install account is an admin on BOTH the server & filer before install! #vmtip
    • #NetApp SnapDrive 6.2 for Windows requires .NET 3.5 SP1 and 3 MS hotfixes (with reboot) BEFORE installation of SnapDrive. #vmtip

    Wednesday, January 13, 2010

    Creating VMware NFS Datastores on NetApp in 3 Easy Steps

    I am often asked by customers how to set up a VMware NFS datastore on NetApp storage.  The first time I received the question, I pointed them to the NetApp vSphere TR.  It turns out the information isn't currently in the document.  I spoke to Vaughn about it and this was an over site that will be corrected.  In the meantime, here is how I create NFS shares in 3 easy steps. I'm also taking screenshots from my lab for the first time, let me know what you think of the screenshot format vs. just a bullet list.

    Step One - Create the volume on the NetApp system
    •  Log into FilerView, Open Volumes, Click Add -> Click Next
    • Accept the default value of Flexible Volume and Click Next
    • Create a name for the volume, set the language type, and click Next
    • Choose which aggregate to create the volume in and click Next
    • Set the size of the Volume, please notice the pull down defaults to MB NOT GB!, I typically don't set a Snap Reserve but if you don't understand the implications of this, just use the default of 20%. Click Next

    • Click Commit
    • Click Volumes -> Manage to and you will see the newly created volume


    • If your NetApp system has been configured for CIFs, you will need to make a slight change to the Q-Tree type of the volume.  Click Volumes -> QTrees -> Manage.  If the QTree type is UNIX, skip to Step Two.  If the QTree type is NTFS, proceed.

    • Click on the volume link (sim3_vmware_01 in this example) to get the following screen
    • Change the QTree type to UNIX and click Apply

    Step Two - Create the NFS Export (Share the Volume)
    •  Click NFS -> Manage Exports -> Click on the Permissions for the newly created volume


    •  Make sure Read-Write Access, Root Access, and Security are all checked. Click Next

    • Click Next at the Export Path Screen but write down this path, you will need it later!
    • At the Read-Write Access Screen, uncheck the All Hosts box and enter the ip addresses of all the VMkernel ports for the vSphere server(s).  NOTE: This is not the Service Console IP address, it is the VMkernel ip address that vSphere will use to "talk" NFS to the storage
    • Repeat this process for the Root Access Screen and click Next
    • Click Next at the Security Menu accepting the defaults
    •  Click Commit. You are now finished configuring the NetApp System!
    Step Three - Create the share in vCenter

    • From the vCenter Client, Click on a vSphere server and click the configuration tab. Click Storage, Click Add Storage

    • Choose Network File System and Click Next
    • Enter the IP address of the NetApp Storage, the path to the export that you wrote down from step 2, and give your datastore a name as it will appear in vCenter

    • You should now see your storage


    UPDATE: I was hoping to stay away from the command line for this article.  This was really designed for users that are just beginning to get their feet wet with NFS.  But, as Mike pointed out, there is one command that should be run on each volume and this can only be achieved from the command line.  It is outlined on page 37 of the 1.0 version of the TR.  The command is: vol options (volume-name) no_atime_update on where volume name is the name of the volume (sim3_vmware_01 in my example).  Thank you for pointing that out Mike!


    A few final notes.  Once all of this is complete I usually test read/write access by pulling up the datastore browser and creating a folder in the datastore and then deleting it.  Also, if the datastore will be protected by NetApp's Snap Manager for Virtual Infrastructure then I will disable snapshots.  This is all detailed in Vaughn's vSphere TR.

    VMware Disk Mount Utility to the Rescue!

    I had a bit of a scare over the holidays. I usually keep two copies of my data at all times.  One copy is on my laptop and the second copy is on an external hard drive at the house. Well, what happens if you are installing a new OS on your laptop (one copy gone) and as you are copying back all of your data, the external hard drive starts clicking and throwing up errors (two copies gone).  Yikes!

    I tend to "sip my own champagne" (I don't "eat my own dog food", too crude of a reference) so I run my corporate workstation in a virtual machine with VMware Player.  All of my data was in one big 32GB vmdk file on the external hard disk.  I cracked the case on the USB disk and mounted it on my home PC.  The drive was recognized!  I tried to copy the vmdk off to the c:\ so I could transfer it to the laptop. The transfer was VERY slow. It was going to take most of the night and the next day to copy but I needed my data NOW!

    VMware Disk Mount Utility to the rescue!  In case you aren't familiar with the product, you can download it here and the manual is here.  It allows you to mount vmdk's from either Windows or Linux.  There are some limitations as specified in the document but you can mount VMware Workstation as well as ESX virtual machines.

    Using the tool I was able to mount the vmdk and just copy out the data I needed for work the next day.  Over the weekend I was able to recover all the rest of my data.  It took 3 days to copy 100GB worth of vm's and there were a few casualties.  My ESXi machine and a couple of XP builds were corrupt and I will have to recreate them.  I was lucky I got the data back but a big thanks to VMware for the tool!

    Tuesday, January 12, 2010

    Heading to VMware Partner Exchange?

    I will be attending the VMware Partner Exchange for the first time this year!  I'm very excited and I hope to post some useful information so stay tuned.

    I've decided to do a little social experiment via Twitter, we'll see how this works.  If you are interested in getting together for drinks or dinner one night, please let me know via a reply on twitter and I'll add you to the being followed section of the list.  Here's the catch, you will need to follow the list as well.  Here is a link to follow the list. As we get closer to the conference I'll send out information to the list or we can just group argue over where to go.  After the conference, I will remove the list.

    Not sure when this will happen or how formal it will be, just seeing if there is interest (it's an experiment remember!)


    UPDATE: To be clear, this is a "bring the person, not the company" event. Anyone is welcome but please don't plan on pushing/selling anything to this crowd.  This is meant to be a social event only.

    Cisco 4001i Nexus Switch for IBM BladeCenter in Depth

    I have been talking to a few customers recently about the Cisco Nexus 4001i Switch for the IBM BladeCenter.  The product looks very nice and I have recently discovered some additional information that I wanted to share.

    If you are unfamiliar with the product, head over to Kevin's site and take a look at this link and this link.  He has some very good links from Cisco about the switch.  In addition, I found a link to the IBM RedPaper on the switch here.  For those of you that don't like links, here's the summary: It is a 20 port (14 down to blades, 6 uplinks), non-blocking 10GB FCoE capable switch utilizing the Nexus OS.  The FCoE functionality is optional.  To enable FCoE you need to purchase the FC Enablement Kit (IBM Part Number 49Y9983).

    Here are some additional notes:
    • The 4001i is NOT an FC Forwarder, it is a FIP Snooping switch
    • You may have already noticed but this switch DOES NOT have FC ports.  To talk FC you will need to go out of the 4001i into a Nexus 5k and break out the FC there
    • If using the Emulex Virtual Fabric Blade Adapter there is no vNIC functionality
    • The switch only supports Cisco SFP's/SFP+ cables and it doesn't ship with any. You will need to purchase them separately to go with the switch.  They are NOT resold through IBM.  Since there are 6 uplinks, you will need a maximum of 6 SFP's (or copper SFP+ cables) per switch.
    • The switch will support a 1GB connection down to the IBM Blade 2port/4port 1GB Adapter. I questioned why you would need this but on second thought I really like this. This way you can provide additional 1GB connections to a server that may not need 10GB without the purchase of additional 1GB switches.  The fact that you can "share" a 1GB and 10GB CFFh slot is VERY nice!
    I'll post more on the switch as I dig deeper!