Monday, March 15, 2010

Cisco UCS QoS vs. HP Flex-10 vNICs in VMware

This post will be more conceptual than technical.  I recently was asked how Cisco's UCS &  HP's Flex-10 network design approaches affect vSphere designs.  Even though the industry is moving towards a unified 10GB fabric, there are different ways to move data through this big "pipe" and still ensure/prioritize delivery.  As you would guess, Cisco and HP approach this problem very differently.  Cisco takes a network centric approach to the problem and HP takes a server centric approach to the problem.

HP's Flex-10

HP Flex-10 takes a 10GB connection and carves it up into multiple virtual NICs.  The size of the "pipes" can be turned up and down to match the amount of bandwidth needed for the NIC.  Think of it as placing smaller pipes in the big 10GB pipe.  This approach is great for vSphere admins because the virtual switches in vSphere can be configured to look just like they did with a bunch of 1GB links into the server.  The transition to this technology is seamless for the vSphere administrator.  I'll borrow a diagram from Barry's awesome article on Flex-10.  If you haven't read it, please do!


What is the down side to this method?
The down side to this approach is by placing multiple pipes within the larger pipes, you have now placed a CEILING on how much data can pass through that particular pipe.  Let's say you present a 1GB vNIC to vMotion and during a vMotion it would be to your advantage to have access to more bandwidth.  Too bad, 1GB is all you will ever get.

 Cisco UCS's QoS

Cisco UCS uses a method known as Quality of Service (QoS).  Most of us "server guys (and gals)" have no idea what this is.  Here is how I have come to understand it.  If this is wrong, please correct me.  Network traffic is given a priority and this priority kicks in WHEN THERE IS CONTENTION on the network.  So, instead of smaller pipes inside a large pipe, you have more of a priority system in place to guarantee certain levels of service.  Think of this as a FLOOR model.  You can have as much as you want as long as everyone else gets their minimums (they get their quality/guarantee of service).  If something needs to spike and there is room, it can spike and then return to normal.  Here is a diagram of our Cisco UCS with traditional switches.  This isn't 1000v but you get the idea.  As you can see, two big 10GB pipes into the virtual switches instead of smaller pipes into multiple virtual switches.


As the vSphere administrator, this looks very different from my old multiple 1GB links into my multiple virtual switches!

What is the down side to this method?

At this time, QoS for Cisco UCS appears complex to configure and represents a shift in thinking for the vSphere administrator. 

How is the QoS implemented for Cisco UCS and VMware?

That is a very good question.  I can't seem to find any documentation on how to actually do this yet.  I'm sure there is a Cisco internal doc somewhere but I haven't found anything public that lays out the hardware that is needed (do I need 1000v or Palo for this, can I use a CNA and the standard switches?) nor have I found a "cook book" that documents how to properly make QoS happen in a vSphere environment.  I'm sure this will happen in time and if you have a link, please leave a comment!

Which is better?

It depends on your point of view and the comfort level of your team.  I can easily see advantages to both approaches.  One is easier to implement, the other appears to be a more elegant (but complex) solution.  Cisco has once again brought a disruptive technology to the table that can't be ignored.  What are your thoughts?

Virtualization Podcast Directory

I have been a Podcast junkie for years.  I've subscribed and unsubscribed to countless feeds over the years.  Up until recently, there has only a handful of podcasts with a virtualization focus.  I have noticed a nice uptick in this number recently.  So, in the interest of spreading the love to everyone out there, here is a list of the virtualization podcasts I am subscribed too.  As always, I'm behind on my episodes but I hope to catch up soon!  If you know of any others or if you have any additions or corrections, please let me know!

Looks like Marc is the Leo Laporte of the Virtualization Podcast world!

Thursday, March 4, 2010

VMware Lab Manager Install Notes and LDAP Import

Setting up Lab Manager can be a little complex.  It isn't as straight forward as some of the other VMware products so I wanted to provide some tips and tricks to get it all up and running.

Things you will need prior to Lab Manager Installation

  • In vCenter, create the datastores, virtual switches, and Resource Pools that you will need.  The Lab Manager (LM) install will detect them at install and this will make configuration MUCH easier
  • Create all groups and users that you will need in either Active Directory or LDAP
  • If you will be using ip pools, define a block of static ip's ahead of time!
  • The Lab Manager server is currently a Windows 2003 based server. It can be virtual and on the same ESX hosts that it will be controlling.  If you do this, DON'T name it lab-manager.  If you do, you will get an error during installation because the install tries to create a folder in vCenter called lab-manger.  You will have to rename your virtual machine to proceed.  Also, you will need to change the speed of the vmxnet3 NIC per Jason's article here.
  • Make sure both forward and reverse DNS lookup work between the vCenter server, LM server, and all vSphere servers
  • The LM Server requires IIS 6.0 and .NET 2.0 to be installed.  IIS MUST be installed before .NET 2.0
  • DON'T put the LM Server into the AD Domain.  VMware recommends against this even if you are importing users and groups from AD into the LM Server.  I asked why at Partner Exchange and I was told because it isn't needed and changes to AD could mess up the LM server.
LDAP/AD Integration

Integration with Active Directory or LDAP is the key to Lab Manager.  Lab Manager allows you to create single users on the box but NOT groups.  This makes security and configuration VERY difficult.  At the same time, the LDAP integration leaves a little to be desired in the implementation.  Here's how to do it:

  • From the Lab Manager Interface, On the left hand side choose Settings and click the LDAP tab:
 
  •  Once that is complete, you are ready to import groups.  Click the Import Groups Button.
 
  •  Here's the magic.  Because the group and users have already been created in Active Directory, you can choose the group and assign it to the users role (the default role is read only so be sure to change it).  All users in this group are now Lab Manager Users

A few interesting notes about this import process.  If you look at the group once it is imported and no one in the group has logged in yet, the group appears blank!  This threw me for a little bit.  I expected it to populate with the users at creation time.  Instead the list populates at each USER's FIRST LOGIN!  My group has three users in it total.  As users log in, they will populate the group and also appear on the Lab Manager's Users list.  Here are a few screenshots as I logged in my test users.

Lab Manager Group with only first test user logged in:


Lab Manager Users Pane with two users created from login:


 Look for more articles as I get everything set up!

Tuesday, March 2, 2010

IBM's New eX5 Server Announcments

I wanted to tell everyone about the new server lines IBM announced today.  I attended IBM Business Partner training on this a few months ago and the products are impressive.  I was under NDA until today to speak about anything.  I can talk about the IBM products specifically but I'm still not able to talk about the Intel Nehalem EX chipset.  I will have in depth posts of the EX chipset when it is officially released.  Also, I am writing this from notes taken a few months ago so a few things might be slightly off.  If you see a mistake, please let me know and I will correct it!

As always, Kevin does an awesome job of laying out the products (and has some great pictures) so head over to his site for an introduction.  Hot off the press, Kevin has another article on just the X5 blade here.

Here's the basics:

  • The servers contain Intel's yet to be announced Nehalem EX chipset.  I can't discuss the details on that since I'm still under NDA.  I will present what has been pre-announced by Intel.
  • The Intel Nehalem EX (Intel 75XX) was designed by Intel to be the 4 socket follow up to the previous generation, the Intel 74XX.  This was the IBM 3850M2 and the HP DL 580 servers.
  • (My opinion here, don't fuss at me Intel and IBM) Intel intended the Nehalem EX to be a 4 socket architecture.  IBM modified the architecture in cooperation with Intel for 2 socket servers.
  • IBM has released the following servers based on Nehalem EX:
    • 2 socket rack server called the x3690 X5.  It can hold two Intel 75XX processors and 32 memory slots
    • 2 socket blade server called the X5 blade.  This was a pre-announce so I can't talk much about it yet.  One thing that will be cool about the blade is it will be "lego based".  By this I mean you can buy one and snap on another for a 4 socket blade
    • 4 socket rack server called the x3850 X5 and the x3950 X5.  This will stack like the previous generation of 3850's and 3950's.  Each 3850/3950 will hold four Intel 75XX processors and 64 slots of memory
    • Additional memory can be bolted on to any of the models above using an IBM exclusive attachment called the MAX5.  This will be a 1U (for the rack servers) with 32 memory slots or 1 blade width attachment that will give you an additional 24 memory slots.  It attaches directly into the Intel QPI (Quick Path Interconnect) bus for easy, low latency memory expansion of the models
    • If I remember correctly both the 3690 and the 3850/3950 will have 1 GB on board network ports but an Emulex card can be added to the systems to replace the 1GB with 10GB on board
What do we know about the Intel EX chipset and why do we care?

I will point you to links here and here.  As I stated before, I'll have in depth analysis of the chipset when it is announced.  The why we care part is actually really cool.  There are some great advancements in the technology but there are also many things to make your life easier at time of purchase as well.

In conclusion, I'm very excited about the 2 socket offerings.  They appear to be very innovative and exciting.  I wasn't given access to any other vendor's early release information so I'm not even sure if anybody else will offering 2 socket servers based on Nehalem EX.  Interesting times indeed...

Tuesday, February 23, 2010

Configuring Multiple Palo Adapters in the Cisco UCS B250 Blade

I found out something interesting today that I wanted to share.  Version 1.1.1 of the Cisco UCS Manager code only supports one Palo card on the B250 at this time.  Here is the line from the release notes and the link to the release notes.

  • Cisco UCS M81KR Virtual Interface Card. In a UCS B250 M1 Extended Memory Blade Server, the UCS M81KR Virtual Interface Card must go in slot 0 only and can not be mixed with other adapters. 
This by itself isn't that big of a deal because I can't see many people wanting more than one Palo card at this time.  The problem is that the Netformix configuration tool allows you to configure the B250 with 2xPalo cards with no errors.  I have no idea if you can take the Netformix configuration and turn it into an order at this time (there might be an error check somewhere else in the system that would kick this out) but somehow I doubt it.  Just in case, I wanted to let everyone know.

Monday, February 22, 2010

How Cisco UCS Deals with Split Brains

This will be a short post this morning.  I wanted to pass along how Cisco UCS deals with a split brain scenario.  I'll start by explaining how you would get into a split brain scenario.  In normal operations, one of the 6100's is the active brain and the other is the stand-by.  A split brain in UCS would happen when both of the cluster interconnects betweenthe 6100 Fabric Interconnects are severed (the L1 and L2 ports).  The active brain still thinks he is active and the stand-by no longer sees the active so he tries to take over.  You now have a potential power struggle because both brains think they are in charge.

Luckily the Cisco UCS folks are way ahead of this scenario.  They added logic to the Serial EPROM (SEEPROM) in the UCS chassis to resolve the situation.  The odd number of chassis that are added to a UCS Domain act as judges during split brains.  For example with four chassis, three are acting as judges.  A marker is added to the SEEPROM on these chassis to make them quorum resources.  To clarify this a little further bit more, if there is an odd number of chassis, all of them will be used.  If there is an even number of chassis, it will drop the last one (n-1) so the number of quorum chassis will always be odd.

When the split brain is detected, both 6100's will immediately demote themselves and then claim as many of the quorum resources as possible.  Whoever claims the most quorum chassis wins and promotes himself back to the active manager. The scenario would look something like the following.  Pretty slick!

Monday, February 15, 2010

Buying an HS22V for VMware? READ THIS!

I have had some interest from our customers in the new IBM HS22V Blade Server.  There is a great overview of the details of the new blade over at Kevin's site here.  I did find out one thing very interesting that I wanted to share.  The HS22V is different from previous models because it will only take up to two 1.8 inch SSD drives.  No hard drives here!  That's a great advancement except for one thing; the list price of ONE of the drives is currently over $1600!!! This means over $3200 (list prices!) to load an operating system if you want a raid-1 set.  That is a pretty high price.  Here's a screenshot of the IBM configuration tool with the SSD drive.



But, if you are running VMware ESXi you have another option.  Hidden in the other section (not the storage section) is an option for ESXi version 3.5 or 4.0.  The best thing is it is only $75 list!!

This cost difference brings about an interesting choice for ESX based organizations vs. ESXi.  How much are you willing to pay for that Service Console?

Thursday, February 11, 2010

VMware PEX: Lab Manager Design and Implementation

VMware PEX: Lab Manager Design and Implementation (TECHMGT0922).  This is written as I sit in the session so it could be messy.

Architecture

  • Lab Manager Server (Windows based)
  • vCenter Server
  • One or more ESX 3.5 or 4.0 servers, ESXi 4 servers
  • TCP port 902/903 for virtual machine console access
  • Brower client to Lab Manager (LM) server is tcp443 
  • TCP 5212 from LM Server to ESX servers and TCP 443 from lm to vCetner
  • Read the manual for all the requirments (installer checks for them and spits out errors)
  • Install a service account on lm server - page 14-15 of user guide as details on permissions needed
  • Pre-crreate resource pools if you want to use them so that it will pick them up at install time
  • Pre-create all virtual switches, distributed switches, etc.
  • Don't join it to the domain so nothing from AD would get pushed to it as a member server
Storage Design
  • LM uses Linked Clones to save disk space
  • Make sure the I/O can support your environment (just because you have the space doesn't mean you have enough I/O!)
  • LUN locking and limit of 8 nodes if using vmfs (NFS doesn't have this limit)
  • Understand the concept of disk chains and how they work (this isn't well documented)
Network Design
  • When Setting up LM server, create the default Physical network  
  • Design Considerations for Networking: Physical Network vs. Virtual, Fenced vs. Non-Fenced - Will IP's be from an IP Pool, DHCP, or Static?
  • Physical network is nothing more than a connection out to a physical network
  • Virtual network - a network that is may or may not be connected to a physical network (could be on a different ip, vlan, etc) - A virtual network can be connected to a physical network if needed upon deployment
  • Fencing - The ability to create a fence around a configuration (group of vm's) so they are isolated from the rest of the network.  If the fenced configuration needs to get out, it will do so through a NAT router (small Linux vm).  In this case it would have an internal ip inside the fence and an external ip address outside the fence using the NAT router.  This is great for machines and applications that all have the same ip.  This way there will not be ip conflicts on the network.
  • Host spanning - fencing isolation was limited to one host in version 3, the ability to cross servers with vMotion is called host spanning.  Host Spanning needs the Distributed Switch (and Enterprise Plus license to get Distributed Switch)
  • IP Pools - Takes a lot of ip addresses - if using fencing remember the the NAT router needs ip's as well
Fencing Considerations
  • Fancing can't use DHCP (DHCP can't cross the fencing to provide the address)
  • IP Static Pool must be used
  • At least one virtual machine needs to conntect to physical network (otherwise virtual network with no outside connection)
  • Fence policy is traffic In and Out, All Blocked, or Out Only.  There is no In Only policy. 
  • Be careful with outside communications if using fencing (many machines with same name but different ip's all hitting an outside source!)
    •  Domain COntroller - member servers can be in a configuration with the same name as others as long as the machine pasword with AD doesn't expire (30 days by default).  Otherwise, put a clone of the DC in the configuration and run it private to the configuration group
    • Domain Controller Clone - be careful - a cloned dc will come up with a .169 address because it detects one with the same ip address already out there.  Best way to do this is clone the DC and completely isolate it from the production network.
    • SQL server - if outside the fence, what happens when multiple configurations hit it??  Maybe different instances of the same database on the same server - adds a little bit of a manual intervention to the process
    • Can create a workstation that is inlcuded in the configuration for the user to use as a workstation "in the fence"

Good Article: VMware KB1000023 - How to Backup the VLM Database

VMware PEX: Site Recovery Manger "Up and Running"

This VMware Partner Exchange Session (PEX) was Site Recovery Manger "Up and Running" (TECHBC0321).  I'm writing as I go so this might be a little messy.

This session will focus on the problems typically encountered during SRM implementations.

Required Components
2x vCenter servers
2x SRM Servers
Replication Product from the Storage Vendor
SRA (Storage Array Replication) from the Storage Vendor

Install Workflow
1. vCenter at each site
2. SRM Server (seperate server)
3. SRM needs a DB instance
4. SRA (Often the most complex and causes the most problems) - Install on SRM server


SRA's Function

  1. Setup - Query for replicated luns, match luns to vm inventory
  2. Failover - Automates promotion of LUNS at remote site, and 
  3. Testing - LUN Snapshot creation
What can go wrong during SRA install?
  • Not all SRA's are create equal.  Each one is different and have different levels of effort put into the development Some require additional framework (Java JRE for example)  Always read all release notes and the install guide prior to the install attempt
  • Always download a fresh SRA FROM THE VMWARE SITE NOT THE VENDOR SITE, many vendors change versions on a frequent basis
  • Whatever you do on one site, do it on the other site
  • When configuring SRA at the protected site, it may fail if not all components are installed at the recovery site (not configured, just installed)
  • What if no datastores appear but the SRA seems to be installed OK?  This is because the datastore doesn't have any vm's on it
  • Always verify you have all the needed license features on BOTH storage systems to fully support replication in BOTH directions
Design Considerations
  • Disparate networks (re-ip of servers) - Most Common
  • Stretch vlans (no re-ip of servers) - Less Common
  • DNS services
  • Active Directory services - Could be dedicated for testing and failover or same production AD 
  • Considered Applications with Hard Coded IP's
  • Remember Default Gateway and Subnet Mask
  • When performing a recovery, the less changes the better (DOC-1491 in VMware Communities
  • SRM Supports RDM's but it isn't recommended
  • If using multiple virtual hard disks, make sure both of them are replicated (or exist) at both locations
  • SRM does not support replicating virtual machines with snapshots
  • Need port 80 https tunnel between sites for site pairing (it is encrypted but travels on port 80 instead of 443 to make security easier
  • 150 protection groups / 1000 protected vm's
  • A protection group can hold consist of datastores if a virtual machine spans datastores

Wednesday, February 10, 2010

VMware PEX: Reliable vCenter Database - Operations, Management and Troubleshooting

I  was finally able to attend a VMware Partner Exchange (PEX) session that I was able to discuss.  This session was Reliable vCenter Database - Operations, Management and Troubleshooting (TECHBC0330).  This is being written as I'm in the session so it might be a little messy.

  • vCenter at startup takes data from the Windows registry, the vpxd.cfg parameter file, and the vCenter database (VCDB).
  • The executable name of the service is vpxd
  • vpxd -p and -P are important because they are used to reset the password
  • Almost everything you do in vCenter requires interaction with the database.  For example:
    • to start a vm - reads location in the db and send commands command
  • If vCenter fails - VMotion and DRS will fail but hosts and vm's will continue to run
  • vCenter won't start with corrupt or inaccesable database but it will run with an empty database
  • HA will be able to execute commands but won't have any "eyes" to see how to execute them
What workload that is running on the vCenter Server will determine how long your environment can be down.  For example a farm containing mainly servers that aren't moving around and won't be restarted could go hours/days without a vCenter Server.  On the other hand in environments with View, SRM, etc. the lack of a vCenter server will be noticed very quickly because machines won't be able to powered on.

Sizing and Location
  • You have the option of a physical or virtual machine and you can co-locate the VCDB or put the VCDB on a separate server.
  • Recommendations
    • vCenter and VCDB - virtual and co-location only to 40hosts or 400 vm's
    • if physical - must reatrt db's together, one could take down the other
    • The speaker recommended seperate virtual vCenter and db servers with an anti-affinity rule to disable both vm's from being on the same hosts. (I'm not sure how I feel about that in the case of power failures)
    • If the database server is virtual, you can take a db backup by cloning the vm
Protecting the vCenter and VCDB's
  • Can be protected many ways (too much information to fast to list) but methods included physical rebuild, VMware HA, vCenter Heartbeat, MSCS, and FT
Registry Components

  • The DSN Information is held in the registry: HKLM\Software\VMwareInc.\VMware Virtual Center\DB 
  • Four objects of value under that: 
    • 1 = DSN Name 
    • 2 = Login ID 
    • 3 = password (encrypted) 
    • 4 = driver being used
Database Structure Components
  • The VCDB is mainly performance information (over 80% of the database typically)
  • The other information is the configuration of accounts and security information for vCenter
  • All db tables have the prefix VPX_ - It is NOT recommended to use the tables directly!

Sunday, February 7, 2010

Why VMWare's VMmark Scores Have Become Useless

Well, it was bound to happen.  Every time an industry benchmark standard comes out, the manufacturers eventually figure out ways to "cook the books".  I've seen a LOT of FUD flying around from both HP and Cisco lately about VMmark scores and I have been asked a lot of questions about both platforms.  After a taking close look at the scores, I'm ready to throw in the towel.

Before I go further, take a look at the VMmark, 8 cores scores posted here.  You will see the Cisco B200 Blade is on top right now (of the major vendors, I don't count Fujistu, sorry Fujistu) with 25.06 and the HP is next with the BL490 24.54.  A couple of points:

What is the different between really really fast and really really really fast??

What is the difference between 25.06 and 24.54.  Maybe 1-2%?  Honestly, not much if they both meet your needs and you won't be pushing them to their limits.  I'm sorry but that is within a margin of error and/or the test could be reconfigured by everybody to meet the score.  At the end of the day both of them will meet your needs very well and the title of "fastest blade" means nothing!

Both Cisco and HP sell "big memory solutions" but they are no where to be seen!

Take a look at the memory in the details for both of them.  Both the HP 490 and B200 use 96GB of memory.  Where is the B250 with the larger memory footprint? Where is the BL490 with either 144GB or 192GB of memory?  You will also notice that the HP BL490 memory is running at 1333Mhz and the B200 is running at 1066 Mhz.  Since there is no big jump in performance numbers the VMmark score isn't memory bandwidth bound or HP would have had an advantage.  I suspect (although I don't have proof) that the VMmark score is now CPU bound and any memory above 96GB doesn't help the scores.  I further think (again, no proof) that the test isn't pushing the maximum memory bandwidth because there is no change from 1333 Mhz to 1066 Mhz.  It would be interesting to see if the drop to 800Mhz by HP would be noticed in the scores.

Cisco is using an EMC SAN with SSD's on the back end!

Take a look at the EMC Storage section on the Cisco benchmark.  They are using an EMC CX-240 with SSD drives!  There is NOTHING wrong with this, SSD's are coming down in prices but they provide a clear, known advantage to the IOP's numbers that could easily be the sole reason for the 1%-2% increase.  I'm willing to bet that if HP used the same storage configuration, they would produce similar scores.

Why didn't Cisco use the Palo card?

Cisco is using the Q-Logic CNA for the tests.  Why didn't they use the Palo card?  I suspect because it isn't "technically" released yet but that is the benchmark everyone wants to know about.

What am I trying to say here?

What I'm saying is that both HP and Cisco make great products and they will go to great lengths to make the other look bad.  They are so close to each other from a VMmark score perspective that any clear difference can't be shown with the current test.  Don't make a purchase based on a score!

Saturday, February 6, 2010

HP Blades Offer a 16GB DIMM, With a Catch

I found out something interesting on Friday, HP is offering a 16GB DIMM in their blades!  My first thought was wow, that sucker is gonna be expensive (and it is!).  But, after that I started to dig deeper as I always do, I found out something that is slightly disturbing.  The 16GB DIMM is actually a quad rank DIMM and not a dual rank DIMM and it is only 1066Mhz speed.  Many of you are saying... So what?

Well, this actually can make a difference from a design perspective.  Take a look at the Memory tables in the BL460 and BL490 QuickSpecs and you will see what I mean.

HP BL460 G6 QuickSpecs
HP BL490 G6 QuickSpecs

Most of the DIMMs sold by the major vendors are dual rank and 1333Mhz Speed.  Let me explain the concept of ranks first.  According to the Intel Nehalem architecture, you can only have 8 ranks per memory channel.  Each memory channel consisted of either 2 or 3 DIMMs per channel.  I have more information on memory layout in this article I wrote on Scott Lowe's site awhile back.  I never really discussed ranks because it was a limit you didn't hit.  You were only using either 4 ranks (2xdual rank dimms on the BL460) or 6 ranks (3xdual ranks on the BL490).  Quad rank DIMMs blows the math out of the water.  You can only put 2 on a memory bus to generate 8 ranks.  This means the BL490 no longer brings extra memory capacity to the table.  Both the BL460 and BL490 top out at 192GB with the 16GB DIMMs.

Memory speed is the next issue.  If you are running a memory bandwidth intensive application, you will expect about a 7% boost in performance by keeping the memory speed at 1333MHZ instead of dropping down to 1066Mhz.  Because the maximum speed of the 16 GB DIMM is 1066Mhz you will never reach a 1333Mhz speed.  Furthermore, populating both slots in the memory channel (the max of 12 DIMMs) drops the speed from 1066 Mhz to 800 Mhz.  The performance drop from 1333 Mhz to 800 Mhz is over 30%!!  This leads to an interesting trade off of memory capacity vs bandwidth speed.

While I applaud HP for thinking outside the box and bringing a 16GB DIMM to market, don't assume it is the same DIMM as the others.  Remember, "One of these is not like the others...."