Enterprise Datacenter Management Voodoo
Uncategorized
Notes from VMworld: ignoring phase 0
Aug 31st
So the first two days of VMworld so far have exceeded my expectations. It wasn’t so much the sessions (though they were pretty good). It wasn’t the partner super session. The super sessions basically verbalized what is on everybody’s mind right now. And that’s the thing that makes it exciting: Its the vibe in the air. Everyone knows that there’s a big change happening in the industry. And with this big change, we all sense opportunities.
The cool thing about these opportunities is that the field is wide open. Certainly VMware has a giant leap on everyone as solidified by the magic quadrant they’re hyping on their home page right now. But the way VMware is set up to let partners also develop solutions based upon them gives them sizable advantage. However, there are other great things afoot as exibited by the vendors in the tech floor down below where I write this.
So here are the coolest things so far:
1. The labs: They took a risk, put the lab in the cloud. I remember trying to do something like this at IBM and thinking it opened up a whole new realm of possibilities. Now the labs don’t need to be given strictly at the conference. In fact there’s no reason that VMware won’t offer these labs outside of VMworld. A fantastic investment. They’ve had some snags: slow networks, machines not provisioning, etc. However, it is usable and they announced that 3800 labs were deployed.
2. Meeting people that we’ve been reading from and just hearing ideas from people that I’m sitting next to. This has been great and the ideas are just flowing. There are lots of brilliant people here that freely share ideas. I really dig this.
3. The ability to sign up and talk to technical experts: a 1 on 1 session for 15 min to get questions answered. This has been great.
4. Seeing all my clients, partners, and long time friends. This is really what its all about: Meeting the people, getting contacts, leads, and laughing about the 24 hour sleepless nights and adventures we shared X years ago.
But there’s also a lot of fud and marketing and statements made that I don’t agree with.
One of the statements I disagree with is what they said in the partner super session and one that has been repeated many times. They say:
“virtualization is stage one of any cloud”.
I strongly disagree. Coming from an HPC background, I am very adamant about the statement: virtualization doesn’t equal cloud. And for that manner, you do not even need virtual machines to have a cloud. Virtualization comes after you’ve got a handle on your data center. This includes switches, physical machines, etc. So if virtualization is stage 1, then stage 0 is getting control of hardware.
Stage 0 means a lights out data center where its dark because people only go in there once in a while to replace failed components. Stage 0 also requires the ability to re-purpose hardware on demand. This is something we’ve been doing in xCAT for many years. Stage 0 means we can power physical machines off / on. We can deploy hypervisors or native OSes without hypervisors to physical machines over the network. This requires a centralized deployment engine. And all of this is the bottom layer that we at Sumavi and the open source xCAT community have been working on for many years.
This area of functionality can not be trivialized and VMware gets it. However, there is no product in its portfolio to hype other than some beta works so the problem is largely ignored and religated to “Look to your hardware vendor to provide this solution”. However, this ignores multivendor sites, which is pretty much everyone.
All of this has given me the feeling that what we are doing at Sumavi is becoming more and more important. Our partners and our customers have stressed the need for it. You can be sure to see more products in this space as time goes on. And you’ll certainly hear more of it at VMworld 2011.
xCAT for non-root users
Aug 27th
You have a user on your machine and you only want to enable them to do things like rinv, rvitals, and nodels. You don’t want them to be able to provision nor power on/off and do all those other awesome things that xCAT can do.
So what do you do?
Suppose your user name is ‘foobar’.
You do this:
1. Set up the policy table so that it contains the following: (tabedit policy)
1 2 3 4 5 | #priority,name,host,commands,noderange,parameters,time,rule,comments,disable "1","root",,,,,,"allow",, "1.1","foobar",,"rinv",,,,"allow",, "1.11","foobar",,"rvitals",,,,"allow",, "1.12","foobar",,"nodels",,,,"allow",, |
2. Set up the local cert for the user:
1 | /opt/xcat/share/xcat/scripts/setup-local-client.sh foobar |
Any other commands you can add by adding another number, like 1.13, etc. The numbers are arbitrary, just make sure there is a unique number. They stand for the priority of access of how the commands are processed. (e.g: if two commands are received by the xCAT server at the same time.
BitTorrent client on CentOS 5.5
Aug 24th
After looking everywhere for a BitTorrent client for CentOS 5.5 I found that the old archives on bittorrent.com provided a perfect match that had no prereq RPMs that I had to download. I got BitTorrent-4.1.3-1.noarch. Installed it with RPM, then ran it like so:
1 | btdownloadgui.py |
VMware API in Perl
Aug 20th
Here’s a simple example to connect to a hypervisor
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | #!/usr/bin/perl use Data::Dumper; require VMware::VIRuntime; VMware::VIRuntime->import(); use strict; # try logging into a node: my $conn; my $hyp = shift || 'vhost31'; print "performing action on $hyp\n"; eval { $conn = Vim->new(service_url=>"https://$hyp/sdk"); $conn->login(user_name=>'root',password=>'cluster'); }; |
Now you probably want to do something since you’re connected. The best way is to go over and read the VMware API documentation. The Reference Guide seems to be the best. You have to do a lot of guessing since it isn’t necessarily written for any language. Hopefully I’ll be able to post more on using this later. If you want to huge example, you can look at the ESX plugin in the xCAT source tree. We do pretty much everything you could think of with it. Since its open source, you can use it however you want.
VMware ESXi 4.1 install using Western Digital USB Hard Drive
Aug 19th
Wish I had better news. It can’t be done. After disabling all that ‘Smartware’ software as explained by on Western Digital’s website, I still fail when it tries to do an mcopy to grab the kickstart file.
And after loading all you get to see is:
rescanning in 10 second(s), press
I copied everything on to a different drive and it worked fine. Moral of the story: Take back your WD USB Hard Drive and get one that’s less ‘smart’
ESXi 4.1 Kickstart on xCAT
Aug 18th
I recently added the ESXi 4.1 base template kickstart file to xCAT. The code is checked in here. We’ve had the ability to do stateless ESXi 4.1 since it came out and we’ve been doing stateless ESXi 4.0 as well. But for some of our customers, we have needed a way to get the ESXi 4.1 server on the disk. This seems to be the most common way people want to install VMware ESX(i) these days. We hope in the future more people will go stateless. But for now, here is our xCAT ESXi 4.1 base kickstart file:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | # Sample scripted installation file # edited and updated by vallard@sumavi.com # Accept the VMware End User License Agreement vmaccepteula # Set the root password for the DCUI and Tech Support Mode rootpw --iscrypted #CRYPT:passwd:key=vmware,username=root:password# # clear all partitions. clearpart --alldrives --overwritevmfs # Choose the first disk (in channel/target/lun order) to install onto autopart --firstdisk --overwritevmfs # The install media is on the network. install url http://#TABLE:noderes:$NODE:nfsserver#/install/#TABLE:nodetype:$NODE:os#/#TABLE:nodetype:$NODE:arch# # Set the network to DHCP on the first network adapter #network --bootproto=dhcp --device=vmnic0 network --bootproto=dhcp # reboot automatically when we're done. reboot # A sample post-install script %post --interpreter=busybox --unsupported --ignorefailure=true # tell xCAT management server we are done installing # have to put in the IP address instead of the hostname because VMware # ESXi 4.1 can not resolve IP addresses... echo "<xcatrequest>\n<command>nextdestiny</command>\n</xcatrequest>" | /bin/openssl s_client -quiet -connect #COMMAND: host #TABLE:noderes:$NODE:xcatmaster# | head -1 | sed 's/.*address//g' #:3001 2>&1 | tee /tmp/foo.log # enable SSH on next boot: %firstboot --interpreter=busybox --unsupported --level=47 sed -ie 's/#ssh/ssh/' /etc/inetd.conf #ssh is too nice not to have |
Since this is an xCAT kickstart template then you see the #TABLE … # and #COMMAND ..# tags in there. Basically these are just cues for xCAT to look up the different attributes for the nodes so that it can customize this one template to be used on the entire data center. So the password, main HTTP server, and xCAT server are all stored in the xCAT database.
I have two scripts in here. The first is the %post. This script simply signals back to xCAT that it is done installing so that the next time it reboots, instead of reinstalling, xCAT will tell the node to boot to hard disk. This happens right after the install.
The second is the %firstboot script. Notice that I added the –level 47 to the script. This is important as it tells this script when to run. If you look at /etc/vmware/init.d/init you’ll see the levels. Level 48 starts the networking. Before the networking starts, I want to enable SSH, so I just uncomment the section inside /etc/inetd.conf to allow SSH to happen on boot. (Another thing you could do is just do an /etc/init.d/TSM-SSH start)
So this template is stored in xCAT in /opt/xcat/share/xcat/install/esx/. You can have a node boot to it (provided the rest of xCAT is setup and copycds have been run) by doing the following:
nodeset <noderange> install=esxi4.1-x86_64-base rpower <noderange> boot
or just:
rinstall <noderange>
Then the template is copied into the /install/autoinst/ directory and the name is changed to match the node and all variables are substituted in. Then the PXE server and DHCP server are set to point to the file to grab and install the node. This is in xCAT 2.5 which you can get now as the development release (make sure you grab the files at the bottom in the ‘Development Builds’ section)
Another thing that is fun to do with the ESXi kickstart file is to make a new VM as part of the kickstart install. Generally I recommend using an NFS server to store your VMs on, but there are cases where you just want them on the local drive. As part of the above kickstart file, the datastore1 partition is created. This is a place where you could now run the vim-cmds during post to create machines. This is easy to do during the firstboot section (you would probably do this at level 99) but not so easy to do in the %post section.
The problem with the %post section is that hostd isn’t running so none of the vim-cmds will work. So you have to start it. This can be done by running:
/etc/init.d/hostd start
But wait, there is another problem! The hostd command doesn’t return and hangs! So you have to use some magic (like creating a script to run it that forks off and returns) otherwise your %post hangs forever. (This is a total bug)
Anyway once you work around that then just running the commands like:
/bin/vim-cmd solo/registervm /vmfs/volumes/datastore1/vm01/vm01.vmx vm01 /bin/vim-cmd vmsvc/power.on 16
Seems to work. But, during %firstboot, you’ll have to reregister them again.
I hope to put more information on this as we go forward with it. I am happy that VMware has made this kickstart file for 4.1 and I can only see it improving over time. The more automation the better and with kickstart we can really automate everything we need.
xCAT Windows Installs
Aug 13th
While working at IBM I wrote an article about how to install Windows Server 2008 using xCAT. The cool thing about this procedure is that you’re using Linux to provision a Windows machine, using the native Windows installer. This isn’t like the other solutions where they just do something like partimage. We think this still has a lot more cool stuff that can be done and from Sumavi’s perspective (my company) its just the beginning of what we’re going to be doing with Windows provisioning.
There are some common pitfalls to doing windows installations with xCAT. Here they are:
1. Is Samba enabled? This seems to be the biggest issue that I always forget. You’ll know if this is your problem and it boots all the way to the command prompt and then reboots. My Samba configuration looks like this:
/etc/samba/smb.conf
[global] workgroup = MYGROUP server string = Samba Server Version %v security = shared passdb backend = tdbsam load printers = yes cups options = raw [install] path = /install public = yes writable = no
Once that’s up restart it and make sure it comes back up on boot:
service smb start chkconfig --add smb
2. Do you have the drivers in your base WinPE image?
This is the hardest part. If Samba is up and you don’t have the drivers then you need to add them to your base WinPE image. I hope to write more on this later, but this is generally the big problem I run into.
3. Do you have drivers in your /install/drivers directory?
If the machine installs and then reboots fine, but then errors out its because it can’t find the boot directory. These drivers in /install/drivers are for the reboot and the script adds them in.
Usually once you get past these issues you can install Windows pretty easily. I hope to write another article on how to do this with the latest updates. Since I left IBM that document has been removed, so if you have troubles either post to the xCAT mailing list or drop me an email and I’ll be glad to see if I can help. We’re trying to make this easier.
How a real man makes a whiteboard
Aug 9th
I’ve been looking for a long time for a real whiteboard solution. The ones I usually run across that look good to me are $100+ dollars. I’ll grant that those have great quality, but for what I’m looking for it may be too much. So I was happily surprised 2 weeks ago while talking with the good people over at Rocky Mountain SuperComputing Center about what to do about this. They told me all I had to do was go to Home Depot and get one of their 8x4feet boards called melamine. I went with the kids to check it out. what I found was indeed an 8×4 foot whiteboard that was perfect for my room. So to my wife’s horror, I dragged this thing up the stairs, hacked a few inches off of it and proudly screwed in 6 screws to against my wall.
Total cost? $11.43. (screw cost not included) So naturally I looked at this and thought: I need more whiteboard!
So next Saturday when I get some time, I’ll be going back to the Home Depot and getting 3 more of them and it will cover my entire wall. Total cost for an entire wall: ~ $44. You can’t even buy a standard good whiteboard for that much money.
The benefits are endless:
- Teach my kids that its ok to write on walls.
- Late night math equations
- No more lost lists of feature requests.
Oh, and here’s the best part: If it gets old and stained, and I need to replace it? No problem, just take the old stuff down and turn it upside down and build a half-pipe out of it and I’ll just skate on it.
There really is nothing that makes a house a home like a wall of whiteboard to write on.
Creating USB key to install VMware ESXi 4.1
Jul 20th
Many servers don’t have DVD drives nor CD drives, so installing ESXi 4.1 with a CD is not optimal. Sure you could always buy a USB DVD drive, but why not try to be cool and do it with a USB stick instead? I’m actually a bigger fan of doing things through PXE. Network installs are the ideal way to manage datacenters. However, sometimes, if you’re a rovering IT guy like me going from site to site, you can’t always do network installs. There are an amazing amount of IT shops that I have run into that shun the idea of network installs. So, for those still in the dark ages, or for those who are on the road a lot, here is how we do it with a USB stick in 10 easy steps.
1. Get the VMware ESXi 4.1 ISO image from VMware.com.
2. Open the image. On my mac, I just click on the image and it opens it for me. On Linux you could do a loop back mount:
mkdir /media/ISO mount -o loop VMware-VMvisor-Installer-4.1.0-260247.x86_64.iso /media/ISO
If you have Windows, I’m sure you can use your favorite search engine to find a way to do it, but the rest of this tutorial is in Linux.
4. Now get a USB stick. You need to partition a large enough windows 95 image and make it bootable. I do this through fdisk:
fdisk /dev/sdc (or whatever it shows up as) d (delete all partitions) n # new partition p # primary partition 1 # 1 is the partition number. 1 # the first cylinder +300M # the size a # toggle bootable flag 1 # make partition 1 bootable t # change the type 1 # of partition 1 b # partition type W95 FAT32 w # write it out
5. Now you need to format it:
mkfs.vfat -n BOOT -F 32 /dev/sdc1
6. Now we need to use syslinux and make it bootable. I do this on Linux like this:
syslinux -s /dev/sdc1 dd if=/usr/lib/syslinux/mbr.bin of=/dev/sdc # note that this is sdc not sdc1
7. Mount the USB stick and copy all the files to it:
mkdir /media/USB mount /dev/sdc1 /media/USB cp -a /media/ISO /media/USB
8. Now you have to get rid of the isolinux stuff:
rm -rf isolinux.bin mv isolinux.cfg syslinux.cfg
9. At this point you should be able to umount the USB drive and stick it in a server and boot from it and start the installer. The problem is (in my opinion) is that the Installer is hard coded to look for the CDROM. So you will error out saying that it can’t find the installation media. This is pretty lame. But that’s ok because I want to automate this anyway. So the answer is we make a kickstart file that can tell it where to go. So let’s edit the syslinux.cfg and add a kickstart file. We add these files to /media/USB where our USB is mounted.
The modified syslinux.cfg file:
Here we simply add the ks=usb argument. This tells it to use kickstart and that the kickstart file is found on the USB drive.
default menu.c32 menu title VMware VMvisor Boot Menu timeout 80 label ESXi Installer menu label ^ESXi Installer kernel mboot.c32 append vmkboot.gz ks=usb --- vmkernel.gz --- sys.vgz --- cim.vgz --- ienviron.vgz --- install.vgz label ^Boot from local disk menu label ^Boot from local disk localboot 0x80
The Kickstart file (ks.cfg)
My simple kickstart file (ks.cfg) just looks like this:
vmaccepteula rootpw cluster autopart --firstdisk --overwritevmfs install usb network --bootproto=static --ip=192.168.70.76 --gateway=192.168.70.1 --hostname=sumavihv --device=vmnic0 --nameserver=192.168.70.1 --netmask=255.255.255.0
10. There, now you’re done. Unmount the USB key, Put it in the server and it will install ESXi4.1 from the USB key without any prompting. Fun in 10 easy steps!
Simple VMware API script to list vms on a Host
Jul 8th
I’ve been getting back to playing with the VMware API and I’ve completely forgotten everything, so I’m starting off simple. Here is a simple script to connect to a host and to list the names of the VMs that are on it:
#!/usr/bin/perl
use Data::Dumper;
require VMware::VIRuntime;
VMware::VIRuntime->import();
1;
# try logging into a node:
my $conn;
my $hyp = 'vhost31'; # you can make this an option to pass in as well.
eval {
$conn = Vim->new(service_url=>"https://$hyp/sdk");
$conn->login(user_name=>'root',password=>'cluster'); # here are some easy root password stuff you need to change.
};
# make sure the connection worked
if($@){
print "Couldn't connect: $@\n";
exit 1;
}
my $entity_views = $conn->find_entity_views(view_type => 'VirtualMachine');
foreach my $ev (@$entity_views){
print $ev->name . "\n";
}
$conn->logout();
That’s pretty much it. Notice that I just printed the name. There are a lot of other things you could print as well on it if you wanted to. Just do a print Dumper($ev) and you’ll see the possibilities.