Enterprise Datacenter Management Voodoo
Linux
More notes on ESXi 4.1 Kickstart
Aug 13th
ESXi 4.1 kickstart is adequate for most things but I still have several issues with it that I consider ‘bugs’:
1. If you’re not connected to a network, it doesn’t work. This is fine since most people will be on a network with VMware anyway right? Fine, I’ll let this one slide. But if you just have a machine and a usb stick, then why do you need the network? Sure you’ll have it eventually but I just want to test it on my server on my desk…
2. The kickstart file likes to stop and give you alerts even if everything is ok. As an example: In the post install script if I don’t put the interpreter it stops and gives me a note: “Interpreter not specified, using busybox” That’s fine, that’s what the default is. Why stop me? The docs state clearly that the default is busybox.
3. Name resolution doesn’t work in postscripts. If you’re trying to get information from other hosts, it doesn’t work. Forget it. Just put in the IP address in your post install script.
4. USB installations without kickstart don’t work. You need to have a CD/DVD image. This is lame. In an era where most servers I deal with don’t have DVD roms, why make me buy a usb DVD drive? A $10 usb stick should do this just fine.
5. Lack of mount support. This kills me. I want to have a USB drive boot up ESXi 4.1 in kickstart and then boot up with a virtual machine. Problem is my virtual machine is 60GB. After digging around, I see that ESXi 4.1 can get files from a FAT32 filesystem by using the mcopy command. (It doesn’t do a mount). But what I really want is ext3 support so that I can copy 60GB files onto a hard disk. I’m thinking about hacking an ext3 driver for busybox, but I don’t know how difficult that will be. Right now, my options seem to break up my disk image into 2GB chunks so they fit on the FAT 32 partition… lame. Anyway, please don’t tell me to consider NFS and all that stuff, because I know that’s the optimal solution. This project is a little different than what you may be thinking of.
Anyway, I don’t want to keep complaining, so here are some nice things:
Enable SSH on the TSM.
We want SSH on our machines, even if its not supported. So we add this to our kickstart file:
%firstboot –interpreter=busybox –unsupported –level=47
sed -ie ‘s/#ssh/ssh/’ /etc/inetd.conf
Mounting USB drives
As I mentioned you can’t mount USB drives on ESXi 4.1. (At least I haven’t figured it out yet). You can do passthrough with the USB drives so that the VMs can mount them, but you can’t actually mount it on the hypervisor.
However you can copy files from the FAT32 partition. Here is an example of a command to use in a kickstart file:
mcopy -i /dev/disks/mpx.vmhba32:C0:T0:L0:1 \::IMAGEDD.BZ2 /install_cache/IMAGEDD.BZ2
(In fact, this is the exact command used by the installer to grab the bz2 image from the fat 32 partition)
So if you had a file named foo on there, you could substitute it in for the IMAGEDD.BZ2 file name and copy it onto your hypervisor. I would do this for copying *vmx files or things like that.
There’s one catch: The mcopy command is available during installation, but upon reboot, there is no mcopy command! So if you want it, then a good idea is to copy it during the kickstart file to some place where you can get it after its installed.
Anyway, happy VMware VMworld to all you who are going.
opening VNC from behind a firewall
Jul 14th
Here is the cast of characters:
1. blopr: A server that is behind a company firewall that I want to view its vnc session
2. netnet: A server that is on the internet that I have access to.
3. Me: The humble system admin who wants to view the VNC session on blopr.
Here is how I do it:
on Blopr:
vncserver :99 -depth 24 # and whatever other arguments you want to have. ssh -R 5999:localhost:5999 root@netnet.example.com
On NetNet:
redir --lport=5989 --cport=5999 --caddr=127.0.0.1
On yours-truleys humble macbook pro:
vncviewer netnet.example.com:89 # enter the password for blopr's vnc session
Presto! You are in there my friend!
Bonus for you to try: Suppose only SSH is allowed out from blopr? This is left as an exercise to the reader. But the trick is its very similar.
Server to Switch Port mapping with SNMP
May 20th
Let’s say you have a switch. And you are on a computer that is connected to the switch. You know the following:
- The name of the switch
- The SNMP community string and version the switch uses.
But, you’re too lazy to walk over to the data center to figure out which port you’re connected on.
Here’s what you do (assuming you’re running Red Hat or CentOS)
yum - y install net-snmp-utils
This gives you the snmpwalk command.
Now:
first, figure out your MAC address:
[root@n33 etc]# ifconfig eth0 eth0 Link encap:Ethernet HWaddr 00:A0:D1:E9:3E:DC inet addr:10.3.0.133 Bcast:10.3.255.255 Mask:255.255.0.0 inet6 addr: fe80::2a0:d1ff:fee9:3edc/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:972965 errors:0 dropped:0 overruns:0 frame:0 TX packets:637686 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:587009797 (559.8 MiB) TX bytes:100870077 (96.1 MiB) Memory:fcde0000-fce00000
So my Mac is 00:A0:D1:E9:3E:DC
Nice.
Now lets look for this mac address on the switch. Assuming my community string is ‘foobar’ and my switch is set up to do snmp version 1, and my switch name is switch1, I do:
snmpwalk -v 1 -c foobar switch1 SNMPv2-SMI::mib-2.17.4.3.1.1
When you do this, you’ll see a list of the nodes. Among them should be a Hex-STRING that matches your MAC address:
... SNMPv2-SMI::mib-2.17.4.3.1.1.0.160.209.233.62.220 = Hex-STRING: 00 A0 D1 E9 3E DC ...
Notice that the string:
0.160.209.233.62.220
Is the decimal representation of the MAC address:
00:A0:D1:E9:3E:DC
So you have verified one important piece of information: You are in fact connected on this switch!
Now figure out which port!
We use the decimal representation of the MAC address from this point on to find the port. Usually for me this works:
snmpwalk -v 1 -c foobar switch1 SNMPv2-SMI::mib-2.17.7.1.2.2.1.2
Among the output I get:
SNMPv2-SMI::mib-2.17.7.1.2.2.1.2.413.0.160.209.233.62.220 = INTEGER: 5
This tells me that my node is connected to port 5 on the switch.
Sometimes (on cisco switches and others) you may need to use a different approach for the mac-to-index values:
snmpwalk -v 1 -c foobar switch1 SNMPv2-SMI::mib-2.17.4.3.1.2
This is great, in that I never had to leave my chair to figure this out! The pounds pile up and my largeness increases. Anybody got a better/easier way to do this?
KVM with RHEL5.4
Oct 29th
Installing KVM with RHEL5.4 is pretty easy. These were some of my notes that I hope you can somewhat follow along with. I have since done it with RHEL5.5 with the same results. Here’s how I got my virtual machine up:
1. Install Packages
1 | yum -y install kvm python-virtinst libvirt libvirt-python virt-manager virt-viewer |
** Note: I got some of these packages from xCAT’s distribution which has some updates. But some of these should still work.
2. Start libvirt
1 2 | chkconfig --level 345 libvirtd on service libvirtd start |
3. Configure Bridge Network
In my set up I want my virtual machines to be able to access the network. I also want machines on the network to access my virtual machine. In my setup, eth1 is on the internal network to my cluster. What I will do is create a bridge with eth1. I also need to make a few aliases to handle my different networks.
/etc/sysconfig/network-scripts/ifcfg-eth1
1 2 3 4 5 | # Intel Corporation 80003ES2LAN Gigabit Ethernet Controller (Copper) DEVICE=eth1 HWADDR=00:15:17:85:A8:CD ONBOOT=yes BRIDGE=br0 |
/etc/sysconfig/network-scripts/ifcfg-br0
1 2 3 4 5 6 | DEVICE=br0 BOOTPROTO=static ONBOOT=yes TYPE=Bridge IPADDR=172.20.0.1 NETMASK=255.255.0.0 |
/etc/sysconfig/network-scripts/ifcfg-br0:1
1 2 3 4 | DEVICE=br0:1 IPADDR=172.29.0.1 NETMASK=255.255.0.0 ONBOOT=yes |
/etc/sysconfig/network-scripts/ifcfg-br0:2
1 2 3 4 | DEVICE=br0:2 IPADDR=172.30.0.1 NETMASK=255.255.0.0 ONBOOT=yes |
4. Create Virtual Machine
1 | virt-install --name xcatmgr --ram 1024 --connect qemu:///system --disk path=/install/libvirt/images/xcat.img,size=10 --vnc --cdrom=/install/isos/RHEL5.4-Server-20090819.0-x86_64-DVD.iso -b br0 --os-variant=rhel5 |
4.2 Copying Virtual Appliance
I created an xCAT appliance that I wanted to run:
1 2 3 | modprobe kvm modprobe kvm.intel service libvirshd restart |
Then import my machine:
define /install/xcatapp/xcatmgr.xml
(or just copy it to /etc/libvirtd/qemu and restart libvirshd)
From there it just booted up, launched virt-viewer and away it went. I’m doing this on 1TB SATA disks, and the performance is just horrible.
Even though I did this all with RHEL5.4, this post was very helpful as well.
Now, I also want my nodes to bridge directly to the physical network.
1 2 3 4 | service libvirtd stop ip link set virbr0 down brctl delbr virbr0 create br0 with /etc/sysconfig/network-scripts |
bind it to eth1
then make new net:
1 2 3 4 5 6 | brctl setfd vlan1 0 ip addr add dev vlan1 172.20.1.19/16 brctl addif vlan1 eth0 ip addr del dev eth0 172.20.1.19/16 ip link set vlan1 up virt-viewer xcatmgr |
Other good notes:
http://wiki.centos.org/HowTos/KVM#head-c02a0b33e7949b0bc3b151ac6e0bdfb91b6bbd1c
redir and ssh forwarding
Oct 19th
Here’s the situation:
You have a machine called skull that has access to the internet. However, no one can come into skull from the outside.
You also have a machine that is on a private network with skull called bones.
Finally, you have a third machine out on the internet named benincosa.org that you want to have access bones.
To make this happen, you use an SSH backdoor plus redir to set it up. Here’s how its done:
1. On skull: ssh -R 2222:localhost:22 benincosa.org
2. On skull: redir –lport=2222 –cport=22 –caddr=bones
3. Now from anywhere: ssh -p 2222 benincosa.org and enter the passwd for bones and you will magically find yourself on bones.
That is how it is done my friends.
Another case:
from internal firewall machine:
ssh -R 2222:localhost:22 vallard@benincosa.org
On Benincosa.org run:
redir –lport=2223 –cport=2222 –caddr=127.0.0.1
IBM tape drive install
Oct 19th
I had to install an old x440 with SCSI attached tape drive. I did it like this:
1. Go here:
ftp://ftp.software.ibm.com/storage/devdrvr/Linux/lin_tape_source-lin_taped/
Grab IBMtapeutil, lin_tape-1.*, and lin_taped*
2. Make sure you install dependencies: rpm-build, gcc
Easy way: yum -y install rpm-build gcc kernel-devel
3. rpm -ivh lin_tape-1.27.0-1.src.rpm.bin (why they left the bin extension is beyond me)
4. cd /usr/src/redhat/SPECS
5. rpmbuild -ba lin_tape.spec
6. cd ../RPMS/x86_64/
7. rpm -ivh lin_tape*rpm
8. Now install the lin_taped rpm you downloaded:
rpm -ivh lin_taped-1.27.0*rpm.bin
9. Install the IBMtapeutil tarball you downloaded:
tar xf IBMtapeutil.1.5.1.rhel5.x86_64.tar.bin
cd IBMtapeutil.1.5.1
make install
That’s it!
Configuration Management
Sep 29th
Last week I researched a few different configuration management tools. Configuration Management is the art, or act of managing lots of computers in some organized fashion. The act of managing a computer involves what is put on the machine as far as software and also figuring out permissions, environmentals etc. The problem isn’t complex when you deal with maybe 1 or 5 machines. However, when you have a cluster, or a cloud, then having a good way to manage them all becomes very important.
In the world I came from, High Performance Computing, the job was a bit easier because every machine was identical. Every ‘node’ did the same thing. The only difference was the IP address, MAC address, and hostname. Everything else was identical. We never did any management other than the initial install plus some post scripts to make sure they were configured perfect. We could spend a few good solid days making sure our postscripts were perfect. That way if a machine died, or a new one needed to be added, installing it was trivial. In this we never needed any post configuration management. In addition the packages required were rather simple because a lot of the required files, libs, and programs were contained on the distributed file system. (NFS, GPFS, or some other way)
Another point to all this is that we usually kept our nodes ‘stateless’, or in other words ‘ram-root’ as it is called. Ram-root just means that the entire operating system resides in memory. You may say “wow, that’s a lot of memory” but keep in mind, the entire OS for HPC environments, including the memory hogging InfiniBand modules could be loaded in less than 200MB image. So when your modern Nehalem machines are usually equipped with 24GB of ram, then what is a measly 200MB of ram? Plus your system runs better cause its only doing what you want. This is all made possible via xCAT.
But, I digress. The world of cloud computing is different. There are different OSes, different applications, and we’re dealing with a very heterogeneous environment. Thus configuring the software on all of these machines is not as trivial of a problem. It’s no longer just one image that you need to be concerned about – it’s many!
Rather than creating my own, (which is never a good idea when there are so many great solutions available), I went to take a look at what was out there.
The most promising that I saw were:
- Bconfig (bcfg2)
- cfengine
- puppet
Never the less, let me give some info on what I found:
cfengine
This tool was created by Mark Burgess. There is an interesting talk he gave to google that is available on YouTube here. cfengine seems to be the most venerable and developed, but it seems from the mailing lists I’ve read that it’s seem to lost its luster in favor of puppet.
Puppet
Puppet seems to be what all the cool kids are using these days. The web site is very well developed, the documentation seems to be organized well and far better than cfengine nor anything else I looked at. This really impressed me: If you want to make a good open source tool that everyone uses you need to do two things right:
1. You have to present it well on a web site with clear documentation, customer testimonials, and all kinds of good information.
2. You need to have to make it easy to use, get, install. IT is too complicated these days. No one wants to spend hours learning something. The easier you can make it to use the more successful it will be.
Puppet may not be better than cfengine (though I think they think it is) and it may not be better than bcfg2. But the presentation is worlds better, and that makes people want to use it. It invites you to use it. xCAT can take a page from that and it’s made me want to double my efforts in revamping the web page.
This shouldn’t be a surprise either. After all, this is what Apple does. They’re a marketing company. Presentation is everything. A good presentation, a good feel, and ease of use will make a tool stand out, even if it isn’t that much better than the rest in the pack.
Part of the marketing is that the person who started puppet used to code vigorously for cfengine adding lots of modules before striking out on his own. This gives people the idea that puppet is the next generation of cfengine. Its a good story. The ease of use is there, and so just on that alone, I can see why its all the rage now days.
bcfg2
bcfg2 or ‘bconfig’ seems to be the lone wolf of the pack. It’s web site even mentions that it doesn’t get as much press as it probably should. Well, what do you expect? This is a national lab full of unsexy engineers. (no offense guys/gals). They’re engineers developing tools. Having said that, Ti Leggett and I spoke and he showed me all the cool things bcfg2 could do. The modules in there seemed very cool as well as the client/server implementation.
My decision
So where does this leave me? Which one do you choose? Well, I hate to say it, but in my situation, I was looking for a solution that could handle an NFS root boot up. It was apparent that they could all handle this in a postscript bring up, but the solutions seemed to fall short when we got a little more specific:
Consider the case of an organization that want’s their images locked down. (meaning NFS root where nearly everything is read only and can’t be touched) This could be a large global organization so /etc/resolv.conf in a lab in Spain isn’t going to be the same as one in Montreal, even though they’re all using the same installation source. Never the less you want /etc/resolv.conf to boot up as a non-writable file, preferably nfs mounted. Sure the user could unmount the file and then change it as root, however no changes they make would stick.
It was a situation such as that where I couldn’t make use of these tools. Perhaps someone knows of a way to do it, but it seems to me that such a tool would need to be integrated into the creation of the ram disk. In addition this global traversing would have to go through a hierarchy of directories:
/foo/globalfiles/
/foo/usafiles/
/foo/newyorkcity/
/foo/datacenter3/
All of these directories may contain an /etc/resolv.conf or a SSH known-host keys that have to be integrated and concatenated down. Perhaps we could look at it from an object perspective instead and this would allow us to see if a node belongs to a particular class. If so how do you establish the hierarchy? It didn’t seem to me that the above tools could handle that. Maybe I’m wrong.
But I think like a lot of other people I would go with Puppet. Not because it’s technically better but because the crowd mind would look like this:
1. If everyone’s doing it, then its going to stick around and I’m not wasting my time learning a dying tool.
2. It’s so easy to learn cause all this documentation, then its not going to take me a long time.
Thus we see my friends, and my point: Sexy wins.
php system log in
Aug 27th
I’m trying to make a php program that will authenticate users based on if they have a userid on the system. In my environment my system has a number of users who can just ssh into the machine if they want. But I am trying to make some applications available via a web interface.
I first saw this link and tried to explore the posix_getpwnam function. This looked promising, but unfortunately Linux puts the password in /etc/shadow so you can’t parse the hash that’s returned to do pattern matching.
I also didn’t want to change any file permissions on the system. I saw a number of posts that suggested that approach.
So I just did a simple expect script:
lib/passwdV.expect
#!/usr/bin/expect
log_user 0
set argc [llength $argv]
if { $argc != 2} {
puts "Usage: $argv0 \[userid\] \[password\]"
exit 1
}
set user [lindex $argv 0]
set password [lindex $argv 1]
spawn su $user -c true
expect "Password:"
send "$password\r"
expect {
"su: incorrect password" {
exit 1
}
}
exit 0
Now I just take my calling script and pass the parameters in:
exec("lib/passwdV.expect $user $passwd", $output, $rc);
if($rc == '0'){
echo "You are logged in!!\n";
}else{
echo "Login Failed!\n";
}
You can then take the user and set the session logged in portion if you want. This is how I did it. This way I don’t need to store user information in a database since its already part of the system.
There is still the security concern that when this exec happens the ps -ef output will actually print the password and userid. A better way would be to encode this. I hope to get back to this soon and fix it.
Update RHEL4 AS U 4 to CentOS 5.3
Aug 25th
I found this great migration guide here on how to go from RHEL4 to CentOS 5.3. The article is actually older so just going to CentOS 5, but the concepts were nearly the same, just changes in the versions of the packages.
Some deviations:
1. Copy the Base Media DVD to the RH4 machine first. Then expand it and have it ready so that the machine can get the necessary RPMs it needs. There will be a period where you can’t ssh into the machine as it will be in a very funky state (as to be expected). Its also a good idea to have the RH4 machine NFS mount a working CentOS 5.3 machine so that you can copy over binaries if needed.
2. Follow the directions listed here. When you get to the list of RPMs to put in the Updates section add a few more RPMs:
sqlite-3.3.6-2 glib2-2.12.3-2.fc6.x86_64 nspr-4.7.3-2.el5 nss-3.12.2.0-4.el5 python-iniparse-0.2.3-4.el5.noarch.rpm yum-metadata-parser-1.1.2-2.el5
After that, you may have to add the –force flag to install all the Updates:
rpm -Uvh *.rpm --nodeps --force
3. The updates part is pretty good. I had to remove the following RPMs:
rpm -e VFlib2 autofs kudzu hal lksctp-tools initscripts dmraid
4. Basically everything else works. I also recommend that you install ‘strace’ as you can then see what is wrong. For example, I had to find all those extra packages by running strace to see which library was missing. I also found out that I needed to add the glibc2 package thanks to this handy post here.
After that, reboot the system and all will come back up and you’ll be running!
xCAT and RHAS4 update 4
Aug 25th
Recently I had to add support for xCAT to install Red Hat 4 update 4 for a project I was working on. I’m using development versions of xCAT 2.3 before all the xnba stuff got put in, so its a real hodgepodge of code.
Here is how I did it.
1. copy the cds:
copycds -n rh4 RHEL4-U4-x86_64-AS-disc*.iso
2. Make a new tree:
cd /opt/xcat/share/xcat/install cp -a rh rh4
3. Edit Kickstart file
RH4 update 4 doesn’t have the –key skip option like it does in RH5 so I had to edit that out of the kickstart template in
cd /opt/xcat/share/xcat/install/rh4/ mv compute.rhel4.tmpl compute.tmpl
4. Install a node:
rinstall i03 -o rh4 -p compute -a x86_64
5. Sit back and relax! …Well, if you boot the right node. Turns out I had new hardware and an old OS! So it cried that it couldn’t find the disk driver of a dx360 M2. Now I really didn’t want to go down that path of patching the kernel and ramdisk. I used to do that a lot for systems… So this time, I was fortunate in that I had an older machine (x346) that I could just use instead…
Whew!
Its so nice when things work right away.