We host the server for a legacy application in our office. Since it’s more like a favor than a real assignment we don’t care much about the server. However we had a few network issues lately, so we decided to migrate it to a virtual server running on our hosted server. Also the machine produces lot’s of heat and noise, so we’d better had it switched off.
This seemed like such an easy task to do, what could be hard in creating a disk image with CloneZilla, copy it to a server, set up a virtual machine there with kvm, restore the image and redirect all traffic to this computer instead of the one in our office. We estimated it could be done in two to three hours tops, and we get home around 7PM.
We missed a few points though. The machine had a faulty CD drive, so booting CloneZilla was not so easy. We went for the network boot option, but something just didn’t work out on the PXE boot. So we switched the CD drive to a working one, well at least it used to work a few years ago… But not anymore. So we decided to scrap the swap partition boot CloneZilla from there, and create our image. It worked! Almost. Except that CloneZilla didn’t quite identify the disk partition types, and didn’t see the RAID device at all. It turned out, that the utility somehow runs in user mode and starting the partimage utility with sudo is an acceptable workaround. We then had the disk image on a different server, where we tried to restore it into a 10GB virtual disk image. As it turns out, CloneZilla is unable to restore images that are bigger then the target partition even if there is less data in it.
So we went on a quest for a parted that can shrink our 100GB partition to 10GB. Well, if there was one… As we had no intention of booting from the non-working CD we popped in a disk from our jukebox server with Ubuntu on it. No wonder it worked flawlessly, until I learned that a RAID 1 device with ext3 on it is not easy to shrink. So I decided to break the RAID block, remove the incompatible ext3 flags, and resize the first partition to 10GB. Then I used partimage to create the backup, it was created with about 300MB/min.
The backup was quickly copied over to the eagerly waiting server, that was to host the virtual machine. We fired up the VM with a 10GB disk image and CloneZilla iso mounted as a CD, and went out to restore the image to our partition. We didn’t even try to use the CloneZilla UI, except for mounting the host with sshfs to access our backup image. The setup was a breeze, except, that the restore speed was only about 50MB/min. No worries, it’s a small application after all, we don’t need a huge server for it anyway. It was also 1 AM already, so we didn’t pay much attention to detail anymore.
We redirected all the network traffic, to the new virtual interface and tested it while waiting for the restore to complete. During redirect, we fired a command, that killed the network adapter under us, so we were desperately trying to reach the hosting provider, to reboot our server. We went through all numbers listed on their homepage. Good thing they have a webcam in their console room, so we could see the admin passing through on the way to our server, obviously not in a good mood! Suffice to say that was the most fun we had all evening!
After the restore was complete and some minor fixes (make the image bootable, install the grub loader, change the root device in the loader AND the fstab as well, change network configuration, to match the setup, AND the /etc/hosts AND /etc/resolv.conf, that are easily overlooked at 3 in the morning) we had the image booting up already. And oh boy, was it slow? Painfully so! The machine doesn’t support HW virtualization, and SW virtualization just doesn’t cut it.
So we grit our teeth and moved the whole bunch of stuff back to where it was, rebuilt the RAID array, and redirected everything where it was originally. Also we were not happy at all. We went home at 4 AM, when the streets are empty, and the bars were about to close.
I could not believe, that a P4 running at 3GHz should be so slow. So today I took the backup from yesterday, installed VirtualBox, and restored the image to a new virtual machine. It took about 2 hours altogether. I’m not saying that VirtualBox is in any way superior to kvm, but I can show that in our case it’s ten times as fast. I’m sure we made mistakes in the deployment and there might be ways to reach this speed with kvm as well.
No matter how much trouble we went through, it was sure an interesting night, we learned a lot about virtualization and image cloning, that might soon become handy.