Why the site was down for a month, or: why I’m abandoning Ubuntu

metamusing was down for a month due to a catastrophic update experience with Ubuntu. I had been running an LTS release for the last year or so. In December a patch came out which somehow broke apache on that release – it was running, but not responding to requests. I gave that a few days to resolve itself via subsequent patches, and when it didn’t, decided to update to a newer release. Turns out moving from that LTS release was a convoluted process which involved updates to specific versions in the right order.

The first update went fine. Everything was back up and running, including the previously broken apache, but having looked over what it took to get to a current version from where I had been, I figured I may as well take the time to get current now because the issue was only going to get worse as time passed. I proceeded to the next update.

The second upgrade also went fine, leaving me only 2 updates away from current, so on I went to the next one, which did not go smoothly. I ended up at the command line on reboot with a broken xwindows and no networking stack running, for reasons I never determined. Xwindows failing on update has happened periodically with linux distros and while I understand the whys of this, it’s still one of the most frustrating aspects of working with the OS for me. Anyway, after screwing around for an hour trying to repair things with no success I gave up and decided to do the last update to the current release, 11.10, using a CD rather than from the network. This turns out to have been my fatal mistake.

Everything appeared to go smoothly – I booted to CD, it correctly recognized the version of ubuntu currently installed on the machine and asked me if I wanted to update it, which I did, so off we went. During the install process there was a single error message which I had not seen before which was worrying, to the effect of ‘some packages cannot be upgraded and will need to be reinstalled.’ It did not enumerate them nor offer me any options, it just reported the problem. Everything else finished and I rebooted…to discover that the update had gone disastrously awry. A random sampling of the oddness:

  • Apache was no longer installed, and there was no longer a /var/www directory where I had gigs of binary data (most of it pictures).
  • Mysql was no longer installed and none of my table data was present any longer (!!!!)
  • A huge swath of my previous software stack was gone, including the data that accompanied it.
  • The usr directories of my wife and I were still present, but her account was not.
  • the stuff I had installed into opt was still there, along with the data.

As you might imagine, I was furious. I still am. On the one hand, no doubt this is somehow my fault. I was rushing through this. I do not keep good backups beyond the wordpress tables that have my blog. I’m hardly the most clever linux user and my job has removed me from daily use and the good practices that helps enforce. On the other hand, I’ve been running linux at home since ~’97. I’ve had head crashes, disastrous red hat upgrades which pushed me to Ubuntu, a cpu cooler retaining clip breakage which caused one of my machines to bake itself to death, including its drive, yet despite all of that, never a loss of a whit of data. I’ve always managed to recover everything. But not this time. I won’t descend into details, but I’ve spent countless hours trying to figure out how to recover data from that drive, and as far as I can tell, short of paying through the nose those mysql tables are gone, and they’re really the critical missing piece. Everything else I can either recover from the drive, or I have partial or complete backups of, but the sql tables with all the structure to 12 years worth of images in my image gallery? They’re gone.

So…to hell with Ubuntu. I get this is my fault, but at the same time, I won’t run that distro again, and I might be done with linux at home. I should have had backups, but it never should have destroyed my data – that upgrade script to 11.10 was somehow disastrously broken.

Where does that leave things? I moved my hosting over to site5 for the time being, and my media server stuff over to a windows machine. I think I’m going to pick up a mac mini and move to that for some of this stuff, but maybe keep the web on a hosting provider and not in my house. I’m evaluating image gallery approaches now. I’m not going back to menalto’s gallery – they’re not keeping up with the times. I’m not sure what it will be, though Piwigo is looking promising so far. I want video support, mobile support, social networking sharing support, and effective, well-maintained wordpress integration. Suggestions welcome. As far as the old linux drive, I may still pay someone to try and get those sql tables off of there – they have the first year of my son’s life in pictures, with all the family comments on them. I have all the pictures, but that structure and the comments are the absolute worst loss out of this, and I want them back if I can get them without breaking the bank.

Praise for Tomato, free router firmware replacement

Linksys WRT54G version 1.
Image via Wikipedia

Tomato is one of a number of replacement firmwares for routers. Last week I switched over to it from the stock firmware on my Linksys WRTG54. So far I love it, despite it being responsible for knocking my network offline and forcing me to re-configure everything from scratch. Truth be told at this point I’m pretty sure the network being offline was my fault (me? Read the docs? never!), and the process of rewriting every device’s config from scratch was a good exercise for me since I have a ton of devices and the configs were an accumulation of mistakes small and large.

The whole move to Tomato was caused by Thanksgiving, when one too many devices ended up on my network. This caused a cascade effect of ip addresses being bumped and multiple devices with one IP assigned to them. This knocked my consoles offline and caused my streaming music to stop working, pushing me to replace the firmware, but I had been planning to do it anyway for a couple of reasons. First, Comcast now has a 250GB monthly bandwidth cap and I want to track how much we’ve used of it at any point in time, and second because there are bugs in the factory firmware on my router which cause UPnP not to work for my gaming consoles.

The install process couldn’t have been simpler – just point the default firmware’s update function at the firmware from the site, do a nvram reset, and configure. It was even smart enough to pick up my old firmware’s configuration with its dozens of MAC addresses in the wireless access list, and though in the end I think that’s what caused the problems I initially had, I was still impressed that it worked.

UPnP now works on my consoles, the interface on Tomato is much nicer than the default Linksys one, there are a ton more features including ssh access, dynds/domain mapping, full routing functions, various logging/traffic reporting features, and more, and all for free – it’s a fantastic option if you have one of the supported routers. Definitely worth checking out.

For kicks, to give you a sense of scale, here’s a mostly complete list of networked objects in my house, each of which I had to poke yesterday as I resurrected everything:

Hardware:

  • Yamaha receiver
  • Pocorn Hour streaming media box
  • Xbox 360
  • Playstation 3
  • Squeezebox Duet remote
  • Squeezebox Duet content streamer
  • Linux webserver
  • Gaming PC
  • 2 mac laptops

Software:

  • Playon (PC)
  • Apache (linux)
  • MyIhome (PC)
  • Squeezecenter (PC)