△ MENU/TOP △

Holtz Communications + Technology

Shel Holtz
Communicating at the Intersection of Business and Technology
SearchClose Icon

Server drama and blogus interruptus

Murphy does more than live. Murphy thrives.

I’d been experiencing some hardware problems with my server for a couple months. Many readers noted that a visit here was often greeted with some kind of database error. This was evidently caused when the server would suddenly reboot for no reason, intermittently at first, then as frequently as eight or 10 times per hour, then it would work fine for a week or two, then begin the whole cycle again. The reboots resulted in corrupt database tables which were easy enough to repair, once I knew the corruption had occurred.

So (I can hear you ask), why didn’t I just fix the hardware problem?

My server was at a colocation facility that was, essentially, abandoned by its owner. Sending about 40 emails had no effect. I couldn’t get to the server. In fact, when I started making inquiries with the property manager, I found out that the colo owner hadn’t paid his rent (he claims otherwise) and even he couldn’t get into the facility.

Finally, out of the blue, the colo owner called and told me he was on site and if I wanted to get my server out of his facility, I’d better do it now. I broke a few speed limits getting there and retrieved my server and one I manage for a group with which I’m involved in a volunteer capacity. It only took a day to find a new colo service, but my server wouldn’t boot up at all; it didn’t get past the BIOS before it started rebooting.

Of course, this was now the day before Thanksgiving, and my IT guy, Mike Vincenty, was off for a family gathering in Phoenix while my family was meeting up with Michele’s brother and his family in Las Vegas. By the time Mike was able to get back to the server, it was clear that something serious was amiss. I was on the road, but Mike took it into the shop where all my computer work is done, where it was determined that the box had overheated and a capacitor on the mainboard had blown. The solution: a new server featuring the hard drives from the dead box.

That took another couple days, then more time to install new drivers, rebuild the RAID array, and make a host of other adjustments. With everything finished, Mike let it burn in overnight last night, then installed it at the colo facility this morning. I was back up and running by about 2 p.m. PST today.

In the end, my blog and website were offline, as was my email, for nearly two weeks. Knowing emails were bouncing (and that I would never recover them) and that I had few means of getting the word out that my site and blog were offline resulted in a degree of anxiousness I’ve rarely experienced before. Some of my colleagues have suggested that I should scrap my own server and maintain my properties in the cloud (using a hosting service), but that has about as many downsides as owning your own box. Besides, with a brand new server, I should be good for a few years.

In any case, I’m back. I’ve missed you all! And I have a backlog of topics I’ve been waiting to blog. Of course, I’m traveling this week (three cities: Montreal, Chicago, and New Orleans), but I’ll find the time to get some posts up. But when the travel subsides, you can expect a healthy dose of posts to make up for the fortnight’s absence.

12/03/06 | 9 Comments | Server drama and blogus interruptus

Comments
  • 1.FYI:
    To help avoid repeat:
    * secondary your dns out to a specialised dns hoster. Generally very cheap, can give levels of rapid flexibility.
    * In the event of a major DNS failure - your first problem; repoint your dns at your registrar to your new dns hosting. At least get *that* working and such that you can manually point entries wherever.
    * Get a cheap webhosting account, one of those $6 a month types is fine, that can accept email. Redirect, via dns MX's, all your email thru them.
    This should give you basic webmail/pop3 capability and hence access to new emails.


    This isn't perfect, but is doable *very* rapidly. And at least you won't lose too much email, if any.
    (E)SMTP will try for up to 5 days before giving up entirely, so, so long as you get a working email system in under 5 days, you shouldn't lose anything. And senders should get "undeliverable, but still trying" messages within 24 hours.


    By the look of your DNS, you've now got a series of backup secondary MX's. I don't believe that many to the same box will help but hey...

    Unf that won't really help solve the problem, namely that the end server is off the air for an extended period of time.
    You really need a method of rapidly pushing email over *there* as a final destination, to gain easy access to it.


    Tho, it is possible to extract raw emails out of sendmail etc queues, and I've done it - having a "single & central point of failure SAN" die is entertaining to watch when it's not one's problem. :-). A very quick and nasty perl script will convert mime encoded documents to their unencoded form.

    Knowing Murphy as well as I do, I suspect the next failure will be within 2-3 weeks. :-P

    steve | December 2006

  • 2.Thanks, Steve.

    One problem: The name server I use is on the same box!

    Shel Holtz | December 2006 | Concord, CA

  • 3.Yayyyy!!! Shel's back!
    It's tough living offline 24/7. When Katrina hit, we were without power for 6 days. Thank Allah Comcast was up when the electricity came back on! I Needed my online fix! lol
    Love & Peace, Clarence

    Clarence Jones | December 2006 | East-Central Mississippi

  • 4.It's good to have you back Shel....If it were not for FIR, we would still be quivering from withdrawal.

    Reid Givens | December 2006 | Rio Rancho, New Mexico

  • 5.Welcome back! Shel.

    Joon Lim | December 2006

  • 6.And like Mr Kotter, welcome back!

    Lee Hopkins | December 2006 | Adelaide Hills, Australia

  • 7.Good to see you back, Shel - and I know it is not funny, but you might consider putting this one up on your server as a practical joke: (OBS! - wait a few seconds - sth. will happen)

    http://larstrup.dk/blog/blog.php

    Karin Hoegh | December 2006 | Copenhagen

  • 8.Glad you're back!

    maggie fox | December 2006 | toronto, canada

  • 9.That's wonderful, Karin; thanks! I've seen a lot of entertaining 404 messages, but nothng quite like that!

    Shel Holtz | December 2006 | Montreal, Quebec, Canada

Comment Form

« Back