Serious electricity problems in the Los Angeles area took our entire network off-line for several hours on Monday afternoon. All services are believed to be up and running at this time. If your service is not fully functional, please contact our support team so we can address any remaining problems as quickly as possible.
Ah, just no /working/ backup power supply at DH:
the entire building where our data center is located’s back-up generators (there are SUPPOSED to be four) stopped working
Im switching to verio…a real hosting company.
Do check the small print. Verio advertises “guaranteed uptime” but of course provides nothing of the sort. Their downtime compensatation threshold is 0.1%, which is a lot worse that DH have delivered for the last year IME. And they totally exclude email service from the guarantee.
Fortunately this was just a minor scare. It was easily recovered from, didn’t disrupt services overly much, and it showed Dreamhost that their emergency plans need overhaul. They should take this as a learning experience, see what failed and why, and make a public statement detailing what steps they’re taking. If they can resist the temptation to blame others for the failure, they would come across as grade A professionals.
[quote]Fortunately this was just a minor scare … didn’t disrupt services overly much
In what sense is an outage of apparently all DH websites and email for three hours just a “minor scare”?
[quote]If they can resist the temptation to blame others for the failure,
they would come across as grade A professionals
Having backups that actually work is what is needed to come over as grade A professionals.
Have to disagree with you on that count. This isn’t a hobby for a lot of customers and an outage of any kind is not minor.
O.K., I’m with you on those statements. I’d like to see the next blog entry address what concrete steps being taken to assure uptime. This appears to have been a classic case of assuming someone else (the building’s generators) would do their job…
As someone who just switched over to Dreamhost, I was concerned about the lack of a central (U.S.) location for their servers and am now even more concerned about the integrity of their datacenter. Looking forward to hearing about rapidly-deployed improvements.
In the sense that it could have been much, much worse? As in, catastrophic hardware failures and such?
Actually, it’s how you respond to a situation that tells whether you’re a professional, or just someone who technically knows what they’re doing. A grade A professional doesn’t wig out, calmly and efficiently studies the problem, corrects it, learns from the planning mistakes, takes preventative action, and communicates with the clients. “Professional” is more a frame of mind than anything else. If a professional does not have working backups before they’re needed, they will have them soon afterwards. A professional is not super-human, and lapses will happen. It’s what happens afterwards that really defines a professional.
Disasters happen everywhere. You just plan ahead so that you’re prepared when they happen. Look at DirectNIC, the only ISP who stayed in business through the hurricane Katrina, and is still operating. Most datacenters in Los Angeles kept operating just fine through the outage, and would probably have survived through a massive earthquake too. The location was not the problem. For some reason or another things didn’t pan out. As I wrote earlier, it’s the going-forward part that’s important now.
As someone who just switched over to Dreamhost, I was concerned about the lack of a central (U.S.) location for their servers
In what sense are DH’s servers not in a central location? Which US are you referring to that Los Angeles is apparently not a part of?
If you want useful replies, ask smart questions.
Well, ABC and Disney both lost their Email servers during the Outage (and even when power was restored, their email servers were still not functioning.)
I agree with Iri, Dreamhost needs to study the matter and come up with a solution. Maybe taking NS1 and putting it in another state. I am not in any means a techie person who understands how to best serve everyone, especially during an event like the blackout. But, my customers were quite happy when they got the call about what was going on. And I would not have been able to give them periodic updates had dreamhost not let us know what was happening at status.dreamhost.com.
How they choose to work on this is just something those of us (who will continue services with DH) will have to wait and see. I have bookmarked the blog and will check it periodically.
I don’t take this lightly, but realize I cannot control these events and need to leave it in the hands of those who can.
Some of my sites were still down two hours ago. About 20 hours of downtime for a couple hours of power outage is not good at all.
One site is still bogging down horribly.
Geographically speaking, one can hardly call Los Angeles a CENTRAL U.S. location. There is a certain level of security provided by being away from the coasts (why else would all those missile silos have been built on the Great Plains?). In the Internet realm, one, of course, also needs to be located near redunant data pipes.
I’m not dissing L.A… I’d just feel better if a mirror site were located in the heartland.
They did have some infrastructure hardware failures, so the sites will probably come back up once those are fixed.
You know there’s this supervolcano waiting to happen under Yellowstone National Park, just east of the Rockies? That ought to put a damper on datacenters in the midwest if those go off. As I said, disasters happen everywhere.
You originally said:
I’m not sure how you can say it was “easily recovered from” or that it “didn’t disrupt services overly much” when you also note that some services are still down.
“Grade A Professionals” don’t bother to work on a blog telling everyone that everything is up “as far as we can tell” when sites are still down. It points out:
- Priorities are WAY off. Making a pretty graphic to tell us all is well is SHOULD NOT BE HIGHER than actually making sure the sites are up. This is juvenile at best.
- DH doesn’t have much of a clue as to if their systems are up or down. Eight hours after their message, I still had sites that were down. One site is still sluggish.
[quote]In the sense that it could have been much, much worse?
That’s true of 99% of failures. It does not make them just “minor scares”.
[quote]If a professional does not have working backups before they’re needed,
… then he’s not a professional.
Think of it this way: it’s hell of a lot less painful to recover from a couple of dead (unmanaged?) switches than it is to recover from complete failure of all drive arrays and motherboards. A nice unrestrained surge through the power lines could do just that, and then a slow site would be the least of your problems. If simply powering the servers up and replacing few switches brings most of the sites back online, that’s a graceful recovery from a complete power failure of this magnitude. It ain’t pretty in the big picture, but given the circumstances it’s not bad.
[quote]> If a professional does not have working backups before they’re needed,
… then he’s not a professional.[/quote]
Even professionals are human. Lapses happen, and you shouldn’t go waving your arms and shouting “Armageddon!” if you lose one email. That’s juvenile, and definitely not professional. Look for a pattern before you pass judgment. If you see a pattern, be a professional and do the professional thing and make your backup provider your primary provider.
In this instance DH supposedly had plenty of backup power capacity standing by, but it didn’t make itself available. There’s been an error in judgment, i.e. relying on someone else without doing checks yourself, and a professional will learn from it. I suggest you wait until you see how Dreamhost responds before nailing them to the wall. If they fail to follow through in a professional manner… rip them a new one
Edit: did you notice what a bunch of us wrote in the off-topic section? I’m not happy with DH’s level of preparedness, in case you’re wondering, but I am waiting 'til I see how they respond.
Yes, actually, it is. Designing a structure for tornado-force winds is easier (and less expensive) than designing for earthquakes or flood waters. We’re not talking trailer parks here - we’re talking datacenters (which should be constructed like bunkers).
My old hosting company (located in the DFW area) wasn’t anything to brag about, but their datacenter(s) never shut down due to a lack of power.
It shouldn’t take much to recover, but the fact of the matter is you said they have recovered when in reality, they have not yet recovered. You cannot gauge that the recover was easy when it’s not complete yet. Does that make sense?