Okay, folks, here’s a somewhat edited version of my response to a Support response…
Shin copied and pasted some boilerplate that said:
[quote]Yesterday afternoon, our file server for your cluster crashed and was down
for about an hour.
More like 1hr 20min to 1hr 30min.
[quote]Then late last night, we switched the nfs server for
that file server and it was working for a while. However, sometime after
that, at around 3am PST, we started to experience some further problems
and nfs stopped working until about 9:30am PST.
Yes, I know. I reported it.
So, the major outages yesterday/last night/this morning lasted a total of 10 hours out of 24.
Today’s mail outage lasted nearly two hours.
I’d also be interested to know why the stats page hasn’t updated to truthfully reflect this series of outages, and why the status page continued to say the services were “Up” when they were obviously “CRITICAL.”
[quote]What this means is that your web server and your mail server could not
communicate with your file server. We have since gotten everything back
up and running and after further investigation found out that there was a
Unfortunately, the problems that have been occurring for your cluster have
been due to hardware problems.
Yes, I suggested that, and I suggested that you get Softaware to swap out the server.
[quote]We will be receiving a new file server
tomorrow to replace the current one. We should have this ready to go and
deployed within the next 2 weeks.
It takes you two WEEKS to configure and deploy a file server?
[quote]This should alleviate all the issues
that we have been experiencing lately with your cluster.
We do empathize with the frustration and disappointment that you have been
experiencing lately. We are not happy when our customers are not happy.
But once we get the new file server, everything should stabilize. In the
meantime, we are offering you one month of hosting credit to your account.
Thanks for the credit, but I care more about making sure that my clients’ services are actually working.
[quote]If you have any further questions regarding this matter, please let us know.
Look, I’m new here. I opened an account a couple of months ago and tested the services. Great stuff. Really happy.
So I moved about 53 domains and a half-dozen sites to Dreamhost, with more to come. It took me a hell of a lot of work.
Now this thing comes along… This is a serious problem, and it scares the hell out of me. But what scares me even more is that you guys aren’t using data centers that provide hot-swappable – or at the very least, warm-swappable – backup hardware… and that you think it will take you two weeks to deploy a replacement server.
Dreamhost isn’t just a single client. It’s a massive number of clients. No duh, right? Well, when you have that many people and companies relying on a single fileserver, it needs to have a backup, and that backup needs to be deployable within a matter of hours, not a matter of weeks.
You’re getting screwed by Softaware, and, in turn, your users are getting screwed. You should be kicking Softaware’s butt right now, and/or upgrading your SLA to provide for swap-outs.
Or, you should be moving the affected accounts to WebVision.
Hey, there’s an idea, huh?
While you’re fooling around for the next two weeks, I’ll be looking for yet another host. I’m at least smart enough to have a backup plan.
I want to stay here. I really, really do. Everyone with DH is extremely nice, and I dig the atmosphere, and I usually get replies to support-form queries within 12 to 24 hours.
But unless you guys learn to stop being so nice and start to kick some vendor butt, I may have no choice but to leave.
Please don’t make me.