Question on backup


When I went to the Wiki, and searched for backup, I could only see the “snapshot” backup feature.

I’m assuming “snapshot” works somewhat similar to LVM snapshots, i e it takes a snapshot of the file system data structure at the time, and locks it down, but it’s still data on the same physical file system as the main site.

Further, I assume that the shared host machines either use some RAID array with moderate redundancy (RAID 5 or 6), or use a partition of some SAN box with similar redundancy.

So, if I’m wrong so far, please let me know. Now, given this, my question is:

Am I correct in understanding that there is no off-device backup of my account at dreamhost? If the underlying file system block device were somehow wiped (say, an EMP on the RAID array :-), then that’s all she wrote and I’d better have an off-site backup?

Then the follow-up question is: If it’s a RAID 5/6 style array, what’s the ratio of redundancy to capacity? How many disks, out of how many total disks, need to go out before data becomes lost?


Some, but not all filesystems are on dual parity RAID 6 NetApp devices.

As you’ll hear from a lot of folks, the default snapshots that everyone gets should not be considered full backups and the filesystem is mostly raided for high-availability. As the webmaster of your site, you need to do your own backups at appropriate times and for sufficient history that only you can determine.

RAID 5 can sustain one disk failure in the array. RAID 6 can sustain two simultaneous failures.

I could not remember when this was discussed before. Though there is no clear answer to the question of whether RAID is used in DH, some still doubt that because it is very expensive to set up RAID.

And we always suggest users to manually backup their data.

Last week, when sxi asked DH honcho Michael, “We’re backed by RAID6 ?” Michael answered, “In many cases yes. All of our storage is not guaranteed to be though.”



Thanks for the answer. I wonder how I can find out what my particular account is.

Btw: There are other RAID set-ups than the low-end RAID 5 or RAID 6 you see on cheap controller cards or “soft raid” controllers. The first RAID array I ever used was a 42-drive array connected to a Connection Machine supercomputer (in the '80s – pathetic by today’s standards). If I remember correctly, that array used 32 drives for data, 5 drives for parity, and 5 drives for standby to re-build if a drive failed. You’d have to lose 5 drives at the same time to lose data.

The same idea can be used in larger SAN/NAS systems, too, if you want higher availability. Regarding the cost of RAID, for a hosting provider like DH, I would assume it’s a lot cheaper to centralize storage on a SAN or at least NAS of some sort, than trying to put separate drives in separate boxes all over the data center. And, once you go the SAN or NAS route, the additional cost of RAID drives is peanuts compared to the gain in reliability (10-20% additional cost only, for drives, which isn’t a particularly big percentage of overall cost).


Dreamhost does use SAN/NAS. As I mentioned in my last post, the current state-of-the-art for Dreamhost installations is dual-parity RAID 6 on NetApp appliances.

