I’m a current DH customer, have been for over 2 years. I’m wondering how many other customers have experienced what seems an unusual amount of tech problems over the past several months, dating from an announcement in mid-February '06 that DH had installed a “lot” (their word) of new hardware?
The problems we’ve experienced in atypical (to be fair, for DH) volume since mid-Feb 06:
When new email accounts are created, even in batches, a healthy percentage (often 50% or more) simply do not work, despite being created at the same time, etc. Only deleting them and recreating them gets them to work.
Email accounts accessible and functional via various and local POP clients but not accessible via webmail until, again, we completely delete said account and recreate it.
Putting new domains online seems to be hit or miss. Often they propagate within an hour or less or sometimes they NEVER propagate. Again, about 50% of all new domains we’ve tried to bring online do NOT propagate and have to be recreated or reset. These are domains registered with Dreamhost, so there is not a 3rd-party DNS involved.
WebDAV volumes that are accessible and mount fine on UNIX (Sun/SGI) and Mac OS X boxes, but simply are not accessible via Windows XP or Windows 2000, no matter what we try.
Simple .htpassword affecting general PHP execution. Removing the .htpassword file fully restores the PHP execution, re-implementing it shuts it down again. This was occuring almost exclusively on one of the servers we live on vs. another, so it seems based on what server your site lives on (see below for more about this).
Discrimination (resulting in failure to upload) against files during FTP uploads depending on, primarily, whether they are binary or ASCII in structure. This problem is not OS, hardware, or FTP-client dependent. Compressed (in various compression and encoding formats) postscript seems to be particularly susceptible to this issue.
Interruption of simple POST events and a propensity for _postdata resends to ALL break one day and ALL work perfectly another day, based on the same code on the same site, using the same event circumstances, file, and/or other elements. We’ve tested this and can repeat it with 100% reliability (on “bad days”, which seem to occur about once a week but not on the same day or days).
Processes and events which occur reliably without fail on one server (temple, for example) but break or fail over 50% of the time on another (rum, for example), using mirrored code, configuration, et al.
Jabber clients which routinely have to be reset or recreated. They work for several days and then simply fail to function after a few more days. This, too, occurs much more often on one server than another.
The frequent inability to execute server-side PHP, JS, or CGIs between the approximate times of 08.30 and 09.00 -5GMT (05.30-06.00 -8GMT server/host time). This happens so often I’m assuming there is some type of routine maintenance or diagnostics run at this time but inquiries to support have not yielded any information to substantiate this idea. Unfortunately our clients often are at work at 07.00 (on the East coast) and do mind waiting for this period of non-use to pass.
That’s about it but these things often keep us from running our business. This is important to us to get to the bottom of this and are very grateful for any feedback or comments that come our way. We do not want to leave Dreamhost as we like many things about them and also have a very big time and resource investment in them. We’re seeking a positive solution to our problems and hope to receive whatever help others can spare us in this effort.
We’re also wide open to comments about whether Dreamhost is the place to be for moderatly mission critical sites and functionality. Possibly we’ve simply chosen the wrong type of host for our applications. We’re not NASA, but our clients depend on us to deliver and we don’t get paid when we don’t.
In our businesses we sell ourselves on the fact that reliability is not a measure of what you pay; if you charge anything for it or if you simply commit to do it and plan to put your name on it, then reliability is all or nothing. We’re fuzzy on the concept of how you even target “partial reliability”. We think that would be considerably more difficult to achieve and subjective to define than “total reliability”. Are we setting the bar too high? We find it easier than “throttling back” to a lesser reliability; just our read on it.
F K Bumgarner
ChaosPoint Systems Engineering