Obviously there is no easy* way to serve 2TB of php, database driven, application generated text, even from a dedicated server. (* i.e. with no work by the web master) This has to do with the cpu and memory requirements of most php apps. You can transfer that kind of bandwidth if you are serving video though, which is really what huge amounts of bandwidth are aimed at. If you are trying to serve TBs of database driven dynamically generated text, you need to have the experience to optimize your own code.
Even our control panel which serves about 5 million page views a month doesn’t use 2TB of bandwidth, because it is primarily a text based site and bandwidth just isn’t the limiting factor. The real limiting factor is database performance on a database driven site. (hence, we are always upgrading our internal db servers, not the web servers) We have the panel running on a cluster of web servers for redundancy, but cpu and memory wise we could run it just fine on one of our shared web servers without crashing it. (We have run it for short periods of time on a single server less powerful then our current new shared web servers)
So when someone comes along and their site crashes a web server its pretty obvious that the problem is inefficient code. It really has nothing to do with bandwidth. Even though we try to give our customers enough bandwidth and disk space to be sure that they they are not limited by those things, the customer does still need to be an intelligent and conscientious web master.
It is not very realistic to expect the laws of computer science and basic program maintenance and optimization to go away just because we give out lots of bandwidth.
If you set up a programming loop for instance (such as a page that just does a remote call to itself or a program that just recursively allocates memory indefinitely) its pretty clear that the site should be disabled, and everyone would agree that the script is broken and should not be run. (this does happen)
In the middle you have software that serves a purpose but just has not been optimized. most poorly optimized software never becomes a problem because it is usually rarely visited. In the rare case though some inefficient software might get over crawled by a search engine or obsessive visitor, or be the victim of comment spam, in which case it is just easiest to put a stop to the outside abuse.
In the case where a website is truly popular but just has inefficient code, it falls on the web master to optimize the site to handle the traffic. This is true no matter where it is hosted, even if you are hosting it yourself on a cluster of servers. (as we are and as we have)
DreamHost has a lot of experience running popular websites, and we are here to give you advice on how to best optimize our website. I have seen a lot of good advice here in these forums as well.
We have customers that use very large amounts of bandwidth without crashing their servers. It is entirely possible, and we do deliver on our plans. We also put a huge amount of behind the scenes work into indexing customer tables in order to improve customer db performance. Any large (and especially growing) computing system has bottlenecks and as a system designer/administrator you spend your time detecting, predicting and removing those bottlenecks. This is a never ending process. (DreamHost has doubled in size in the last year and a half and so know this very, very well)
This all being said, the primary cause of machines crashing or loading slowly is bad code and database queries. If your website suffers from this (and you are pretty sure its not you causing the crashing), you should contact support immediately so that the offending code can be tracked down.
If the support tech is not helpful in solving the problem, I encourage you to rate the tech poorly in the accompanying survey and then reopen your ticket. We take all customer complaints very seriously and will work with you to make sure your website is performing adequately. And keep posting about any problems you have in this forum as I do check on them fairly regularly.
Here are some guidelines for what types of information to collect for submission to support:
The idea here is to turn up evidence about the following items in the website’s information path:
- visitor network connection to the webserver
- the webserver’s load
- the load on the specific apache instance serving the website
- the webserver’s connection to the filer
- the load on the filer
- the webserver’s connection to its mysql machine
- the load on the mysql machine.
For file based sites you can test the webserver’s connectivity to its file server by copying one of the files in question to /tmp and seeing if the operation is speedy.
time cp /home/user/example.com/images/example.png /tmp
if this is slow send that info.
When optimizing the speed of a website I add timers to the code of the script and measure three times, SQL time per query / Total SQL time, Total scripting language time, and Print Time. To do this, don’t print anything within your code until the very end. Just put a wrapper function around your db calls to get the SQL time. To get scripting time look at total time until right before printing minus your SQL time and finally just put timers right before and after you print out your html to get print time.
If an SQL statment takes a long time to load, test it out directly from the mysql server via the mysql command line, if the query is still slow, then its the query or the sql server. Do explains on the query and make sure it is properly indexed and if everything is fine with the query itself then talk with support about your SQL server. If the query is fast on the command line, but slow from the script then it is likely the network between the web server and the mysql machine.
If you have a long script time, but fine SQL times, and your script time varies by time of day, it is likely a web server load problem. You can always check for a high load or swapping and report those.
If the print time is high, it can be the network anywhere between the visitor and the webserver, or even the speed of machine/browser of the visitor. Here traceroutes can be useful and downloading a file of the same size as the script output from other locations on the same and other networks can give good comparisons.
Any other tests you devise for these items will also be useful. There are a great many things that can affect each of the items above, so any narrowing of the possibilities is very helpful.
For items you can’t test from your user shell, you should ask the support person to check on the performance for you. Our support technicians are given instructions to test these very same things.
If you get a message from a support person that doesn’t address these factors please fill out the survey that comes with each support message as these make it to the eyes of the support managers. Customer feedback is very useful in complimenting our internal quality controls.
Managing a very busy website can at times be frustrating, but DreamHost is about as good a friend as you are going to find when it comes to helping you work through those problems.