Memory limits


#1

I’ve been having problems with database connections failing lately. Support initially told me it was due to the nameserver attacks recently.

I followed up to let them know that I was still experiencing intermittent, but brief, times when database connections would be very slow or fail with errors such as:

or

To my surprise and astonishment, Support responded with this:

[quote]After looking into this with another tech, we have concluded this is not
an issue with the database. The database errors are a side affect of
another issue…

I am very sorry that it’s taken this long for anyone to get back to you
about this matter.

It seems your scripts have been getting automatically killed by our
Process Watcher script due to your sites going over Memory limits on the
shared server. Below is the most recent event at the time of this
writing:

Mon Dec 12 17:02:21 2011 procwatch3 INFO: PID 17496 (php53.cgi)
username:pg1111111 - 18.2MB ram, 0.24 sec cpu [idle php]: killed for uid
ram[/quote]

What? 18.2 MB of ram? Really? Ok, so maybe I had hundreds of visitors at that time? Actually no, not at all. At that time only one person was visiting my site:

1.1.1.1 - - [12/Dec/2011:17:02:00 -0800] "POST /ajax/Session.dispatch.json HTTP/1.1" 200 967 "http://example.com/session/monolingual" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/8.0" 1.1.1.1 - - [12/Dec/2011:17:02:05 -0800] "POST /ajax/Session.dispatch.json HTTP/1.1" 200 938 "http://example.com/session/monolingual" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/8.0" 1.1.1.1 - - [12/Dec/2011:17:02:08 -0800] "POST /ajax/Session.dispatch.json HTTP/1.1" 200 978 "http://example.com/session/monolingual" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/8.0" 1.1.1.1 - - [12/Dec/2011:17:02:11 -0800] "POST /ajax/Session.dispatch.json HTTP/1.1" 200 945 "http://example.com/session/monolingual" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/8.0" 1.1.1.1 - - [12/Dec/2011:17:02:17 -0800] "POST /ajax/Session.dispatch.json HTTP/1.1" 200 977 "http://example.com/session/monolingual" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/8.0" 1.1.1.1 - - [12/Dec/2011:17:02:21 -0800] "POST /ajax/Session.dispatch.json HTTP/1.1" 200 1013 "http://example.com/session/monolingual" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/8.0" 1.1.1.1 - - [12/Dec/2011:17:02:26 -0800] "POST /ajax/Session.dispatch.json HTTP/1.1" 200 968 "http://example.com/session/monolingual" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/8.0"

These requests are handled by a custom framework that always comes in at under 20MB. WordPress can’t even start in under 30! And look at the size of these requests! <1k of data!

Really Dreamhost? Really? I’m using too much memory?

My site was on Reddit some time back. I had 100+ visitors per hour for 24 hours. They were all simultaneously making about 100 of these requests each. I contacted Support after things calmed down to see how the resource usage went. Guess what they said?

[quote][20:28:24] me: Hi, one of my sites recently received a big increase in hits lately and I’m just looking through the logs to make sure everything went ok. Could you please tell me if / how many processes were killed for high memory use or long execution times over the last 5 days or so for user: username on domain: example.com?
[20:33:01] Dan G.: I don’t see any kills for that domain.[/quote]

Dreamhost, what’s going on? Is any one else experiencing this? I rarely say this, but WTF?

Here’s what [font=Courier]top[/font] reports when there are three concurrent users on the site, doing the same thing that the single user above did to cause the Process Watcher to kill my scripts:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1512 user 20 0 208m 18m 8052 S 1 0.1 0:06.10 php53.cgi 7630 user 22 2 18820 1168 944 R 0 0.0 0:00.02 top 31660 user 22 2 66056 2088 1424 S 0 0.0 0:00.76 sshd 31664 user 22 2 99.7m 2320 1440 S 0 0.0 0:00.52 bash


#2

So, to follow up, I decided to test out Support’s claim that excessive resource usage was causing database connectivity issues. I used LoadImpact to send 200 simultaneous users to my site, each clicking through ~150 pages with a delay of 1-2.5 seconds per page. This represents a truly massive load which is far above anything my site has ever experienced, ever. Processes and requests were killed left and right, which is reflected in my logs. Over 1,000 requests were terminated due to either hitting the limit on concurrent connections or running out of memory. In all, there were some 13,000 requests over 10 minutes. As a positive side note, it looks like the server handled it all quite well as the server load reported in [font=Courier]top[/font] never went very high.

Anyway, guess what? No database connectivity issues. None. Zero. Nada. Zilch. I monitored it the entire time, loading some of my scripts that connect to the databases. There were never any problems. I tried this test three times because I was also comparing the effect of FastCGI vs regular CGI, vs PageSpeed. There were never any connectivity issues.

It’s really frustrating that almost every time I’ve contacted support, which is not often, they send me off on some wild, and completely unrelated, goose chase such as this latest incident rather than admitting that there’s a problem on their end, actually showing some understanding of their own system, or just plain updating their panel to reflect reality so that users can help themselves.

For what it’s worth, I’m still generally pleased with what Dreamhost offers as far as shared hosting, but Support is not really up to snuff. DH’s backups policy and resource limits in general are much more generous than any other shared host I’ve investigated. Fortunately I don’t need to contact Support very often, but it sure is frustrating when I do.

So my conclusion is that the database connectivity issues were Dreamhost’s problem and fortunately they have cleared up. It appears to be just a coincidence that someone sent 10-20 simultaneous requests for a single page which did cause the procwatcher to kill the scripts on once on the same day that I reported the issue. Despite my best efforts to replicate the issue with a massive onslaught of simultaneous requests, I can not.


#3

SQLSTATE[HY000] [2013] Lost connection to MySQL server at ‘reading authorization packet’, system error: 0

I have the same error sometimes when connection to mysql, I am on a private server with a psmysql and my resource usage never goes over 120mb/300mb but that is what I see on the control panel.

Also sometimes mysql connections take 3 seconds, and php scripts can run upto 50 seconds, when everything is normal these scripts run under 0.05 seconds.


#4

I’ve been using DH for seven years. I’m only noting that because I had never heard of Process Watcher until Monday morning when it killed all of my cron jobs. The reason given to me by support (with log details) was that I had “too many concurrent processes”.


#5

How many did you have?


#6

The limit is 25 processes. You’ll almost always hit your memory limit first.


#7

It’s hard to say. I probably had five ssh sessions open (across two clients), half of which had MySQL clients open. So that’s a dozen processes right there.

And everything I have on shared hosting is under a single user.

It just caught me off guard. And support was pretty pretty circumspect about the thresholds. (Thanks andrewf!)

And I don’t have the time right now to think about migrating sites to different users.

But I should be okay for now by limiting the number of ssh sessions


#8

If you have a low traffic or less important site, you should try the panel for moving it to a new user. I’ve tried for a few of mine and it’s quite painless! I’ve been slowly containerising my sites after the large spike in site hijackings and security breaches across the internet over the past year.

We’re lucky to have andrewf to give some solid information. I guess that DH avoids stating hard limits because they want to be flexible and they claim that they will try to accommodate more popular sites if possible, so there may be room for exceptions. Other hosts are starting to be more upfront about limits including memory and CPU time, whereas 2-3 years ago, most weren’t. I’m guessing that will change in future though.

But try separate users. You can throw your own public key in each of them and log into all of them with a single password.