Long latency, fast site


#1

I’m having an issue that Dreamhost doesn’t seem to be able to help me with at this point. Perhaps someone here can give me a clue as to what to do next.

The issue is with websites that reside on the same server. But note that the different sites do not hit the same database server.

What happens is that there is a very long latency before getting a response back from the server, and after that the page loads quickly. However, static pages load nearly instantly with little latency. I have gone so far as to create a totally fresh static page to ensure that nothing was cached – it loaded very quickly.

I have tried this from two totally separate systems in different locations (although both are in Kansas City).

Here are a few examples:

http://www.scrooks.net/wiki. This is a one-click install mediawiki site that has not been touched at all since the install. I use it for testing things like this. When I try to hit that site it takes 10+ seconds to load, but nearly all of that time is spent “waiting on scrooks.net” with a blank page.

http://www.crooks.net/ttt.htm. A simple static page with a large picture on it. It loads nearly instantly with no “waiting on crooks.net”.

http://www.crooks.net. A Drupal site. Again, it takes 10+ seconds before I get any kind of response and then the entire page loads nearly instantly.

Dreamhost support staff is trying to tell me that the problems are in my code and that hitting a database can cause delays. But it seems to me that a standard mediawiki install should not take 10+ seconds to load the home page, so something else is wrong and not the code.

Some other data points:
– Before a few days ago I was not experiencing this problem at all.
– Very early this morning the latency seemed better, although still not what I would call speedy or normal. It got dead slow again during work hours.
– My pings are good at around 60ms.
– My server load is usually around 7.
– My server is cookie.
– The wiki database is on spot:alvie, and the drupal database is on puss:lucilla.

I hope someone can help. This is killing me. I’ve been with Dreamhost for 4 years now and have been very happy (and forgiving during those downtime periods). But it’s kind of frustrating that I’m now getting responses on this that essentially say, “everything looks okay, your installation just needs to turn on the cache, and drupal code is inefficient.” What I’m seeing is definitely not okay. Although I can’t say where the problem is, and it’s possible it’s not even at Dreamhost I guess, I know that mediawiki install should respond much, much faster and has in the recent past.

Any pointers for what to try/do/say/ask next?


#2

Can you run a traceroute or two and post the results here? If you have a Linux desktop machine or access to one, can you also run both a tcptraceroute and an lft to the site and post it here?


#3

Sure. Here’s a tracert from WinXP:

C:\Documents and Settings\cersyc>tracert www.scrooks.net

Tracing route to www.scrooks.net [208.113.183.215]
over a maximum of 30 hops:

1 <1 ms <1 ms <1 ms 10.184.34.1
2 <1 ms <1 ms <1 ms 10.184.254.49
3 1 ms 1 ms 1 ms 10.160.56.229
4 1 ms 1 ms 1 ms 10.160.56.78
5 1 ms 1 ms 1 ms 10.160.58.77
6 1 ms 1 ms 1 ms 2800-1int2-e50.cerner.com [10.160.2.46]
7 1 ms 1 ms 1 ms FW1-Trusted.cerner.com [159.140.253.10]
8 2 ms 2 ms 2 ms 159.140.254.5
9 4 ms 3 ms 2 ms 12.125.73.145
10 115 ms 36 ms 39 ms tbr1.sl9mo.ip.att.net [12.123.24.194]
11 37 ms 36 ms 41 ms tbr2.dlstx.ip.att.net [12.122.10.90]
12 38 ms 40 ms 37 ms ggr4.dlstx.ip.att.net [12.122.86.157]
13 27 ms 27 ms 28 ms 192.205.34.82
14 62 ms 60 ms 59 ms 64.209.105.26
15 80 ms 77 ms 69 ms apache2-kant.cookie.dreamhost.com [208.113.183.215]

Trace complete.

And here’s a traceroute from MacOSX:

traceroute to scrooks.net (208.113.183.215), 64 hops max, 40 byte packets
1 192.168.2.1 (192.168.2.1) 1.950 ms 1.162 ms 1.134 ms
2 76.48.80.1 (76.48.80.1) 16.237 ms 13.721 ms 10.875 ms
3 gig2-1.kscymovivi-rtr1.kc.rr.com (24.31.239.161) 10.890 ms 10.214 ms 9.982 ms
4 so5-1-1-CHCGILL3-RTR1.kc.rr.com (24.94.160.81) 31.118 ms 28.328 ms 20.944 ms
5 so-3-1-0.kscymol3-rtr1.kc.rr.com (24.94.160.162) 39.408 ms so0-0-2.kscymoL3-rtr1.kc.rr.com (24.94.161.105) 22.730 ms so1-0-2.kscymoL3-rtr1.kc.rr.com (24.94.160.49) 35.384 ms
6 ge-5-1-203.hsa1.StLouis1.Level3.net (4.79.132.13) 43.368 ms te-1-3.car1.StLouis1.Level3.net (4.79.132.33) 41.525 ms 42.943 ms
7 ae-11-11.car2.StLouis1.Level3.net (4.69.132.186) 141.379 ms 101.335 ms 164.152 ms
8 ae-4-4.ebr2.Chicago1.Level3.net (4.69.132.190) 41.659 ms 57.292 ms 52.424 ms
9 ae-23-54.car3.Chicago1.Level3.net (4.68.101.103) 45.597 ms ae-23-56.car3.Chicago1.Level3.net (4.68.101.167) 41.499 ms ae-23-52.car3.Chicago1.Level3.net (4.68.101.39) 48.898 ms
10 glbx-level3-te.Chicago1.Level3.net (4.68.110.194) 36.789 ms 36.525 ms 35.038 ms
11 64.209.105.26 (64.209.105.26) 94.957 ms 116.991 ms 305.342 ms
12 apache2-kant.cookie.dreamhost.com (208.113.183.215) 97.872 ms 98.208 ms 96.353 ms

The tcptraceroute and lft commands were not available on my Mac, and the one Linux machine I poked at didn’t seem to have them (at least in my path) either…

As another point of data, a friend of mine in Boulder (I’m in KC) said it took him 26 seconds to load the page, with only a second or so between the time he saw anything and the time he saw everything.

Is it possible this is a PHP issue? www.crooks.net uses a local PHP5 (and I’ve tinkered with fastcgi versus not-fastcgi with no effect) and www.scrooks.net/wiki uses the Dreamhost PHP. I’m just grasping at straws here…


#4

Focusing in on your virgin mediawiki installation, my fresh mediawiki install on my Dreamhost PS server takes a total of 2-4 seconds to load everything. Taking a look at the firebug results, there’s a lot of crap that gets loaded with every page, so it’s never going to be blazingly fast.

Have you checked out what the component loading times are?

What are [color=#CC0000]50DISK50[/color], [color=#CC0000]3DOM50[/color], and [color=#CC0000]1IP1DOM50[/color]?
More Dreamhost coupons


#5

There’s not blazingly fast and there’s 26 seconds before you see anything. I think 26 seconds is a little bit much.

I haven’t looked at component loading yet, but I’m pretty sure I’m going to see nothing happening for about 25 of those 26 seconds. I’ll verify tomorrow.


#6

Can someone from over there in CA who is closer to the DreamHost farm give http://www.scrooks.net/wiki a hit to see how long it takes to load for them? I have a hard time believing that if DreamHost support saw a 26 second delay they would say it’s pretty much working as it should be. Just trying to narrow things down. I’d sure like to get all my sites back into a usable state…


#7

Hold on, I just tested again and it’s coming back at reasonable speeds. Either someone fixed something or it gets better at night. I’ll try again during work hours tomorrow and report, assuming I haven’t heard good news from support.


#8

It’s responding to me in a shade over 5 seconds and I’m on the east coast with around 80-90ms latency to the dreamhost servers. This is on par with what I saw with my wiki before I “went PS”.

What are [color=#CC0000]50DISK50[/color], [color=#CC0000]3DOM50[/color], and [color=#CC0000]1IP1DOM50[/color]?
More Dreamhost coupons


#9

Thanks. One reason I asked you to do this is that if you were being routed over the Sprintlink network, there are some places on that network that are real trouble spots, including KC/Lee’s Summit. But you are not on that network and so that slaps down that idea.


#10

I’ve had similar problems (on millhouse, I see from nslookup you’re on cookie) that were caused by high server load. It affected CMS sites like Mediawiki and Drupal the worst.

This morning, after I had complained last night, the server load went back down and my pages have loaded pretty quickly since.

I made a script that can be used to check load from the browser - this should be uploaded as {site}/cgi-bin/load.cgi

#!/bin/bash echo "Content-type: text/plain" echo "" uptime|awk '{print $1" "$10" "$11" "$12}'


#11

I’m pretty sure I’ve had server loads that are okay. You can now check at http://www.scrooks.net/cgi-bin/load.cgi.

As I suspected, things were not fixed last night, the problem just didn’t show up. This morning everything is very, very slow again. I tried looking at things in firebug and ethereal, but I’m not sure I know what I’m doing well enough. It looks like the server is just taking forever to do an initial response – but only for non-static pages.

I’m going on 4 days now with sites that are totally unusable during the day. Haven’t heard from support since yesterday morning.

Now can someone in CA try http://www.scrooks.net/wiki to see what their response times look like during work hours?


#12

Mediawiki does a lot of queries to construct its page.

Looking at your wiki now I’m getting 10s to completion rather than the 5s I was getting last night. Some of the components were definitely slowish at 1s or more.

What are [color=#CC0000]50DISK50[/color], [color=#CC0000]3DOM50[/color], and [color=#CC0000]1IP1DOM50[/color]?
More Dreamhost coupons


#13

I don’t really have anything valuable to add… I just want to second the notion that MediaWiki seems to be prone to long latency periods – as the original poster notes. I just one-click-installed my MediaWiki. Even though it’s fairly “stock,” it takes approximately 8 seconds of “Waiting for website.com…” before it connects. Then, when it actually connects with the server, the page loads very quickly.

Any tips for improving the speed?


#14

Just out of curiosity, are these sites using “pretty urls”?

Re-write rules can considerably delay server response times. :wink:

–rlparker


#15

~1 sec loadtime from Australia (standard).

Care to share that load script? I’d like to take a look at it.


#16

The conclusion to all this is that my server was being pounded (on something other than my sites). Load averages were 10+ during the day. I mistakenly thought that those were already expressed in a percentage, so I didn’t think it was bad. Not sure why Dreamhost didn’t catch it.

For some odd reason, the server had much lower load times yesterday. Maybe the site getting pounded got moved to a new server because that site owner complained? Don’t know. We’ll see what happens today. Dreamhost support says they’ll move me to a new server if that’s what it’ll take.


#17

I’m getting the same long load times on any site that uses a database.

I’ve tested a basic php page with some logic etc but without a database connection and it all seems fine for me that way… but as soon as I try anything with a database the page load time jumps up to between 10 and 30 secs with pages often timing out.

I’ve contacted DH support 3 times about this now and each time they do something that resolves the problem temporarily and then it comes back again a few days later.

Josh did add a months worth of hosting fees to my account for the last one though but I’m still having the problem so that doesn’t really help.

I’m on webserver ajax and database server nanook.

The load script running on my server outputs the following…

10:29:24 10.58, 9.09, 10.16

I have no idea what that means.


#18

I’m have the exact same problem as the user above.

I have gone back and forth numerous times with support, but with no satisfaction. Each time they do something to make it better temporarily. I’m never told what was done and within a few hours it goes back to the way that it was. Very slow…10-30 seconds and then the page pops.

There is a mysql string error displayed on the mysql control panel, but support claims it is benign. I don’t care if it is, I just want it resolved so that it can be ruled out as the cause.

Also, many mysql variables are in the ‘red’ including the ‘slow queries’ variable. Can’t be good.

I’m on the Triton server, which always displays a ‘high load’ in the diagnostic test run from the control panel.

I’m still waiting to see what they say now. I’ll update with the results.

If anyone else gets a resolution would you update this thread?

Thanks.


#19

I’ve found the “high load” diagnostic warning to be accurate. For some reason, it’s a bear for them to get the load down from a user/site that is overloading the server. I know that sometimes loads are so high they can’t even log in to the server to figure out what’s going on, which obviously makes it time consuming to debug these issues.

I also think they don’t like just immediately shutting down users completely who are killing the server. We could obviously have some good discussions about the needs of the many vs. the needs of the one.

What are [color=#CC0000]50DISK50[/color], [color=#CC0000]3DOM50[/color], and [color=#CC0000]1IP1DOM50[/color]?
More Dreamhost coupons


#20

I spent a little time poking around the wiki trying to determine what a “high” load number is. In one place, http://wiki.dreamhost.com/116_Common_Questions (Question 2), it says they try to keep them under the 2 to 3 range. In another place, http://wiki.dreamhost.com/Uptime, it indicates that anything over 1 is bad. And I found another place which I don’t have the URL for where it talks about how on these multiprocessor servers anything under 3 or so is perfectly fine. It’s all a little unclear, and I’m hoping to get an answer to the question, “for what load values should I inform support there’s a problem?” from support in the next day or two.

When I was having problems, my server showed loads that varied from 7 to 12. Now I understand that that was clearly the problem. For the last two days the loads have been 3 or less and my sites have run far better. Astonishingly fast when the load is <1.

I understand these are shared servers and loads will possibly vary wildly, but it would be very handy to know what official value to watch for to raise a red flag with support. (Anyone here ever been told that number?)

If you are seeing 10+ loads, contact support and tell_them_the_numbers.

Note also that my server also usually showed the “high load” diagnostic warning, but I glossed over it because I assumed it would be one of the first things support checked and if they didn’t seem to think it was a big deal it must be a meaningless warning. Now I think that was a really bad assumption. I’m not sure why support wouldn’t check load values first thing with problems like this. I’d like to give them the benefit of the doubt and think that they do check it and there are other reasons why it doesn’t raise an alert…