500+ load averages, anything I can do?


#1

[pacman]$ uptime
06:59:18 up 64 days, 14:35, 1 user, load average: 513.54, 454.29, 419.49

This is just ridiculous. Are all Dreamhost servers this insanely overloaded?


#2

Oh god

[pacman]$ uptime
07:07:20 up 64 days, 14:43, 1 user, load average: 963.75, 763.30, 576.33


#3

Yeah, I had joy like that on Poseidon. I put in enough tickets asking to move and they finally shuffled me off.

Usually because of NFS bottlenecks… still sucks though.


#4

I think the smoke rising from the server should get their attention.

Just submit a ticket. They’re probably already looking into it, but a ticket is like a squeaky wheel. I’m guessing this is a new server with new users. As users iron out the kinks in their websites, the server calms down, but Support needs to track these problem users down and let them know there’s a problem.

-Scott


#5

If you can actually get a response from uptime with a load that high, it means that there are a lot of processes waiting on an nfs server. Any data on the nfs server that was not responding would have trouble, but data from other nfs servers would not be affected. Anytime a load shoots up like that the emergency response team gets notified and would be working on it.


#6

Why did my ticket sit in queue for an hour then before someone decided to reboot the server?


#7

Well presumably someone was fixing the file server that was causing the problem. The person who answered your question was not that person though. They may have asked the admin who was working on the file server what was up, and then the admin probably said “yeah its almost fixed hold on a second.” then, “okay its back up now, soft reboot that server to get the nfs mount back.” or something. have you ever worked on computers? If a situation is affecting multiple servers as a file server problem likely would we usually group the incoming messages about that problem together and then answer them all once the problem is resolved. It may have been a case were 2-3 web servers needed to be rebooted. I am not quite sure what you are expecting, but if support figured out the root issue, coordinated with the fixer, assisted the fixer, then answered all the related support within an hour, as a manager I would consider that a job well done.


#8

You’re generally not going to get a response unless they’ve fixed it or it’s an intermittent problem they couldn’t catch that time around. Response is usually within 24 hours.

-Scott


#9

I’ve also been having lots of load problems these past 2-3 days. Site only loads sporadically, usually after doing a reboot. Last time this happened was back in late March. Can’t even get into my wp-admin panel to deactivate plugins and test further. Any suggestions other than sitting and waiting and pulling hair out of my head?


#10

If you can SSH in, you can what the load is like on your server with the ‘uptime’ command. If you need to take your site offline, you can rename the index.php to index.old and put a blank index.html file in there.
http://wiki.dreamhost.com/Ssh

Are you on a virtual Private Server? You comment on doing a reboot makes it seem this way.

-Scott


#11

Yes – I am on a VPS. Getting very weird spikes in memory usage and was unable to even get to my plugins yesterday to begin troubleshooting.


#12

Phew:
[seltzer]$ uptime
15:15:24 up 90 days, 5:52, 4 users, load average: 3.26, 2.58, 2.61