High iowait on PS - every day at 5:12AM PST


#1

Ever since I switched to Dreamhost PS my site (sqlite based) has been unusable from 5:12AM PST - 5:45AM PST. (that’s 8:00 AM EST).

I have been dealing with support for months with little progress. I’ve got plenty of gauranteed RAM and CPU, but when the disk is getting thrashed by SOMETHING (support can’t seem to figure out what) all the cpu/ram doesn’t matter for anything. Example output:

[code][ps4822]$ date; mpstat 1 5
Mon Dec 29 05:22:42 PST 2008
Linux 2.6.22.19-xeon-vserver-c4.1-grsec-grsec2.1.11-vs2.2.0.7 (ps4822) 12/29/08

05:22:42 CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s
05:22:43 all 0.74 0.74 2.23 96.03 0.00 0.25 0.00 0.00 1886.14
05:22:44 all 7.98 1.25 3.24 87.03 0.25 0.25 0.00 0.00 2081.19
05:22:45 all 0.00 1.49 1.74 96.53 0.25 0.00 0.00 0.00 1819.80
05:22:46 all 0.00 0.99 1.99 96.77 0.00 0.25 0.00 0.00 1861.39
05:22:47 all 0.50 0.50 1.98 96.29 0.25 0.50 0.00 0.00 1914.00
Average: all 1.84 0.99 2.23 94.54 0.15 0.25 0.00 0.00 1912.50
[/code]Since this happens every day at the same time, on my own i discovered my daily crons were scheduled to run at – gasp – 5:12 PST. One of those daily crons is updatedb which updates the database used by the locate command. I requested my crons get moved to 3:12 PST so they’re outside of east coast business hours – which was done – but i’m still having issues. My only guess is that other VM’s I share a physical disk with are still running their daily crons at that time which screws me just the same.

Anyone else run into this, or have any ideas? Because support has pretty much failed me – even though I’m paying $15/mo PLUS $45/mo for cpu/ram on PS. They dont seem to really want to troubleshoot it – their idea is for me to purchase even MORE cpu/ram to combat their disk being screwed every day at the same time.


#2

Have you tried ‘lsof’ to see what files are open? And how about looking through system logs? And ‘top’ to see what’s actually running?

Your guess about someone else hogging disk time is a pretty good shot, but only Support would be able to track that down. If all else fails, try asking to be moved to a different machine.

-Scott


#3

Yes, top is running nothing. SOMETHING is hitting disk at the same time every day. Support has been pretty poor, considering I’m paying extra for PS I didn’t think I’d have to still be affected by other people.

I’ve tried lsof and top don’t see anything out of the ordinary. It’s quite frustrating paying for a VM and not having root, there’s only so much I can do…

Really I’m beholden to support at this point, but was hoping someone else on PS could do an mpstat at 5:12AM PST and see if they also see the 0% idle and high iowait times.


#4

Hey, same problem here

Every day, near 06:12 PDT, server go down for 30-45 minutes

solutions?


#5

I’m sure it’s their daily crons running.

cat /etc/crontab and see when your daily crons are set to run. They just need to fix this to put it at a more convenient time - 3:00AM PST, and they need to do it on ALL the VM’s.


#6

I finally got a reply from support and with my help they identified that all their PS servers on the same cluster were all running updatedb at the same time causing massive disk thrashing.

I still have a 1 minute blip at 5:12AM PST, but I can live with that.


#7

I’m glad to hear that it got resolved.

-Scott