Oi, what a week.
Last Wednesday our main internal db server (we have two) started dying inexplicably. After pretty much 24 hours of wrestling with it, we finally downgraded it from MySQL 4 to 3.23.55 and that fixed it for good. Internal dbs being down means no Web Panel, Signup, Webmail, and well, basically anything that might use your web id or our internal data on you. Customer websites, ftp, and email (apart from webmail access to it) were up though.
Then yesterday afternoon for about an hour and a half something weird happened: Our main (of 2) router crapped out for an unknown reason, dropping all of it’s BGP sessions and creating an HSRP fight between itself and our backup router since they could no longer reach eachother. Apparently, the main switch behind those routers got a bit confused (probably due to the ARP traffic generated and started blocking even non-routed traffic. Which meant ALL of DreamHost… everything… was down to the world.
To add insult to further injury, shortly after that was fixed our OTHER (of 2) internal databases started having the same problems our first one had last Wednesday. Now finally, IT’S been downgraded to 3.23.55 and seems okay again.
I hope everybody can put this week behind them as quickly as we’d like to. I guess go celebrate St. Patrick’s day LONG AND HARD!
(also, you can read the official announcements for each of these in the “Status > Announcements” area of our web panel, now that it’s back up)