Shared hosted memcache service fleet


#1

It’d be really awesome if there were a memcache fleet available for users, with an on-box proxy that provides a simple API that can fetch or distribute keys across the fleet. It would be up to users to hash the keys well for their own security (using something like sha1(secret+key), and the on-box proxy would just partition across the fleet using consistent hashing or something.

(I’d provide more details but that would probably infringe on my day job’s NDA/noncompete. Dreamhost folks are pretty smart though.)


#2

That’s a really interesting concept, but I see at least two big issues we’d have to figure out first:

[list=1]
[*] Most off-the-shelf web applications that can be configured to use memcache don’t do any sort of hashing or namespacing for their keys, so every instance of WordPress (for instance) would end up sharing cache entries, causing bizarre behavior.

[*] Memcached’s eviction policies treat all keys as having largely equal importance, so they would likely not be suitable for a shared configuration like this. (As it stands, the cache would end up dominated by whatever application used the cache most aggressively; keys inserted by applications that used it more lightly or infrequently would be less likely to “survive”.)
[/list]

That being said, while it isn’t a feature that we officially support, it’s possible to install memcached on a DreamHost VPS and use it locally. (We do officially provide memcached as a caching component for DreamPress service, but that doesn’t sound like what you’re after here.)


#3

For 1, the on-box proxy could require an auth secret or something, and that could be used to hash the key. It could also be used to instrument the heavier users and drive reaching out to them to suggest they move to a VPS. I was thinking it wouldn’t be the memcached protocol itself but some wrapper to it, and require apps to actually support it specifically rather than just letting anyone configure WordPress’s off-the-shelf thing to talk to it (I’m also assuming that most apps don’t use a client/etc. that does a very good job of partitioning, either).

2 is definitely a problem but if the fleet is big enough then everyone still benefits somewhat. (Heck, the fleet could just be colocated on the web and mail hosts - just set aside, say, 100MB from everyone, and as your userbase and therefore server farm grows, so does the cache.)

Also, if someone isn’t using the cache very aggressively, then the cache probably doesn’t buy them anything anyway. If someone’s got enough data that’s that critical to be cacheable, then they probably would want to have their own dedicated cache host in the first place. So that’s where instrumentation comes in. :slight_smile:

My personal use case here is just that I’d like to have a few pieces of data on a lot of pages on my site, where said data seldom changes but incurs a lot of database activity to read. Having one little widget (recent forum posts) slowed every db-using page on my site down to the point that it was unusable. Removing the widget fixed it. Frankly the widget wasn’t really doing anything useful anyway, but it’s still nice to have, and as an engineer who works in this domain in my day job I thought maybe it’d be interesting to see if you would be interested in adding it. (Like I said, NDA/noncompete prevents me from just handing you a solution.)

I could also just do simple filesystem-based caching but that’s kind of silly and causes other problems.

Anyway, does WordPress really not hash its keys? If so that’s pretty dismaying although par for the course with them. Do they expect it to just be secure? Because memcache sure isn’t - its entire security model is predicated on the idea that the assumption are unguessable (and/or that the port to access it isn’t accessible and you only have a single app talking to a single memcached).