Just a few notes about why this problem is so much more complicated than it might sound.
It’s unlikely that lots of users will start installing SA themselves - it requires a certain degree of technical sophistication that lots of users don’t have – and also, it won’t work on the mxxxxxx type accounts - only on mail users that have corresponding shell / ftp users – and I’ve already seen some resource consumption problems even with the small number of users that currently do have it setup. Having a single installation by itself doesn’t really help, although running spamc / spamd, or using some sort of inline content filter would be a little more efficient.
Setting it up globally is also a lot trickier than setting it up in an officially non-supported way… you really want to use a content-filter type mechanism rather than invoking the filter from procmail, you have to deal with support headaches when users “don’t receive” important messages (i.e., when users get false matches and don’t check their spam folder), you need some sort of mechanism to let users train the bayesian filters, some sort of interface to effect changes to user preferences… the list goes on and on. And once we set something up so that users can do it from the panel, it becomes yet another thing we have to support; support staff has to understand how it works and how to troubleshoot it, if there are any problems, it’s up to us to fix them… this is a big part of the “cost” of implementing something like this.
SpamAssassin is mostly effective because it takes a “kitchen sink” approach. It does a whole lot of dns lookups, checking headers and body against regular expressions, etc. - all of which are pretty resource intensive. Adaptive filters (like bogofilter, spamprobe, etc.) tend to be effective and consume way less resources - but are much trickier to deal with when you have a lot of users. We have almost 100 thousand users across our 4 clusters of mail machines, so we’re talking about a lot of work here, and a high liklihood of slowdowns and other problems when there’s an extra heavy mail load, or even if there’s some weird sort of mail loop.
You might consider a client-side (or server-side, if you use a console based mailer to read your email, or if you want to write a “training” script) adaptive / Bayesian filter. Mozilla and Apple’s “Mail.app” both have adaptive filters which are supposed to work pretty well.
We are working on some options for effective spam and virus filtering… this is actually one of the next projects I’ll probably be working on. If there were an easy, reasonably priced, and effective solution, we’d be in there.