Cleaning Up MediaWiki Spam

apps

#1

Spammers got into my wiki, and there are over 100k spam articles in there now. So I’m trying to figure out whether there is a good way to remove those. I’ve looked into some of the usual solutions like Nuke, but the problem with that is that I have to be able to identify the spam pages for it, and that will take a while.

One thing I’ve noticed is that none of the spam pages are linked to from other pages. So if I could just delete all pages that don’t have a link to them, that would probably take care of the problem pages.

Any thoughts on how to deal with that?


#2

Hello,
Have you seen this article: https://www.mediawiki.org/wiki/Manual:Combating_spam it has a few different solutions you can try. Also if you provide me with a domain I can email you with more details.

Thanks,
MariT


#3

I’ve seen that, but it’s more for spam prevention than spam cleanup.


#4

A page that no other page links to is called an orphan page.
If you’re just trying to find all the orphan pages,
go to “Special pages” (it’s in the sidebar in the default MediaWiki installation),
and under “Maintenance reports” click on “Orphaned pages” which takes you to a URL something like “http:…/Special:LonelyPages” with a list of orphaned pages.

Once you link to all the non-spam pages, so all the remaining orphan pages are all spam, you can copy that list and feed that list of spam pages into the DeleteBatch script.

You may find some of these spam cleanup tips useful:

Good luck!