Feed Sign in with OpenID OpenID

Simon Willison’s Weblog

CSS-Discuss Wiki Spam

The css-discuss wiki has pretty much looked after itself since its inception a year and a half ago, thanks to a small but active community of wiki gardeners. Unfortunately, recent months have seen a rise in the amount of SEO spam hitting the site. Spam gets deleted pretty quickly, but there’s always room for more help to provide a faster turnaround. If you run an aggregator and don’t mind spending a minute or so a day tending the wiki you can sign up for the RecentChanges RSS feed and help check over new changes as and when they are made. Your assistance will be greatly appreciated.

This is CSS-Discuss Wiki Spam by Simon Willison, posted on 27th April 2004.

View blog reactions

Next: Google, circa 1998

Previous: Curious Javascript in .NET

18 comments

  1. I'm subbed now.

    (My bloglines folders are'nna be all sorts of messy at the end of next month...)

    ben - 27th April 2004 21:58 - #

  2. Mine (Dive Into OS X) is getting hit too, as is the Atom wiki. Mine is relatively low-traffic, so it's not hard to pick out the spam just by watching RecentChanges. But I don't have a lot of free time these days to clean up after shitheads, and I may just disable posting altogether (freeze the wiki) until I get more time, or find another maintainer.

    Mark - 28th April 2004 01:32 - #

  3. I don't know how long wiki spam has been going around, but I always wondered how long it would be before it happened.

    Chris Vincent - 28th April 2004 02:59 - #

  4. No surprise here given how bad blog comment spam is. I'm increasingly in favor of captcha verification that proves you are human on almost any type of anonymous system where user input is displayed.

    I'm also starting to think that hyperlinks should be posted only as text -- never as live links. I.e. take away the incentive for comment spammers.

    Scott Johnson of Feedster - 28th April 2004 03:38 - #

  5. Also subbed.

    It was only a matter of time. My site is pretty low-traffic, and I'm amazed at how much spam is blocked by MT-Blacklist.

    Yvonne Adams - 28th April 2004 05:29 - #

  6. Would this be considered "spam"? "See Page: Akku"? Quite annoying people using such great webpages to spam. *tsts*

    alain - 28th April 2004 15:37 - #

  7. On many wikis with revision history, quick deletion is not a deterrent. Spammers still get the desired page rank boost from the old versions in the archive. Deletion is just an incentive to spam more often and with different links.

    Several wikis (including the DocBook wiki) are being regularly hit by spammers using scripts to add spam links to every page. The spammers are using open proxies to remain anonymous and avoid blocking.

    Manually deleting spam is proving insufficient in these cases.

    Matt Brubeck - 28th April 2004 18:20 - #

  8. Matt,

    Wouldn't a robots.txt block on rev history requests work there?

    Or maybe a CAPTCHA'd "nuke revision" feature...

    I think we're not too far off from strong identity being a requirement for reasonable 'net signal to noise ratio. *sigh*

    Jeremy Dunck - 28th April 2004 19:07 - #

  9. Jeremy: Blocking robots on the revision history is an excellent idea. Or making links inactive in the revision history and diffs. Nuking revisions (even with CAPTCHA verification) is very dangerous, because the revisions are often necessary to repair damage done by spammers or vandals. Only authorized users should be able to nuke revisions (or at least, authorized users should still be able to view nuked revisions).

    Matt Brubeck - 29th April 2004 01:39 - #

  10. Hi Simon, I run the MozTips Wiki and was getting hit by the same Chinese SPAMbot. It checks for certain pages, such as the main page, the Sandbox, and other pages. I used the IP blocking tool to block their SPAMbot. It seems that hundreds (if not thousands) of Wiki-pages have been defaced over the last week. I am thinking of writing some system to thwart these new SPAMbots: email verfication or sign-in for those posts that contain non-Wiki (i.e. external http links). Another option is a CAPTCHA system. Do you know of a good CAPTCHA system for PHP?

    Jay Sheth - 29th April 2004 03:45 - #

  11. Simon, I forget to mention in my last note that you should probably block access to the Wiki from the following IP range: 222.183.* I am not sure if that is all of China, but at this rate, what is one to do?

    Jay Sheth - 29th April 2004 03:51 - #

  12. I've just blocked wiki access for 222.176.* - 222.183.* which covers the ISP (Chinanet Telecom) of the main offender. Hopefully this will improve things.

    Tim Fountain - 29th April 2004 16:48 - #

  13. Jay - a couple of weeks ago the HN CAPTCHA class for PHP was released. You can find it, with a demo, at PHP Classes.org

    Brian Wahoff - 30th April 2004 01:00 - #

  14. Hi,

    I'm a frequent contributor to the wiki of the POPFile project. The other day, we saw the first spam on our wiki. It was the guy that spams wikis world-wide. That probably is his job. He always adds a link to emmss dot com and he likes to mention "Chongqing".

    Wo googled a bit and were shocked to see just how many wikis and blogs were getting spammed.

    Simon Willison's idea is very good, but it requires spammers to read the notice and to notice the effect, although therer really won't be any effect you could notice.

    I had another idea. Why not fight back? If that guy wants to rank high on "Chongqing", we could even rank higher, with a page that tells the world about those idiots.

    I have no idea what emmss is selling, but they apparently are very keen to get a high page rank. I don't think that they would be as keen on getting negative publicity.

    Here is a first draft of _my_ Chongqing page.

    Kind regards,
    Manni

    Manni - 30th April 2004 13:37 - #

  15. Hi Simon and others,

    I am using the same Wiki software as CSS-Discuss. (In fact, the MozTips Wiki was inspired by the CSS Discuss Wiki.)

    I have create an anti-spam ascii art CAPTCHA system (because getting GD / ImageMagick to work on shared servers is sometimes hard).

    I thought that this plugin could use some good testing, so if anyone has the time (or need) to test out the plugin, you can find more about it and download it here:

    http://www.moztips.com/index.php?id=222

    If it does not work, bug reports are appreciated.

    Jay Sheth - 30th April 2004 18:05 - #

  16. I've been thinking about this lately, and it really is sad. The moment search engine rank got tied to links from other sites, a link on another person's site became a commodity. In a way this really decreases the general innocence of the web as a whole, and the fight for wiki's, blog comments, and ANY way that a person could post data to a web site now has to become a battleground in the same way Email has long since been. Luckily, it seems like Captcha-type systems are a pretty good solution...for now.

    Daniel Talsky - 2nd May 2004 19:46 - #

  17. Hi guys!

    I got serious about the fighting-back-on-wiki-spam business and registered chongqed.org for the sole purpose of annoying spammers everywhere.

    This idea (just like the wiki idea) will only work when a community is willing to help. A single person will not be able to do much.

    And that's why I post again: to ask you for help. Please visit the site and have a look. And if you then think that the idea might work, pick your favorite spammer and use his keywords to link to chongqed.org.

    Thanks,
    Manni

    Manni - 5th May 2004 16:51 - #

  18. I have a script that can clean wikis of links to blacklisted domains. So if your wiki has serious damage I can repair it quickly if you need. Just post your wiki to the link below. I've only tested it with MoinMoin or UseMod-based wikis. A simpler solution though is to just patch your wiki to redirect URLs through Google like this blog does. That way links do not affect Pagerank. See this patch for MoinMoin wikis, for example: http://moinmoin.wikiwikiweb.de/RedirectingExternal Links

    WikiUser - 12th June 2004 20:52 - #

Comments are closed.

Previously hosted at http://simon.incutio.com/archive/2004/04/27/wikiSpam

A django site