Feed Sign in with OpenID OpenID

Simon Willison’s Weblog

rel=“nofollow”

Reading between the lines (which in this case isn’t particularly hard), this and this (don’t forget to view source) suggest that Google are soon to announce that they won’t be calculating PageRank for links with a rel="nofollow" attribute. Finally, an official way of fighting the economics of comment spam by denying PageRank on user-submitted link content. Sam Ruby points to Mark Pilgrim’s prediction that spammers won’t care—they’ll spam anyway, on the offchance that they hit somewhere undefended. I’m optimistic—if the major weblog (and wiki) vendors get behind this one it could help stem the tide.

As an aside, I have exams starting in a week and plenty to revise, so I’ll probably be on hiatus until the end of the month.

This is rel=“nofollow” by Simon Willison, posted on 17th January 2005.

View blog reactions

Next: New eclipse downloads page

Previous: Fixing MSDN with Greasemonkey

23 comments

  1. This is exactly what I wanted. Perhaps Google have learnt from the experience of running Blogger?

    Jim Dabell - 17th January 2005 02:42 - #

  2. It's a shame, though, since the non-spam links in comments often do deserve to participate in PageRank. Well, I don't now if signature links really matter, but most links in the body should cound.

    Ian Bicking - 17th January 2005 02:45 - #

  3. I've been toying with the idea of "lazy registration". Basically, you post your comment, and get an email saying "a comment has been posted to [x] purporting to be from you", and supplies a link to confirm that it was you (and a link to block any further notifications of course).

    Upon confirmation, you get the opportunity to set a password, or simply approve the browser you are currently using for the next 3 weeks or whatever.

    The way this ties into Pagerank is that you can take off the rel="nofollow" when somebody confirms their identity, without the burden of requiring people to register to post comments.

    It's vulnerable to the possibility of spammers registering and posting under valid email addresses, of course, but I think that's negligible, at least for the time being, and if spammers start doing that, at least you'll have a more reliable way of identifying them to block them.

    Jim Dabell - 17th January 2005 03:15 - #

  4. What would be even better is a way of putting that on a containing block, and have it apply to all links inside the block. Doesn't make too much difference though.

    rjw - 17th January 2005 10:01 - #

  5. Ian, in my implementation, links in comments body also are tagged with rel="nofollow". Apparently, in Dave Winer's current implementation, they are not.

    Sam Ruby - 17th January 2005 12:20 - #

  6. Good luck on your exams and project. :)

    Jeremy Dunck - 17th January 2005 23:04 - #

  7. I think this is a very valid approach. blogsnow supports it as of now: http://www.blogsnow.com/nofollow.html

    Andreas Wacker - 18th January 2005 00:21 - #

  8. LOL.

    Are you kidding ?

    It's Google problem to keep their indexes valuable.

    It's your problem to keep your blog free from spam.

    This is unreasonable to team with Google to stop spammers. Once they will get no Google benefits - they will simply start promoting their phone-numbers, emails, postal addresses, bank accounts to wire money or in any other way.

    You do not get idea of spammers - they need to get attention and as result a few bucks from "stupid" people who will pay them.

    As well - spammers can start to work for hire to destroy somebody competitors instead of promoting goods.

    Think about posting comments to every blog related to finanses with content like a "Bank X is not good one. I've visited them and ... blah .. blah ... Do not use their services".

    Do not be fooled with PageRank :-(

    AT - 18th January 2005 05:20 - #

  9. AT, I think you miss the point here. Spammers often send comments full of links back to their own pages, with little or no description. These are designed solely to enhance their listing in Google, which ranks pages based on the number of incoming links (amongst other factors).

    Regarding Simon's post, Google are simply offering a way of preventing pagerank inflation from within certain parts of a blog (eg comment pages), which means people will not be able to use blog comments to artificially increase their own pagerank. It will not prevent people from posting comments with links, but will hopefully remove the incentive for spammers to do so.

    People recognise spam when they see it, so whilst spammers may post other details (phone numbers etc), this is particularly unlikely to win them any business. Don't forget that pagerank-inflating comment spam is not designed to get people to click the links - it's merely aimed at increasing Google's pagerank for spammers' pages.

    Chris Beach - 18th January 2005 15:26 - #

  10. Take a read this very-very old (November 15, __2003__) blog posting about WebBlog spam.

    Current "nofollow" offer is more like a hack. Google must simply respect "robots.txt" exclusion of /cgi-bin redirect scripts and do not ask us to implement new technology. As well - this can be pretty fine if search engines obbey meta robots tags.

    Adding new standarts to currently existing one will force all search engine and web-spiders developers to implement this new functionaly.

    Yea. I agree that current offer give better granularity (up to single link) over robots.txt and meta - but adoption of this new feature must go via W3.org and others interested parties.

    BTW, Current comments spam can be solved in different way - if you will provide search engines different view of your website. Some special view optimised for indexing (unfortunately - not for caching, but anyway - I do not like Google cache). If users will click on link feeded to search-engine - show correct page. This way you can separate your valuable information from others textual elements (like your notes to spammers and note to visitors about tags supported ;-) As well you can save on traffic - as you will need to feed only valuable information to search engine, no needs to feed information about page layout. I've did hack like this one for feeding my internal search engine. But not for global one :-(

    AT - 18th January 2005 17:26 - #

  11. AT, a few points:

    • Google does respect robots.txt. Comment spam has nothing to do with robots.txt
    • Adding a new "standard" doesn't force spiders to implement. Google will, and others may. Now, if it were actually a standard, and not an optional extension, then you might be right.
    • Adoption of special rel="" values does -not- need to go through W3. The HTML spec encourages app-specific usages, and there are examples in the wild.
    • Do you honestly feel that all sites providing alternate views for indexing is a better solution? For that matter, if there is less structure or meaning in the alternate view, do you really expect search engines to look only at the alternate views, when they can provide better service if they don't?

    Jeremy Dunck - 18th January 2005 22:39 - #

  12. Violent jihad or holy war against unbelievers. Murder, stealing, torture, slavery and rape against unbelievers is acceptable if couched in terms of jihad. Christians and Jews allowed to live in Muslim dominated lands provided they pay a special tax and accept their status as second class citizens. The status of polythiests is tenuous at best, they will eventually be eliminated either by persecution or death. My dear brother, please read Chronicles of Islamic Wisdom. Here are a few excerpts from the book of one of Islam s' supreme spiritual leader: Ayatollah Khomeini: "A man can have sexual pleasure from a child as young as a baby. However he should not penetrate, sodomising the child is OK. If the man penetrates and damages the child then he should be responsible for her subsistence all her life. This girl, however does not count as one of his four permanent wives. The man will not be eligible to marry the girls sister" - From Khomeini's book, "Tahrirolvasyleh", fourth volume, Darol Elm, Gom, Iran, 1990. "A man can have sex with animals such as sheep?s, cows, camels and so on. However he should kill the animal after he has his orgasm. He should not sell the meat to the people in his own village, however selling the meat to the next door village should be fine". From Khomeini's book, "Tahrirolvasyleh", fourth volume, Darol Elm, Gom, Iran, 1990 "If one commits the act of sodomy with a cow, a ewe, or a camel, their urine and their excrements become impure, and even their milk may no longer be consumed. The animal must then be killed and as quickly as possible and burned." The little green book, Sayings of Ayatollah Khomeini, Political, Phylosophica, Social and Religious with a special introduction by Clive Irving, ISBN number 0-553-14032-9, page 47"It is better for a girl to marry in such a time when she would begin menstruation at her husband's house rather than her father's home. Any father marrying his daughter so young will have a permanent place in heaven" - From Khomeini's book, "Tahrirolvasyleh", fourth volume, Darol Elm, Gom, Iran, 1990 REMEMBER IRAN CONTRA? Eleven things are impure: urine, excrement, sperm...non-Moslem men and women...and the sweat of an excrement-eating camel.Ayatollah Khomeini The most important information to know about Muslims,is that they will smile to your face and cut your throat from the back. I'm so scared i have a wart on my cock. Is it a cocksucking muslim ? 93% muslim males have anal and/or genital warts ALLAH

    WRSE - 23rd January 2005 23:17 - #

  13. One reason for introducing the nofollow could be that they want to limit the ways to get pr easily. Bloggers have been blessed with high page rank, and that skeews the ranking in favour of blogs instead of just boring info pages that could actually be more relevant but have low pr.

    Jon Berg - 24th January 2005 20:02 - #

  14. I agree that this attribute really only benefits Google, and won't do very much to slow comment spam. But helping Google isn't a bad thing.

    Mr. Nosuch - 26th January 2005 22:25 - #

  15. So it only helps Google now, which I believe is now the most widely used search engine (I remember using it right when it first came out). So in turn this helps you in the long run by keeping your legitimate links closer to the top of searches. Eventually other search engines will follow if they want to be at the top of the game since it is an obvious benefit. Take for example the link I attached to this post. It is my website but has no merit being with this post. So why should it deserve a point from Google? Simon didn't choose to link to my site. This is not as much what comment spam is but it is another form of it.

    Matt - 30th January 2005 04:00 - #

  16. I don't agree with it to be honest. In my opinion, it is the bloggers job to keep your own blog (it is yours after all!) from spam and it is the search engines job to rank the content.

    This is in essense going to kill the ability for an active participant in whatever field they work in, from becoming prominent through participation. For instance, say you do a lot of work in development, help a lot of people out with problems, yet don't have linked to through peoples blog rolls and such. By removing the ability for your page to gain rank through your feedback on other peoples sites, you are removing the ability for that person to appear in a search; mainly because the incoming links to that persons site don't exist.

    A beter solution, would have been to put in some sort of human interogation into the feedback process. The randomly generated images with a few numbers/letters come to mind; asking random but easily answered questions as you submit the post; dynamicly generating the postback URL so spammers can't just abuse the fact that they 'assume' that a MT installation will hit comments.cgi or whatever. If it was constantly posting back to a synthetic URL; they couldn't blatantly spam through the use of a perl/c/cpp/python by just hammering a known file.

    Bad in my opinion.

    Al.

    Alistair Lattimore - 30th January 2005 13:33 - #

  17. The problem with nofollow is people can use it to screw their reciprical links partners. Does anyone know if programs like linksmanager look for this tag when checking links?

    Crazy Frog - 17th June 2005 12:06 - #

  18. As far as I know, I don't think linksmanager checks for rel="no follow". I have also thought of that trick. As black hat as it may be, it's all game until the hol eis plugged.

    ReZ - 17th January 2006 18:40 - #

  19. I have a blog site and keep getting hammared with spam in the comments. Do I add rel="nofollow" to the meta section of the ASP page where comments are made, or do I edit the links left by spammers and insert rel="nofollow"?

    Blogwerkz - 24th March 2006 21:18 - #

  20. Drop it into the comments portion of the asp. That way it will insert the tag automatically.

    Chris - 17th May 2006 22:19 - #

  21. Surely rel="nofollow" is only important if you publish spam! Who would do that?

    Ed - 18th August 2006 12:23 - #

  22. Well, it is right for Google to do it, an author will put a NoFollow tag if he deems the link as unapropriate or unapprooved, therefore the link shouldn't be counted towards page rank.

    Standard Warfare - 15th October 2006 11:54 - #

  23. I think using rel='nofollow' will not prevent spam and I think it will discourage non spammers from posting on your blogs, purely becasue there is nothing in it for them.

    Sam Millar - 18th October 2006 12:48 - #

Comments are closed.

Previously hosted at http://simon.incutio.com/archive/2005/01/17/relNoFollow

A django site