The dangers of PageRank
A well documented side effect of the weblog format is that it brings Google PageRank in almost absurd quantities. I’m now the 5th result for simon on Google, and I’ve been the top result for simon willison almost since the day I launched. High rankings however are not always a good thing, especially when combined with a comment system. A growing number of bloggers have found themselves at the top position for terms of little or no relevance to the rest of their sites, which in turn can attract truly surreal comments from visitors from search engines who may never have encountered a blog before.
I know of a couple of entries on my own blog that are attracting this kind of traffic. The most interesting is probably this entry on artifical diamonds, which has attracted comments from both buyers and sellers of artificial gems. My entry on MSN messenger usability problems from 2002 has drawn a steady stream of hilarious comments, no doubt caused in part by its top rating on Google for msn messenger sucks. Amusingly, for a long time Microsoft’s own search engine was giving my page a high rank for a wide variety of less negative messenger related terms.
My own experiences of this phenomenon pale in to significance to some of the others I’ve seen. The most impressive example has to be Jason Kottke’s brief review of the Matrix Reloaded, which drew over 900 comments from Google strays, developed its own micro-community and resulted in Jason pondering who owns the conversation on my web site? Jason eventually deciding to close and archive the thread after the page grew to more than a megabyte in size.
The problem can take on a far more disturbing twist. I won’t link directly to these entries for fear of adding to their predicaments, but searches for crime scene cleanup and suicide chat rooms both return blogs in the first two results. The former thread is mostly crime scene cleanup companies marketing their services, but the latter is quite frankly disturbing. It’s certainly lead me to double check the titles of my entries before posting them.
Thankfully, avoiding this kind of unwanted comment traffic is pretty simple. One way is to simply disable comments for entries older than a certain time (generally a couple of weeks), although personally I like to see the occasional comment on old entries. A neater solution proposed by Russell Beattie last year is to simply hide comments from search engine referrals, thus ensuring that random strays won’t leave their mark without understanding the nature of your site first.
More recent articles
- ChatGPT should include inline tips - 30th May 2023
- Lawyer cites fake cases invented by ChatGPT, judge is not amused - 27th May 2023
- llm, ttok and strip-tags - CLI tools for working with ChatGPT and other LLMs - 18th May 2023
- Delimiters won't save you from prompt injection - 11th May 2023
- Weeknotes: sqlite-utils 3.31, download-esm, Python in a sandbox - 10th May 2023
- Leaked Google document: "We Have No Moat, And Neither Does OpenAI" - 4th May 2023
- Midjourney 5.1 - 4th May 2023
- Prompt injection explained, with video, slides, and a transcript - 2nd May 2023
- download-esm: a tool for downloading ECMAScript modules - 2nd May 2023
- Let's be bear or bunny - 1st May 2023