Simon Willison’s Weblog


29 items tagged “seo”


Weeknotes: python_requires, documentation SEO

Fixed Datasette on Python 3.6 for the last time. Worked on documentation infrastructure improvements. Spent some time with Fly Volumes.

[... 1497 words]


datasette-block-robots. Another little Datasette plugin: this one adds a /robots.txt page with “Disallow: /” to block all indexing of a Datasette instance from respectable search engine crawlers. I built this in less than ten minutes from idea to deploy to PyPI thanks to the datasette-plugin cookiecutter template. # 23rd June 2020, 3:28 am

Building a sitemap.xml with a one-off Datasette plugin

One of the fun things about launching a new website is re-learning what it takes to promote a website from scratch on the modern web. I’ve been thoroughly enjoying using Niche Museums as an excuse to explore 2020-era SEO.

[... 1078 words]


Evolving “nofollow” – new ways to identify the nature of links (via) Slightly confusing announcement from Google: they’re introducing rel=ugc and rel=sponsored in addition to rel=nofollow, and will be treating all three values as “hints” for their indexing system. They’re very unclear as to what the concrete effects of these hints will be, presumably because they will become part of the secret sauce of their ranking algorithm. # 10th September 2019, 9:16 pm


Googlebot’s Javascript random() function is deterministic. random() as executed by Googlebot returns the same predicable sequence. More interestingly, Googlebot runs a much faster timer for setTimeout and setInterval—as Tom Anthony points out, “Why actually wait 5 seconds when you are a bot?” # 7th February 2018, 2:41 am


What is the plural of blitz? Wow, WordHippo is a straight up masterclass in keyword SEO tactics. Everything from the page URL to the keyword-crammed content to the enormous quantity of related links. # 25th November 2017, 5:42 pm


Whether 404 custom error page necessary for a website?

They aren’t required, but if you don’t have a custom 404 page you’re missing out on a very easy way of improving the user experience of your site, and protecting against expired or incorrect links from elsewhere on the web.

[... 98 words]


What are good sources to learn about SEO?

The Beginner’s Guide to SEO from Moz (previously SEOMoz) is an excellent introduction to SEO fundamentals.

[... 31 words]

Do comments really count for SEO link building?

Most sensible commenting systems will put rel=nofollow on links to discourage comment spam, which will have a significant effect on SEO.

[... 35 words]


What are the ways to  Convert Dynamic JSP pages to a Static HTML to Appear in Google search results?

You don’t have to do anything. You’re misunderstanding how dynamic server-side languages like JSP work.

[... 202 words]

What is the optimal description length in the Apple App Store?

Have you ever come across one if those ugly, long pages advertising an ebook—the ones that bang on for dozens of paragraphs with bullet points, pictures, testimonials, headings, more testimonials, more bullet points and so on?

[... 106 words]

Why does Google use “Allow” in robots.txt, when the standard seems to be “Disallow?”

The Disallow command prevents search engines from crawling your site.

[... 59 words]

Why is Google indexing & displaying www1 versions of my site and how might I stop this?

You should stop serving your site to the public on multiple subdomains. Configure your site to serve a 301 permanent redirect from www1-www4 to the equivalent page on www—also, make sure that your site accessed without the www redirects to the right place as well.

[... 269 words]

What are the best SEO conferences around Cincinnati?

It doesn’t look like there are many (any?) SEO events in Cincinnati, but Chicago has has SES in November 2012:

[... 36 words]


Does domain name masking negatively effect SEO?

Yes, because you’ve made it impossible for people to share links to sub-pages on your site—which means you won’t get incoming links to those pages, a crucial ranking metric.

[... 44 words]


Is it a good idea to allocate URLs such as to users?

There’s an interesting discussion about this issue on this question: How do sites prevent vanity URLs from colliding with future features ?

[... 42 words]

If I have data that loads using  json / JavaScript will it get indexed by Google?

No. Personally I dislike sites with content that is only accessible through JavaScript, but if you absolutely insist on doing this you should look in to implementing the Google Ajax Crawling mechanism:

[... 56 words]

Great Literature Retitled To Boost Website Traffic (via) “7 Awesome Ways Barnyard Animals Are Like Communism”. # 17th June 2010, 10:32 am

I’m renaming the book to “Dive Into HTML 5” for better SEO. This is not a joke. The book is the #5 search result for “HTML5” (no space) but #13 for “HTML 5” (with a space). I get 514 visitors a day searching Google for “HTML5” but only 53 visitors a day searching for “HTML 5”.

Mark Pilgrim # 8th June 2010, 8:48 pm


Official Google Webmaster Blog: A proposal for making AJAX crawlable. It’s horrible! The Google crawler would map url#!state to url?_escaped_fragment_=state, then expect your site to provide rendered HTML that reflects that state (they even go as far as to suggest running a headless browser within your web server to do this). Just stick to progressive enhancement instead, it’s far less hideous. It looks like the proposal may have originated with the GWT team. # 8th October 2009, 5:52 pm

Specify your canonical. You can now use a link rel=“canonical” to tell Google that a page has a canonical URL elsewhere. I’ve run in to this problem a bunch of times—in some sites it really does make sense to have the same content shown in two different places—and this seems like a neat solution that could apply to much more than just metadata for external search engines. # 14th February 2009, 11:28 am


Underscores are now word separators, proclaims Google. I missed this story last year—the change was announced by Matt Cutts at WordCamp 2007. # 13th August 2008, 1:06 pm

Search Engine Optimization Through Hoax News. Devious new black-hat SEO technique: invent a news story that’s pure link-bait. The recent “13 year old steals dad’s credit card to buy hookers” story was a hoax: it was a pure play for PageRank. # 22nd May 2008, 6:09 pm


Some thoughts on Mahalo. Rich Skrenta with notes on running a large site that lives and dies by SEO traffic. # 20th August 2007, 5:23 pm

If you’re designing social media systems, you should be keeping an eye on the $2B industry that sells links from your site to their clients.

Rick Skrenta # 7th April 2007, 12:32 am

Why people hate SEO... (and why SMO is bulls$%t). Jason Calacanis explains SMO, or “Social Media Optimisation”—digg spamming now has its own TLA. # 8th February 2007, 7:47 am


The dangers of PageRank

A well documented side effect of the weblog format is that it brings Google PageRank in almost absurd quantities. I’m now the 5th result for simon on Google, and I’ve been the top result for simon willison almost since the day I launched. High rankings however are not always a good thing, especially when combined with a comment system. A growing number of bloggers have found themselves at the top position for terms of little or no relevance to the rest of their sites, which in turn can attract truly surreal comments from visitors from search engines who may never have encountered a blog before.

[... 469 words]


Google conspiracy theories

Microdoc News have a poorly researched story suggesting that Google have been engineering their search results to favour their own properties:

[... 582 words]

Google oddities

Dave Winer:

[... 182 words]