Simon Willison’s Weblog

Implementing IndieAuth for Datasette

IndieAuth is a spiritual successor to OpenID, developed and maintained by the IndieWeb community and based on OAuth 2. This weekend I attended IndieWebCamp East Coast and was inspired to try my hand at an implementation. datasette-indieauth is the result, a new plugin which enables IndieAuth logins to a Datasette instance.

Surprisingly this was my first IndieWebCamp—I’ve been adjacent to that community for over a decade, but I’d never made it to one of their in-person events before. Now that everything’s virtual I didn’t even have to travel anywhere, so I finally got to break my streak of non-attendance.

Understanding IndieAuth

The key idea behind IndieAuth is to provide federated login based on URLs. Users enter a URL that they own (e.g. simonwillison.net), and the protocol then derives their identity provider, redirects the user there, waits for them to sign in and get redirected back and then uses tokens passed in the redirect to prove the user’s ownership of the URL and sign them in.

Here’s what that authentication flow looks like, using this demo of the plugin:

Animated demo: starts at an IndieAuth login screen, enters simonwillison.net, gets redirected to another site where clicking the verify button completes the sign-in and redirects back to the original page.

IndieAuth works by scanning the linked page for a <link rel="authorization_endpoint" href="https://indieauth.com/auth"> HTML element which indicates a service that should be redirected to in order to authenticate the user.

I’m using IndieAuth.com for my own site’s authorization endpoint, an identity provider run by IndieAuth spec author Aaron Parecki. IndieAuth.com implements RelMeAuth.

RelMeAuth is a neat hack where the authentication provider can scan the user’s URL for a <link href="https://github.com/simonw" rel="me"> element, confirm that the GitHub profile in question links back to the same page, and then delegate to GitHub authentication for the actual sign-in.

Why implement this for Datasette?

A key goal of Datasette is to reduce the friction involved in publishing data online as much as possible.

The datasette publish command addresses this by providing a single CLI command for publishing a SQLite database to the internet and assigning it a new URL.

datasette publish cloudrun ca-fires.db \
    --service ca-fires \
    --title "Latest fires in California"

This command will create a new Google Cloud Run service, package up the ca-fires.db (created in this talk) along with the Datasette web application, and deploy the resulting site using Google Cloud Run.

It will output a URL that looks like this: https://ca-fires-j7hipcg4aq-uc.a.run.app

Datasette is unauthenticated by default—anyone can view the published data. If you want to add authentication you can do so using a plugin, for example datasette-auth-passwords.

Authentication without passwords is better. The datasette-auth-github plugin implements single-sign-on against the GitHub API, but comes with a slight disadvantage: you need to register and configure your application with GitHub in order to configure things like the redirect URL needed for authentication.

For most applications this isn’t a problem, but when you’re deploying dozens or potentially hundreds of applications with Datasette—each with initially unpredictable URLs—this can add quite a bit of friction.

The joy of IndieAuth (and OpenID before it) is that there’s no centralized authority to register with. You can deploy an application to any URL, install the datasette-indieauth plugin and users can start authenticating with your site.

Even better... IndieAuth means you can grant people permission to access a site without them needing to create an account, provided they have their own domain with IndieAuth setup.

I took advantage of that in the design of datasette-indieauth. Say you want to publish a Datasette that only I can access—you can do that using the restrict_access plugin configuration setting like so:

datasette publish cloudrun simon-only.db \
  --service simon-only \
  --title "For Simon's eye only" \
  --install datasette-indieauth \
  --plugin-secret datasette-indieauth \
    restrict_access https://simonwillison.net/

The resulting Datasette instance will require the user to authenticate in order to view it—and will only allow access to the user who can use IndieAuth to prove that they are the owner of simonwillison.net.

Next steps

There are two sides to the IndieAuth specification: client sites that allow sign-in with IndieAuth, and authorization providers that handle that authentication.

datasette-indieauth currently acts as a client, allowing sign-in with IndieAuth.

I’m considering extending the plugin to act as an authorization provider as well. This is a bit more challenging as authentication providers need to maintain some small aspects of session state, but it would be good for the IndieAuth ecosystem for there to be more providers. The most widely used provider at the moment is the excellent IndieAuth WordPress plugin, which I used while testing my Datasette plugin and really was just a one-click install from the WordPress plugin directory.

datasette-indieauth has 100% test coverage, and I wrote the bulk of the logic in a standalone utils.py module which could potentially be extracted out of the plugin and used to implement IndieAuth in Python against other frameworks. A Django IndieAuth provider is another potential project, which could integrate directly with my Django blog.

Addendum: what about OpenID?

Fom 2006 to 2010 I was a passionate advocate for OpenID. It was clear to me that passwords were an increasingly unpleasant barrier to secure usage of the web, and that some form of federated sign-in was inevitable. I was terrified that Microsoft Passport would take over all authentication on the web!

With hindsight that’s not quite what happened: for a while it looked like Facebook would win instead, but today it seems to be a fairly even balance between Facebook, Google, community-specific authentication providers like GitHub and Apple’s iPhone-monopoly-enforced Sign in with Apple.

OpenID as an open standard didn’t really make it. The specification grew in complicated new directions (Yadis, XRDS, i-names, OpenID Connect, OpenID 2.0) and it never quite overcame the usability hurdle of users having to understand URLs as identifiers.

IndieAuth is a much simpler specification, based on lessons learned from OAuth. I’m still worried about URLs as identifiers, but helping people reclaim their online presence and understand those concepts is core to what the IndieWeb movement is all about.

IndieAuth also has some clever additional tricks up its sleeve. My favourite is that IndieAuth can return an identifier for the user that’s different from the one they typed in the box. This means that if a top-level domain with many users supports IndieAuth, each user can learn to just type example.com in (or click a branded button) to start the authentication flow—they’ll be signed in as example.com/users/simonw based on who they authenticated as. This feels like an enormous usability improvement to me, and one that could really help avoid users having to remember their own profile URLs.

OpenID was trying to solve authentication for every user of the internet. IndieAuth is less ambitious—if it only takes off with the subset of people who embrace the IndieWeb movement I think that’s OK.

The datasette-indieauth project is yet another example of the benefit of having a plugin ecosystem around Datasette: I can add support for technologies like IndieAuth without baking them into Datasette’s core, which almost eliminates the risk to the integrity of the larger project of trying out something new.

This is Implementing IndieAuth for Datasette by Simon Willison, posted on 18th November 2020.

Tagged , ,

Next: Weeknotes: datasette-indieauth, datasette-graphql, PyCon Argentina

Previous: Personal Data Warehouses: Reclaiming Your Data