Simon Willison’s Weblog

Subscribe

Plugin support for Datasette Lite

17th August 2022

I’ve added a new feature to Datasette Lite, my distribution of Datasette that runs entirely in the browser using Python and SQLite compiled to WebAssembly. You can now install additional Datasette plugins by passing them in the URL.

Datasette Lite background

Datasette Lite runs Datasette in the browser. I initially built it as a fun technical proof of concept, but I’m increasingly finding it to be a genuinely useful tool for quick ad-hoc data analysis and publication. Not having any server-side components at all makes it effectively free to use without fear of racking up cloud computing costs for a throwaway project.

You can read more about Datasette Lite in these posts:

Adding plugins to Datasette Lite

One of Datasette’s key features is support for plugins. There are over 90 listed in the plugin directory now, with more emerging all the time. They’re a fantastic way to explore new feature ideas and extend the software to handle non-default use cases.

Plugins are Python packages, published to PyPI. You can add them to Datasette Lite using the new ?install=name-of-plugin query string parameter.

Here’s an example URL that loads the datasette-jellyfish plugin, which adds new SQL functions for calculating distances between strings, then executes a SQL query that demonstrates that plugin:

https://lite.datasette.io/?install=datasette-jellyfish#/fixtures?sql=SELECT%0A++++levenshtein_distance%28%3As1%2C+%3As2%29%2C%0A++++damerau_levenshtein_distance%28%3As1%2C+%3As2%29%2C%0A++++hamming_distance%28%3As1%2C+%3As2%29%2C%0A++++jaro_similarity%28%3As1%2C+%3As2%29%2C%0A++++jaro_winkler_similarity%28%3As1%2C+%3As2%29%2C%0A++++match_rating_comparison%28%3As1%2C+%3As2%29%3B&s1=barrack+obama&s2=barrack+h+obama

That URL uses ?install=datasette-jellyfish to install the plugin, then executes the following SQL query:

SELECT
    levenshtein_distance(:s1, :s2),
    damerau_levenshtein_distance(:s1, :s2),
    hamming_distance(:s1, :s2),
    jaro_similarity(:s1, :s2),
    jaro_winkler_similarity(:s1, :s2),
    match_rating_comparison(:s1, :s2);

It sets s1 to "barack obama" and s2 to "barrack h obama".

Screenshot showing the results of that SQL query running in Datasette Lite. It compares the string barrack obama with the string barrack h obama and shows various different scores.

Plugin compatibility

Unfortunately, many existing Datasette plugins aren’t yet compatible with Datasette Lite. Most importantly, visualization plugins such as datasette-cluster-map and datasette-vega don’t work.

This is because I haven’t yet solved the challenge of loading additional JavaScript and CSS into Datasette Lite—see issue #8.

Here’s the full list of plugins that I’ve confirmed work with Datasette Lite so far:

How it works

The implementation is pretty simple—it can be seen in this commit. The short version is that ?install= options are passed through to the Python web worker that powers Datasette Lite, which then runs the following:

for install_url in install_urls:
    await micropip.install(install_url)

micropip is a component of Pyodide which knows how to install pure Python wheels directly from PyPI into the browser’s emulated Python environment. If you open up the browser devtools networking panel you can see that in action!

The Firefox Network pane shows a flurry of traffic, some of it to PyPI to look up the JSON descriptions of packages followed by downloads of .whl files from files.pythonhosted.org

Since the ?install= parameter is being passed directly to micropip.install() you don’t even need to provide names of packages hosted on PyPI—you could instead provide the URL to a wheel file that you’re hosting elsewhere.

This means you can use ?install= as a code injection attack—you can install any Python code you want into the environent. I think that’s fine—the only person who will be affected by this is the user who is viewing the page, and the lite.datasette.io domain deliberately doesn’t have any cookies set that could cause problems if someone were to steal them in some way.