Simon Willison’s Weblog

185 items tagged “projects”

2020

sqlite-utils 2.14 (via) I finally figured out porter stemming with SQLite full-text search today—it turns out it’s as easy as adding tokenize=’porter’ to the CREATE VIRTUAL TABLE statement. So I just shipped sqlite-utils 2.14 with a tokenize= option (plus the ability to insert binary file data from stdin). # 1st August 2020, 9:19 pm

Fun with binary data and SQLite

This week I’ve been mainly experimenting with binary data storage in SQLite. sqlite-utils can now insert data from binary files, and datasette-media can serve content over HTTP that originated as binary BLOBs in a database file.

[... 947 words]

datasette-media 0.4. datasette-media is my Datasette plugin for serving media (e.g. images) directly from Datasette. The first version used file paths saved in a column and served the data from disk—this new version adds the ability to serve content from BLOB columns, such as those created by the new “sqlite-utils insert-files” command. It also adds configurable support for resizing images based on querystring parameters like ?w=100. # 28th July 2020, 2:22 am

sqlite-utils 2.12 (via) I’ve been experimenting with ways of improving BLOB support in Datasette and sqlite-utils. This new version of sqlite-utils includes a “sqlite-utils insert-files” command, which can recursively crawl directories for files and add their contents to SQLite with configurable columns containing their metadata. I was inspired by Paul Ford who has been creating multi-GB SQLite databases of images and PDFs. It turns out that when disk space is cheap this is a pretty effective way of working with interesting corpuses of documents and images. # 27th July 2020, 7:36 am

pypi-rename. I wanted to rename a PyPI package (renaming datasette-insert-api to datasette-insert as it’s about to grow some non-API features). PyPI recommend uploading a final release under the old name which points to (and depends on) the new name. I’ve built a cookiecutter template to codify that pattern. # 25th July 2020, 11:07 pm

Weeknotes: datasette-copyable, datasette-insert-api

Two new Datasette plugins this week: datasette-copyable, helping users copy-and-paste data from Datasette into other places, and datasette-insert-api, providing a JSON API for inserting and updating data and creating tables.

[... 953 words]

Weeknotes: datasette-auth-passwords, a Datasette logo and a whole lot more

All sorts of project updates this week.

[... 913 words]

datasette-auth-passwords. My latest plugin: datasette-auth-passwords provides a mechanism for signing into Datasette using a username and password (which is verified in order to set a ds_actor authentication cookie). So far it only supports passwords that are hard-coded into Datasette’s configuration via environment variables, but I plan to add database-backed user accounts in the future. # 13th July 2020, 11:39 pm

Building a self-updating profile README for GitHub

GitHub quietly released a new feature at some point in the past few days: profile READMEs. Create a repository with the same name as your GitHub account (in my case that’s github.com/simonw/simonw), add a README.md to it and GitHub will render the contents at the top of your personal profile page—for me that’s github.com/simonw

[... 599 words]

Weeknotes: SBA Covid-19 PPP loans, Datasette talks, Datasette plugin upgrades

This week I’ve mainly been exploring Small Business Administration Covid-19 loans data, pitching some talks and upgrading some plugins for compatibility with Datasette 0.44+.

[... 524 words]

sba-loans-covid-19-datasette (via) The treasury department released a bunch of data on the Covid-19 SBA Paycheck Protection Program Loan recipients today—I’ve loaded the most interesting data (the $150,000+ loans) into a Datasette instance. # 7th July 2020, 2:42 am

Datasette 0.45: The annotated release notes

Datasette 0.45, out today, features magic parameters for canned queries, a log out feature, improved plugin documentation and four new plugin hooks.

[... 863 words]

Weeknotes: cookiecutter templates, better plugin documentation, sqlite-generate

I spent this week spreading myself between a bunch of smaller projects, and finally getting familiar with cookiecutter. I wrote about my datasette-plugin cookiecutter template earlier in the week; here’s what else I’ve been working on.

[... 703 words]

datasette-block-robots. Another little Datasette plugin: this one adds a /robots.txt page with “Disallow: /” to block all indexing of a Datasette instance from respectable search engine crawlers. I built this in less than ten minutes from idea to deploy to PyPI thanks to the datasette-plugin cookiecutter template. # 23rd June 2020, 3:28 am

click-app. While working on sqlite-generate today I built a cookiecutter template for building the skeleton for Click command-line utilities. It’s based on datasette-plugin so it automatically sets up GitHub Actions for running tests and deploying packages to PyPI. # 23rd June 2020, 2:21 am

sqlite-generate (via) I wrote this tool today to generate arbitrarily large SQLite databases, for testing purposes. You tell it how many tables, columns and rows you want and it will use the Faker Python library to generate random data and populate the tables with it. # 23rd June 2020, 2:19 am

A cookiecutter template for writing Datasette plugins

Datasette’s plugin system is one of the most interesting parts of the entire project. As I explained to Matt Asay in this interview, the great thing about plugins is that Datasette can gain new functionality overnight without me even having to review a pull request. I just need to get more people to write them!

[... 914 words]

Weeknotes: Datasette alphas for testing new plugin hooks

A relatively quiet week this week, compared to last week’s massive push to ship Datasette 0.44 with authentication, permissions and writable canned queries. I can now ship alpha releases, such as today’s Datasette 0.45a1, which means I can preview new plugin features before they are completely ready and stable.

[... 728 words]

Datasette 0.44: The annotated release notes

I just released Datasette 0.44 to PyPI. With 128 commits since 0.43 this is the biggest release in a long time—and likely the last major release of new features before Datasette 1.0.

[... 1648 words]

Serving photos locally with datasette-media. datasette-media is a new Datasette plugin which can serve static files from disk in response to a configured SQL query that maps incoming URL parameters to a path to a file. I built it so I could run dogsheep-photos locally on my laptop and serve up thumbnails of images that match particular queries. I’ve added documentation to the dogsheep-photos README explaining how to use datasette-media, datasette-json-html and datasette-template-sql to create custom interfaces onto Apple Photos data on your machine. # 26th May 2020, 3:53 pm

Using SQL to find my best photo of a pelican according to Apple Photos

According to the Apple Photos internal SQLite database, this is the most aesthetically pleasing photograph I have ever taken of a pelican:

[... 1937 words]

Weeknotes: Working on my screenplay

I’m taking an Introduction to Screenwriting course with Adam Tobin at Stanford, and my partial screenplay is due this week. I’m pulling together some scenes that tell the story of the Russian 1917 February Revolution and the fall of the Tsar through the lens of the craftsmen working on the Tsar’s last Fabergé egg. So I’ve not been spending much time on anything else.

[... 226 words]

Weeknotes: Datasette 0.41, photos breakthroughs

Shorter weeknotes this week, because my main project for the week warrants a detailed write-up on its own (coming soon... update 21st May here it is).

[... 867 words]

github-to-sqlite 2.2 highlights thread. I released github-to-sqlite 2.2 today with a new “stargazers” command for importing users who have starred one or more specific repositories. This Twitter thread lists highlights of recent releases and links to a live Datasette demo that shows what the tool can do. # 2nd May 2020, 10:16 pm

Weeknotes: Archiving coronavirus.data.gov.uk, custom pages and directory configuration in Datasette, photos-to-sqlite

I mainly made progress on three projects this week: Datasette, photos-to-sqlite and a cleaner way of archiving data to a git repository.

[... 1132 words]

Weeknotes: Datasette 0.40, various projects, Dogsheep photos

A new release of Datasette, two new projects and progress towards a Dogsheep photos solution.

[... 826 words]

Using a self-rewriting README powered by GitHub Actions to track TILs

I’ve started tracking TILs—Today I Learneds—inspired by this five-year-and-counting collection by Josh Branchaud on GitHub (found via Hacker News). I’m keeping mine in GitHub too, and using GitHub Actions to automatically generate an index page README in the repository and a SQLite-backed search engine.

[... 1100 words]

Weeknotes: Hacking on 23 different projects

I wrote a lot of code this week: 184 commits over 23 repositories! I’ve also started falling for Zeit Now v2, having found workarounds for some of my biggest problems with it.

[... 901 words]

datasette-clone

I released a fun little Datasette utility today: datasette-clone.

[... 379 words]

Goodbye Zeit Now v1, hello datasette-publish-now—and talking to myself in GitHub issues

This week I’ve been mostly dealing with the finally announced shutdown of Zeit Now v1. And having long-winded conversations with myself in GitHub issues.

[... 2049 words]