Simon Willison’s Weblog

Entries tagged projects

Filters: Type: entry × projects ×


Single sign-on against GitHub using ASGI middleware

I released Datasette 0.29 last weekend, the first version of Datasette to be built on top of ASGI (discussed previously in Porting Datasette to ASGI, and Turtles all the way down).

[... 1612 words]

Porting Datasette to ASGI, and Turtles all the way down

This evening I finally closed a Datasette issue that I opened more than 13 months ago: #272: Port Datasette to ASGI. A few notes on why this is such an important step for the project.

[... 1082 words]

Datasette 0.28—and why master should always be releasable

It’s been quite a while since the last substantial release of Datasette. Datasette 0.27 came out all the way back in January.

[... 1326 words]

Running Datasette on Glitch

The worst part of any software project is setting up a development environment. It’s by far the biggest barrier for anyone trying to get started learning to code. I’ve been a developer for more than twenty years and I still feel the pain any time I want to do something new.

[... 998 words]

Generating a commit log for San Francisco’s official list of trees

San Francisco has a neat open data portal (as do an increasingly large number of cities these days). For a few years my favourite file on there has been Street Tree List, a list of all 190,000 trees in the city maintained by the Department of Public Works.

[... 1051 words]

I commissioned an oil painting of Barbra Streisand’s cloned dogs

Two dogs in a stroller looking at a gravestone, as an oil painting
Two identical puffs of white fur, gazing at the tombstone of the dog they are

[... 517 words]

sqlite-utils: a Python library and CLI tool for building SQLite databases

sqlite-utils is a combination Python library and command-line tool I’ve been building over the past six months which aims to make creating new SQLite databases as quick and easy as possible.

[... 1237 words]

Exploring search relevance algorithms with SQLite

SQLite isn’t just a fast, high quality embedded database: it also incorporates a powerful full-text search engine in the form of the FTS4 and FTS5 extensions. You’ve probably used these a bunch of times already: many iOS, Android and desktop applications use SQLite under-the-hood and use it to implement their built-in search.

[... 1398 words]

Zeit 2.0, and building smaller Python Docker images

Changes are afoot at Zeit Now, my preferred hosting provider for the past year (see previous posts). They have announced Now 2.0, an intriguing new approach to providing auto-scaling immutable deployments. It’s built on top of lambdas, and comes with a whole host of new constraints: code needs to fit into a 5MB bundle for example (though it looks like this restriction will soon be relaxed a littleupdate November 19th you can now bump this up to 50MB).

[... 1872 words]

The interesting ideas in Datasette

Datasette (previously) is my open source tool for exploring and publishing structured data. There are a lot of ideas embedded in Datasette. I realized that I haven’t put many of them into writing.

[... 2857 words]

Analyzing US Election Russian Facebook Ads

Two interesting data sources have emerged in the past few weeks concerning the Russian impact on the 2016 US elections.

[... 922 words]

Datasette Facets

Datasette 0.22 is out with the most significant new feature I’ve added since the initial release: faceted browse.

[... 1189 words]

Exploring the UK Register of Members Interests with SQL and Datasette

Ever wondered which UK Members of Parliament get gifted the most helicopter rides? How about which MPs have been given Christmas hampers by the Sultan of Brunei? (David Cameron, William Hague and Michael Howard apparently). Here’s how to dig through the Register of Members Interests using SQL and Datasette.

[... 1167 words]

Datasette plugins, and building a clustered map visualization

Datasette now supports plugins!

[... 751 words]

Analyzing my Twitter followers with Datasette

I decided to do some ad-hoc analsis of my social network on Twitter this afternoon… and since everything is more fun if you bundle it up into a SQLite database and publish it to the internet I performed the analysis using Datasette.

[... 1314 words]

Datasette Publish: a web app for publishing CSV files as an online database

I’ve just released Datasette Publish, a web tool for turning one or more CSV files into an online database with a JSON API.

[... 863 words]

New in Datasette: filters, foreign keys and search

I’ve released Datasette 0.13 with a number of exciting new features (Datasette previously).

[... 1143 words]

Datasette: instantly create and publish an API for your SQLite databases

I just shipped the first public version of datasette, a new tool for creating and publishing JSON APIs for SQLite databases.

[... 968 words]

Implementing faceted search with Django and PostgreSQL

I’ve added a faceted search engine to this blog, powered by PostgreSQL. It supports regular text search (proper search, not just SQL“like” queries), filter by tag, filter by date, filter by content type (entries vs blogmarks vs quotation) and any combination of the above. Some example searches:

[... 3049 words]

WildlifeNearYou: It began on a fort...

Back in October 2008, myself and 11 others set out on the first /dev/fort expedition. The idea was simple: gather a dozen geeks, rent a fort, take food and laptops and see what we could build in a week.

[... 558 words]

Crowdsourced document analysis and MP expenses

As you may have heard, the UK government released a fresh batch of MP expenses documents a week ago on Thursday. I spent that week working with a small team at Guardian HQ to prepare for the release. Here’s what we built:

[... 2051 words]

Django ponies: Proposals for Django 1.2

I’ve decided to step up my involvement in Django development in the run-up to Django 1.2, so I’m currently going through several years worth of accumulated pony requests figuring out which ones are worth advocating for. I’m also ensuring I have the code to back them up—my innocent AutoEscaping proposal a few years ago resulted in an enormous amount of work by Malcolm and I don’t think he’d appreciate a repeat performance.

[... 1674 words]

djng—a Django powered microframework

djng is nearly two weeks old now, so it’s about time I wrote a bit about the project.

[... 1501 words]

rev=canonical bookmarklet and designing shorter URLs

I’ve watched the proliferation of URL shortening services over the past year with a certain amount of dismay. I care about the health of the web and try to ensure that URLs I am responsible will last for as long as possible, and I think it’s very unlikely that all of these new services will still be around in twenty years time. Last month I suggested that the Internet Archive start mirroring redirect databases, and last week I was pleased to hear that Archiveteam, a different organisation, had already started crawling.

[... 920 words]

Rate limiting with memcached

On Monday, several high profile “celebrity” Twitter accounts started spouting nonsense, the victims of stolen passwords. Wired has the full story—someone ran a dictionary attack against a Twitter staff member, discovered their password and used Twitter’s admin tools to reset the passwords on the accounts they wanted to steal.

[... 910 words]

Announcing dmigrations

The team at Global Radio (formerly GCap Media) is the largest group of Django developers I’ve personally worked with, consisting of 14 developers split into two scrum teams, all contributing to the same overall codebase.

[... 625 words]