Simon Willison’s Weblog

Subscribe
Atom feed

Entries

Filters: Sorted by date

Weeknotes: git-history, bug magnets and s3-credentials --public

I’ve stopped considering my projects “shipped” until I’ve written a proper blog entry about them, so yesterday I finally shipped git-history, coinciding with the release of version 0.6—a full 27 days after the first 0.1.

[... 1,013 words]

git-history: a tool for analyzing scraped data collected using Git and SQLite

Visit git-history: a tool for analyzing scraped data collected using Git and SQLite

I described Git scraping last year: a technique for writing scrapers where you periodically snapshot a source of data to a Git repository in order to record changes to that source over time.

[... 2,002 words]

Weeknotes: Shaving some beautiful yaks

Visit Weeknotes: Shaving some beautiful yaks

I’ve been mostly shaving yaks this week—two in particular: the Datasette table refactor and the next release of git-history. I also built and released my first Web Component!

[... 1,307 words]

Weeknotes: Apache proxies in Docker containers, refactoring Datasette

Updates to six major projects this week, plus finally some concrete progress towards Datasette 1.0.

[... 1,630 words]

Weeknotes: git-history, created for a Git scraping workshop

Visit Weeknotes: git-history, created for a Git scraping workshop

My main project this week was a 90 minute workshop I delivered about Git scraping at Coda.Br 2021, a Brazilian data journalism conference, on Friday. This inspired the creation of a brand new tool, git-history, plus smaller improvements to a range of other projects.

[... 1,239 words]

Weeknotes: datasette-jupyterlite, s3-credentials and a Python packaging talk

Visit Weeknotes: datasette-jupyterlite, s3-credentials and a Python packaging talk

My big project this week was s3-credentials, described yesterday—but I also put together a fun expermiental Datasette plugin bundling JupyterLite and wrote up my PyGotham talk on Python packaging.

[... 476 words]

How to build, test and publish an open source Python library

Visit How to build, test and publish an open source Python library

At PyGotham this year I presented a ten minute workshop on how to package up a new open source Python library and publish it to the Python Package Index. Here is the video and accompanying notes, which should make sense even without watching the talk.

[... 2,055 words]

s3-credentials: a tool for creating credentials for S3 buckets

Visit s3-credentials: a tool for creating credentials for S3 buckets

I’ve built a command-line tool called s3-credentials to solve a problem that’s been frustrating me for ages: how to quickly and easily create AWS credentials (an access key and secret key) that have permission to read or write from just a single S3 bucket.

[... 1,618 words]

Weeknotes: Learning Kubernetes, learning Web Components

I’ve been mainly climbing the learning curve for Kubernetes and Web Components this week. I also released Datasette 0.59.1 with Python 3.10 compatibility and an updated Docker image.

[... 1,101 words]

Datasette 0.59: The annotated release notes

Visit Datasette 0.59: The annotated release notes

Datasette 0.59 is out, with a miscellaneous grab-bag of improvements. Here are the annotated release notes.

[... 2,103 words]

Finding and reporting an asyncio bug in Python 3.10

I found a bug in Python 3.10 today! Some notes on how I found it and my process for handling it once I figured out what was going on.

[... 1,789 words]

Weeknotes: CDC vaccination history fixes, developing in GitHub Codespaces

I spent the last week mostly surrounded by boxes: we’re completing our move to the new place and life is mostly unpacking now. I did find some time to fix some issues with my CDC vaccination history Datasette instance though.

[... 514 words]

Weeknotes number 100

This entry marks my 100th weeknotes, which I’ve managed to post once a week (plus or minus a few days) consistently since 13th September 2019.

[... 593 words]

Datasette Desktop 0.2.0: The annotated release notes

Visit Datasette Desktop 0.2.0: The annotated release notes

Datasette Desktop is a new macOS desktop application version of Datasette, an “open source multi-tool for exploring and publishing data” built on top of SQLite. I released the first version last week—I’ve just released version 0.2.0 (and a 0.2.1 bug fix) with a whole bunch of critical improvements.

[... 2,208 words]

Datasette Desktop—a macOS desktop application for Datasette

Visit Datasette Desktop - a macOS desktop application for Datasette

I just released version 0.1.0 of the new Datasette macOS desktop application, the first version that end-users can easily install. I would very much appreciate your help testing it out!

[... 1,761 words]

Building a desktop application for Datasette (and weeknotes)

Visit Building a desktop application for Datasette (and weeknotes)

This week I started experimenting with a desktop application version of Datasette—with the goal of providing people who aren’t comfortable with the command-line the ability to get Datasette up and running on their own personal computers.

[... 1,423 words]

Dynamic content for GitHub repository templates using cookiecutter and GitHub Actions

Visit Dynamic content for GitHub repository templates using cookiecutter and GitHub Actions

GitHub repository templates were introduced a couple of years ago to provide a mechanism for creating a brand new GitHub repository starting with an initial set of files.

[... 1,413 words]

Weeknotes: Getting my personal Dogsheep up and running again

Visit Weeknotes: Getting my personal Dogsheep up and running again

I gave a talk about Dogsheep at Noisebridge’s Five Minutes of Fame on Thursday. Just one problem: my regular Dogsheep demo was broken, so I ended up building it from scratch again. In doing so I fixed a few bugs in some Dogsheep tools.

[... 1,311 words]

Datasette on Codespaces, sqlite-utils API reference documentation and other weeknotes

Visit Datasette on Codespaces, sqlite-utils API reference documentation and other weeknotes

This week I broke my streak of not sending out the Datasette newsletter, figured out how to use Sphinx for Python class documentation, worked out how to run Datasette on GitHub Codespaces, implemented Datasette column metadata and got tantalizingly close to a solution for an elusive Datasette feature.

[... 2,164 words]

Apply conversion functions to data in SQLite columns with the sqlite-utils CLI tool

Visit Apply conversion functions to data in SQLite columns with the sqlite-utils CLI tool

Earlier this week I released sqlite-utils 3.14 with a powerful new command-line tool: sqlite-utils convert, which applies a conversion function to data stored in a SQLite column.

[... 1,941 words]

Exploring the SameSite cookie attribute for preventing CSRF

Visit Exploring the SameSite cookie attribute for preventing CSRF

In reading Yan Zhu’s excellent write-up of the JSON CSRF vulnerability she found in OkCupid one thing puzzled me: I was under the impression that browsers these days default to treating cookies as SameSite=Lax, so I would expect attacks like the one Yan described not to work in modern browsers.

[... 2,198 words]

Weeknotes: datasette-remote-metadata, sqlite-transform --multi

I mentioned Project Pelican (still a codename until the end of the embargo) last week. This week it inspired a new plugin, datasette-remote-metadata.

[... 595 words]

The Baked Data architectural pattern

Visit The Baked Data architectural pattern

I’ve been exploring an architectural pattern for publishing websites over the past few years that I call the “Baked Data” pattern. It provides many of the advantages of static site generators while avoiding most of their limitations. I think it deserves to be used more widely.

[... 1,896 words]

Datasette—an ecosystem of tools for working with small data

Visit Datasette - an ecosystem of tools for working with small data

This is the transcript and video from a talk I gave at PyGotham 2020 about using SQLite, Datasette and Dogsheep to work with small data.

[... 4,655 words]

Weeknotes: sqlite-transform 1.1, Datasette 0.58.1, datasette-graphql 1.5

Work on Project Pelican inspires new features and improvements across a number of different projects.

[... 1,419 words]

It doesn’t take much public creativity to stand out as a job candidate

I’ve spent nearly twenty years blogging, giving talks and releasing open source code. It’s been fantastic for my career, and a huge amount of work. But here’s a useful secret: you don’t have to put very much work at all into public creativity in order to stand out as a job candidate.

[... 495 words]

Datasette 0.58: The annotated release notes

I released Datasette 0.58 last night, with new plugin hooks, Unix domain socket support, a major faceting performance fix and a few other improvements. Here are the annotated release notes.

[... 1,062 words]

Weeknotes: Fun with Unix domain sockets

A small enhancement to Datasette this week: I’ve added support for proxying via Unix domain sockets.

[... 809 words]

Django SQL Dashboard 1.0

Visit Django SQL Dashboard 1.0

Earlier this week I released Django SQL Dashboard 1.0. I also just released 1.0.1, with a bug fix for PostgreSQL 10 contributed by Ryan Cheley.

[... 629 words]

PAGNIs: Probably Are Gonna Need Its

Luke Page has a great post up with his list of YAGNI exceptions.

[... 1,289 words]