Simon Willison’s Weblog

Items in 2019

Filters: Year: 2019 ×


Scaling React Server-Side Rendering (via) Outstanding, detailed essay from 2017 on challenges and solutions for scaling React server-side rendering at Kijiji, Canada’s largest classified site (owned by eBay). There’s a lot of great stuff in here, including a detailed discussion of different approaches to load balancing, load shedding, component caching, client-side rendering fallbacks and more. # 30th December 2019, 10:26 pm

Guide To Using Reverse Image Search For Investigations (via) Detailed guide from Bellingcat’s Aric Toler on using reverse image search for investigative reporting. Surprisingly Google Image Search isn’t the state of the art: Russian search engine Yandex offers a much more powerful solution, mainly because it’s the largest public-facing image search engine to integrate scary levels of face recognition. # 30th December 2019, 10:23 pm

Machine Learning on Mobile and at the Edge: 2019 industry year-in-review (via) This is a fantastic detailed overview of advances made in the field of machine learning on the edge (primarily on mobile devices) over 2019. I’m really excited about this trend: I love the improved privacy implications of running models on my phone without uploading data to a server, and it’s great to see techniques like Federated Learning (from Google Labs) which enable devices to privately train models in a distributed way without having to upload their training data. # 30th December 2019, 10:17 pm

sqlite-utils 2.0: real upserts

I just released version 2.0 of my sqlite-utils library/CLI tool to PyPI.

[... 1140 words]

For creative work, you can’t cheat. My believe is that there are 5 creative hours in everyone’s day. All I ask of people at Shopify is that 4 of those are channeled into the company.

Tobi Lutke # 26th December 2019, 7:06 pm

free-for.dev (via) It’s pretty amazing how much you can build on free tiers these days—perfect for experimenting with side-projects. free-for.dev collects free SaaS tools for developers via pull request, and has had contributions from over 500 people. # 26th December 2019, 10:03 am

Weeknotes: Datasette 0.33

I released Datasette 0.33 yesterday. The release represents an accumulation of small changes and features since Datasette 0.32 back in November. Duplicating the release notes:

[... 678 words]

The Guardian’s nifty old-article trick is a reminder of how news organizations can use metadata to limit misinformation (via) The Guardian displays prominent banners on news stories from more than a year ago warning that it is an older article to help prevent accidental or intentional spread of misinformation using their content as ammunition. Impressively they also display the year prominently on the card images they serve as social media previews fir older articles. # 23rd December 2019, 9:36 am

Building tools to bring data-driven reporting to more newsrooms. I wrote about my fellowship project so far and my goals for the next few months for the JSK Medium publication. My next priority: an invite-only hosted version for newsrooms so that figuring out how to install and manage the software isn’t the biggest barrier to entry. # 20th December 2019, 11:17 am

athena-sqlite (via) Amazon Athena is the AWS tool for querying data stored in S3—as CSV, JSON or Apache Parquet files—using SQL. It’s an interesting way of buliding a very cheap data warehouse on top of S3 without having to run any additional services. Athena recently added a query federation SDK which lets you define additional custom data sources using Lambda functions. Damon Cortesi used this to write a custom connector for SQLite, which lets you run queries against data stored in SQLite files that you have uploaded to S3. You can then run joins between that data and other Athena sources. # 18th December 2019, 9:05 am

GitHub Actions ci.yml for deno. Spotted this today: it’s one of the cleanest examples I’ve seen of a complex CI configuration for GitHub Actions, testing, linting, benchmarking and building Ryan Dahl’s deno JavaScript runtime. # 18th December 2019, 8:49 am

Microbrowsers are Everywhere (via) Colin Bendell introduces a new-to-me term, “microbrowsers”, to describe the user-agents which hit websites to generate unfurled link previews in messenger apps. Twitter and Facebook first popularized them, but today you’re likely getting far more preview-generating traffic from chat clients such as iMessage, WhatsApp and Slack (which won’t execute script and ignore cookies, and hence won’t show up in Google Analytics). Lots of great tips here—one example: if you provide three og:image meta tags iMessage will render them as a collage. # 18th December 2019, 8:32 am

Logging to SQLite using ASGI middleware

I had some fun playing around with ASGI middleware and logging during our flight back to England for the holidays.

[... 2535 words]

Monarch Bear Grove on Niche Museums (via) Monarch Bear Grove is my favourite hidden corner of Golden Gate Park in San Francisco. It has stone circles formed from pieces of a Spanish monastery that was exported to the USA by press baron William Randolph Hearst. And there are druids. You should read the whole thing. (I added paragraph breaks for this using datasette-render-markdown—Niche Museums is basically a full-blown blog now.) # 16th December 2019, 9:19 pm

London Silver Vaults on Niche Museums. I’m keeping up my streak of posting a new museum I’ve visited to niche-museums.com daily—today’s entry is the London Silver Vaults, which I think are one of London’s best kept secrets: 30 specialist silver merchants in a network of vaults five storeys below Chancery Lane. # 12th December 2019, 2:40 am

The Blue Tape List (via) I’ve often thought there’s something magical about your first month at a new job—you can meet anyone and ask any question, taking advantage of your “newbie” status. I like this suggestion by Michael Lopp to encourage your new hires to take notes on things that they think are broken but reserve acting on them for long enough to gain fuller context of how the new organization works. # 10th December 2019, 6:09 pm

Better presentations through storytelling and STAR moments

Last week I completed GSBGEN 315: Strategic Communication at the Stanford Graduate School of Business.

[... 643 words]

Two malicious Python libraries caught stealing SSH and GPG keys. Nasty. Two typosquatting libraries were spotted on PyPI—targetting dateutil and jellyfish but with tricky variants of their names. They attempted to exfiltrate SSH and GPG keys and send them to an IP address defined server. npm has seen this kind of activity too—it’s important to consider this when installing packages. # 5th December 2019, 6:07 am

flk: A LISP that runs wherever Bash is (via) This is a heck of a project: an implementation of LISP written entirely in Bash, meaning you can run it as a script on any machine that has a Bash installation. # 4th December 2019, 5:19 am

Let’s agree that no matter what we call the situation that the humans who are elsewhere are at a professional disadvantage. There is a communication, culture, and context tax applied to the folks who are distributed. Your job as a leader to actively invest in reducing that tax.

Michael Lopp # 3rd December 2019, 1:34 pm

datasette-atom: Define an Atom feed using a custom SQL query

I’ve been having a ton of fun iterating on www.niche-museums.com. I put together some notes on how the site works last week, and I’ve been taking advantage of the Thanksgiving break to continue exploring ways in which Datasette can be used to quickly build database-backed static websites.

[... 1084 words]

In general, reviewers should favor approving a CL [code review] once it is in a state where it definitely improves the overall code health of the system being worked on, even if the CL isn’t perfect.

Google Standard of Code Review # 28th November 2019, 5:40 am

niche-museums.com, powered by Datasette

I just released a major upgrade to my www.niche-museums.com website (launched last month).

[... 1154 words]

With a sufficient number of users of an API, it does not matter what you promise in the contract: all observable behaviors of your system will be depended on by somebody.

Hyrum's Law # 21st November 2019, 10:45 pm

How Do You Remove Unused CSS From a Site? (via) Chris Coyier takes an exhaustive look at the current set of tools for automatically removing unused CSS, and finds that there’s no magic bullet but you can get OK results if you use them carefully. # 21st November 2019, 4:41 am

Weeknotes: datasette-template-sql

Last week I talked about wanting to take ona a larger Datasette project, and listed some candidates. I ended up pushing a big project that I hadn’t listed there: the upgrade of Datasette to Python 3.8, which meant dropping support for Python 3.5 (thanks to incompatible dependencies).

[... 521 words]

datasette-template-sql (via) New Datasette plugin, celebrating the new ability in Datasette 0.32 to have asynchronous custom template functions in Jinja (which was previously blocked by the need to support Python 3.5). The plugin adds a sql() function which can be used to execute SQL queries that are embedded directly in custom templates. # 15th November 2019, 12:59 am

I have sometimes wondered how I would fare with a problem where the solution really isn’t in sight. I decided that I should give it a try before I get too old. I’m going to work on artificial general intelligence (AGI). I think it is possible, enormously valuable, and that I have a non-negligible chance of making a difference there, so by a Pascal’s Mugging sort of logic, I should be working on it.

John Carmack # 14th November 2019, 1:18 am

Datasette 0.31. Released today: this version adds compatibility with Python 3.8 and breaks compatibility with Python 3.5. Since Glitch support Python 3.7.3 now I decided I could finally give up on 3.5. This means Datasette can use f-strings now, but more importantly it opens up the opportunity to start taking advantage of Starlette, which makes all kinds of interesting new ASGI-based plugins much easier to build. # 12th November 2019, 6:11 am

My Python Development Environment, 2020 Edition (via) Jacob Kaplan-Moss shares what works for him as a Python environment coming into 2020: pyenv, poetry, and pipx. I’m not a frequent user of any of those tools—it definitely looks like I should be. # 12th November 2019, 1:30 am