Simon Willison’s Weblog

Items in Feb, 2020

Filters: Year: 2020 × Month: Feb ×

Wildcard: Spreadsheet-Driven Customization of Web Applications (via) What a fascinating collection of ideas. Wildcard is a browser extension (currently using Tampermonkey and sadly not yet available to try out) which lets you add “spreadsheet-driven customization” to any web application. Watching the animated screenshots in the videos helps explain what this mean—essentially it’s a two-way scraping trick, where content on the page (e.g. Airbnb listings) are extracted into a spreadsheet-like table interface using JavaScript—but then interactions you make in that spreadsheet like filtering and sorting are reflected back on the original page. It even has the ability to serve editable cells by mapping them to form inputs on the page. Lots to think about here. # 28th February 2020, 7:39 pm

Weeknotes: Datasette Writes

As discussed previously, the biggest hole in Datasette’s feature set at the moment involves writing to the database.

[... 604 words]

I’ve really come to appreciate that performance isn’t just some property of a tool independent from its functionality or its feature set. Performance — in particular, being notably fast — is a feature in and of its own right, which fundamentally alters how a tool is used and perceived.

Nelson Elhage # 24th February 2020, 2:32 pm

Why Google invested in providing Google Fonts for free. Fascinating comment from former Google Fonts team member Raph Levien. In short: text rendered as PNGs hurt Google Search, fonts were a delay in the transition from Flash, Google Docs needed them to better compete with Office and anything that helps create better ads is easy to find funding for. # 23rd February 2020, 2:13 pm

So next time someone is giving you feedback about something you made, think to yourself that to win means getting two or three insights, ideas, or suggestions that you are excited about, and that you couldn’t think up on your own.

Juliette Cezzar # 21st February 2020, 1:04 am

Things I learned about shapefiles building shapefile-to-sqlite

The latest in my series of x-to-sqlite tools is shapefile-to-sqlite. I learned a whole bunch of things about the ESRI shapefile format while building it.

[... 1073 words]

pup. This is a great idea: a command-line tool for parsing HTML on stdin using CSS selectors. It’s like jq but for HTML. Supports a sensible collection of selectors and has a number of output options for the selected nodes, including plain text and JSON. It also works as a simple pretty-printer for HTML. # 14th February 2020, 4:25 pm

A group of software engineers gathered around a whiteboard are a joint cognitive system. The scrawls on the board are spatial cues for building a shared model of a complex system.

Eric Dobbs # 13th February 2020, 6:48 pm

How to cheat at unit tests with pytest and Black

I’ve been making a lot of progress on Datasette Cloud this week. As an application that provides private hosted Datasette instances (initially targeted at data journalists and newsrooms) the majority of the code I’ve written deals with permissions: allowing people to form teams, invite team members, promote and demote team administrators and suchlike.

[... 885 words]

We write a lot of JavaScript at Basecamp, but we don’t use it to create “JavaScript applications” in the contemporary sense. All our applications have server-side rendered HTML at their core, then add sprinkles of JavaScript to make them sparkle. [...] It allows us to party with productivity like days of yore. A throwback to when a single programmer could make rapacious progress without getting stuck in layers of indirection or distributed systems. A time before everyone thought the holy grail was to confine their server-side application to producing JSON for a JavaScript-based client application.

David Heinemeier Hansson # 8th February 2020, 8:10 am

Deep learning isn’t hard anymore. This article does a great job of explaining how transfer learning is unlocking a new wave of innovation around deep learning. Previously if you wanted to train a model you needed vast amounts if data and thousands of dollars of compute time. Thanks to transfer learning you can now take an existing model (such as GPT2) and train something useful on top of it that’s specific to a new domain in just minutes it hours, with only a few hundred or a few thousand new labeled samples. # 7th February 2020, 8:47 am

Weeknotes: Shaving yaks for Datasette Cloud

I’ve been shaving a lot of yaks, but I’m finally ready to for other people to start kicking the tires on the MVP of Datasette Cloud.

[... 392 words]