Simon Willison’s Weblog


Filters: Type: blogmark ×

json-flatten. A little Python library I wrote that attempts to flatten a JSON object into a set of key/value pairs suitable for transmitting in a query string or using to construct an HTML form. I first wrote this back in 2015 as a Gist—I’ve reconstructed the Gist commit history in a new repository and shipped it to PyPI. # 22nd June 2019, 4:51 am

Announcing Envoy Mobile. This is a fascinating development: Lyft’s Envoy proxy / service mesh has been widely adopted across the industry as a server-side component for adding smart routing and observability to the network calls made between services in microservice architectures. “The reality is that three 9s at the server-side edge is meaningless if the user of a mobile application is only able to complete the desired product flows a fraction of the time”—so Lyft are building a C++ embedded library companion to Envoy which is designed to be shipped as part of iOS and Android client applications. “Envoy Mobile in conjunction with Envoy in the data center will provide the ability to reason about the entire distributed system network, not just the server-side portion.” Their decision to release an early working prototype and then conduct ongoing development entirely in the open is interesting too. # 18th June 2019, 6:42 pm

Toward a “Kernel Python” (via) Glyph makes a strong case for releasing a slimmed down “kernel” version of Python with the minimal possible standard library, and argues that the current standard library is proving impossible for a single core team to productively maintain. “If I wanted to update the colorsys module to be more modern—perhaps to have a Color object rather than a collection of free functions, perhaps to support integer color models—I’d likely have to wait 500 days, or more, for a review.” # 15th June 2019, 4 pm

When should you be using Web Workers? 85% of worldwide mobile devices are massively less performant than high end iPhones. Surma argues that we should be making aggressive use of Web Workers to keep as much of our JavaScript as possible off the main UI thread, to avoid freezing up the entire interface. # 15th June 2019, 4:31 am

Convert Locations.kml (pulled from an iPhone backup) to SQLite. I’ve been playing around with data from my iPhone using the iPhone Backup Extractor app and one of the things it exports for you is a Locations.kml file full of location history data. I wrote a tiny script using Python’s ElementTree XMLPullParser to efficiently iterate through the Placemarks and yield them as dictionaries, which I then batch-inserted into sqlite-utils to create a SQLite database. # 14th June 2019, 12:45 am

paginate-json (via) I released a fun tiny utility: paginate-json, which knows how to paginate through JSON APIs that use the HTTP Link header for pagination. I built it so I could pull data from the GitHub API and pipe it directly into SQLite via sqlite-utils. # 12th June 2019, 3:22 pm

Serverless Microservice Patterns for AWS (via) A handy collection of 19 architectural patterns for AWS Lambda collected by Jeremy Daly. # 12th June 2019, 12:13 am

datasette-render-binary (via) Yet another tiny Datasette plugin. This one attempts to render binary data in a slightly more readable fashion—it shows ASCII characters as they are, and shows all other data as monospace octets. Useful as a tool for exploring new unfamiliar databases as it makes it easier to spot if a binary column may contain a decipherable binary format. # 9th June 2019, 4:22 pm

datasette-bplist (via) It turns out an OS X laptop is positively crammed with SQLite databases, and many of them contain values that are data structures encoded using Apple’s binary plist format. datasette-bplist is my new plugin to help explore those files: it provides a display hook for rendering their contents, and a custom bplist_to_json() SQL function which can be used to extract and query information that is embedded in those values. The README includes tips on how to pull interesting EXIF data out of the SQLite database that sits behind Apple Photos. # 9th June 2019, 1:26 am

Friday wins and a case study in ritual design. “Culture is what you celebrate. Rituals are the tools you use to shape culture.” # 8th June 2019, 6:14 pm

Los Angeles Weedmaps analysis (via) Ben Welsh at the LA Times published this Jupyter notebook showing the full working behind a story they published about LA’s black market weed dispensaries. I picked up several useful tricks from it—including how to load points into a geopandas GeoDataFrame (in epsg:4326 aka WGS 84) and how to then join that against the LA Times neighborhoods GeoJSON boundaries file. # 30th May 2019, 4:35 am

Building a stateless API proxy (via) This is a really clever idea. The GitHub API is infuriatingly coarsely grained with its permissions: you often end up having to create a token with way more permissions than you actually need for your project. Thea Flowers proposes running your own proxy in front of their API that adds more finely grained permissions, based on custom encrypted proxy API tokens that use JWT to encode the original API key along with the permissions you want to grant to that particular token (as a list of regular expressions matching paths on the underlying API). # 30th May 2019, 4:28 am

datasette-jq (via) I released another tiny Datasette plugin: datasette-jq registers a single custom SQL function, jq(), which lets you execute the jq expression language against a JSON column (or literal value) to filter and transform the JSON data. The README includes a link to a live demo—it’s a neat way to play with the jq micro-language. # 30th May 2019, 1:52 am

Falsehoods Programmers Believe About Search (via) These are great. “When you find the boolean operator ‘OR’, you always know it doesn’t mean Oregon”. # 29th May 2019, 8:09 pm

gls: Goroutine local storage (via) Go doesn’t provide a mechanism for having “goroutine local” variables (like threadlocals in Python but for goroutines), and the structure of the language makes it really hard to get something working. JT Olio figured out a truly legendary hack: Go’s introspection lets you see the current stack, so he figured out a way to encode a base-16 identifer tag into the call order of 16 special nested functions. I particularly like the “What are people saying?” section of the README: “Wow, that’s horrifying.”—“This is the most terrible thing I have seen in a very long time.”—“Where is it getting a context from? Is this serializing all the requests? What the heck is the client being bound to? What are these tags? Why does he need callers? Oh god no. No no no.” # 28th May 2019, 11:13 pm

Zdog (via) Well this is absolutely delightful: Zdog is a pseudo-3D engine for canvas and SVG that outputs 3D models rendered as super-stylish flat shapes. It’s hard to describe with words—go play with the demos! # 28th May 2019, 9:59 pm

Using dependabot to bump Django on my blog from 2.2 to 2.2.1 (via) GitHub recently acquired dependabot and made it free, and I decided to try it out on my blog. It’s a really neat piece of automation: it scans your requirements.txt (plus a number of other packaging definitions across several different languages), checks for updates to your dependencies and opens pull requests against any that it finds. Combine it with a CI service such as Circle CI and your tests will run automatically against the pull request, letting you know if it’s safe to merge. dependabot constantly rebases other changes against the pull request to try and ensure it will merge as cleanly as possible. # 27th May 2019, 1:24 am

sqlite-utils 1.0. I just released sqlite-utils 1.0, with a couple of handy new features over 0.14: it can now automatically add columns to a database table if you attempt to insert data which doesn’t quite fit (using alter=True in the Python API or the --alter option to the “sqlite-utils insert” command). It also has the ability to output nested JSON column values on the command-line using the new --json-cols option. This is the first project I’ve marked as a 1.0 release in a very long time—I’ll be sticking to semver for this project from now on, bumping the major version only in the case of a backwards incompatible change. # 25th May 2019, 1:20 am

WebAssembly at eBay: A Real-World Use Case (via) eBay used WebAssembly to run a C++ barcode reading library inside a web worker, passing images from the camera in order to provide a barcode scanning interface as part of their mobile web “add listing” page (a feature that had already proved successful in their native mobile apps). This is a great write-up, with lots of detail about how they compiled the library. They ended up running three barcode solutions in parallel web workers—two using WebAssembly, one in pure JavaScript—because their testing showed that racing between three implementations greatly increased the chance of a match due to how the different libraries handled poor quality or out-of-focus images. # 22nd May 2019, 8:30 pm

Terrarium by Fastly Labs. Fastly have been investing heavily in WebAssembly, which makes sense as it provides an excellent option for a sandboxed environment for executing server-side code at the edge of their CDN offering. Terrarium is their “playground for experimenting with edge-side WebAssembly”—it lets you write a program in Rust, C, TypeScript or Wat (WebAssembly text format), compile it to WebAssembly and deploy it to a URL with a single button-click. It’s just a demo for the moment so deployments only persist for 15 minutes, but it’s a fascinating sandbox to play around with. # 21st May 2019, 8:51 pm

Monaco Editor. VS Code is MIT licensed and built on top of Electron. I thought “huh, I wonder if I could run the editor component embedded in a web app”—and it turns out Microsoft have already extracted out the code editor component into an open source JavaScript package called Monaco. Looks very slick, though sadly it’s not supported in mobile browsers. # 21st May 2019, 8:47 pm

Public Data Release of Stack Overflow’s 2019 Developer Survey. Here’s the Stack Overflow announcement of their developer survey public data release, which discusses the Glitch partnership and mentions Datasette. # 21st May 2019, 6:51 pm

Discover Insights in Developer Survey Results. Stack Overflow partnered with Glitch and used Datasette to host the full data set from Stack Overflow’s 2019 Developer Survey! # 21st May 2019, 6:50 pm

django-lifecycle (via) Interesting alternative to Django signals by Robert Singer. It provides a model mixin class which over-rides the Django ORM’s save() method, tracking which model attributes have been changed. Then it lets you add methods to your model with a @hook annotation allowing you to specify things like “run this method before saving if the status changed” or “run this after an object has been deleted”. # 15th May 2019, 11:34 pm

Why I (Still) Love Tech: In Defense of a Difficult Industry (via) If you only read one longform piece this week, make it this one. Utterly delightful prose and a bunch of different messages that resonated with me deeply. # 15th May 2019, 3:45 pm

quicktype code generator for Python. Really interesting tool: give it an example JSON document and it will code-generate the equivalent set of Python classes (with type annotations) instantly in your browser. It also accepts input in JSON Schema or TypeScript and can generate code in 18 different languages. # 14th May 2019, 11:35 pm

Amazon’s Away Teams laid bare: How AWS’s hivemind of engineers develop and maintain their internal tech (via) Some interesting insights into how Amazon structure their engineering organization to maximize team productivity in a service-oriented environment. Two things that stood out to me: each service is owned by a “home team”, but sometimes features that are needed by other teams can be built by forming an “away team” to build out that functionality. Secondly, Amazon has a concept of “bar raisers” who are engineers across the organization who help approve key design and architectural decisions. It’s possible to go against the recommendation of a bar raiser but “such a move is noted and made visible to higher levels of management”. # 14th May 2019, 6:32 pm

asgi-cors (via) I’ve been trying out the new ASGI 3.0 spec and I just released my first piece of ASGI middleware: asgi-cors, which lets you wrap an ASGI application with Access-Control-Allow-Origin CORS headers (either “*” or dynamic headers based on an origin whitelist). # 7th May 2019, 12:12 am

Want to see what one digital future for newspapers looks like? Look at The Guardian, which isn’t losing money anymore (via) After losing money every single year since 1998, the Guardian just managed to turn a profit! Detailed analysis of how they did it by Joshua Benton. # 2nd May 2019, 5:49 am

A Conspiracy To Kill IE6 (via) Cracking story by Chris Zacharias about how a team of engineers at YouTube back in 2009 took advantage of some exploits in YouTube’s organization structure (left over from their acquisition by Google) to ship a vague IE6 deprecation warning banner on one of the world’s highest traffic websites, inspiring many other similar banners and resulting in a 10% drop in global IE6 traffic. # 1st May 2019, 8:26 pm