Simon Willison’s Weblog

Subscribe
Atom feed

Blogmarks

Filters: Sorted by date

PugSQL. Interesting new twist on a definitely-not-an-ORM library for Python. With PugSQL you define SQL queries in files, give them names and then load them into a module which allows you to execute them as Python methods with keyword arguments. You can mark statements as only returning a single row (or a single scalar value) with a comment at the top of their file.

# 3rd July 2019, 6:19 pm / orm, python, sql, dan-mckinley

Choose Boring Technology (via) The definitive write-up of Dan McKinley’s presentation on why you should mostly use “boring” technology rather than going after the latest shiniest stack components. There’s so much accumulated wisdom in here. I particularly like how Dan owns up to having introduced Scala and MongoDB at Etsy before eventually helping remove them and go back to something less exciting and far more predictable. Also neat: the site is generated using Dan’s better-keynote-export tool which helps turn Keynote presentations into a flat web page with notes and images.

# 3rd July 2019, 6:09 pm / dan-mckinley, boring-technology

db-to-sqlite 1.0 release. I’ve released version 1.0 of my db-to-sqlite tool, which lets you create a SQLite database copy of any database supported by SQLAlchemy (I’ve tested it against MySQL and PostgreSQL). The tool has a bunch of new features: you can use --redact to redact specific columns, specify --table multiple times to copy a subset of tables, and the --all option now efficiently adds all foreign keys at the end of the import. The project now has unit tests which run against MySQL and PostgreSQL in Travis CI. Also included in the README: a shell one-liner for creating a local SQLite copy of a remote Heroku Postgres database based on extracting the connection string from a Heroku config environment variable.

# 1st July 2019, 1:35 am / mysql, postgresql, projects, sqlite, heroku, datasette

Feature Toggles (aka Feature Flags). I’m a huge fan of feature flags as a way of managing feature releases and keeping incomplete code in master as opposed to maintaining long-running branches. Recently I’ve found myself pointing people to this essay by Pete Hodgson—it’s a great overview of feature flags (here called toggles) and I particularly like how it splits them into four categories: Release Toggles, Experiment Toggles, Ops Toggles and Permissioning Toggles.

# 25th June 2019, 9:13 pm / feature-flags

json-flatten. A little Python library I wrote that attempts to flatten a JSON object into a set of key/value pairs suitable for transmitting in a query string or using to construct an HTML form. I first wrote this back in 2015 as a Gist—I’ve reconstructed the Gist commit history in a new repository and shipped it to PyPI.

# 22nd June 2019, 4:51 am / json, projects, python

Announcing Envoy Mobile. This is a fascinating development: Lyft’s Envoy proxy / service mesh has been widely adopted across the industry as a server-side component for adding smart routing and observability to the network calls made between services in microservice architectures. “The reality is that three 9s at the server-side edge is meaningless if the user of a mobile application is only able to complete the desired product flows a fraction of the time”—so Lyft are building a C++ embedded library companion to Envoy which is designed to be shipped as part of iOS and Android client applications. “Envoy Mobile in conjunction with Envoy in the data center will provide the ability to reason about the entire distributed system network, not just the server-side portion.” Their decision to release an early working prototype and then conduct ongoing development entirely in the open is interesting too.

# 18th June 2019, 6:42 pm / mobile, microservices

Toward a “Kernel Python” (via) Glyph makes a strong case for releasing a slimmed down “kernel” version of Python with the minimal possible standard library, and argues that the current standard library is proving impossible for a single core team to productively maintain. “If I wanted to update the colorsys module to be more modern—perhaps to have a Color object rather than a collection of free functions, perhaps to support integer color models—I’d likely have to wait 500 days, or more, for a review.”

# 15th June 2019, 4 pm / python, glyph

When should you be using Web Workers? 85% of worldwide mobile devices are massively less performant than high end iPhones. Surma argues that we should be making aggressive use of Web Workers to keep as much of our JavaScript as possible off the main UI thread, to avoid freezing up the entire interface.

# 15th June 2019, 4:31 am / javascript, mobile, webworkers, web-performance

Convert Locations.kml (pulled from an iPhone backup) to SQLite. I’ve been playing around with data from my iPhone using the iPhone Backup Extractor app and one of the things it exports for you is a Locations.kml file full of location history data. I wrote a tiny script using Python’s ElementTree XMLPullParser to efficiently iterate through the Placemarks and yield them as dictionaries, which I then batch-inserted into sqlite-utils to create a SQLite database.

# 14th June 2019, 12:45 am / kml, projects, sqlite, xml, sqlite-utils

paginate-json (via) I released a fun tiny utility: paginate-json, which knows how to paginate through JSON APIs that use the HTTP Link header for pagination. I built it so I could pull data from the GitHub API and pipe it directly into SQLite via sqlite-utils.

# 12th June 2019, 3:22 pm / json, projects, webapis, sqlite-utils

Serverless Microservice Patterns for AWS (via) A handy collection of 19 architectural patterns for AWS Lambda collected by Jeremy Daly.

# 12th June 2019, 12:13 am / aws, lambda, patterns, software-architecture

datasette-render-binary (via) Yet another tiny Datasette plugin. This one attempts to render binary data in a slightly more readable fashion—it shows ASCII characters as they are, and shows all other data as monospace octets. Useful as a tool for exploring new unfamiliar databases as it makes it easier to spot if a binary column may contain a decipherable binary format.

# 9th June 2019, 4:22 pm / projects, datasette

datasette-bplist (via) It turns out an OS X laptop is positively crammed with SQLite databases, and many of them contain values that are data structures encoded using Apple’s binary plist format. datasette-bplist is my new plugin to help explore those files: it provides a display hook for rendering their contents, and a custom bplist_to_json() SQL function which can be used to extract and query information that is embedded in those values. The README includes tips on how to pull interesting EXIF data out of the SQLite database that sits behind Apple Photos.

# 9th June 2019, 1:26 am / macos, projects, datasette

Friday wins and a case study in ritual design. “Culture is what you celebrate. Rituals are the tools you use to shape culture.”

# 8th June 2019, 6:14 pm / kellan-elliott-mccrea, management

Los Angeles Weedmaps analysis (via) Ben Welsh at the LA Times published this Jupyter notebook showing the full working behind a story they published about LA’s black market weed dispensaries. I picked up several useful tricks from it—including how to load points into a geopandas GeoDataFrame (in epsg:4326 aka WGS 84) and how to then join that against the LA Times neighborhoods GeoJSON boundaries file.

# 30th May 2019, 4:35 am / data-journalism, geospatial, latimes, pandas, jupyter, ben-welsh

Building a stateless API proxy (via) This is a really clever idea. The GitHub API is infuriatingly coarsely grained with its permissions: you often end up having to create a token with way more permissions than you actually need for your project. Thea Flowers proposes running your own proxy in front of their API that adds more finely grained permissions, based on custom encrypted proxy API tokens that use JWT to encode the original API key along with the permissions you want to grant to that particular token (as a list of regular expressions matching paths on the underlying API).

# 30th May 2019, 4:28 am / apis, encryption, github, proxies, security, jwt

datasette-jq (via) I released another tiny Datasette plugin: datasette-jq registers a single custom SQL function, jq(), which lets you execute the jq expression language against a JSON column (or literal value) to filter and transform the JSON data. The README includes a link to a live demo—it’s a neat way to play with the jq micro-language.

# 30th May 2019, 1:52 am / projects, datasette, jq

Falsehoods Programmers Believe About Search (via) These are great. “When you find the boolean operator ‘OR’, you always know it doesn’t mean Oregon”.

# 29th May 2019, 8:09 pm / search

gls: Goroutine local storage (via) Go doesn't provide a mechanism for having "goroutine local" variables (like threadlocals in Python but for goroutines), and the structure of the language makes it really hard to get something working. JT Olio figured out a truly legendary hack: Go's introspection lets you see the current stack, so he figured out a way to encode a base-16 identifer tag into the call order of 16 special nested functions.

I particularly like the "What are people saying?" section of the README:

"Wow, that's horrifying."

"This is the most terrible thing I have seen in a very long time."

"Where is it getting a context from? Is this serializing all the requests? What the heck is the client being bound to? What are these tags? Why does he need callers? Oh god no. No no no."

# 28th May 2019, 11:13 pm / go, hacks

Zdog (via) Well this is absolutely delightful: Zdog is a pseudo-3D engine for canvas and SVG that outputs 3D models rendered as super-stylish flat shapes. It’s hard to describe with words—go play with the demos!

# 28th May 2019, 9:59 pm / 3d, canvas

Using dependabot to bump Django on my blog from 2.2 to 2.2.1 (via) GitHub recently acquired dependabot and made it free, and I decided to try it out on my blog. It’s a really neat piece of automation: it scans your requirements.txt (plus a number of other packaging definitions across several different languages), checks for updates to your dependencies and opens pull requests against any that it finds. Combine it with a CI service such as Circle CI and your tests will run automatically against the pull request, letting you know if it’s safe to merge. dependabot constantly rebases other changes against the pull request to try and ensure it will merge as cleanly as possible.

# 27th May 2019, 1:24 am / django, github

sqlite-utils 1.0. I just released sqlite-utils 1.0, with a couple of handy new features over 0.14: it can now automatically add columns to a database table if you attempt to insert data which doesn’t quite fit (using alter=True in the Python API or the --alter option to the “sqlite-utils insert” command). It also has the ability to output nested JSON column values on the command-line using the new --json-cols option. This is the first project I’ve marked as a 1.0 release in a very long time—I’ll be sticking to semver for this project from now on, bumping the major version only in the case of a backwards incompatible change.

# 25th May 2019, 1:20 am / projects, semantic-versioning, sqlite, versioning, sqlite-utils

WebAssembly at eBay: A Real-World Use Case (via) eBay used WebAssembly to run a C++ barcode reading library inside a web worker, passing images from the camera in order to provide a barcode scanning interface as part of their mobile web “add listing” page (a feature that had already proved successful in their native mobile apps). This is a great write-up, with lots of detail about how they compiled the library. They ended up running three barcode solutions in parallel web workers—two using WebAssembly, one in pure JavaScript—because their testing showed that racing between three implementations greatly increased the chance of a match due to how the different libraries handled poor quality or out-of-focus images.

# 22nd May 2019, 8:30 pm / webworkers, webassembly

Terrarium by Fastly Labs. Fastly have been investing heavily in WebAssembly, which makes sense as it provides an excellent option for a sandboxed environment for executing server-side code at the edge of their CDN offering. Terrarium is their “playground for experimenting with edge-side WebAssembly”—it lets you write a program in Rust, C, TypeScript or Wat (WebAssembly text format), compile it to WebAssembly and deploy it to a URL with a single button-click. It’s just a demo for the moment so deployments only persist for 15 minutes, but it’s a fascinating sandbox to play around with.

# 21st May 2019, 8:51 pm / rust, webassembly, fastly

Monaco Editor. VS Code is MIT licensed and built on top of Electron. I thought “huh, I wonder if I could run the editor component embedded in a web app”—and it turns out Microsoft have already extracted out the code editor component into an open source JavaScript package called Monaco. Looks very slick, though sadly it’s not supported in mobile browsers.

# 21st May 2019, 8:47 pm / editor, javascript, microsoft, open-source, electron, vs-code

Public Data Release of Stack Overflow’s 2019 Developer Survey. Here’s the Stack Overflow announcement of their developer survey public data release, which discusses the Glitch partnership and mentions Datasette.

# 21st May 2019, 6:51 pm / glitch, stackoverflow, surveys, datasette

Discover Insights in Developer Survey Results. Stack Overflow partnered with Glitch and used Datasette to host the full data set from Stack Overflow’s 2019 Developer Survey!

# 21st May 2019, 6:50 pm / glitch, stackoverflow, surveys, datasette

django-lifecycle (via) Interesting alternative to Django signals by Robert Singer. It provides a model mixin class which over-rides the Django ORM’s save() method, tracking which model attributes have been changed. Then it lets you add methods to your model with a @hook annotation allowing you to specify things like “run this method before saving if the status changed” or “run this after an object has been deleted”.

# 15th May 2019, 11:34 pm / django

Why I (Still) Love Tech: In Defense of a Difficult Industry (via) If you only read one longform piece this week, make it this one. Utterly delightful prose and a bunch of different messages that resonated with me deeply.

# 15th May 2019, 3:45 pm / paul-ford

quicktype code generator for Python. Really interesting tool: give it an example JSON document and it will code-generate the equivalent set of Python classes (with type annotations) instantly in your browser. It also accepts input in JSON Schema or TypeScript and can generate code in 18 different languages.

# 14th May 2019, 11:35 pm / json, python, typescript

Years

Tags