Simon Willison’s Weblog

Subscribe

Items in May, 2021

Filters: Year: 2021 × Month: May × Sorted by date


explain.dalibo.com (via) By far the best tool I’ve seen for turning the output of PostgreSQL EXPLAIN ANALYZE into something I can actually understand—produces a tree visualization which includes clear explanations of what each step (such as a “Index Only Scan Node”) actually means. # 28th May 2021, 5:41 pm

M1RACLES: M1ssing Register Access Controls Leak EL0 State. You need to read (or at least scan) all the way to the bottom: this security disclosure is a masterpiece. It not only describes a real flaw in the M1 silicon but also deconstructs the whole culture of over-hyped name-branded vulnerability reports. The TLDR is that you don’t really need to worry about this one, and if you’re writing this kind if thing up for a news article you should read all the way to the end first! # 26th May 2021, 3:25 pm

Weeknotes: Spinning back up on Datasette

I’ve been somewhat distracted from Datasette for the past couple of months, thanks to my work on VIAL and the accompanying open source project django-sql-dashboard. This week I scraped back some time to work on Datasette.

[... 401 words]

HackSoft Django styleguide: services and selectors. HackSoft’s Django styleguide uses the terms “services” and “selectors”. Services are functions that live in services.py and perform business logic operations such as creating new entities that might span multiple Django models. Selectors live in selectors.py and perform more complex database read operations, such as returning objects in a way that respects visibility permissions. # 24th May 2021, 7:17 pm

How to look at the stack with gdb. Useful short tutorial on gdb from first principles. # 24th May 2021, 6:23 pm

Flat Data. New project from the GitHub OCTO (the Office of the CTO, love that backronym) somewhat inspired by my work on Git scraping: I’m really excited to see GitHub embracing git for CSV/JSON data in this way. Flat incorporates a reusable Action for scraping and storing data (using Deno), a VS Code extension for setting up those workflows and a very nicely designed Flat Viewer web app for browsing CSV and JSON data hosted on GitHub. # 19th May 2021, 1:05 am

Weeknotes: Velma, more Django SQL Dashboard

Matching locations for Vaccinate The States, fun with GeoJSON and more improvements to Django SQL Dashboard.

[... 555 words]

No feigning surprise (via) Don’t feign surprise if someone doesn’t know something that you think they should know. Even better: even if you are surprised, don’t let them know! “When people feign surprise, it’s usually to make them feel better about themselves and others feel worse.” # 17th May 2021, 4:30 pm

geocode-sqlite. Neat command-line Python utility by Chris Amico: point it at a SQLite database file and it will add latitude and longitude columns and populate them by geocoding one or more of the other fields, using your choice from four currently supported geocoders. # 17th May 2021, 1:15 am

Powering the Python Package Index in 2021. PyPI now serves “nearly 900 terabytes over more than 2 billion requests per day”. Bandwidth is donated by Fastly, a value estimated at 1.8 million dollars per month! Lots more detail about how PyPI has evolved over the past years in this post by Dustin Ingram. # 14th May 2021, 4:50 am

Folks think s3 is static assets hosting but really it’s a consistent and highly available key value store with first class blob support

Brian LeRoux # 13th May 2021, 3:01 pm

Async functions require an event loop to run. Flask, as a WSGI application, uses one worker to handle one request/response cycle. When a request comes in to an async view, Flask will start an event loop in a thread, run the view function there, then return the result.

Each request still ties up one worker, even for async views. The upside is that you can run async code within a view, for example to make multiple concurrent database queries, HTTP requests to an external API, etc. However, the number of requests your application can handle at one time will remain the same.

Using async and await in Flask 2.0 # 12th May 2021, 5:59 pm

New Major Versions Released! Flask 2.0, Werkzeug 2.0, Jinja 3.0, Click 8.0, ItsDangerous 2.0, and MarkupSafe 2.0. Huge set of releases from the Pallets team. Python 3.6+ required and comprehensive type annotations. Flask now supports async views, Jinja async templates (used extensively by Datasette) “no longer requires patching”, Click has a bunch of new code around shell tab completion, ItsDangerous supports key rotation and so much more. # 12th May 2021, 5:37 pm

Django SQL Dashboard

I’ve released the first non-alpha version of Django SQL Dashboard, which provides an interface for running arbitrary read-only SQL queries directly against a PostgreSQL database, protected by the Django authentication scheme. It can also be used to create saved dashboards that can be published or shared internally.

[... 2171 words]

A museum bot (via) Shawn Graham built a Twitter bot, using R, which tweets out random items from the collection at the Canadian Science and Technology Museum—using a Datasette instance that he’s running based on a CSV export of their collections data. # 5th May 2021, 7:09 pm

cinder: Instagram’s performance oriented fork of CPython (via) Instagram forked CPython to add some performance-oriented features they wanted, including a method-at-a-time JIT compiler and a mechanism for eagerly evaluating coroutines (avoiding the overhead of creating a coroutine if an awaited function returns a value without itself needing to await). They’re open sourcing the code to help start conversations about implementing some of these features in CPython itself. I particularly enjoyed the warning that accompanies the repo: this is not intended to be a supported release, and if you decide to run it in production you are on your own! # 4th May 2021, 10:13 pm

Plot & Vega-Lite. Useful documentation comparing the brand new Observable Plot to Vega-Lite, complete with examples of how to achieve the same thing in both libraries. # 4th May 2021, 4:32 pm

Observable Plot (via) This is huge: a brand new high-level JavaScript visualization library from Mike Bostock, the author of D3—partially inspired by Vega-Lite which I’ve used enthusiastically in the past. First impressions are that this is a big step forward for quickly building high-quality visualizations. It’s released under the ISC license which is “functionally equivalent to the BSD 2-Clause and MIT licenses”. # 4th May 2021, 4:28 pm

Practical SQL for Data Analysis (via) This is a really great SQL tutorial: it starts with the basics, but quickly moves on to a whole array of advanced PostgreSQL techniques—CTEs, window functions, efficient sampling, rollups, pivot tables and even linear regressions executed directly in the database using regr_slope(), regr_intercept() and regr_r2(). I picked up a whole bunch of tips for things I didn’t know you could do with PostgreSQL here. # 4th May 2021, 3:11 am

Adding GeoDjango to an existing Django project

Work on VIAL for Vaccinate The States continues.

[... 1503 words]

Hosting SQLite databases on Github Pages (via) I’ve seen the trick of running SQLite compiled to WASM in the browser before, but this comes with an incredibly clever bonus trick: it uses SQLite’s page structure to fetch subsets of the database file via HTTP range requests, which means you can run indexed SQL queries against a 600MB database file while only fetching a few MBs of data over the wire. Absolutely brilliant. Tucked away at the end of the post is another neat trick: making the browser DOM available to SQLite as a virtual table, so you can query and update the DOM of the current page using SQL! # 2nd May 2021, 6:55 pm

One year of TILs

Just over a year ago I started tracking TILs, inspired by Josh Branchaud’s collection. I’ve since published 148 TILs across 43 different topics. It’s a great format!

[... 224 words]

Query Engines: Push vs. Pull (via) Justin Jaffray (who has worked on Materialize) explains the difference between push and pull query execution engines using some really clear examples built around JavaScript generators. # 2nd May 2021, 2:49 am