Blogmarks
Filters: Sorted by date
GOV.UK Guidance: Documenting APIs (via) Characteristically excellent guide from GOV.UK on writing great API documentation. “Task-based guidance helps users complete the most common integration tasks, based on the user needs from your research.”
Comby (via) Describes itself as “Structural search and replace for any language”. Lets you execute search and replace patterns that look a little bit like simplified regular expressions, but with some deep OCaml-powered magic that makes them aware of comment, string and nested parenthesis rules for different languages. This means you can use it to construct scripts that automate common refactoring or code upgrade tasks.
simonw/datasette-screenshots (via) I started a new GitHub repository to automate taking screenshots of Datasette for marketing purposes, using my shot-scraper browser automation tool.
Supercharging GitHub Actions with Job Summaries (via) GitHub Actions workflows can now generate a rendered Markdown summary of, well, anything that you can think to generate as part of the workflow execution. I particularly like the way this is designed: they provide a filename in a $GITHUB_STEP_SUMMARY environment variable which you can then append data to from each of your steps.
Heroku: Core Impact (via) Ex-Heroku engineer Brandur Leach pulls together some of the background information circulating concerning the now more than a month long Heroku security incident and provides some ex-insider commentary on what went right and what went wrong with a platform that left a huge, if somewhat underappreciated impact on the technology industry at large.
How Materialize and other databases optimize SQL subqueries. Jamie Brandon offers a survey of the state-of-the-art in optimizing correlated subqueries, across a number of different database engines.
Why Rust’s postfix await syntax is good (via) C J Silverio explains postfix await in Rust—where you can write a line like this, with the ? causing any errors to be caught and turned into an error return from your function:
let count = fetch_all_animals().await?.filter_for_hedgehogs().len();
sqlite-utils: a nice way to import data into SQLite for analysis (via) Julia Evans on my sqlite-utils Python library and CLI tool.
SIARD: Software Independent Archiving of Relational Databases (via) I hadn’t heard of this before but it looks really interesting: the Federal Archives of Switzerland developed a standard for archiving any relational database as a zip file full of XML which is “is used in over 50 countries around the globe”.
Simple declarative schema migration for SQLite (via) This is an interesting, clearly explained approach to the database migration problem. Create a new in-memory database and apply the current schema, then run some code to compare that with the previous schema—which tables are new, and which tables have had columns added. Then apply those changes.
I’d normally be cautious of running something like this because I can think of ways it could go wrong—but SQLite backups are so quick and cheap (just copy the file) that I could see this being a relatively risk-free way to apply migrations.
Web Scraping via Javascript Runtime Heap Snapshots (via) This is an absolutely brilliant scraping trick. Adrian Cooney figured out a way to use Puppeteer and the Chrome DevTools protocol to take a heap snapshot of all of the JavaScript running on a web page, then recursively crawl through the heap looking for any JavaScript objects that have a specified selection of properties. This allows him to scrape data from arbitrarily complex client-side web applications. He built a JavaScript library and command line tool that implements the pattern.
sqlite-utils 3.26.1 (via) I released sqlite-utils 3.36.1 with one tiny but exciting feature: I fixed its one dependency that wasn’t published as a pure Python wheel, which means it can now be used with Pyodide—Python compiled to WebAssembly running in your browser!
PyScript demos (via) PyScript was announced at PyCon this morning. It’s a new open source project that provides Web Components built on top of Pyodide, allowing you to use Python directly within your HTML pages in a way that is executed using a WebAssembly copy of Python running in your browser. These demos really help illustrate what it can do—it’s a fascinating new piece of the Python web ecosystem.
Testing Datasette parallel SQL queries in the nogil/python fork. As part of my ongoing research into whether Datasette can be sped up by running SQL queries in parallel I’ve been growing increasingly suspicious that the GIL is holding me back. I know the sqlite3 module releases the GIL and was hoping that would give me parallel queries, but it looks like there’s still a ton of work going on in Python GIL land creating Python objects representing the results of the query.
Sam Gross has been working on a nogil fork of Python and I decided to give it a go. It’s published as a Docker image and it turns out trying it out really did just take a few commands... and it produced the desired results, my parallel code started beating my serial code where previously the two had produced effectively the same performance numbers.
I’m pretty stunned by this. I had no idea how far along the nogil fork was. It’s amazing to see it in action.
HTML event handler attributes: down the rabbit hole
(via)
onclick="myfunction(event)" is an idiom for passing the click event to a function - but how does it work? It turns out the answer is buried deep in the HTML spec - the browser wraps that string of code in a function(event) { ... that string ... } function and makes the event available to its local scope that way.
Mac OS 8 emulated in WebAssembly (via) Absolutely incredible project by Mihai Parparita. This is a full, working copy of Mac OS 8 (from 1997) running in your browser via WebAssembly—and it’s fully loaded with games and applications too. I played with Photoshop 3.0 and Civilization and there’s so much more on there to explore too—I finally get to try out HyperCard!
Learn Go with tests. I really like this approach to learning a new language: start by learning to write tests (which gets you through hello world, environment setup and test running right from the beginning) and use them to explore the language. I also really like how modern Go development no longer depends on the GOPATH, which I always found really confusing.
jq language description (via) I love jq but I’ve always found it difficult to remember how to use it, and the manual hasn’t helped me as much as I would hope. It turns out the jq wiki on GitHub offers an alternative, more detailed description of the language which fits the way my brain works a lot better.
A tiny CI system (via) Christian Ştefănescu shares a recipe for building a tiny self-hosted CI system using Git and Redis. A post-receive hook runs when a commit is pushed to the repo and uses redis-cli to push jobs to a list. Then a separate bash script runs a loop with a blocking “redis-cli blpop jobs” operation which waits for new jobs and then executes the CI job as a shell script.
WebAIM guide to using iOS VoiceOver to evaluate web accessibility (via) I asked for pointers on learning to use VoiceOver on my iPhone for accessibility testing today and Matt Hobbs pointed me to this tutorial from the WebAIM group at Utah State University.
Web Components as Progressive Enhancement (via) I think this is a key aspect of Web Components I had been missing: since they default to rendering their contents, you can use them as a wrapper around regular HTML elements that can then be progressively enhanced once the JavaScript has loaded.
Glue code to quickly copy data from one Postgres table to another (via) The Python script that Retool used to migrate 4TB of data between two PostgreSQL databases. I find the structure of this script really interesting—it uses Python to spin up a queue full of ID ranges to be transferred and then starts some threads, but then each thread shells out to a command that runs “psql COPY (SELECT ...) TO STDOUT” and pipes the result to “psql COPY xxx FROM STDIN”. Clearly this works really well (“saturate the database’s hardware capacity” according to a comment on HN), and neatly sidesteps any issues with Python’s GIL.
Netlify Edge Functions: A new serverless runtime powered by Deno. You can now run Deno scripts directly in Netlify’s edge CDN—bundled as part of their default pricing plan. Interesting that they decided to host it on Deno’s Deno Deploy infrastructure. The hello world example is pleasingly succinct:
export default () => new Response(“Hello world”)
How to push tagged Docker releases to Google Artifact Registry with a GitHub Action. Ben Welsh’s writeup includes detailed step-by-step instructions for getting the mysterious “Workload Identity Federation” mechanism to work with GitHub Actions and Google Cloud. I’ve been dragging my heels on figuring this out for quite a while, so it’s great to see the steps described at this level of detail.
Litestream: Live Read Replication (via) The documentation for the read replication implemented in the latest Litestream beta (v0.4.0-beta.2). The design is really simple and clever: the primary runs a web server on a port, and replica instances can then be started with a configured URL pointing to the IP and port of the primary. That’s all it takes to have a SQLite database replicated to multiple hosts, each of which can then conduct read queries against their local copies.
Datasette for geospatial analysis (via) I added a new page to the Datasette website describing how Datasette can be used for geospatial analysis, pulling together several of the relevant plugins and tools from the Datasette ecosystem.
datasette-dashboards (via) Romain Clement’s datasette-dashboards plugin lets you configure dashboards for Datasette using YAML, combining markdown blocks, Vega graphs and single number metrics using a layout powered by CSS grids. This is a beautiful piece of software design, with a very compelling live demo.
WebAssembly in my Browser Desktop Environment (via) Dustin Brett built the WebAssembly demo to end all WebAssembly demos: his daedalOS browser desktop environment simulates a Windows-style operating system, and bundles WebAssembly projects that include v86 for 486 emulation, js-dos for DOS emulation to run Doom, BoxedWine to run Wine applications like Notepad++, Ruffle to emulate Flash, ffmpeg.wasm to power audio and video conversion, WASM-ImageMagick for image conversion, Pyodide for a Python shell and more besides that!
geoBoundaries. This looks useful: “The world’s largest open, free and research-ready database of political administrative boundaries.” Founded by the geoLab at William & Mary university, and released under a Creative Commons Attribution license that includes a requirement for a citation. File formats offered include shapefiles, GeoJSON and TopoJSON.
Deno by example (via) Interesting approach to documentation: a big list of annotated examples illustrating the Deno way of solving a bunch of common problems.