Simon Willison’s Weblog

Subscribe

Blogmarks

Filters: Sorted by date

GPSJam (via) John Wiseman’s “Daily maps of GPS interference” —a beautiful interactive globe (powered by Mapbox GL) which you can use to see points of heaviest GPS interference over a 24 hour period, using data collected from commercial airline radios by ADS-B Exchange. “From what I can tell the most common reason for aircraft GPS systems to have degraded accuracy is jamming by military systems. At least, the vast majority of aircraft that I see with bad GPS accuracy are flying near conflict zones where GPS jamming is known to occur.”

# 30th July 2022, 7:51 pm / gis, gps, mapping, john-wiseman

Introducing sqlite-lines—a SQLite extension for reading files line-by-line (via) Alex Garcia wrote a brilliant C module for SQLIte which adds functions (and a table-valued function) for efficiently reading newline-delimited text into SQLite. When combined with SQLite’s built-in JSON features this means you can read a huge newline-delimited JSON file into SQLite in a streaming fashion so it doesn’t exhaust memory for a large file. Alex also compiled the extension to WebAssembly, and his post here is an Observable notebook post that lets you exercise the code directly.

# 30th July 2022, 7:18 pm / json, sqlite, observable, webassembly, alex-garcia

Packaging Python Projects with pyproject.toml. I decided to finally figure out how packaging with pyproject.toml works—all of my existing projects use setup.py. The official tutorial from the Python Packaging Authority (PyPA) had everything I needed.

# 29th July 2022, 11:18 pm / packaging, python

Fastest way to turn HTML into text in Python (via) A light benchmark of the new-to-me selectolax Python library shows it performing extremely well for tasks such as extracting just the text from an HTML string, after first manipulating the DOM. selectolax is a Python binding over the Modest and Lexbor HTML parsing engines, which are written in no-outside-dependency C.

# 27th July 2022, 5:55 pm / html, python

SQLite Internals: Pages & B-trees (via) Ben Johnson provides a delightfully clear introduction to SQLite internals, describing the binary format used to store rows on disk and how SQLite uses 4KB pages for both row storage and for the b-trees used to look up records.

# 27th July 2022, 2:57 pm / algorithms, databases, sqlite, ben-johnson

Cosmopolitan: Compiling Python. Cosmopolitan is Justine Tunney’s “build-once run-anywhere C library”—part of the αcτµαlly pδrταblε εxεcµταblε effort, which produces wildly clever binary executable files that work on multiple different platforms, and is the secret sauce behind redbean. I hadn’t realized this was happening but there’s an active project to get Python to work as this format, producing a new way of running Python applications as standalone executables, only these ones have the potential to run unmodified on Windows, Linux and macOS.

# 26th July 2022, 8:43 pm / python, redbean, cosmopolitan, justine-tunney

viewport-preview (via) I built a tiny tool which lets you preview a URL in a bunch of different common browser viewport widths, using iframes.

# 26th July 2022, 12 am / css, iframes, mobile, projects, testing

Reduce Friction. Outstanding essay on software engineering friction and development team productivity by C J Silverio: it explains the concept of “friction” (and gives great definitions of “process”, “ceremony” and “formality” in the process) as it applies to software engineering, lays out the challenges involved in getting organizations to commit to reducing it and then provides actionable advice on how to get consensus and where to invest your efforts in order to make things better.

# 25th July 2022, 10:25 pm / software-engineering, management

Sqitch tutorial for SQLite (via) Sqitch is an interesting implementation of database migrations: it’s a command-line tool written in Perl with an interface similar to Git, providing commands to create, run, revert and track migration scripts. The scripts the selves are written as SQL in whichever database engine you are using. The tutorial for SQLite gives a good idea as to how the whole system works.

# 24th July 2022, 11:44 pm / databases, migrations, sqlite

You should take more screenshots (via) Alex Chan suggests saving screenshots of your work, since they may well last a lot longer than the projects themselves. I try to do that these days but I have SO many projects from the past that I didn’t capture in this way, and that I really regret not keeping a better visual record of.

# 24th July 2022, 9:03 pm / archiving

Promise Maps. Egbert Teeselink describes a neat JavaScript caching pattern: instead of caching key:value cache key:promise-that-resolves-to-value—doing this gives you dog piling prevention for free, because the first lookup of a value trigers the computation to fetch it while subsequent lookups wait on the same promise to resolve—or resolve instantly if the computation has completed.

# 22nd July 2022, 3:52 pm / dogpile, javascript

How John Wiseman tracks worldwide GPS interference (via) Part of the ADS-B signals broadcast by commercial aircraft include a measure of GPS accuracy. By collecting global data every day, John is able to generate a map of areas that are experiencing higher than expected GPS interference, which generally corresponds to military jamming technology. He sometimes posts the resulting maps on Twitter—he just picked up increasing jamming activity around Moscow.

# 20th July 2022, 11:45 pm / gps, john-wiseman

Visual Studio Code: Development Process (via) A detailed description of the development process used by VS Code: a 6-12 month high level roadmap, then month long iterations that each result in a new version that is shipped to users. Includes details of how the four weeks of each iteration are spent too.

# 20th July 2022, 4:34 pm / microsoft, software-engineering, vs-code

The Checkered Flag Diagram for visualizing SQL joins. I really like this alternative to Venn diagrams for showing the difference between different types of SQL join (left join, right join, cross join etc).

# 20th July 2022, 1:16 pm / sql

Soft Deletion Probably Isn’t Worth It. Brandur argues that soft deletion—where you delete records by populating a “is_deleted” or “deleted_at” column in your table—isn’t worth the additional complexity and risk it adds to other database queries. Instead, he suggests having a separate deleted records table which records the deleted data in a JSON blob—allowing you to review and recover it manually if necessary, and giving you an easy way to expire deleted records that have exceeded your retention policy.

# 19th July 2022, 8:40 pm / databases, brandur-leach

Datasette Discord community (via) I started a Discord chat community for Datasette. 57 people have joined up already!

# 15th July 2022, 3:17 am / datasette, discord

The DALL·E 2 Prompt Book (via) This is effectively DALL-E: The Missing Manual: an 81 page PDF book that goes into exhaustive detail about how to get the most out of DALL-E through creative prompt design.

# 14th July 2022, 11:26 pm / ai, openai, dalle, prompt-engineering, generative-ai

Bringing page transitions to the web (via) Jake Archibald’s 13 minute Google I/O talk demonstrating the page transitions API that’s now available in Chrome Canary. This is a fascinating piece of API design—it works by effectively creating a static image screenshot of the before and after states of the transition, then letting you define CSS animations that animate a transition between the two static images. By default the screenshot encompasses the full viewport, but you can instead define multiple elements within the page and apply separate transitions to them. It’s only available for SPAs right now but the final design should include support for multi-page applications as well—which means transitions with no JavaScript needed at all!

# 13th July 2022, 4:26 pm / css, javascript, google-io, jake-archibald, view-transitions

GPT-3 prompt for spotting nonsense questions (via) In response to complaints that GPT-3 will happily provide realistic sounding answers to nonsense questions, rictic recommends the following prompt:

I'll ask a series of questions. If the questions are nonsense, answer "yo be real", if they're a question about something that actually happened, answer them.

# 10th July 2022, 4:33 am / gpt-3, openai, prompt-engineering, generative-ai, llms

How to Temporarily Disable Face ID or Touch ID, and Require a Passcode to Unlock Your iPhone or iPad. Hold down the power and volume up buttons for a couple of seconds, and your iPhone will no longer allow you to use FaceID to unlock it without first entering your passcode.

# 6th July 2022, 5:38 pm / iphone, security

Helpful 404s for Jekyll (and GitHub Pages). Neat trick from Ben Balter: JavaScript that runs on your 404 page, fetches the sitemap.xml, parses all of the URLs out of it and then uses a levenshtein edit-distance comparison to find the closest URL to the one that you landed on and suggests that as a “Perhaps you’re looking for?”.

# 6th July 2022, 5:31 pm / javascript

Bun. “Bun is a fast all-in-one JavaScript runtime”—this is very interesting. It’s the first project I’ve seen written using the Zig language, which I see as somewhat equivalent to Rust. Bun provides a full Node.js-style JavaScript environment plus a host of packaged tools—an npm install client, a TypeScript transpiler, bundling tools—all wrapped up in a single binary. The JavaScript engine itself extends JavaScriptCore. Bun also ships with its own wrapper for SQLite.

# 6th July 2022, 5:24 pm / javascript, sqlite, npm, typescript, zig, bun

Defensive CSS (via) Fantastic new site by Ahmad Shadeed describing in detail CSS patterns which can help build layouts that adapt well to unexpected content—things like overly long titles or strange aspect ratio images, common when you are designing against UGC.

# 6th July 2022, 5:16 pm / css, frontend, ahmad-shadeed

The Magic Interview Question (via) Jeff Gothelf explains why “Tell me about the last time you [did something]” is the most valuable question you can ask when interviewing a user or potential user.

# 28th June 2022, 2:26 pm / usability

How Imagen Actually Works. Imagen is Google’s new text-to-image model, similar to (but possibly even more effective than) DALL-E. This article is the clearest explanation I’ve seen of how Imagen works: it uses Google’s existing T5 text encoder to convert the input sentence into an encoding that captures the semantic meaning of the sentence (including things like items being described as being on top of other items), then uses a trained diffusion model to generate a 64x64 image. That image is passed through two super-res models to increase the resolution to the final 1024x1024 output.

# 23rd June 2022, 6:05 pm / google, machine-learning, ai, dalle, generative-ai

The State of WebAssembly 2022. Colin Eberhardt talks through the results of the State of WebAssembly 2022 survey. Rust continues to dominate as the most popular language for working to WebAssembly, but Python has a notable increase of interest.

# 20th June 2022, 6:07 pm / rust, webassembly

WarcDB (via) Florents Tselai built this tool for loading web crawl data stored in WARC (Web ARChive) format into a SQLite database for smaller-scale analysis with SQL, on top of my sqlite-utils Python library.

# 19th June 2022, 6:08 pm / archiving, sqlite, sqlite-utils

Making Code Faster. Tim Bray’s detailed guide to using the Go profiler.

# 13th June 2022, 7:40 pm / go, profiling, tim-bray

The End of Localhost. swyx makes the argument for cloud-based development environments, and points out that many large companies—including Google, Facebook, Shopify and GitHub—have made the move already. I was responsible for the team maintaining the local development environment experience at Eventbrite for a while, and my conclusion is that with a large enough engineering team someone will ALWAYS find a new way to break their local environment: the idea of being able to bootstrap a fresh, guaranteed-to-work environment in the cloud at the click of a button could save SO much time and money.

# 8th June 2022, 6:09 pm / software-engineering, swyx

Announcing Pyston-lite: our Python JIT as an extension module (via) The Pyston JIT can now be installed in any Python 3.8 virtual environment by running “pip install pyston_lite_autoload”—which includes a hook to automatically inject the JIT. I just tried a very rough benchmark against Datasette (ab -n 1000 -c 10) and got 391.20 requests/second without the JIT compared to 404.10 request/second with it.

# 8th June 2022, 5:58 pm / jit, performance, python

Years

Tags