Simon Willison’s Weblog

Subscribe

Blogmarks

Filters: Sorted by date

PyScript Updates: Bytecode Alliance, Pyodide, and MicroPython. Absolutely huge news about Python on the Web tucked into this announcement: Anaconda have managed to get a version of MicroPython compiled to WebAssembly running in the browser. Pyodide weighs in at around 6.5MB compressed, but the MicroPython build is just 303KB—the size of a large image. This makes Python in the web browser applicable to so many more potential areas.

# 9th November 2022, 10:26 pm / python, webassembly, pyodide

Semantic text search using embeddings. Example Python notebook from OpenAI demonstrating how to build a search engine using embeddings rather than straight up token matching. This is a fascinating way of implementing search, providing results that match the intent of the search (“delicious beans” for example) even if none of the keywords are actually present in the text.

# 9th November 2022, 7:57 pm / machine-learning, search, openai, embeddings

Inside the mind of a frontend developer: Hero section. Ahmad Shadeed provides a fascinating, hyper-detailed breakdown of his approach to implementing a “hero section” component using HTML and CSS, including notes on CSS grids and gradient backgrounds.

# 9th November 2022, 7:54 pm / css, ahmad-shadeed

Blessed.rs Crate List (via) Rust doesn’t have a very large standard library, so part of learning Rust is figuring out which of the third-party crates are the best for tackling common problems. This here is an opinionated guide to crates, which looks like it could be really useful.

# 7th November 2022, 7:25 pm / rust

GOV.UK: Rules for getting production access (via) Fascinating piece of internal documentation on GOV.UK describing their rules, procedures and granted permissions for their deployment and administrative ops roles.

# 5th November 2022, 6:25 pm / security, gov-uk

Nikodemus’ Guide to Mastodon (via) I’ve been reading a bunch of different Mastodon guides and this one had pretty much exactly the information I needed to see when I first started out.

# 5th November 2022, 4:18 am / mastodon

Don’t Read Off The Screen (via) Stuart Langridge provides a fantastic set of public speaking tips in a five minute lightning talk remix of Sunscreen. Watch with sound.

# 4th November 2022, 4:02 pm / speaking, stuart-langridge

RFC 7807: Problem Details for HTTP APIs (via) This RFC has been brewing for quite a while, and is currently in last call (ends 2022-11-03). I’m designing the JSON error messages for Datasette at the moment so this could not be more relevant for me.

# 1st November 2022, 3:15 am / errors, http, json, mark-nottingham, rfc, standards

mitsuhiko/insta (via) I asked for recommendations on Twitter for testing libraries in other languages that would give me the same level of delight that I get from pytest. Two people pointed me to insta by Armin Ronacher, a Rust testing framework for “snapshot testing” which automatically records reference values to your repository, so future tests can spot if they change.

# 31st October 2022, 1:06 am / armin-ronacher, testing, rust, pytest

About the sqlite3 WASM/JS Subproject. SQLite now maintains an official WebAssembly build. It’s influenced by sql.js but is a fresh implementation with its own API design. It also supports Origin-Private FileSystem (OPFS)—a very new standard which doesn’t yet have wide browser support that allows websites to save and load files using a dedicated folder on the host machine.

# 28th October 2022, 11:05 pm / javascript, sqlite, webassembly

Welcome to hell, Elon (via) If you only read one thing about the Elon acquisition of Twitter make it this, by Nilay Patel. Outstanding insights into what it actually takes to to run a commercial social media service.

# 28th October 2022, 3:16 pm / moderation, social-media, twitter, nilay-patel

Leveraging ’shot-scraper’ and creating image diffs. Üllar Seerme has a neat recipe for using shot-scraper and ImageMagick to create differential animations showing how a scraped web page has visually changed.

# 24th October 2022, 9:34 pm / imagemagick, github-actions, shot-scraper

The Commodordion (via) The Commodordion is “an 8-bit accordion primarily made of C64s, floppy disks, and gaffer tape” by Linus Åkesson. It’s absolutely beautiful.

# 21st October 2022, 11:36 pm / commodore, hacks, music

Simple, Fast, and Scalable Reverse Image Search Using Perceptual Hashes and DynamoDB. Christopher Bong provides a clear explanation of how perceptual hashes can be used to create a string representing the visual content of an image, such that similar images can be identified by calculating a hamming distance between those hashes. He then explains how they built a large-scale system for this at Canva on top of DynamoDB, by splitting those strings into smaller hash windows and using those for efficient bulk lookups of similar candidates.

# 19th October 2022, 3:04 pm / images, search

Chris Amico’s Python setup for 2022 (via) Homebrew to install pyenv, then pyenv to install specific Python versions. pipx and pipenv for package management. I need to habitually start using pyenv for everything.

# 18th October 2022, 2:04 pm / python, chris-amico

“You are GPT-3”. Genius piece of prompt design by Riley Goodside. “A long-form GPT-3 prompt for assisted question-answering with accurate arithmetic, string operations, and Wikipedia lookup. Generated IPython commands (in green) are pasted into IPython and output is pasted back into the prompt (no green).” Uses “Out[” as a stop sequence to ensure GPT-3 stops at each generated iPython prompt rather than inventing the output itself.

# 17th October 2022, 4:35 am / gpt-3, prompt-engineering, generative-ai, riley-goodside, llms

Half Moon Bay Pumpkin Festival traffic on Saturday 15th October 2022 (via) It’s the Half Moon Bay Pumpkin Festival this weekend... and its impact on the traffic between our little town of El Granada and Half Moon Bay—8 minutes drive away—is notorious. So I built a git scraper that archives estimated driving times from the Google Maps Navigation API, and used git-history to turn that scraped data into a SQLite database and visualize it on a chart.

# 16th October 2022, 3:56 am / projects, git-scraping, git-history, half-moon-bay

How to create a Python package in 2022 (via) Fantastic tutorial on modern Python packaging by Rodrigo Girão Serrão. I’ve been meaning to figure out Poetry for a while now and this gave me exactly the information I needed to start figuring it out. Great coverage of GitHub Actions, Tox and pre-commit as well.

# 15th October 2022, 10:10 pm / packaging, python, github-actions

Dumping the HTML of a page using shot-scraper. New in 1.0 is the “shot-scraper html URL” command, which outputs the HTML of a page once JavaScript has finished executing there. You can pass in additional custom JavaScript to run before the shapshot is taken, and you can also specify a CSS selector on the page to return just that fragment of HTML.

# 15th October 2022, 9:30 pm / shot-scraper

shot-scraper 1.0 (via) Only a minor release in terms of features, but I decided that I'm comfortable enough with the CLI design at this point that I'm ready to stamp a 1.0 on it and commit to not making backwards-incompatible changes (at least without shipping a 2.0 release, which I'd like to avoid if possible).

Full release notes:

# 15th October 2022, 9:28 pm / cli, projects, shot-scraper

How to implement a “dry run mode” for data imports in Django (via) Adam Johnson describes in detail a beautiful pattern for implementing a dry-run mode for a Django management command, by executing ORM calls inside an atomic() transaction block, showing a summary of changes that are made and then rolling the transaction back at the end.

# 13th October 2022, 4:22 pm / django, transactions, adam-johnson

The AI that creates any picture you want, explained. Vox made this explainer video about text-to-image generative AI models back in June, months before Stable Diffusion was released and shortly before the DALL-E preview started rolling out to a wider audience. It’s a really good video—in particular the animation that explains at a high level how diffusion models work, which starts about 5m30s in.

# 10th October 2022, 3:28 am / ai, dalle, stable-diffusion, generative-ai, text-to-image

Reasons Why I Think 50% Coding 50% Marketing is the Best Framework for Solo Tech Founders (via) Jon Yongfook offers a deliciously simple recipe for splitting up the work of both developing and marketing a product: one week of development, then one week of marketing, then repeat. I really like this concept: I mix the two activities randomly at the moment and constantly find myself feeling guilty that I’m not spending enough focused time on either of them!

# 8th October 2022, 3:43 pm / entrepreneurship, marketing, startup

Can :has Connect 4? (via) Spectacular CSS demo by Jhey Tompkins, implementing a working 3D Connect 4 game using just CSS (brilliant trickery with the new :has() selector) and not a single line of JavaScript.

# 7th October 2022, 5:49 pm / css

Stringing together several free tiers to host an application with zero cost using fly.io, Litestream and Cloudflare. Alexander Dahl provides a detailed description (and code) for his current preferred free hosting solution for small sites: SQLite (and a Go application) running on Fly’s free tier, with the database replicated up to Cloudflare’s R2 object storage (again on a free tier) by Litestream.

# 7th October 2022, 5:47 pm / hosting, sqlite, cloudflare, fly, litestream

py2rs. Extremely useful document providing resources for learning Rust followed by an extensive collection of common Python tasks (building a list, opening a file, spawning a thread, running a simple web server) and their Rust equivalents.

# 7th October 2022, 5:44 pm / python, rust

Getting tabular data from unstructured text with GPT-3: an ongoing experiment (via) Roberto Rocha shows how to use a carefully designed prompt (with plenty of examples) to get GPT-3 to convert unstructured textual data into a structured table.

# 5th October 2022, 3:03 am / data-journalism, ai, gpt-3, openai, prompt-engineering, generative-ai, llms

The Illustrated Stable Diffusion (via) Jay Alammar provides a detailed, clearly explained description of how the Stable Diffusion image generation model actually works under the hood..

# 5th October 2022, 2:58 am / ai, stable-diffusion, generative-ai, text-to-image

libsql (via) A brand new Apache 2 licensed fork of SQLite. The README explains the rationale behind the project: SQLite itself is open source but not open contribution, and this fork aims to try out new ideas. The most interesting to me so far is a plan to support user defined functions implemented in WebAssembly. The project also intends to use Rust for new feature development.

# 4th October 2022, 4:13 pm / open-source, sqlite, rust, webassembly

mod_wasm: run WebAssembly with Apache (via) Brand new Apache module from a team at VMWare: mod_wasm builds on top of wasmtime to let you write WebAssembly programs that are exposed to the world by Apache, using a mechanism that looks similar to old CGI scripts (headers passed in environment variables, request body sent to standard input). Includes a demo Docker image that runs using Python-compiled-to-WebAssembly.

# 4th October 2022, 12:53 am / apache, webassembly

Years

Tags