Simon Willison’s Weblog

Subscribe
Atom feed

Blogmarks

Filters: Sorted by date

Cleaning Up Your Postgres Database (via) Craig Kerstiens provides some invaluable tips on running an initial check of the health of a PostgreSQL database, by using queries against the pg_statio_user_indexes table to find the memory cache hit ratio and the pg_stat_user_tables table to see what percentage of queries to your tables are using an index.

# 3rd February 2021, 7:32 am / databases, performance, postgresql, craig-kerstiens

JMeter Result Analysis using Datasette (via) NaveenKumar Namachivayam wrote a detailed tutorial on using Datasette (on Windows) and csvs-to-sqlite to analyze the results of JMeter performance test runs and then publish them online using Vercel.

# 1st February 2021, 4:42 am / tutorials, datasette

Making GitHub’s new homepage fast and performant. A couple of really clever tricks in this article by Tobias Ahlin. The first is using IntersectionObserver in conjunction with the video preload=“none” attribute to lazily load a video when it scrolls into view. The second is an ingenious trick to create an efficiently encoded transparent JPEG image: embed the image in a SVG file twice, once as the image and once as a transparency mask.

# 29th January 2021, 7:05 pm / github, images, javascript, performance, svg

Culture is the Behavior You Reward and Punish (via) Jocelyn Goldfein describes an intriguing exercise for discovering your company culture: imagine a new hire asking for advice on what makes people successful there, and use that to review what behavior is rewarded and discouraged.

# 12th January 2021, 6:09 am

datasette-css-properties (via) My new Datasette plugin defines a “.css” output format which returns the data from the query as a valid CSS stylesheet defining custom properties for each returned column. This means you can build a page using just HTML and CSS that consumes API data from Datasette, no JavaScript required! Whether this is a good idea or not is left as an exercise for the reader.

# 7th January 2021, 7:42 pm / css, projects, datasette, css-custom-properties

Custom Properties as State. Fascinating thought experiment by Chris Coyier: since CSS custom properties can be defined in an external stylesheet, we can APIs that return stylesheets defining dynamically server-side generated CSS values for things like time-of-day colour schemes or even strings that can be inserted using ::after { content: var(--my-property).

This gave me a very eccentric idea for a Datasette plugin...

# 7th January 2021, 7:39 pm / apis, css, css-custom-properties

brumm.af/shadows (via) I did not know this trick: by defining multiple box-shadow values as a comma separated list you can create much more finely tuned shadow effects. This tool by Philipp Brumm provides a very smart UI for designing shadows.

# 6th January 2021, 4:12 pm / css, design

DALL·E: Creating Images from Text (via) “DALL·E is a 12-billion parameter version of GPT-3 trained to generate images from text descriptions, using a dataset of text–image pairs.”. The examples in this paper are astonishing—“an illustration of a baby daikon radish in a tutu walking a dog” generates exactly that.

# 5th January 2021, 8:31 pm / machine-learning, ai, openai, dalle, generative-ai

Everything You Always Wanted To Know About GitHub (But Were Afraid To Ask) (via) ClickHouse by Yandex is an open source column-oriented data warehouse, designed to run analytical queries against TBs of data. They've loaded the full GitHub Archive of events since 2011 into a public instance, which is a great way of both exploring GitHub activity and trying out ClickHouse. Here's a query I just ran that shows number of watch events per year, for example:

SELECT toYear(created_at) as yyyy, count()
FROM github_events
WHERE event_type = 'WatchEvent' group by yyyy

# 5th January 2021, 1:02 am / analytics, github, sql, big-data, clickhouse

hooks-in-a-nutshell.js (via) Neat, heavily annotated implementation of React-style hooks in pure JavaScript, really useful for understanding how they work.

# 4th January 2021, 6:36 pm / javascript, react

sqlite-utils 3.2 (via) As discussed in my weeknotes yesterday, this is the release of sqlite-utils that adds the new “cached table counts via triggers” mechanism.

# 3rd January 2021, 9:25 pm / projects, sqlite, sqlite-utils

Replicating SQLite with rqlite (via) I’ve been trying out rqlite, a “lightweight, distributed relational database, which uses SQLite as its storage engine”. It’s written in Go and uses the Raft consensus algorithm to allow a cluster of nodes to elect a leader and replicate SQLite statements between them. By default it uses in-memory SQLite databases with an on-disk Raft replication log—here are my notes on running it in “on disk” mode as a way to run multiple Datasette processes against replicated SQLite database files.

# 28th December 2020, 7:51 pm / replication, sqlite, datasette

How Shopify Uses WebAssembly Outside of the Browser (via) I’m fascinated by applications of WebAssembly outside the browser. As a Python programmer I’m excited to see native code libraries getting compiled to WASM in a way that lets me call them from Python code via a bridge, but the other interesting application is executing untrusted code in a sandbox.

Shopify are doing exactly that—they are building a kind-of plugin mechanism where partner code compiled to WASM runs inside their architecture using Fastly’s Lucet. The performance numbers are in the same ballpark as native code.

Also interesting: they’re recommending AssemblyScript, a TypeScript-style language designed to compile directly to WASM without needing any additional interpreter support, as required by dynamic languages such as JavaScript, Python or Ruby.

# 19th December 2020, 4:46 pm / performance, security, webassembly

Commits are snapshots, not diffs (via) Useful, clearly explained revision of some Git fundamentals.

# 17th December 2020, 10:01 pm / git, github

datasette.io (via) Datasette finally has an official project website, three years after the first release of the software. I built it using Datasette, with custom templates to define the various pages. The site includes news, latest releases, example sites and a new searchable plugin directory.

# 11th December 2020, 4:11 am / projects, datasette

Deno 1.6 Release Notes. Two signature features in Deno 1.6 worth paying attention to: a built-in language server for code editors like VS Code, and the “deno compile” command which can build Deno JavaScript/TypeScript projects into standalone binaries. The ability to build binaries has turned out to be a killer feature of both Go and Rust, so seeing it ship as a default capability of a interpreted dynamic language is fascinating. I would love it if Python followed Deno’s example.

# 10th December 2020, 1:25 am / javascript, deno

The case against client certificates (via) Colm MacCárthaigh provides a passionately argued Twitter thread about client certificates and why they should be avoided. I tried using them as an extra layer of protection fir my personal Dogsheep server and ended up abandoning them—certificate management across my devices was too fiddly.

# 9th December 2020, 2:41 pm / certificates, dogsheep

Cameras and Lenses (via) Fabulous explotable interactive essay by Bartosz Ciechanowski explaining how cameras and lenses work.

# 8th December 2020, 3:38 am / photography, explorables

The secrets of Monkey Island’s source code (via) To celebrate the thirty year anniversary of the Secret of Monkey Island the Video Game History Foundation interviewed developer Rod Gilbert and produced this comprehensive collection of cut content and material showing how the game was originally constructed.

# 5th December 2020, 4:32 pm / game-design

Command Line Interface Guidelines (via) Aanand Prasad, Ben Firshman, Carl Tashian and Eva Parish provide the missing manual for designing CLI tools in 2020. Deeply researched and clearly presented—I picked up a bunch of useful tips and ideas from reading this, and I’m looking forward to applying them to my own CLI projects.

# 4th December 2020, 8:44 pm / cli, design, usability

Scaling Datastores at Slack with Vitess (via) Slack spent three years migrating 99% of their MySQL query load to run against Vitess, the open source MySQL sharding system originally built by YouTube. “Today, we serve 2.3 million QPS at peak. 2M of those queries are reads and 300K are writes. Our median query latency is 2 ms, and our p99 query latency is 11 ms.”

# 1st December 2020, 9:30 pm / mysql, scaling, sharding, youtube, slack, vitess

New for AWS Lambda – Container Image Support. “You can now package and deploy Lambda functions as container images of up to 10 GB in size”—can’t wait to try this out with Datasette.

# 1st December 2020, 5:34 pm / aws, lambda, docker, datasette

Why is Apple’s M1 Chip So Fast? (via) This explanation by Erik Engheim is exactly the right level of nerdery for me.

# 30th November 2020, 11:20 pm / apple

Datasette Weekly volume 4. Datasette Office Hours, Personal Data Warehouses, datasette-ripgrep, datasette-indieauth, Datasette Client for Observable and more.

# 30th November 2020, 11:02 pm / datasette

Datasette Office Hours (via) I’ve decided to try running Datasette Office Hours every Friday. If you’d like to chat with me over Zoom about the project for 20 minutes for any reason at all you can grab a slot here using Calendly.

# 30th November 2020, 11:01 pm / datasette

Datasette 0.52. A relatively small release—it has a new plugin hook (database_actions(), for adding links to a new database actions menu), renames the --config option to --setting and adds a new “datasette publish cloudrun --apt-get-install” option.

# 29th November 2020, 12:56 am / projects, datasette, cloudrun

Unravelling `not` in Python (via) Part of a series where Brett Cannon looks at how fundamental Python syntactic sugar works, including a clearly explained dive into the underlying op codes and C implementation.

# 27th November 2020, 5:59 pm / c, python, brett-cannon

Datasette Client for Observable (via) Really elegant piece of code design from Alex Garcia: DatasetteClient is a client library he built designed to work in Observable notebooks, which uses JavaScript tagged template literals to allow SQL query results to be executed against a Datasette instance and displayed as inline tables in a notebook, or used to return JSON data for further processing. His example notebook includes a neat d3 stacked area chart example built against a Datasette of congresspeople, plus examples using interactive widgets to update the Notebook.

# 24th November 2020, 6:53 pm / javascript, datasette, observable, alex-garcia

datasette-graphql 1.2 (via) A new release of the datasette-graphql plugin, fixing a minor security flaw: previous versions of the plugin could expose the schema (but not the actual data) of tables in databases that were otherwise protected by Datasette’s permission system.

# 21st November 2020, 10:21 pm / projects, security, graphql, datasette

Years

Tags