Simon Willison’s Weblog

Items in Nov

Filters: Month: Nov ×


Twitter conversation about long-term pre-paid archival storage. I kicked off a conversation on Twitter yesterday about long-time archival storage of web content: “Anyone know of a web hosting provider where I can pay a lump sum of money to host a file at a reliable URL essentially forever? Is this even remotely feasible?”. The thread is really interesting—this is definitely an unsolved problem, and it’s clear that the challenge is more organizational (how do you create an entity that can keep this kind of promise—does it need to be some kind of foundation or trust?) than technical. # 5th November 2018, 6:50 pm

If you stop thinking in terms of MVC you might notice that at its core, React is a runtime for effectful functions that don’t execute “once”, but run continuously while being anchored to a call tree.

Dan Abramov # 3rd November 2018, 9:51 pm

Apple’s New Map (via) Map nerds rejoice! Justin O’Beirne had written another spectacularly illustrated essay about web cartography, this time examining the iOS 12 upgrade to Apple Maps in most of California and a little bit of Nevada. # 3rd November 2018, 9:28 pm

Every bitcoin proof-of-work mined is an incremental addition to a vast distributed summoning ritual powering the demon-soul at the heart of the maze, the computational equivalent of a Buddhist prayer wheel spinning in a Himalayan breeze.

Charles Stross # 3rd November 2018, 5:47 pm

Svelte RFC 0001: Reactive assignments (via) Svelte is a really interesting JavaScript framework: it offers a similar component-based developer experience to React but does it while delivering a tiny amount of code-to-the-browser thanks to being built entirely around a compiler that generates the minimum code necessary. In this RFC the Svelte team propose taking this approach even further, by generating code to accompany every relevant variable assignment that can trigger the corresponding view update. The document also has a very clear explanation of how React, Vue and current Svelte differ in their solutions to the challenge of updating the visible HTML view when the corresponding state changes. # 3rd November 2018, 4:58 pm

jantic/DeOldify (via) “A Deep Learning based project for colorizing and restoring old images”. Delightful (and well documented) project that uses a Self-Attention Generative Adversarial Network to colorize old black and white photos, with extremely impressive results. Built on an older version of the fastai library, and trained by running for several days on a 1080TI graphics card. # 2nd November 2018, 11:13 am

Among other things at Netflix the Mantis Query Language (MQL an SQL for streaming data) which ferries around approximately 2 trillion events every day for operational analysis (SPS alerting, quality of experience metrics, debugging production, etc) is written entirely in Clojure.

diab0lic on Hacker News # 1st November 2018, 2:52 am

Datasette: Ability to customize presentation of specific columns in HTML view. Still a work in progress, but Datasette master now allows you to inject links to one or more additional CSS and JavaScript resources (optionally with SRI hashes) which will be included on every page. Each template also now provides CSS classes on the body element derived from the current database and table names to provide hooks for custom styling. Next up: custom template support. # 30th November 2017, 7:27 am

Can I use... input type=color. TIL <input type=“color”> has reached 78.83% support globally already—biggest gap right now is Mobile Safari. # 29th November 2017, 9:56 pm

Interactive Workflows for C++ with Jupyter. Whoa, this really works... not just an interactive C++ REPL in a Jupyter notebook, but inline graph plotting support and interactive widgets as well. Scroll to the bottom of the article for Binder links which let you fire up an interactive C++ REPL in your browser and start interacting with it instantly. # 29th November 2017, 9:51 pm

Object models (via) Extremely comprehensive and readable discussion of the object models of Python, JavaScript, Lua and Perl 5. I learned something new about every one of those languages. # 29th November 2017, 2:59 pm

Big Data Workflow with Pandas and SQLite (via) Handy tutorial on dealing with larger data (in this case a 3.9GB CSV file) by incrementally loading it into pandas and writing it out to SQLite. # 28th November 2017, 11:02 pm

Boiling the Ocean, Incrementally—How Stylo Brought Rust and Servo to Firefox. Firefox Quantum is the product of an impressive, highly risky chain of software engineering—Rust, Servo, then Stylo. # 28th November 2017, 8:34 pm

For most books, the review is a bell-shaped curve of star ratings; this one has a peak at 1, a peak at 5, and very little in between. How could this be? I think it is because SICP is a very personal message that works only if the reader is at heart a computer scientist (or willing to become one).

Peter Norvig # 28th November 2017, 6:38 pm

Firefox Debugger Playground. Excellent hands-on tutorial to using the Firefox JavaScript debugger. I learned a bunch of neat tricks from this—including using conditional breakpoints to add temporary console.log statements—since that function returns undefined it won’t pause your code, but this saves you from having to remember to remove the lines from your source code later. I also didn’t know that the Firefox debugger can show the value of variables in paused code if you hover over them in the source pane. [UPDATE: Turns out Chrome DevTools do this as well—TIL] # 28th November 2017, 4:01 pm

The Best Request Is No Request, Revisited · An A List Apart Article. In HTTP/2 the rules have changed: serving unnecessary code as part of a larger bundle to avoid extra request overhead no longer makes sense. Splitting your code into many files and loading just the ones needed by the current page can knock seconds off your load time. # 28th November 2017, 3:50 pm

Inside Docker’s “FROM scratch” (via) I’m a big fan of understanding your abstractions. Here’s a neat tutorial that dives deep into Docker’s “scratch” image which offers the smallest possible Docker image, and hence provides a great opportunity to understand what a Docker container at its most minimal does for you. # 27th November 2017, 4:33 pm

A Complete CMS with No Server and 18 Lines of Code | Netlify. Slightly hyperbolic title, but there’s something really interesting going on here. Netlify is a CDN/hosting provider optimized for static site builders—it can hook up to a GitHub repository and build and deploy your site on every commit. Netlify CMS is their open-source CMS tool which works in a fascinating way: it’s a single page React app which stores structured content (as Markdown files with embedded key/value pairs) directly to your GitHub repository. Fire up Chrome DevTools and you can watch it using the GitHub API to construct new commits every time you hit “save”. # 26th November 2017, 5:53 pm

Many Small Queries Are Efficient In SQLite. Since SQLite runs in-process rather than being accessed over a network it avoids the per-query overhead of network round trips. This means that while MySQL or PostgreSQL applications need to avoid N+1 query patterns that create 100s of queries per request, SQLite apps can be designed differently: provided you hit indexes or small tables, 200 queries just means 200 extra cheap function calls. # 26th November 2017, 4:24 pm

SQLite Query Language: WITH clause. SQLite’s documentation on recursive CTEs starts out with some nice clear examples of tree traversal using a WITH statement, then gets into graphs, then goes way off the deep end with a Mandelbrot Set query and a query that can solve Soduku puzzles (“in less than 300 milliseconds on a modern workstation”). # 26th November 2017, 7:23 am

Added TSV example to the README · simonw/csvs-to-sqlite@957d4f5. Thanks to a pull request from Jani Monoses, csvs-to-sqlite can now handle TSV (or any other separator) as well as regular CSVs. # 26th November 2017, 7:02 am

New in Datasette: filters, foreign keys and search

I’ve released Datasette 0.13 with a number of exciting new features (Datasette previously).

[... 1143 words]

harelba/q (via) q is a neat command-line utility that lets you run SQL queries directly against CSV and TSV files. Internally it works by firing up an in-memory SQLite database, and as of the latest release (1.7.1) you can use the new --save-db-to-disk option to save that in-memory database to disk. # 25th November 2017, 5:49 pm

What is the plural of blitz? Wow, WordHippo is a straight up masterclass in keyword SEO tactics. Everything from the page URL to the keyword-crammed content to the enormous quantity of related links. # 25th November 2017, 5:42 pm

VoxelSpace (via) Lovely clear explanation of the voxel space landscape rendering technique used by NovaLogic for Comanche back in 1992, including a working JavaScript demo plus pseudo-code in Python. # 24th November 2017, 7:30 pm

“Just” makes me feel like an idiot. “Just” presumes I come from a specific background, studied certain courses in university, am fluent in certain technologies, and have read all the right books, articles, and resources. “Just” is a dangerous word.

Brad Frost # 24th November 2017, 5:07 pm

TLDR pages. This is an absurdly good idea: a community maintained set of alternative man pages for common commands with a focus on usage examples, plus a “tldr netstat” command to see them. The man pages themselves are maintained on GitHub. # 24th November 2017, 5:38 am

Return of the Obra Dinn: Dithering Process (via) Lucas Pope (creator of “Papers, Please”) has a new game under development: “Return of the Obra Dinn”, a first-person adventure mystery game set in 1807 that is spectacularly rendered in a 1-bit art style. He has a development diary on tigsource.com, and in this entry he describes the extreme lengths he has gone to in order to develop the best possible dithering implementation for rendering his 3D world in 1-bit colour. “It feels a little weird to put 100 hours into something that won’t be noticed by its absence.” # 23rd November 2017, 9:21 pm

How a single PostgreSQL config change improved slow query performance by 50x. “If you are using SSDs and running PostgreSQL with default configuration, I encourage you to try tuning random_page_cost & seq_page_cost. You might be surprised by some huge performance improvements.” # 23rd November 2017, 8:11 pm

From Markdown to RCE in Atom (via) Lukas Reschke found a remote code execution vulnerability in the Atom editor by taking advantage of a combination of Markdown’s ability to embed HTML, Atom’s Content-Security-Policy allowing JavaScript from the local filesystem to be executed, and a test suite HTML file hidden away in the Atom application package that executes code passed to it via query string. # 23rd November 2017, 4:13 pm