Simon Willison’s Weblog

Subscribe

October 2022

68 posts: 8 entries, 28 links, 3 quotes, 29 beats

Oct. 14, 2022

Release datasette-screenshots 0.62 — Screenshots of Datasette, taken using shot-scraper
TIL shot-scraper for a subset of table columns — For [Datasette issue #1844](https://github.com/simonw/datasette/issues/1844) I wanted to create the following screenshot:

Automating screenshots for the Datasette documentation using shot-scraper

Visit Automating screenshots for the Datasette documentation using shot-scraper

I released shot-scraper back in March as a tool for keeping screenshots in documentation up-to-date.

[... 1,810 words]

Oct. 15, 2022

Release shot-scraper 1.0 — A command-line utility for taking automated screenshots of websites

shot-scraper 1.0 (via) Only a minor release in terms of features, but I decided that I'm comfortable enough with the CLI design at this point that I'm ready to stamp a 1.0 on it and commit to not making backwards-incompatible changes (at least without shipping a 2.0 release, which I'd like to avoid if possible).

Full release notes:

# 9:28 pm / cli, projects, shot-scraper

Dumping the HTML of a page using shot-scraper. New in 1.0 is the “shot-scraper html URL” command, which outputs the HTML of a page once JavaScript has finished executing there. You can pass in additional custom JavaScript to run before the shapshot is taken, and you can also specify a CSS selector on the page to return just that fragment of HTML.

# 9:30 pm / shot-scraper

TIL Guessing Amazon image URLs using GitHub Copilot — I was experimenting with the new [Readwise export API](https://readwise.io/api_deets#export) and it gave me back the following JSON:

How to create a Python package in 2022 (via) Fantastic tutorial on modern Python packaging by Rodrigo Girão Serrão. I’ve been meaning to figure out Poetry for a while now and this gave me exactly the information I needed to start figuring it out. Great coverage of GitHub Actions, Tox and pre-commit as well.

# 10:10 pm / packaging, python, github-actions

Oct. 16, 2022

Half Moon Bay Pumpkin Festival traffic on Saturday 15th October 2022 (via) It’s the Half Moon Bay Pumpkin Festival this weekend... and its impact on the traffic between our little town of El Granada and Half Moon Bay—8 minutes drive away—is notorious. So I built a git scraper that archives estimated driving times from the Google Maps Navigation API, and used git-history to turn that scraped data into a SQLite database and visualize it on a chart.

# 3:56 am / projects, git-scraping, git-history, half-moon-bay

Oct. 17, 2022

“You are GPT-3”. Genius piece of prompt design by Riley Goodside. “A long-form GPT-3 prompt for assisted question-answering with accurate arithmetic, string operations, and Wikipedia lookup. Generated IPython commands (in green) are pasted into IPython and output is pasted back into the prompt (no green).” Uses “Out[” as a stop sequence to ensure GPT-3 stops at each generated iPython prompt rather than inventing the output itself.

# 4:35 am / gpt-3, prompt-engineering, generative-ai, riley-goodside, llms

Oct. 18, 2022

Chris Amico’s Python setup for 2022 (via) Homebrew to install pyenv, then pyenv to install specific Python versions. pipx and pipenv for package management. I need to habitually start using pyenv for everything.

# 2:04 pm / python, chris-amico

Oct. 19, 2022

Simple, Fast, and Scalable Reverse Image Search Using Perceptual Hashes and DynamoDB. Christopher Bong provides a clear explanation of how perceptual hashes can be used to create a string representing the visual content of an image, such that similar images can be identified by calculating a hamming distance between those hashes. He then explains how they built a large-scale system for this at Canva on top of DynamoDB, by splitting those strings into smaller hash windows and using those for efficient bulk lookups of similar candidates.

# 3:04 pm / images, search

Measuring traffic during the Half Moon Bay Pumpkin Festival

Visit Measuring traffic during the Half Moon Bay Pumpkin Festival

This weekend was the 50th annual Half Moon Bay Pumpkin Festival.

[... 2,693 words]

Oct. 20, 2022

TIL Adding a Datasette ASGI app to Django — [Datasette](https://datasette.io/) is implemented as an ASGI application.

Oct. 21, 2022

The Commodordion (via) The Commodordion is “an 8-bit accordion primarily made of C64s, floppy disks, and gaffer tape” by Linus Åkesson. It’s absolutely beautiful.

# 11:36 pm / commodore, hacks, music

Oct. 22, 2022

Release datasette-gunicorn 0.1 — Plugin for running Datasette using Gunicorn

Oct. 23, 2022

TIL Simple load testing with Locust — I've been using [Locust](https://locust.io/) recently to run some load tests - most significantly [these tests](https://github.com/simonw/django_sqlite_benchmark/issues?q=is%3Aissue+is%3Aclosed) against SQLite running with Django and [this test](https://github.com/simonw/datasette-gunicorn/issues/1) exercising Datasette and Gunicorn.
TIL Writing a Datasette CLI plugin that mostly duplicates an existing command — My new [datasette-gunicorn](https://datasette.io/plugins/datasette-gunicorn) plugin adds a new command to Datasette - `datasette gunicorn` - which mostly replicates the existing `datasette serve` command but with a few differences.

Weeknotes: DjangoCon, SQLite in Django, datasette-gunicorn

I spent most of this week at DjangoCon in San Diego—my first outside-of-the-Bay-Area conference since the before-times.

[... 1,184 words]

Oct. 24, 2022

Release datasette 0.63a1 — An open source multi-tool for exploring and publishing data

Leveraging ’shot-scraper’ and creating image diffs. Üllar Seerme has a neat recipe for using shot-scraper and ImageMagick to create differential animations showing how a scraped web page has visually changed.

# 9:34 pm / imagemagick, github-actions, shot-scraper

Release shot-scraper 1.0.1 — A command-line utility for taking automated screenshots of websites

Oct. 25, 2022

Release datasette-indieauth 1.2.1 — Datasette authentication using IndieAuth and RelMeAuth
TIL os.remove() on Windows fails if the file is already open — I puzzled over this one for [quite a while](https://github.com/simonw/sqlite-utils/issues/503) this morning. I had this test that was failing with Windows on Python 3.11:
Release sqlite-utils 3.30 — Python CLI utility and library for manipulating SQLite databases

Most researchers don’t share their data. If you’ve ever read the words “data is available upon request" in an academic paper, and emailed the authors to request it, the chances that you'll actually receive the data are just 7 percent. The rest of the time, the authors have lost access to their data, changed emails, or are too busy or unwilling.

Saloni Dattani

# 10:48 pm / open-data, science

Oct. 27, 2022

Release datasette 0.63 — An open source multi-tool for exploring and publishing data

Datasette 0.63: The annotated release notes

Visit Datasette 0.63: The annotated release notes

I released Datasette 0.63 today. These are the annotated release notes.

[... 1,531 words]

Release datasette-edit-templates 0.1 — Plugin allowing Datasette templates to be edited within Datasette

Oct. 28, 2022

Welcome to hell, Elon (via) If you only read one thing about the Elon acquisition of Twitter make it this, by Nilay Patel. Outstanding insights into what it actually takes to to run a commercial social media service.

# 3:16 pm / moderation, social-media, twitter, nilay-patel

2022 » October

MTWTFSS
     12
3456789
10111213141516
17181920212223
24252627282930
31