Simon Willison’s Weblog

Subscribe

TILs

Filters: Sorted by date

TIL Histogram with tooltips in Observable Plot — Given an array of datetime objects, I wanted to plot a histogram. But I wanted to automatically pick a bucket size for that histogram that resulted in something interesting, no matter what range of time the individual points covered.
TIL Track timestamped changes to a SQLite table using triggers — This is more of a "today I figured out" than a TIL.
TIL Building a specific version of SQLite with pysqlite on macOS/Linux — I wanted the ability to test my Python software against specific version of SQLite on macOS. I found a way to do that using [pysqlite3](https://github.com/coleifer/pysqlite3).
TIL Adding Sphinx autodoc to a project, and configuring Read The Docs to build it — I built a [new API reference page](https://sqlite-utils.datasette.io/en/latest/reference.html) today for `sqlite-utils`, using the Sphinx [autodoc extension](https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html) to extract docstrings from the code and use them to build a full class reference.
TIL Find local variables in the traceback for an exception — For [sqlite-utils issue #309](https://github.com/simonw/sqlite-utils/issues/309) I had an error that looked like this:
TIL Tailing Google Cloud Run request logs and importing them into SQLite — The `gcloud` CLI tool has [the alpha ability to tail log files](https://cloud.google.com/logging/docs/reference/tools/gcloud-logging#live-tailing) - but it's a bit of a pain to setup.
TIL SQLite aggregate filter clauses — SQLite supports aggregate filter clauses (as of [3.30.0, released 2019-10-04](https://www.sqlite.org/changes.html#version_3_30_0)), as described in this [SQL Pivot in all databases](https://modern-sql.com/use-case/pivot) tutorial.
TIL Set a GIF to loop using ImageMagick — I managed to accidentally create a GIF that ran once without looping. I think this is because I created it in [LICEcap](https://www.cockos.com/licecap/) but then deleted some frames and re-saved it using macOS Preview.
TIL Check spelling using codespell — Today I discovered [codespell](https://github.com/codespell-project/codespell/) via [this Rich commit](https://github.com/willmcgugan/rich/commit/9c12a4537499797c43725fff5276ef0da62423ef#diff-ce84a1b2c9eb4ab3ea22f610cad7111cb9a2f66365c3b24679901376a2a73ab2). `codespell` is a really simple spell checker that can be run locally or incorporated into a CI flow.
TIL Search and replace with regular expressions in VS Code — I wanted to replace all instances of this:
TIL Extracting objects recursively with jq — The Algolia-powered Hacker News API returns nested comment threads that look like this: https://hn.algolia.com/api/v1/items/27941108
TIL Using the tesseract CLI tool — Tesseract OCR has a command-line utility which is woefully under-documented. Thanks to [Alexandru Nedelcu](https://alexn.org/blog/2020/11/11/organize-index-screenshots-ocr-macos.html) I figured out how to use it today.
TIL Importing CSV data into SQLite with .import — I usually use my `sqlite-utils insert blah.db tablename file.csv --csv` command to import CSV data into SQLite, but for large CSV files (like a 750MB one) this can take quite a long time - over half an hour in this case.
TIL Using nginx to proxy to a Unix domain socket — I figured this out while adding `--uds` support to Datasette in [#1388](https://github.com/simonw/datasette/issues/1388). Save the following in `nginx.conf`:
TIL Using pysqlite3 on macOS — While trying to use [pysqlite3](https://github.com/coleifer/pysqlite3) on macOS I got the following error:
TIL Increasing the time limit for a Google Cloud Scheduler task — In [VIAL issue 724](https://github.com/CAVaccineInventory/vial/issues/724) a Cloud Scheduler job which triggered a Cloud Run hosted export script - by sending an HTTP POST to an endpoint - was returning an error. The logs showed the error happened exactly three minutes after the task started executing.
TIL Preventing double form submissions with JavaScript — I needed this for [VIAL issue 722](https://github.com/CAVaccineInventory/vial/issues/722). I decided to disable form submissions for two seconds after they are submitted, to protect against accidental double submissions without risk of unexpected issues that could cause the form to be permanently disabled even though it should still be able to submit it.
TIL Installing different PostgreSQL server versions in GitHub Actions — The GitHub Actions `ubuntu-latest` default runner currently includes an installation of PostgreSQL 13. The server is not running by default but you can interact with it like this:
TIL Scraping Reddit via their JSON API — Reddit have long had an unofficial (I think) API where you can add `.json` to the end of any URL to get back the data for that page as JSON.
TIL Mouse support in vim — Today I learned that if you hit `Esc` in vim and then type `:set mouse=a` and hit enter... vim grows mouse support! In your terminal!
TIL Running a MySQL server using Homebrew — First, install MySQL like so:
TIL Exporting Amplitude events to SQLite — [Amplitude](https://amplitude.com/) offers an "Export Data" button in the project settings page. This can export up to 365 days of events (up to 4GB per export), where the export is a zip file containing `*.json.gz` gzipped newline-delimited JSON.
TIL Building a Markdown summary of Django group permissions — [Django SQL Dashboard](https://django-sql-dashboard.datasette.io/) can display rendered markdown generated by a SQL query.
TIL Finding CSV files that start with a BOM using ripgrep — For [sqlite-utils issue 250](https://github.com/simonw/sqlite-utils/issues/250) I needed to locate some test CSV files that start with a UTF-8 BOM.
TIL Running Docker on an M1 Mac — *TLDR: Use `platform: linux/amd64` in `docker-compose.yml` and your containers will probably work!*
TIL Docker Compose for Django development — I had to get Docker Compose working for a Django project, primarily to make it easier for other developers to get a working development environment.
TIL Finding duplicate records by matching name and nearby distance — I wanted to find potentially duplicate records in my data, based on having the exact same name and being geographically located within 500 meters of each other.
TIL Switching between gcloud accounts — I have two different Google Cloud accounts active at the moment. Here's how to list them with `gcloud auth list`:

Years