Simon Willison’s Weblog

Subscribe

TILs

Filters: Sorted by date

TIL Convert git log output to JSON using jq — I just spent way too long messing around with ChatGPT ([transcript here](https://gist.github.com/simonw/c3b486fa90d7c32a0e8dfb47e151090a)) trying to figure this out. After much iteration, here's a recipe that works (mostly written by me at this point):
None
TIL Use DuckDB to convert parquet to JSON and then open it in Datasette Lite — [pickapic.io](https://pickapic.io/) is a new tool funded by [stability.ai](https://stability.ai/) which asks people to generate and then vote on images in order to provide data to be used for fine tuning an open source image generation model.
None
TIL A simple Python implementation of the ReAct pattern for LLMs — A popular nightmare scenario for AI is giving it access to tools, so it can make API calls and execute its own code and generally break free of the constraints of its initial environment.
None
TIL Scraping Reddit and writing data to the Datasette write API — Today I built a system for monitoring Reddit for new posts that link to various domains that I own.
None
TIL How to read Hacker News threads with most recent comments first — [Hacker News](https://news.ycombinator.com/) displays comments in a tree. This can be frustrating if you want to keep track of a particular conversation, as you constantly have to seek through the tree to find the latest comment.
None
TIL Copy rich text to the clipboard — I've been experimenting with a tool for generating the content for a weekly Substack newsletter by querying the Datasette API for my blog and assembling HTML for the last week of content.
None
TIL Running LLaMA 7B and 13B on a 64GB M2 MacBook Pro with llama.cpp — See also: **[Large language models are having their Stable Diffusion moment right now](https://simonwillison.net/2023/Mar/11/llama/)**.
None
TIL Using SQL with GDAL — Inspired [by Brad Neuberg](https://twitter.com/bradneuberg/status/1633875601789681666) I decided to take a look at the SQL features in the GDAL family of tools.
None
TIL Using ChatGPT to write AppleScript — I found a killer application for ChatGPT today: writing AppleScript!
None
TIL Mocking subprocess with pytest-subprocess — For [apple-notes-to-sqlite](https://github.com/dogsheep/apple-notes-to-sqlite) I needed to write some tests that simulated executing the `osascript` command using the Python `subprocess` module.
None
TIL A simple Python wrapper for the ChatGPT API — OpenAI [released an API for ChatGPT](https://openai.com/blog/introducing-chatgpt-and-whisper-apis) yesterday. It's 1/10th of the price of the `text-davinci-003` model!
None
TIL sips: Scriptable image processing system — I wanted to convert some `.webp` images to `.png` on my Mac. I asked ChatGPT:
None
TIL Training nanoGPT entirely on content from my blog — This is a follow-up to [Running nanoGPT on a MacBook M2 to generate terrible Shakespeare](https://til.simonwillison.net/llms/nanogpt-shakespeare-m2).
None
TIL Subqueries in select expressions in SQLite - also window functions — I figured out a single SQL query for the following today. Given a table of GitHub repositories, for each repository return:
None
TIL Avoiding "length" errors in Apache Bench with the -l option — I was using the Apache Bench `ab` command to exercise some new code I'm writing in Datasette and I noticed I was getting a lot of errors:
None
TIL The SQLite now argument is stable within the same query — I stumbled across an interesting little detail of SQLite today, running the following query:
None
TIL Building Mastodon bots with GitHub Actions and toot — Twitter [announced today](https://twitter.com/TwitterDev/status/1621026986784337922) that they'll be ending free API access for bots.
None
TIL Run Python code in a WebAssembly sandbox — I've been trying to figure this out for ages. Tim Bart responded to [my call for help on Hacker News](https://news.ycombinator.com/item?id=34598024) with [this extremely useful code example](https://gist.github.com/pims/711549577759ad1341f1a90860f1f3a5) showing how to run Python code in WebAssembly inside Python, using [wasmtime-py](https://github.com/bytecodealliance/wasmtime-py) and the new Python WASM build [released by VMware Wasm Labs](https://wasmlabs.dev/articles/python-wasm32-wasi/).
None
TIL Running nanoGPT on a MacBook M2 to generate terrible Shakespeare — [nanoGPT](https://github.com/karpathy/nanoGPT) is Andrej Karpathy's "simplest, fastest repository for training/finetuning medium-sized GPTs".
None
TIL Calculating embeddings with gtr-t5-large in Python — I've long wanted to run some kind of large language model on my own computer. Now that I have a M2 MacBook Pro I'm even more keen to find interesting ways to keep all of those CPU cores busy.
None
TIL Using recursive CTEs to explore hierarchical Twitter threads — This TIL adapted from [a Gist](https://gist.github.com/simonw/656a8c6e4688f720773c474080abe1b0) I put together in 2019, before I started tracking TILs here.
None
TIL Combining CTEs and VALUES in SQLite — Here's how to use SQLite's `VALUES` syntax with a CTE to create a temporary table that you can then perform joins against in a query:
None
TIL Installing lxml for Python on an M1/M2 Mac — I ran into this error while trying to run `pip install lxml` on an M2 Mac, inside a virtual environment I had intitially created using `pipenv shell`:
None
TIL SQLite pragma_function_list() — The SQLite `pragma_function_list()` table-valued function returns a list of functions that have been registered with SQLite, including functions that were added by extensions.
None
TIL Rewriting a Git repo to remove secrets from the history — I decided to make a GitHub repository public today that had previously been private. Unfortunately the revision history of that repository included some secret values, one of which I could not figure out a way to revoke.
None
TIL Upgrading a pipx application to an alpha version — I wanted to upgrade my [git-history](https://datasette.io/tools/git-history) installation to a new alpha version.
None
TIL Scraping the Sky News Westminster Accounts, a Flourish application — Sky News in partnership with [Tortoise](https://www.tortoisemedia.com/) published a fantastic piece of investigative data reporting: [the Westminster Accounts](https://news.sky.com/story/westminster-accounts-methodology-12764656), a database of money in UK politics that brought together data from three different sources and make it explorable.
None
TIL Loading SQLite extensions in Python on macOS — I finally found a workaround for this error when attempting to load a SQLite extension in Python on macOS:
None
TIL Geopoly in SQLite — I noticed this morning that one of my Datasette installations had the [Geopoly](https://www.sqlite.org/geopoly.html) SQLite extension enabled. I don't know how it got there - it has to be compiled specifically - but since it was there I decided to try it out.
None