Simon Willison’s Weblog

Subscribe

TILs

Filters: Sorted by date

TIL Downloading every video for a TikTok account — TikTok may or may not be banned in the USA within the next 24 hours or so. Here's a pattern you can use to download all of the videos from a specific account.
TIL Calculating the size of all LFS files in a repo — I wanted to know how large the [deepseek-ai/DeepSeek-V3-Base](https://huggingface.co/deepseek-ai/DeepSeek-V3-Base) repo on Hugging Face was without actually downloading all of the files.
TIL Named Entity Resolution with dslim/distilbert-NER — I was exploring the original BERT model from 2018, which is mainly useful if you fine-tune a model on top of it for a specific task.
TIL Fixes for datetime UTC warnings in Python — I was getting the following warning for one of my Python test suites:
TIL Publishing a simple client-side JavaScript package to npm with GitHub Actions — Here's what I learned about publishing a single file JavaScript package to NPM for my [Prompts.js](https://simonwillison.net/2024/Dec/7/prompts-js/) project.
TIL GitHub OAuth for a static site using Cloudflare Workers — My [tools.simonwillison.net](https://tools.simonwillison.net/) site is a growing collection of small HTML and JavaScript applications hosted as static files on GitHub Pages.
TIL Running cog automatically against GitHub pull requests — I really like [Cog](https://nedbatchelder.com/code/cog/) ([previously](https://til.simonwillison.net/python/cog-to-update-help-in-readme)) as a tool for automating aspects of my Python project documentation - things like the SQL schemas shown on the [LLM logging page](https://llm.datasette.io/en/latest/logging.html#sql-schema).
TIL Generating documentation from tests using files-to-prompt and LLM — I was experimenting with [wasmtime-py](https://github.com/bytecodealliance/wasmtime-py) today and found the [current documentation](https://bytecodealliance.github.io/wasmtime-py/) didn't quite give me the information that I needed.
TIL Installing flash-attn without compiling it — If you ever run into instructions that tell you to do this:
TIL Using uv to develop Python command-line applications — I finally figured out a process that works for me for hacking on Python CLI utilities using [uv](https://docs.astral.sh/uv/) to manage my development environment, thanks to a little bit of help from Charlie Marsh.
TIL Using uv to develop Python command-line applications — I finally figured out a process that works for me for hacking on Python CLI utilities using [uv](https://docs.astral.sh/uv/) to manage my development environment, thanks to a little bit of help from Charlie Marsh.
TIL Setting cache-control: max-age=31536000 with a Cloudflare Transform Rule — I ran https://simonwillison.net/ through [PageSpeed Insights](https://pagespeed.web.dev/) and it warned me that my static assets were not being served with browser caching headers:
TIL Running prompts against images, PDFs, audio and video with Google Gemini — I'm still working towards adding multi-modal support to my [LLM](https://llm.datasette.io/) tool. In the meantime, here are notes on running prompts against images and PDFs and audio and video files from the command-line using the [Google Gemini](https://ai.google.dev/gemini-api) family of models.
TIL The most basic possible Hugo site — With [Claude's help](https://gist.github.com/simonw/6f7b6a40713b36749da845065985bb28) I figured out what I think is the most basic version of a static site generated using [Hugo](https://gohugo.io/).
TIL Livestreaming a community election event on YouTube — I live in El Granada, California. Wikipedia calls us [a census designated place](https://en.wikipedia.org/wiki/El_Granada,_California) - we don't have a mayor or city council. But we do have a [Community Services District](https://granada.ca.gov/) - originally responsible for our sewers, and since 2014 also responsible for our parks. And we get to vote for the board members [in the upcoming November election](https://granada.ca.gov/2024-candidate-listing)!
TIL Upgrading Homebrew and avoiding the failed to verify attestation error — I managed to get my Homebrew installation back into shape today. The first problem I was having is that it complained that macOS Sequoia was unsupported:
TIL Collecting replies to tweets using JavaScript — I ran [a survey](https://twitter.com/simonw/status/1843290729260703801) on Twitter the other day to try and figure out what people mean when they use the term "agents" with respect to AI.
TIL Compiling and running sqlite3-rsync — Today I heard about the [sqlite3-rsync](https://sqlite.org/draft/rsync.html) command, currently available in a branch in the SQLite code repository. It provides a mechanism for efficiently creating or updating a copy of a SQLite database that is running in WAL mode, either locally or via SSH to another server.
TIL Building an automatically updating live blog in Django — OpenAI's DevDay event yesterday (October 1st 2024) didn’t invite press (as far as I can tell), didn’t livestream the event and didn’t allow audience livestreaming either. I made a last minute decision [to live blog the event](https://simonwillison.net/2024/Oct/1/openai-devday-2024-live-blog/) myself.
TIL How streaming LLM APIs work — I decided to have a poke around and see if I could figure out how the HTTP streaming APIs from the various hosted LLM providers actually worked. Here are my notes so far.
TIL Testing HTML tables with Playwright Python — I figured out this pattern today for testing an HTML table dynamically added to a page by JavaScript, using [Playwright Python](https://playwright.dev/python/):
TIL Using namedtuple for pytest parameterized tests — I'm writing some quite complex [pytest]() parameterized tests this morning, and I was finding it a little bit hard to read the test cases as the number of parameters grew.
TIL Using sqlite-vec with embeddings in sqlite-utils and Datasette — Alex Garcia's [sqlite-vec](https://github.com/asg017/sqlite-vec) SQLite extension provides a bunch of useful functions for working with vectors inside SQLite.
TIL Using pytest-django with a reusable Django application — I published a reusable Django application today: **[django-http-debug](https://github.com/simonw/django-http-debug)**, which lets you define mock HTTP endpoints using the Django admin - like `/webhook-debug/` for example, configure what they should return and view detailed logs of every request they receive.
TIL Assistance with release notes using GitHub Issues — I like to write the release notes for my projects by hand, but sometimes it can be useful to have some help along the way.
TIL Back-dating Git commits based on file modification dates — I fell down a bit of a rabbit hole this morning. In trying to figure out [where the idea of celebrating World Wide Web Day on August 1st](https://simonwillison.net/2024/Aug/1/august-1st-world-wide-web-day/) came from I ran across Tim Berner-Lee's original code for the WorldWideWeb application for NeXT on the W3C's website:
TIL HTML video with subtitles — Via [Mariatta](https://fosstodon.org/@mariatta/112883308634473940) I found my [PyVideo speaker page](https://pyvideo.org/speaker/simon-willison.html), and thanks to that page I learned that a talk I gave in 2009 had been rescued from the now-deceased [Blip.tv](https://en.wikipedia.org/wiki/Blip.tv) and is now hosted by the Internet Archive:
TIL Trying out free-threaded Python on macOS — Inspired by [py-free-threading.github.io](https://py-free-threading.github.io/) I decided to try out a beta of Python 3.13 with the new free-threaded mode enabled, which removes the GIL.
TIL Accessing 1Password items from the terminal — I save things like API keys in [1Password](https://1password.com/). Today I figured out how to access those from macOS terminal scripts.
TIL Mocking Stripe signature checks in a pytest fixture — I'm writing some code that accepts webhooks from Stripe. I wanted to simulate hits to this endpoint in my Django tests. Stripe uses a `Stripe-Signature` header and I wanted a way to mock my code so that I didn't need to calculate the correct signature.

Years