Simon Willison’s Weblog

Subscribe

Saturday, 16th September 2023

How CPython Implements and Uses Bloom Filters for String Processing. Fascinating dive into Python string internals by Abhinav Upadhyay. It turns out CPython uses very simple bloom filters in several parts of the core string methods, to solve problems like splitting on newlines where there are actually eight codepoints that could represent a newline, and a tiny bloom filter can help filter a character in a single operation before performing all eight comparisons only if that first check failed.

# 10:32 pm / performance, bloom-filters, python

Notes on using a single-person Mastodon server. Julia Evans experiences running a single-person Mastodon server (on masto.host—the same host I use for my own) pretty much exactly match what I’ve learned so far as well. The biggest disadvantage is the missing replies issue, where your server only shows replies to posts that come from people who you follow—so it’s easy to reply to something in a way that duplicates other replies that are invisible to you.

# 10:35 pm / mastodon, julia-evans

2023 » September

MTWTFSS
    123
45678910
11121314151617
18192021222324
252627282930