Simon Willison’s Weblog

Subscribe
Atom feed for bloom-filters

5 posts tagged “bloom-filters”

2024

[On Reddit] we had to look up every single comment on the page to see if you had voted on it [...]

But with a bloom filter, we could very quickly look up all the comments and get back a list of all the ones you voted on (with a couple of false positives in there). Then we could go to the cache and see if your actual vote was there (and if it was an upvote or a downvote). It was only after a failed cache hit did we have to actually go to the database.

But that bloom filter saved us from doing sometimes 1000s of cache lookups.

Jeremy Edberg

# 24th December 2024, 7:13 am / reddit, bloom-filters, scaling

Bloom Filters, explained by Sam Rose. Beautifully designed explanation of bloom filters, complete with interactive demos that illustrate exactly how they work.

# 23rd February 2024, 3:59 pm / explorables, bloom-filters, sam-rose

2023

How CPython Implements and Uses Bloom Filters for String Processing. Fascinating dive into Python string internals by Abhinav Upadhyay. It turns out CPython uses very simple bloom filters in several parts of the core string methods, to solve problems like splitting on newlines where there are actually eight codepoints that could represent a newline, and a tiny bloom filter can help filter a character in a single operation before performing all eight comparisons only if that first check failed.

# 16th September 2023, 10:32 pm / performance, bloom-filters, python

2009

All you ever wanted to know about writing bloom filters. This helped me understand a key use case for bloom filters: reducing the impact of the “worst case search is when there are no matching results so everything gets scanned” problem.

# 30th January 2009, 8:26 am / bloom-filters, search, jonathan-ellis

2008

Bloom Filter Resources. A continuation of the discussion about how to transfer information about a large number of recently updated resources around in an efficient way, Joe provides working code illustrating a simple approach using bloom filters.

# 19th October 2008, 10:22 pm / rest, joe-gregorio, bloom-filters, hashing