Simon Willison’s Weblog

Subscribe

Saturday, 15th June 2024

Using DuckDB for Embeddings and Vector Search (via) Sören Brunk's comprehensive tutorial combining DuckDB 1.0, a subset of German Wikipedia from Hugging Face (loaded using Parquet), the BGE M3 embedding model and DuckDB's new vss extension for implementing an HNSW vector index.

# 2:39 pm / embeddings, ai, duckdb

I understand people are upset about AI art making it to the final cut, but please try to also google artist names and compare to their portfolio before accusing them of using AI. I'm genuinely pretty upset to be accused of this. It's no fun to work on your craft for decades and then be told by some 'detection site' that your work is machine generated and people are spreading this around as a fact.

Johanna Tarkela

# 3:28 pm / ai, ethics, generative-ai

Notes on upgrading by blog’s Heroku database plan. Heroku discontinued the “Basic” PostgreSQL plan I’ve been using for my blog, so I just upgraded to the new “essential-0” tier. Here are my notes as a GitHub issue—it was very straightforward, and I’m really only linking to it now to test that writes to the new database work correctly.

I try to create an issue like this any time I do even a minor ops task, mainly so I have somewhere to drop screenshots of any web UI interactions for future reference.

# 10:29 pm / heroku