WarcDB (via) Florents Tselai built this tool for loading web crawl data stored in WARC (Web ARChive) format into a SQLite database for smaller-scale analysis with SQL, on top of my sqlite-utils Python library.
Sponsor me for $10/month and get a curated email digest of the month's most important LLM developments.
Pay me to send you less!