Wikidata is a Giant Crosswalk File. Drew Breunig shows how to take the 140GB Wikidata JSON export, use sed 's/,$//'
to convert it to newline-delimited JSON, then use DuckDB to run queries and extract external identifiers, including a query that pulls out 500MB of latitude and longitude points.
Recent articles
- I built an automaton called Squadron - 4th March 2025
- Notes from my Accessibility and Gen AI podcast appearence - 2nd March 2025
- Hallucinations in code are the least dangerous form of LLM mistakes - 2nd March 2025