Conflating Overture Places Using DuckDB, Ollama, Embeddings, and More. Drew Breunig's detailed tutorial on "conflation" - combining different geospatial data sources by de-duplicating address strings such as RESTAURANT LOS ARCOS,3359 FOOTHILL BLVD,OAKLAND,94601
and LOS ARCOS TAQUERIA,3359 FOOTHILL BLVD,OAKLAND,94601
.
Drew uses an entirely offline stack based around Python, DuckDB and Ollama and finds that a combination of H3 geospatial tiles and mxbai-embed-large
embeddings (though other embedding models should work equally well) gets really good results.
Recent articles
- OpenAI DevDay 2024 live blog - 1st October 2024
- Weeknotes: Three podcasts, two trips and a new plugin system - 30th September 2024
- NotebookLM's automatically generated podcasts are surprisingly effective - 29th September 2024