Simon Willison’s Weblog

Subscribe
Atom feed for gis

50 items tagged “gis”

2024

Conflating Overture Places Using DuckDB, Ollama, Embeddings, and More. Drew Breunig's detailed tutorial on "conflation" - combining different geospatial data sources by de-duplicating address strings such as RESTAURANT LOS ARCOS,3359 FOOTHILL BLVD,OAKLAND,94601 and LOS ARCOS TAQUERIA,3359 FOOTHILL BLVD,OAKLAND,94601.

Drew uses an entirely offline stack based around Python, DuckDB and Ollama and finds that a combination of H3 geospatial tiles and mxbai-embed-large embeddings (though other embedding models should work equally well) gets really good results.

# 30th September 2024, 5:24 pm / drew-breunig, gis, duckdb, python, ai, embeddings, overture

Towards Standardizing Place. Overture Maps announced General Availability of its global maps datasets last week, covering places, buildings, divisions, and base layers.

Drew Breunig demonstrates how this can be accessed using both the Overture Explorer tool and DuckDB, and talks about Overture's GERS IDs - reminiscent of Who's On First IDs - which provide stable IDs for all kinds of geographic places.

# 1st August 2024, 11:14 pm / whosonfirst, drew-breunig, geospatial, gis, overture

The many lives of Null Island (via) Stamen's custom basemaps have long harbored an Easter egg: zoom all the way in on 0, 0 to see the outline of the mystical "null island", the place where GIS glitches and data bugs accumulate, in the Gulf of Guinea south of Ghana.

Stamen's Alan McConchie provides a detailed history of the Easter egg - first introduced by Mike Migurski in 2010 - along with a definitive guide to the GIS jokes and traditions that surround it.

Here's Null Island on Stamen's Toner map. The shape (also available as GeoJSON) is an homage to the island from 1993's Myst, hence the outline of a large docked ship at the bottom.

White outline of Null Island on a black background.

Alan recently gave a talk about Stamen's updated custom maps at State of the Map US 2024 (video, slides) - their Toner and Terrain maps are now available as vector tiles served by Stadia Maps (here's the announcement), but their iconic watercolor style is yet to be updated to vectors, due to the weird array of raster tricks it used to achieve the effect.

In researching this post I searched for null island on Google Maps and was delighted to learn that a bunch of entrepreneurs in Western Africa have tapped into the meme for their own businesses:

A null island search returns companies in The Gambia, Côte d’Ivoire, Burkina Faso, Cameroon and Democratic Republic of the Congo.

# 28th July 2024, 5:44 pm / maps, stamen-design, gis, michal-migurski

Searching an aerial photo with text queries. Robin Wilson built a demo that lets you search a large aerial photograph of Southampton for things like "roundabout" or "tennis court". He explains how it works in detail: he used the SkyCLIP model, which is trained on "5.2 million remote sensing image-text pairs in total, covering more than 29K distinct semantic tags" to generate embeddings for 200x200 image segments (with 100px of overlap), then stored them in Pinecone.

# 12th July 2024, 6:07 pm / gis, embeddings, clip

tiny-world-map (via) I love this project. It’s a JavaScript file (694K uncompressed, 283KB compressed) which can be used with the Leaflet mapping library and provides a SVG base map of the world with country borders and labels for every world city with a population more than 48,000—10,000 cities total.

This means you can bundle an offline map of the world as part of any application that doesn’t need a higher level of detail. A lot of smaller island nations are missing entirely though, so this may not be right for every project.

It even includes a service worker to help implement offline mapping support, plus several variants of the map with less cities that are even smaller.

# 21st April 2024, 10:11 pm / svg, serviceworkers, javascript, gis, mapping

A POI Database in One Line (via) Overture maps offer an extraordinarily useful freely licensed databases of POI (point of interest) listings, principally derived from partners such as Facebook and including restaurants, shops, museums and other locations from all around the world.

Their new "overturemaps" Python CLI utility makes it easy to quickly pull subsets of their data... but requires you to provide a bounding box to do so.

Drew Breunig came up with this delightful recipe for fetching data using LLM and gpt-3.5-turbo to fill in those bounding boxes:

overturemaps download --bbox=$(llm 'Give me a bounding box for Alameda, California expressed as only four numbers delineated by commas, with no spaces, longitude preceding latitude.') -f geojsonseq --type=place | geojson-to-sqlite alameda.db places - --nl --pk=id

# 19th April 2024, 2:44 am / drew-breunig, llm, gis, geojson, overture

mapshaper.org (via) It turns out the mapshaper CLI tool for manipulating geospatial data—including converting shapefiles to GeoJSON and back again—also has a web UI that runs the conversions entirely in your browser. If you need to convert between those (and other) formats it’s hard to imagine a more convenient option.

# 23rd March 2024, 3:44 am / shapefiles, gis, geojson, javascript

Claude and ChatGPT for ad-hoc sidequests

Visit Claude and ChatGPT for ad-hoc sidequests

Here is a short, illustrative example of one of the ways in which I use Claude and ChatGPT on a daily basis.

[... 1,754 words]

How to make self-hosted maps that work everywhere and cost next to nothing. Chris Amico provides a detailed roundup of the state of web mapping in 2024. It’s never been easier to entirely host your own mapping infrastructure, thanks to OpenStreetMap, Overture, MBTiles, PMTiles, Maplibre and a whole ecosystem of other fine open source projects.

I like Protomaps creator Brandon Liu’s description of this: “post-scarcity web mapping”.

# 24th February 2024, 4:19 am / maps, chris-amico, gis, overture

2023

Database generated columns: GeoDjango & PostGIS. Paolo Melchiorre advocated for the inclusion of generated columns, one of the biggest features in Django 5.0. Here he provides a detailed tutorial showing how they can be used with PostGIS to create database tables that offer columns such as geohash that are automatically calculated from other columns in the table.

# 11th December 2023, 7:14 pm / postgresql, gis, django

Geospatial SQL queries in SQLite using TG, sqlite-tg and datasette-sqlite-tg. Alex Garcia built sqlite-tg—a SQLite extension that uses the brand new TG geospatial library to provide a whole suite of custom SQL functions for working with geospatial data.

Here are my notes on trying out his initial alpha releases. The extension already provides tools for converting between GeoJSON, WKT and WKB, plus the all important tg_intersects() function for testing if a polygon or point overlap each other.

It’s pretty useful already. Without any geospatial indexing at all I was still able to get 700ms replies to a brute-force point-in-polygon query against 150MB of GeoJSON timezone boundaries stored as JSON text in a table.

# 25th September 2023, 7:45 pm / datasette, geospatial, sqlite, alex-garcia, gis, geojson, tg

TG: Polygon indexing (via) TG is a brand new geospatial library by Josh Baker, author of the Tile38 in-memory spatial server (kind of a geospatial Redis). TG is written in pure C and delivered as a single C file, reminiscent of the SQLite amalgamation.

TG looks really interesting. It implements almost the exact subset of geospatial functionality that I find most useful: point-in-polygon, intersect, WKT, WKB, and GeoJSON—all with no additional dependencies.

The most interesting thing about it is the way it handles indexing. In this documentation Josh describes two approaches he uses to speeding up point-in-polygon and intersection using a novel approach that goes beyond the usual RTree implementation.

I think this could make the basis of a really useful SQLite extension—a lighter-weight alternative to SpatiaLite.

# 23rd September 2023, 4:32 am / c, sqlite, geojson, geospatial, spatialite, gis, tg

Overture Maps Foundation Releases Its First World-Wide Open Map Dataset. The Overture Maps Foundation is a collaboration lead by Amazon, Meta, Microsoft and TomTom dedicated to producing “reliable, easy-to-use, and interoperable open map data”.

Yesterday they put out their first release and it’s pretty astonishing: four different layers of geodata, covering Places of Interest (shops, restaurants, attractions etc), administrative boundaries, building outlines and transportation networks.

The data is available as Parquet. I just downloaded the 8GB places dataset and can confirm that it contains 59 million listings from around the world—I filtered to just places in my local town and a spot check showed that recently opened businesses (last 12 months) were present and the details all looked accurate.

The places data is licensed under “Community Data License Agreement – Permissive” which looks like the only restriction is that you have to include that license when you further share the data.

# 27th July 2023, 4:45 pm / open-data, parquet, gis, meta, overture

2022

GPSJam (via) John Wiseman’s “Daily maps of GPS interference” —a beautiful interactive globe (powered by Mapbox GL) which you can use to see points of heaviest GPS interference over a 24 hour period, using data collected from commercial airline radios by ADS-B Exchange. “From what I can tell the most common reason for aircraft GPS systems to have degraded accuracy is jamming by military systems. At least, the vast majority of aircraft that I see with bad GPS accuracy are flying near conflict zones where GPS jamming is known to occur.”

# 30th July 2022, 7:51 pm / john-wiseman, gps, gis, mapping

A tiny web app to create images from OpenStreetMap maps

Visit A tiny web app to create images from OpenStreetMap maps

Earlier today I found myself wanting to programmatically generate some images of maps.

[... 1,388 words]

Datasette for geospatial analysis (via) I added a new page to the Datasette website describing how Datasette can be used for geospatial analysis, pulling together several of the relevant plugins and tools from the Datasette ecosystem.

# 13th April 2022, 12:48 am / gis, datasette, plugins

geoBoundaries. This looks useful: “The world’s largest open, free and research-ready database of political administrative boundaries.” Founded by the geoLab at William & Mary university, and released under a Creative Commons Attribution license that includes a requirement for a citation. File formats offered include shapefiles, GeoJSON and TopoJSON.

# 24th March 2022, 2:03 pm / shapefiles, gis, geojson

2021

Weeknotes: datasette-tiddlywiki, filters_from_request

Visit Weeknotes: datasette-tiddlywiki, filters_from_request

I made some good progress on the big refactor this week, including extracting some core logic out into a new Datasette plugin hook. I also got distracted by TiddlyWiki and released a new Datasette plugin that lets you run TiddlyWiki inside Datasette.

[... 1,197 words]

Adding GeoDjango to an existing Django project

Work on VIAL for Vaccinate The States continues.

[... 1,503 words]

country-coder (via) Given a latitude and longitude, how can you tell what country that point sits within? One way is to do a point-in-polygon lookup against a set of country polygons, but this can be tricky: some countries such as New Zealand have extremely complex outlines, even though for this use-case you don’t need the exact shape of the coastline. country-coder solves this with a custom designed 595KB GeoJSON file with detailed land borders but loosely defined ocean borders. It also comes with a wrapper JavaScript library that provides an API for resolving points, plus useful properties on each country with details like telepohen calling codes and emoji flags.

# 18th April 2021, 7:37 pm / gis, geojson

Spatialite Speed Test. Part of an excellent series of posts about SpatiaLite from 2012—here John C. Zastrow reports on running polygon intersection queries against a 1.9GB database file in 40 seconds without an index and 0.186 seconds using the SpatialIndex virtual table mechanism.

# 4th April 2021, 4:28 pm / spatialite, gis, sqlite

New call queue ready to test. Also geography.

I just shipped the first working preview version of the new /api/requestCall API [#54]—the API that our caller app uses to get the next location that the caller should be contacting (and lock it so that other users don’t call it at the same time).

[... 697 words]

Serving map tiles from SQLite with MBTiles and datasette-tiles

Visit Serving map tiles from SQLite with MBTiles and datasette-tiles

Working on datasette-leaflet last week re-kindled my interest in using Datasette as a GIS (Geographic Information System) platform. SQLite already has strong GIS functionality in the form of SpatiaLite and datasette-cluster-map is currently the most downloaded plugin. Most importantly, maps are fun!

[... 1,334 words]

Drawing shapes on a map to query a SpatiaLite database (and other weeknotes)

Visit Drawing shapes on a map to query a SpatiaLite database (and other weeknotes)

This week I built a Datasette plugin that lets you query a database by drawing shapes on a map!

[... 950 words]

Weeknotes: datasette-export-notebook, PyInstaller packaged Datasette, CBSAs

Visit Weeknotes: datasette-export-notebook, PyInstaller packaged Datasette, CBSAs

What a terrible week. I’ve found it hard to concentrate on anything substantial. In a mostly futile attempt to distract myself from doomscrolling I’ve mainly been building some experimental output plugins, fiddling with PyInstaller and messing around with shapefiles.

[... 732 words]

2020

Weeknotes: California Protected Areas in Datasette

Visit Weeknotes: California Protected Areas in Datasette

This week I built a geospatial search engine for protected areas in California, shipped datasette-graphql 1.0 and started working towards the next milestone for Datasette Cloud.

[... 1,099 words]

California Protected Areas Database in Datasette (via) I built this yesterday: it’s a Datasette interface on top of the CPAD 2020 GIS database of protected areas in California maintained by GreenInfo Network. This was a useful excuse to build a GitHub Actions flow that builds a SpatiaLite database using my shapefile-to-sqlite tool, and I fixed a few bugs in my datasette-leaflet-geojson plugin as well.

# 21st August 2020, 11:15 pm / shapefiles, github-actions, datasette, projects, california, spatialite, gis

Things I learned about shapefiles building shapefile-to-sqlite

Visit Things I learned about shapefiles building shapefile-to-sqlite

The latest in my series of x-to-sqlite tools is shapefile-to-sqlite. I learned a whole bunch of things about the ESRI shapefile format while building it.

[... 1,073 words]

geojson-to-sqlite (via) I just put out the first release of geojson-to-sqlite—a CLI tool that can convert GeoJSON files (consisting of a Feature or a set of features in a FeatureCollection) into a table in a SQLite database. If you use the --spatialite option it will initalize the table with SpatiaLite and store the geometries in a spacially indexed geometry field—without that option it stores them as GeoJSON.

# 31st January 2020, 6:40 am / geojson, sqlite, spatialite, projects, gis

2019

kepler.gl. Uber built this open source geospatial analysis tool for large-scale data sets, and they offer it as a free hosted online tool—just click Get Started on the site. I uploaded two CSV files with 30,000+ latitude/longitude points in them just now and used Kepler to render them as images.

# 25th October 2019, 4:16 am / geospatial, gis, visualization