Simon Willison’s Weblog

Subscribe

Monday, 8th January 2018

Himalayan Database: From Visual FoxPro GUI to JSON API with Datasette (via) The Himalayan Database is a compilation of records for all expeditions that have climbed in the Nepalese Himalaya, originally compiled by journalist Elizabeth Hawley over several decades. The database is published as a Visual FoxPro database—here Raffaele Messuti‏ provides step-by-step instructions for extracting the data from the published archive, converting them to CSV using dbfcsv and then converting the CSVs to SQLite using csvs-to-sqlite so you can browse them using Datasette.

# 7:26 pm / csv, datasette

Statistical NLP on OpenStreetMap. libpostal is ferociously clever: it’s a library for parsing and understanding worldwide addresses, built on top of a machine learning model trained on millions of addresses from OpenStreetMap. Al Barrentine describes how it works in this fascinating and detailed essay.

# 7:33 pm / machine-learning, nlp, openstreetmap

[On Meltdown's impact on hosting costs] The reality is that we have been living with borrowed performance. The new reality is that security is too important and can not be exchanged for speed. Time to profile, tune and optimize.

Miguel de Icaza‏

# 7:35 pm / migueldeicaza, security

csvkit. “A suite of command-line tools for converting to and working with CSV”—includes a huge range of utilities for things like converting Excel and JSON to CSV, grepping, sorting and extracting a subset of columns, combining multiple CSV files together and exporting CSV to a relational database. Worth reading through the tutorial which shows how the different commands can be piped together.

# 9:03 pm / csv, datasette

2018 » January

MTWTFSS
1234567
891011121314
15161718192021
22232425262728
293031