Blogmarks
Filters: Sorted by date
Observable Beta (via) Observable just released their beta, and it’s quite something. It’s by Mike Bostock (d3), Jeremy Ashkenas (Backbone, CoffeeScript) and Tom MacWright (Mapbox Studio). The easiest way to describe it is Jupyter notebooks for JavaScript supporting reactive programming—so code is evaluated as you type and you can add interactive widgets (like sliders and canvas views) to construct explorable visualizations on the fly.
SQLite: The Spellfix1 Virtual Table (via) A SQLite extension that lets you create a spellfix1 virtual table which can power “fuzzy” search, by suggesting corrections for misspelled words. I haven’t tried this yet but it looks pretty powerful, including a configurable edit distance and the ability to set up custom “soundslike” terms for words with known unusual spellings.
6M observations total! Where has iNaturalist grown in 80 days with 1 million new observations? Citizen science app iNaturalist is seeing explosive growth at the moment—they’ve been around for nearly a decade but 1/6 of the observations posted to the site were added in just the past few months. Having tried the latest version of their iPhone app it’s easy to see why: snap a photo of some nature and upload it to the app and it will use surprisingly effective machine learning to suggest the genus or even the individual species. Submit the observation and within a few minutes other iNaturalist community members will confirm the identification or suggest a correction. It’s brilliantly well executed and an utter delight to use.
How did the Roman Republic determine its budget? Fascinating answer on the AskHistorians subreddit about how taxation worked in the Roman Empire. Since the republic was almost permanently at war, and was very good at it, no taxes were levied on Roman citizens in Italy from 167 B.C. onwards.
Domains Search for Web: Instant, Serverless & Global (via) The team at Zeit are pioneering a whole bunch of fascinating web engineering architectural patterns. Their new domain name autocomplete search uses Next.js and server-side rendering on first load, then switches to client-side rendering from then on. It can then load results asynchronously over a custom WebSocket protocol as the microservices on the backend finish resolving domain availability from the various different TLD providers.
django-postgres-copy (via) Really neat Django queryset add-on which exposes the PostgreSQL COPY statement for importing (and exporting) CSV data. MyModel.objects.from_csv(“filename.csv”). Built by the team of data journalists at the California Civic Data Coalition.
Nicaraguan Address System (via) “Instead of street names or numbers Nicaraguans use reference points from where they start describing a certain address. [...] There are instances, however, in which the reference points do not exist anymore!”
How to turn a list of JSON objects into a Datasette. ramadis on GitHub cleaned up data on 184,879 crimes reported in Buenos Aires since 2016 and shared them on GitHub as a JSON file. Here are my notes on how to use Pandas to convert JSON into SQLite and publish it using Datasette.
GaretJax/django-click (via) I’ve been using Click to write command-line tools in Python recently (big datasette and csvs-to-sqlite use it) and its a delightful way of composing simple and complex CLI interfaces. I’ve always found Django’s default management command syntax hard to fit in my head—django-click means I can combine the two.
Generating polygon representing a rough 100km circle around latitude/longitude point using Python. A question I posted to the GIS Stack Exchange—I found my own answer using a Python library called geog, then someone else posted a better solution using pyproj.
API 2.0: Log-In with ZEIT, New Docs & More. Here’s Zeit’s write-up of their brand new API 2.0, which adds OAuth support and allows anything that can be done with their command-line tools to be achieved via their public API as well. This is the enabling technology that allowed me to build Datasette Publish.
A SIM Switch Account Takeover (Mine). Someone walked into a T-Mobile store with a fake ID in his name and stole Albert Wenger’s SIM identity, then used it to gain access to his Yahoo mail account, reset his Twitter password and post a tweet boosting a specific cryptocurrency. His accounts with Google Authenticator 2FA stayed safe.
How the industry-breaking Spectre bug stayed secret for seven months. It’s pretty amazing that the bug only became public knowledge a week before the intended embargo date, considering the number of individuals and companies that has to be looped in. The biggest public clues were patches being applied in public to the Linux kernel—one smart observer noted that the page table issue “has all the markings of a security patch being readied under pressure from a deadline.”
Telling stories through your commits. Joel Chippendale’s excellent guide to writing a useful commit history. I spend a lot of time on my commit messages, because when I’m trying to understand code later on they are the only form of documentation that is guaranteed to remain up-to-date against the code at that exact point of time. These tips are clear, concise, teadabale and include some great examples.
Notes on Kafka in Python. Useful review by Matthew Rocklin of the three main open source Python Kafka client libraries as of October 2017.
Incident report: npm. Fascinating insight into the challenges involved in managing a massive scale community code repository. An algorithm incorrectly labeled a legit user as spam, an NPM staff member acted on the report, dependent package installations started failing and because the package had been removed as spam other users were able to try and fix the bug by publishing fresh copies of the missing package to the same namespace.
How to compile and run the SQLite JSON1 extension on OS X. Thanks, Stack Overflow! I’ve been battling this one for a while—it turns out you can download the SQLite source bundle, compile just the json1.c file using gcc and load that extension in Python’s sqlite3 module (or with Datasette’s --load-extension= option) to gain access to the full suite of SQLite JSON functions—json(), json_extract() etc.
ftfy—fix unicode that’s broken in various ways (via) I shipped a small web UI wrapper around the excellent Python FTFY library, which can take broken unicode strings and suggest a sequence of operations that can be applied to get back sensible text.
csvkit. “A suite of command-line tools for converting to and working with CSV”—includes a huge range of utilities for things like converting Excel and JSON to CSV, grepping, sorting and extracting a subset of columns, combining multiple CSV files together and exporting CSV to a relational database. Worth reading through the tutorial which shows how the different commands can be piped together.
Statistical NLP on OpenStreetMap. libpostal is ferociously clever: it’s a library for parsing and understanding worldwide addresses, built on top of a machine learning model trained on millions of addresses from OpenStreetMap. Al Barrentine describes how it works in this fascinating and detailed essay.
Himalayan Database: From Visual FoxPro GUI to JSON API with Datasette (via) The Himalayan Database is a compilation of records for all expeditions that have climbed in the Nepalese Himalaya, originally compiled by journalist Elizabeth Hawley over several decades. The database is published as a Visual FoxPro database—here Raffaele Messuti provides step-by-step instructions for extracting the data from the published archive, converting them to CSV using dbfcsv and then converting the CSVs to SQLite using csvs-to-sqlite so you can browse them using Datasette.
Frontend in 2017: The important parts. Keeping track of developments in the frontend and JavaScript community is pretty much a full time job here days, so I found this summary of trends and developments over 2017 very useful for trying to catch up.
Game developer’s guide to graphical projections (with video game examples), Part 1: Introduction. Absolutely delightful series of illustrated essays by Matej ‘Retro’ Jan explaining how different graphical projections can be used for video game art. Each concept is illustrated by screenshots or gifs from a mixture of games spanning four decades. Reading this was a real treat.
My Internet Mea Culpa. Rick Webb asks “What if we were wrong?” about the internet leading to enormous benefit for humankind. I’ve been worrying about this a lot recently: it turns out the internet provides tools that allow bad people to spread lies, propaganda and discrimination with lethal effectiveness. It’s hard to believe that universal access to the sum of all human knowledge can have negative effects, but there are clearly a whole load of negative effects that us internet utopians failed to predict.
Computer latency: 1977-2017 (via) Dan Luu used a 240 fps camera to investigate the latency between hitting a key and having the character show up on the display across four decades of computing devices... and found 1983’s Apple IIe outperformed everything else. He goes to great lengths to explain why in his accompanying write-up.
Google Maps’s Moat. Gorgeous essay by digital cartographer Justin O’Beirne, exploring how Google Maps has evolved over time and how the fantastically useful “areas of interest” feature (where commercial corridors and business districts are automatically highlighted) uses data derived from a combination of Street View business data and 3D building outlines derived from aerial imagery.
How to compile and run a Pony program using Docker. My notes on using the Docker ponylang/ponyc container to compile and execute a Pony program without needing to install anything (since Docker will download and run the image the first time you run the command).
An Early History of Pony. Pony is an interesting looking new programming language, built around actor-based concurrency on top of a mathematically proved type system. The history of the language makes for interesting reading: it’s based on experience with actor libraries in C at an investment bank, combined with research into type systems at Imperial College London.
How do Ruby & Python profilers work? Julia Evans: “As a precursor to writing a Ruby profiler I wanted to do a survey of how existing Ruby & Python profilers work.”
The Mirai Botnet Was Part of a College Student Minecraft Scheme. Fascinating story about last year’s Mirai botnet, which was originally developed to help corner the Minecraft server market.