Simon Willison’s Weblog

Subscribe

Items in Jan, 2019

Filters: Year: 2019 × Month: Jan × Sorted by date


Everyone is angry about CSS again. I’m not even going to try to summarize the arguments. However it always seems to boil down to the fact that CSS is simultaneously too easy to bother with, yet so hard it needs to be wrapped up in a ball of JavaScript in case it scares the horses.

Rachel Andrew # 30th January 2019, 11:14 pm

Since Mozilla moved on from Firefox OS, its derivatives have shipped on an order of magnitude more devices than during its entire time under Mozilla’s leadership and it has gone on to form the basis of the third largest and fastest growing mobile operating system in the world.

Ben Francis # 28th January 2019, 12:59 pm

websocketd (via) Delightfully clever piece of design: “It’s like CGI, twenty years later, for WebSockets”. Simply run “websocketd --port=8080 my-program” and it will start up a WebSocket server on port 8080 and fire up a new process running your script every time it sees a new WebSocket connection. Standard in and standard out are automatically hooked up to the socket connection. Since it spawns a new process per connection this won’t work well with thousands of connections but for smaller scale projects it’s an excellent addition to the toolbok—and since it’s written in Go there are pre-compiled binaries available for almost everything. # 26th January 2019, 2:38 am

Practical Deep Learning for Coders 2019 (via) The deep learning evening course I took a few months ago has now been shared online in full, and it’s outstanding. “After the first lesson you’ll be able to train a state-of-the-art image classification model on your own data”—can confirm: after just the first lesson I built a bobcat v.s. cougar classifier using photos from iNaturalist.

The biggest thing I learned from the course is how powerful transfer learning is. I used to think you needed a huge amount of data to get good results from deep learning. That’s no longer true: you can take an existing model (eg ResNet for image classification) and train on top of it.

ResNet can classify images as 1,000 classes (house, cat, etc)—training an extra few hundred images of e.g. Bobcats vs Cougars only takes a couple of minutes on a GPU and can give you crazily accurate results.

It works because the pre-trained model can already pick up really subtle details—fur patterns, ear shapes etc—so you only need to train a few more layers on it for it to be able to classify against the patterns in your new set of training images.

And this doesnt just work for image classification! Natural language processing benefits from transfer learning too: take an existing model trained on the entire corpus of Wikipedia (so it knows patterns from sentence structures) and you can build IMDB sentiment analysis on top. That’s in lesson 4. # 26th January 2019, 12:32 am

Subscribe to my blog on Telegram (via) I created a Telegram bot that’s subscribed to my Atom feed, so if you want to get notifications when I post to my blog you can do that using Telegram now. # 20th January 2019, 4:11 am

togeojson (via) Handy JavaScript library and command-mine tool for converting KML and GPX to GeoJSON, by Tom MacWright # 18th January 2019, 11:50 pm

[On 5G] This is the great thing about the decentralized, permissionless innovation of the internet—telcos don’t need to decide in advance what the use cases are, any more than Intel had to decide what the use cases for faster CPUs would be.

Ben Evans # 18th January 2019, 6:55 am

SQLite in 2018: A state of the art SQL dialect (via) In 2018 SQLite gained boolean literals, window functions, filter clauses, upserts and the ability to rename a column. If you want to try it out the latest official datasetteproject/datasette Docker image now bundles SQLite 3.26. # 15th January 2019, 4:21 pm

An Interactive Introduction to Fourier Transforms (via) I love interactive exploitable explanations and this is the best I’ve seen in a while: Jez Swanson breaks down exactly what a Fourier transform does, first by letting you interactively draw and deconstruct wave patterns and then by showing Epicycles andcexplsining JPEG compression. All with not a formula in sight! # 12th January 2019, 2:55 am

Usable Data (via) A Paul Ford essay from February 2016 in which he advocates for SQLite as the ideal format for sharing interesting data. I don’t know how I missed this one—it predates Datasette, but it perfectly captures the benefits that I’m trying to expose with the project. “In my dream universe, there would be a massive searchable torrent site filled with open, explorable data sets, in SQLite format, some with full text search indexes already in place.” # 11th January 2019, 6:33 pm

Exploring search relevance algorithms with SQLite

SQLite isn’t just a fast, high quality embedded database: it also incorporates a powerful full-text search engine in the form of the FTS4 and FTS5 extensions. You’ve probably used these a bunch of times already: many iOS, Android and desktop applications use SQLite under-the-hood and use it to implement their built-in search.

[... 1390 words]

Launching LiteCLI (via) Really neat alternative command-line client for SQLite, written in Python and using the same underlying framework as the similar pgcli (PostgreSQL) and mycli (MySQL) tools. Provides really intuitive autocomplete against table names, columns and other bits and pieces of SQLite syntax. Installation is as easy as “pip install litecli”. # 5th January 2019, 11:16 pm