Simon Willison’s Weblog

8 items tagged “machinelearning”

Statistical NLP on OpenStreetMap. libpostal is ferociously clever: it’s a library for parsing and understanding worldwide addresses, built on top of a machine learning model trained on millions of addresses from OpenStreetMap. Al Barrentine describes how it works in this fascinating and detailed essay. # 8th January 2018, 7:33 pm

Indexes are models: a B-Tree-Index can be seen as a model to map a key to the position of a record within a sorted array [...] Our initial results show, that by using neural nets we are able to outperform cache-optimized B-Trees by up to 70% in speed while saving an order-of-magnitude in memory over several real-world data sets.

The Case for Learned Index Structures # 11th December 2017, 6:25 am

deeplearn.js imagenet webcam demo (via) This is pretty astonishing... deeplearn.js is a Google Brain research tool that implements a GPU-accelerated neural network in browser-friendly JavaScript (using WebGL fragment shaders to run the algorithms). This demo hooks into your webcam and runs the SqueezeNet image recognition model against it, showing classification in real-time and providing a live-updating visualization of the different layers of the network. # 5th December 2017, 11:15 pm

Feature Visualization (via) Another gorgeous paper published on Distill, the journal that prides itself on including interactive visualizations to help provide clear explanations of machine learning. # 7th November 2017, 8:48 pm

How Adversarial Attacks Work. Adversarial attacks against machine learning classifiers involve constructing an input that deliberately produces the wrong classification. This article shows how these can be constructed, and includes examples generated using PyTorch which produce a sports car that gets identified as a toaster and a photo of Sylvester Stallone that gets classified as Keanu Reeves. # 2nd November 2017, 8:25 pm

Oxford Deep NLP 2017 course (via) Slides, course description and links to lecture videos for the 2017 Deep Natural Language Processing course at the University of Oxford presented by a team from Google DeepMind. # 31st October 2017, 8:39 pm

Hey Siri: An On-device DNN-powered Voice Trigger for Apple’s Personal Assistant (via) “The “Hey Siri” detector uses a Deep Neural Network (DNN) to convert the acoustic pattern of your voice at each instant into a probability distribution over speech sounds. It then uses a temporal integration process to compute a confidence score that the phrase you uttered was “Hey Siri”. If the score is high enough, Siri wakes up.” # 20th October 2017, 3:48 am

How Dopplr learns. Dopplr uses global and personal trip histories to disambiguate place names, and your friends’ schedules to help disambiguate dates in airline confirmation emails. # 23rd July 2008, 4:17 pm