Simon Willison’s Weblog

17 items tagged “machinelearning”

jantic/DeOldify (via) “A Deep Learning based project for colorizing and restoring old images”. Delightful (and well documented) project that uses a Self-Attention Generative Adversarial Network to colorize old black and white photos, with extremely impressive results. Built on an older version of the fastai library, and trained by running for several days on a 1080TI graphics card. # 2nd November 2018, 11:13 am

Reinforcement Learning with Prediction-Based Rewards (via) Fascinating result: by teaching a reinforcement learning agent that plays video games to optimize for “unfamiliar states”—states where it cannot predict what will happen next—the agent does a much better job of playing some games. “... for the first time exceeds average human performance on Montezuma’s Revenge. RND achieves state-of-the-art performance, periodically finds all 24 rooms and solves the first level without using demonstrations or having access to the underlying state of the game.” # 31st October 2018, 11:51 pm

Automatically playing science communication games with transfer learning and fastai

This weekend was the 9th annual Science Hack Day San Francisco, which was also the 100th Science Hack Day held worldwide.

[... 1174 words]

Notebook: How to build a Teachable Machine with TensorFlow.js (via) This is a really cool Observable notebook. It explains how to build image classification that runs in the browser on top of Tensorflow.js, and includes interactive demos that hook into your webcam and let you hold up items and use them to train a classifier. Since it’s built on Observable every single underlying line of source code is available to browse as part of the essay. # 20th June 2018, 9:10 pm

Text Embedding Models Contain Bias. Here’s Why That Matters (via) Excellent discussion from the Google AI team of the enormous challenge of building machine learning models without accidentally encoding harmful bias in a way that cannot be easily detected. # 17th April 2018, 8:54 pm

Suppose a runaway success novel/tv/film franchise has “Bob” as the evil bad guy. Reams of fanfictions are written with “Bob” doing horrible things. People endlessly talk about how bad “Bob” is on twitter. Even the New York times writes about Bob latest depredations, when he plays off current events. Your name is Bob. Suddenly all the AIs in the world associate your name with evil, death, killing, lying, stealing, fraud, and incest. AIs silently, slightly ding your essays, loan applications, uber driver applications, and everything you write online. And no one believes it’s really happening. Or the powers that be think it’s just a little accidental damage because the AI overall is still, overall doing a great job of sentiment analysis and fraud detection.

Daniel Von Fange # 17th April 2018, 8:51 pm

BearID: Bear Face Detector. Comprehensive tutorial on building a computer vision system to identify faces of bears, using dlib and the Histogram of Oriented Gradients (HOG) technique. Bears! # 1st March 2018, 5:31 pm

A Promenade of PyTorch. Useful overview of the PyTorch machine learning library from Facebook AI Research described as “a Python library enabling GPU-accelerated tensor computation”. Similar to TensorFlow, but where TensorFlow requires you to explicitly construct an execution graph PyTorch instead lets you write regular Python code (if statements, for loops etc) which PyTorch then uses to construct the execution graph for you. # 21st February 2018, 5:31 am

6M observations total! Where has iNaturalist grown in 80 days with 1 million new observations? Citizen science app iNaturalist is seeing explosive growth at the moment—they’ve been around for nearly a decade but 1/6 of the observations posted to the site were added in just the past few months. Having tried the latest version of their iPhone app it’s easy to see why: snap a photo of some nature and upload it to the app and it will use surprisingly effective machine learning to suggest the genus or even the individual species. Submit the observation and within a few minutes other iNaturalist community members will confirm the identification or suggest a correction. It’s brilliantly well executed and an utter delight to use. # 28th January 2018, 8:18 pm

Statistical NLP on OpenStreetMap. libpostal is ferociously clever: it’s a library for parsing and understanding worldwide addresses, built on top of a machine learning model trained on millions of addresses from OpenStreetMap. Al Barrentine describes how it works in this fascinating and detailed essay. # 8th January 2018, 7:33 pm

Indexes are models: a B-Tree-Index can be seen as a model to map a key to the position of a record within a sorted array [...] Our initial results show, that by using neural nets we are able to outperform cache-optimized B-Trees by up to 70% in speed while saving an order-of-magnitude in memory over several real-world data sets.

The Case for Learned Index Structures # 11th December 2017, 6:25 am

deeplearn.js imagenet webcam demo (via) This is pretty astonishing... deeplearn.js is a Google Brain research tool that implements a GPU-accelerated neural network in browser-friendly JavaScript (using WebGL fragment shaders to run the algorithms). This demo hooks into your webcam and runs the SqueezeNet image recognition model against it, showing classification in real-time and providing a live-updating visualization of the different layers of the network. # 5th December 2017, 11:15 pm

Feature Visualization (via) Another gorgeous paper published on Distill, the journal that prides itself on including interactive visualizations to help provide clear explanations of machine learning. # 7th November 2017, 8:48 pm

How Adversarial Attacks Work. Adversarial attacks against machine learning classifiers involve constructing an input that deliberately produces the wrong classification. This article shows how these can be constructed, and includes examples generated using PyTorch which produce a sports car that gets identified as a toaster and a photo of Sylvester Stallone that gets classified as Keanu Reeves. # 2nd November 2017, 8:25 pm

Oxford Deep NLP 2017 course (via) Slides, course description and links to lecture videos for the 2017 Deep Natural Language Processing course at the University of Oxford presented by a team from Google DeepMind. # 31st October 2017, 8:39 pm

Hey Siri: An On-device DNN-powered Voice Trigger for Apple’s Personal Assistant (via) “The “Hey Siri” detector uses a Deep Neural Network (DNN) to convert the acoustic pattern of your voice at each instant into a probability distribution over speech sounds. It then uses a temporal integration process to compute a confidence score that the phrase you uttered was “Hey Siri”. If the score is high enough, Siri wakes up.” # 20th October 2017, 3:48 am

How Dopplr learns. Dopplr uses global and personal trip histories to disambiguate place names, and your friends’ schedules to help disambiguate dates in airline confirmation emails. # 23rd July 2008, 4:17 pm