Simon Willison’s Weblog

Subscribe

213 items tagged “recovered”

2010

Running Yahoo! Pipes on Google App Engine. “The pipe2py package can compile a Yahoo! Pipe into pure Python source code, or it can interpret the pipe on-the-fly”—makes smart use of Python generators, and comes with tools to run the resulting compiled code on Google App Engine. # 30th October 2010, 12:11 am

mrjob: Distributed Computing for Everybody. Yelp use MapReduce with Hadoop (running on Amazon’s EMR service) to power all sorts of interesting features on the site, including spelling suggestions, review highlights, top searches and “people who viewed X also viewed...”. mrjob is their new open source Python framework for writing MapReduce jobs against the Hadoop streaming API. # 29th October 2010, 11:55 pm

Using MySQL as a NoSQL—A story for exceeding 750,000 qps on a commodity server. Very interesting approach: much of the speed difference between MySQL/InnoDB and memcached is due to the overhead involved in parsing and processing SQL, so the team at DeNA wrote their own MySQL plugin, HandlerSocket, which exposes a NoSQL-style network protocol for directly calling the low level MySQL storage engine APIs—resulting in a 7.5x performance increase. # 27th October 2010, 11:10 pm

Bees with machine guns! Low-cost, distributed load-testing using EC2. Great name for a useful project—Bees with machine guns is a Fabric script which fires up a bunch of EC2 instances, uses them to load test a website and then spins them back down again. # 27th October 2010, 11:04 pm

Bleach, HTML sanitizer and auto-linker. HTML sanitisation is notoriously difficult to do correctly, but Bleach (a Python library) looks like an excellent effort. It uses the html5lib parsing library to deal with potentially malformed HTML, uses a whitelist rather than a blacklist and has a neat feature for auto-linking URLs that is aware of the DOM (so it won’t try to auto-link a URL that is already wrapped in a link element). It was written by the Mozilla team for addons.mozilla.org and support.mozilla.org so it should be production ready. # 25th October 2010, 1:32 pm

Firesheep (via) Oh wow. A Firefox extension that makes sniffing for insecured (non-HTTPS) cookie requests on your current WiFi network and logging in as that person a case of clicking a couple of buttons. Always possible of course, but it’s never been made easy before. Private VPNs are about to become a lot more popular. # 25th October 2010, 9:11 am

Linked Data at the Guardian. The Guardian’s Open Platform API can now be queried by MusicBrainz ID and ISBN, opening up some extremely useful new types of query. # 19th October 2010, 7:11 pm

jQuery 1.4.3 Released. Once again, the thing that impresses me most about this jQuery release is how stable the core API is. Hardly any new methods added, but the existing methods are made faster, more flexible and more predictable. The same as been true for the past several releases as well. It just keeps getting more and more polished. # 17th October 2010, 12:15 am

JS had to “look like Java” only less so, be Java’s dumb kid brother or boy-hostage sidekick. Plus, I had to be done in ten days or something worse than JS would have happened.

Brendan Eich # 16th October 2010, 8:25 am

Annotated backbone.js. Literate programming. # 13th October 2010, 5:24 pm

Backbone.js. As should be expected for a DocumentCloud project, Backbone is a concise, elegant and educational take on the JavaScript MVC pattern. Depends on Underscore.js and plays well with jQuery. # 13th October 2010, 5:23 pm

Tuning Canabalt. Fascinating insight in to the game parameter tuning needed to make a game feel just right. # 13th October 2010, 8:32 am

Dark Patterns: Forced Continuity example, Audible.com. Dark Patterns are user interfaces that are designed to trick people. I just submitted Audible.com for their habit of signing up users for a $7.49 “gold membership” without making it clear on the checkout screens that this is a recurring monthly charge, not a one-off payment. # 12th October 2010, 10:55 am

Why, for a decade of experience, can we not seem to see the IE 8 zombie coming? It’s not like it’s going to be some big surprise that unless we do something different, we’ll still be supporting it in 2015. That’s right: in 2015, you’ll still be thinking about a browser that doesn’t support canvas or video and doesn’t even have a JITing JS engine.

Alex Russell # 11th October 2010, 11:01 pm

JSON sucks. [...] Every time I need to (correctly) represent a large integer such as 4611686018427387900, I’m forced to do so in a string. It causes me to throw up in mouth a little.

Theo Schlossnagle # 11th October 2010, 11:06 am

PaintbrushJS. Impressive open source JavaScript library from Dave Shea for applying image filters (sharpen, blur, emboss, greyscale etc) to the canvas element. # 9th October 2010, 11:53 am

What is a Polyfill? Useful new term: a Polyfill is “a shim that mimics a future API providing fallback functionality to older browsers”. # 9th October 2010, 11:48 am

Schneier on Stuxnet. Stuxnet now rivals Wikileaks as the real life plot most likely to have leaked from science fiction. # 9th October 2010, 10:57 am

What is the Open Web? Tantek Çelik describes the three pillars of the open web: open publishing of content, freedom to code and implement the standards needed to access that content and open access to that content over an unfiltered internet. # 9th October 2010, 10:47 am

It might seem a folly to want to build a gigantic, relatively puny computer at great expense 170 years after its invention. But the message of a completed Analytical Engine is very clear: it’s possible to be 100 years ahead of your own time. With support, this type of “blue skies” thinking can result in fantastic changes to the lives of everyone. Just think of the impact of the computer and ask yourself how different the Victorian world would have been with Babbage Engines at its disposal.

John Graham-Cumming # 6th October 2010, 9:26 am

The 100-year leap. John Graham-Cumming recounts the history of Charles Babbage’s Difference Engine and Analytical Engine, and proposes a project to build a working Analytical Engine 170 years after its invention (the machine built by the Science Museum in London is the Difference Engine). # 6th October 2010, 9:26 am

My First Week with the iPhone. A blind user describes the experience of using VoiceOver on the iPhone, including the joy of discovering the Color Identifier app which speaks the names of colours picked up by the iPhone’s camera. “ I used color cues to find my pumpkin plants, by looking for the green among the brown and stone. I spent ten minutes looking at my pumpkin plants, with their leaves of green and lemon-ginger.” # 3rd October 2010, 12:20 pm

Facebook’s Instant Personalization: An Analysis of Fundamental Privacy Flaws (via) Oh FFS. “Instant Personalization” means you visit one of Facebook’s “partner websites” and Facebook instantly tells them your full identity and gives them access to full Facebook connect functionality—without you performing any action other than visiting the site. This will not end well. # 2nd October 2010, 11:53 pm

I think that “bad technology” can kill a startup, but slightly different variations of good technology don’t have much effect. Choose what you know/like best. And Ruby and Python are both in this latter category.

enko on Hacker News # 2nd October 2010, 11:19 am

Velocity: Forcing Gzip Compression. Almost every browser supports gzip these days, but 15% of web requests have had their Accept-Encoding header stripped or mangled, generally due to poorly implemented proxies or anti-virus software. Steve Souders passes on a trick used by Google Search, where an iframe is used to test the browser’s gzip support and set a cookie to force gzipping of future pages. # 30th September 2010, 5:45 pm

Content management remains an unsolved problem. Untold billions of dollars (and hours) have been spent building commercial, open source, and custom content management systems since the first Web page was pushed to a Web server using FTP, and yet they all still suck.

Rafe Colburn # 30th September 2010, 12:26 pm

The Web for me is still URLs and HTML. I don’t want a Web which can only be understood by running a JavaScript interpreter against it.

Me, on Twitter # 27th September 2010, 4:37 pm

evercookie—virtually irrevocable persistent cookies (via) Mischievous genius from the chap who created the MySpace worm—evercookie attempts to set an irrevocable cookie using a whole bunch of different methods, including “storing cookies in RGB values of auto-generated, force-cached PNGs using HTML5 Canvas tag to read pixels back out” and an extremely clever scheme built on top of the web history CSS visited link colour vulnerability. # 23rd September 2010, 1:35 am