Simon Willison’s Weblog

Subscribe

August 2009

Aug. 7, 2009

Richard Jones: Something I’m working on... Python’s with statement appears to provide just enough syntactic sugar to create some really interesting DSL-style APIs—here’s a very promising example for laying out GUI applications.

# 3:47 pm / python, richardjones, with, gui

Aug. 10, 2009

tr.im is “discontinuing service”. “However, all tr.im links will continue to redirect, and will do so until at least December 31, 2009.Your tweets with tr.im URLs in them will not be affected.”—these statements seem to contradict themselves. Will tr.im URLs in tweets stop working after December 31st or not? Any chance they could hand the domain over to the Internet Archive? At any rate, this is exactly why centralised URL shorteners are a harmful trend.

# 11:06 am / internet-archive, trim, redirects, twitter, urls, urlshorteners

How Different Groups Spend Their Day. Classy interactive infographic from the New York Times.

# 3:37 pm / new-york-times, visualisation, infographics, interactives

topia.termextract. Impressive Python term extraction library (similar to the various term extraction web APIs but you can run it on your own hardware), incorporating a Parts-Of-Speech tagging algorithm.

# 9:26 pm / python, topia, termextraction, nlp

Aug. 12, 2009

Yahoo! Term Extraction and Contextual Web Search services to be discontinued. The official closure date is August 31st. Term extraction was really useful—thankfully there are a number of decent alternatives such as Zemanta, OpenCalais and topia.termextract.

# 11:57 am / topia, zemanta, opencalais, yahoo, web-services, termextractor

minixsv (via) As far as I can tell, this is the only library that can validate XML using pure Python (no C extension required). I’d be extremely happy if someone would write a pure Python library (or one that only depends on ElementTree, which is included in the standard library) for validating XML against a Relax NG Compact syntax schema. Even DTD validation would be better than nothing!

# 4:59 pm / relaxng, elementtree, minixsv, python, validation, xml, xmlschema

Aug. 13, 2009

Best of OpenStreetMap (via) I keep on telling people OpenStreetMap is this year’s Wikipedia—at its best, it beats commercially available maps. This “best of” site highlights the areas where OSM really shines (the yellow stars)—the German mapping community in particular have produced some outstanding cartography.

# 12:30 pm / openstreetmap, wikipedia, mapping, maps, cartography

SQL pie chart. Generating ASCII art pie charts using the world’s scariest MySQL SELECT statement.

# 1:04 pm / mysql, graphing, asciiart, sql

When we get the tools to do distributed Twitter, etc., we get the tools to communicate in stanzas richer than those allowed by our decades-old email clients. Never mind Apple being anti-competitive, social networks are the peak of monopolistic behaviour today.

Blaine Cook

# 1:06 pm / apple, blaine-cook, distributedsocialnetworks, facebook, socialnetworks, twitter

Scriptlets—Quick web scripts (via) From the prolific Jeff Lindsay, a pastebin-style tool for short server-side scripts written in Python, JavaScript or PHP that executes them within a Google App Engine powered sandbox. The Java code that implements the service is available on GitHub.

# 1:51 pm / github, jeff-lindsay, webhooks, scriptlets, python, javascript, php, googleappengine, appengine, open-source, java

Mandelbrot set in PostgreSQL. Surprisingly short SQL statement that produces an ASCII art Mandelbrot set.

# 2:23 pm / fractals, mandelbrot, postgresql, sql, asciiart

Python logging from multiple processes. Use Python’s socket log handler to send all log messages to a single server—the python-loggingserver project implements such a server as a Twisted application with a handy web interface for viewing the aggregated logs.

# 11:55 pm / python, logging, twisted

Aug. 14, 2009

Last night I woke up at 2am and realized that there was a fundamental problem with cursor preservation in today’s real-time collaborative applications [...] MobWrite now has what I believe to be the most advanced cursor preservation algorithm available.

Neil Fraser

# 10:38 am / collaboration, mobwrite, neilfraser, realtime

How do you install lxml on OS X Leopard without using MacPorts or Fink? I’ve asked on Stack Overflow... hope I get a good answer.

# 1:04 pm / python, leopard, lxml, stackoverflow, osx

Microsoft backs long life for IE6. Oh FFS... “The software giant said it would support IE6 until 2014—four years beyond the original deadline.”

# 2:53 pm / ie6, ffs, microsoft, ie, browsers

Aug. 17, 2009

On HTML 5 Drag and Drop. Francisco Tolmasky investigated HTML 5 drag and drop, which allows web apps to implement drag and drop between windows and between the browser and the desktop. He found a number of problems with the spec and proposes detailed solutions.

# 12:31 pm / html5, draganddrop, franciscotolmasky, javascript, standards, 280slides

You Deleted Your Cookies? Think Again (via) Flash cookies last longer than browser cookies and are harder to delete. Some services are sneakily “respawning” their cookies—if you clear the regular tracking cookie it will be reinstated from the Flash data next time you visit a page.

# 3:23 pm / cookies, privacy, security, flash, respawning

Aug. 18, 2009

Data Is Journalism: MSNBC.com Acquires Everyblock. Congratulations Adrian, Wilson and the team! Brady Forrest reports the acquisition within the larger context of the rise of data-driven journalism.

# 12:10 pm / adrian-holovaty, wilson-miner, everyblock, msnbc, brady-forrest, datadrivenjournalism

Caching in ASP.NET with the SqlCacheDependency Class. Interesting cache invalidation concept: set up dependencies between cache entries and tables or rows in the database, then use triggers (which I presume are automatically created for you) to clear your cache.

# 12:15 pm / aspdotnet, caching, invalidation

It is amazing how much you can accomplish when it doesn't matter who gets the credit.

Harry S Truman

# 12:20 pm / harrytruman

rather baffling finding: POST requests, made via the XMLHTTP object, send header and body data in separate tcp/ip packets [and therefore,] xmlhttp GET performs better when sending small amounts of data than an xmlhttp POST

Iain Lamb

# 12:27 pm / ajax, get, http, iainlamb, performance, post, xmlhttprequest

Aug. 19, 2009

JavaScript cannot save you. Even if it could, you should not let it, for the price of this short-term salvation is the end of what you like about the web.

Alex Russell

# 11:33 am / alex-russell, javascript

Kung Fu People (via) The first site to launch based on the open source Django code from djangopeople.net!

# 11:37 am / kungfu, django-people, open-source, django, python, peter-bengtsson

easy_install no longer working with SourceForge-hosted projects? Unsurprising, since installation software (which is often run as root) that crawls the web and scrapes HTML pages for download links is a horrible, horrible idea.

# 11:38 am / easyinstall, python, sourceforge

How to find un-indexed queries in MySQL, without using the log (via) Use tcpdump(!) to sniff the MySQL protocol and dump out queries that had the “no index used” bit set.

# 11:42 am / tcpdump, mysql, profiling

By Popular Demand, We’re Keeping the Term Extraction Service. Yahoo! aren’t shutting down the term extractor after all. On the one hand, this is a great decision—but this kind of back and forth (dare I say flip-flopping?) really doesn’t help encourage people to build against hosted APIs.

# 11:44 am / yahoo, ydn, termextractor

Aug. 20, 2009

Eulogy to _why. The pseudonymous hacker/artist _why has deleted his online presence, apparently moving on to other things. John Resig explains why _why has been such an inspiration.

# 9:57 am / john-resig, whytheluckystiff

you seem to think i'm random, but i'm only psuedorandom. you would be exactly this way, were you seeded at the very same time and place.

_why

# 10:26 am / whytheluckystiff

Dive Into HTML 5. Mark Pilgrim’s free online book on HTML 5—currently just one chapter on canvas (which neatly illustrates the coordinate system using a diagram rendered using canvas itself) but certain to become an invaluable resource for anyone looking to take advantage of HTML 5.

# 2:40 pm / mark-pilgrim, html5, web-standards, books, canvas

Aug. 22, 2009

CSS 3: Progress! Alex Russell on the new exciting stuff going in to CSS 3 based on real-world implementations in the modern set of browsers. Of particular interest is the new Flexible Box specification, which specifies new layout primitives hbox and vbox (as seen in XUL) and is already supported by both WebKit and Gecko.

# 11:52 am / browsers, css, css3, alex-russell, flexiblebox, hbox, vbox, webkit, gecko, standards

2009 » August

MTWTFSS
     12
3456789
10111213141516
17181920212223
24252627282930
31