Simon Willison’s Weblog

Subscribe
Atom feed

Blogmarks

Filters: Sorted by date

minixsv (via) As far as I can tell, this is the only library that can validate XML using pure Python (no C extension required). I’d be extremely happy if someone would write a pure Python library (or one that only depends on ElementTree, which is included in the standard library) for validating XML against a Relax NG Compact syntax schema. Even DTD validation would be better than nothing!

# 12th August 2009, 4:59 pm / elementtree, minixsv, python, relaxng, validation, xml, xmlschema

Yahoo! Term Extraction and Contextual Web Search services to be discontinued. The official closure date is August 31st. Term extraction was really useful—thankfully there are a number of decent alternatives such as Zemanta, OpenCalais and topia.termextract.

# 12th August 2009, 11:57 am / opencalais, termextractor, topia, web-services, yahoo, zemanta

topia.termextract. Impressive Python term extraction library (similar to the various term extraction web APIs but you can run it on your own hardware), incorporating a Parts-Of-Speech tagging algorithm.

# 10th August 2009, 9:26 pm / nlp, python, termextraction, topia

tr.im is “discontinuing service”. “However, all tr.im links will continue to redirect, and will do so until at least December 31, 2009.Your tweets with tr.im URLs in them will not be affected.”—these statements seem to contradict themselves. Will tr.im URLs in tweets stop working after December 31st or not? Any chance they could hand the domain over to the Internet Archive? At any rate, this is exactly why centralised URL shorteners are a harmful trend.

# 10th August 2009, 11:06 am / internet-archive, redirects, trim, twitter, urls, urlshorteners

Richard Jones: Something I’m working on... Python’s with statement appears to provide just enough syntactic sugar to create some really interesting DSL-style APIs—here’s a very promising example for laying out GUI applications.

# 7th August 2009, 3:47 pm / dsl, gui, python, richard-jones, with

How to avoid ads in gmail. “After extensive testing I’ve discovered you need 1 catastrophic event or tragedy for every 167 words in the rest of the email.”

# 31st July 2009, 1:40 am / ads, gmail

Making Image Overlays Easy with GGroundOverlay and GGeoXML (via) Surprisingly, there doesn’t appear to be a good online tool for helping align an overlay image with a Google Map and exporting the result as a KML file. This is the best I could find—Yahoo! used to have a tool called MapMixer but it doesn’t seem to exist any more.

# 30th July 2009, 10:58 pm / google-maps, maps, overlays

Collection: Search Patterns. Peter Morville’s enormous collection of screenshots of search engine interfaces.

# 30th July 2009, 12:35 pm / design, patterns, peter-morville, search, ui, usability

Today’s News and Yahoo!’s Developer Program. “For SearchMonkey and BOSS, we currently do not have anything concrete to tell you” ... “We wanted to let you know that today’s news does not affect these products [YUI, YQL, Pipes]”.

# 30th July 2009, 12:20 pm / boss, pipes, searchmonkey, yahoo, yahoo-pipes, ydn, yql, yui

Building Rome in a Day (via) “The first system capable of city-scale reconstruction from unstructured photo collections”—computer vision techniques used to construct 3D models of cities using 10s of thousands of photos from Flickr. Reminiscent of Microsoft PhotoSynth.

# 29th July 2009, 3:41 pm / 3d, computer-vision, flickr, photos, photosynth, research, rome

Django: Security updates released. A fix for a directory traversal attack in the Django development server (the one with the big “never run this in production” warnings in the documentation). Also reminds that the release of 1.1 means that 0.96, released over two years ago, has reached end of life and will not receive any further bug fixes after the just-released 0.96.4.

# 29th July 2009, 1:45 pm / django, python, security

Toy Chest: Online or Downloadable Tools for Building Projects (via) “Toy Chest collects online or downloadable software tools/thinking toys that humanities students and others without programming skills (but with basic computer and Internet literacy) can use to create interesting projects”—a fantastic list compiled by the English Department at UCSB.

# 29th July 2009, 12:12 pm / tools, toychest, ucsb

Django 1.1 release notes (via) Django 1.1 is out! Congratulations everyone who worked on this, it’s a fantastic release. New features include aggregate support in the ORM, proxy models, deferred fields and some really nice admin improvements. Oh, and the testing framework is now up to 10 times thanks to smart use of transactions.

# 29th July 2009, 9:34 am / aggregates, django, django-admin, open-source, orm, python, releases

JSONP Memory Leak. Neil Fraser advocates iterating over and deleting every property on a JSONP script DOM node after you removeChild it from the DOM, to protect against memory leaks of “in excess of 15 MB per hour”.

# 28th July 2009, 12:46 pm / javascript, jsonp, memoreleaks, neil-fraser

My Sys-Con Nightmare. This is just ridiculous. Don’t speak at or attend Sys-Con conferences (which include AJAXWorld, the Cloud Computing Expo and Ajax in the Cloud), don’t write for or buy their journals (including AJAXWorld Magazine, JDJ and .NET Developer’s Journal), and don’t visit or advertise on any of their sites.

# 28th July 2009, 12:39 pm / ajaxworld, aral-balkan, boycott, syscon

NASA NEBULA Services (via) NASA’s new NEBULA cloud computing platform appears to be built entirely on open source infrastructure, including Python, Django, Fabric, Eucalyptus, RabbitMQ, Trac and Solr.

# 28th July 2009, 12:10 pm / cloud-computing, django, eucalyptus, fabric, nasa, nebula, open-source, python, rabbitmq, solr, trac

Fabric, Django, Git, Apache, mod_wsgi, virtualenv and pip deployment. I’m slowly working my way through this stack at the moment—next stop, fabric.

# 28th July 2009, 11:56 am / apache, deployment, django, fabric, gareth-rushgrove, git, modwsgi, pip, python, virtualenv, wsgi

Learning to compile things from source (on Unix/Linux/OSX). I asked on serverfault.com for tips on learning how to solve configure/make/install problems on my own, and got some extremely useful replies.

# 27th July 2009, 4:21 pm / compiling, linux, macos, questions, serverfault

Why we migrated from MySQL to MongoDB. Includes some useful information on MongoDB’s limitations—for example, running many different collections can waste disk space and repairing large datasets or bulk deleting many rows can block and lock the database for the duration of the operation.

# 27th July 2009, 10:49 am / databases, documentstores, mongodb, mysql

AdSense for Feeds: What’s all the hubbub about PubSubHubbub? “Today we’re happy to announce initial support in FeedBurner for the PubSubHubbub protocol.”

# 24th July 2009, 6:45 pm / feedburner, google, pubsubhubbub, pushbutton, realtimeweb

The Pushbutton Web: Realtime Becomes Real. Anil Dash is excited by the potential for PubSubHubBub and Webhooks to make near-real-time scalable event publishing accessible to regular web developers. So am I.

# 24th July 2009, 6:30 pm / anil-dash, pubsubhubbub, pushbutton, realtime, realtimeweb, webhooks

MoD sticks with insecure browser. Tom Watson MP used parliamentary written answers to find out that the majority of government departments still require their staff to use IE6, and not all of them have upgrade plans to 7 or 8. Not a single department considered an alternative browser. “Many civil servants use web browsers as a tool of their trade. They’re as important as pens and paper. So to force them to use the most decrepit browser in the world is a rare form of workplace cruelty that should be stopped.”

# 24th July 2009, 10:18 am / browsers, civilservice, politics, tom-watson, ukgovernment

EtherPad. Outstanding implementation of an online real-time collaborative text editor—basically SubEthaEdit in your browser. I can see myself using this a lot.

# 24th July 2009, 12:35 am / appjet, comet, etherpad, javascript, realtime, subethaedit

xmlwitch. An XML building library for Python that doesn’t suck (I love ElementTree for parsing XML, but I’ve never really liked it for generation). Makes smart use of the with statement.

# 24th July 2009, 12:33 am / python, withstatement, xml, xmlwitch

Webhooks behind the firewall with Reverse HTTP. Hookout is a Ruby / rack adapter that lets you serve a web application from behind a firewall, by binding to a Reverse HTTP proxy running on the internet (such as the free one provided by reversehttp.net). Useful for far more than just webhooks, this means you can easily expose any Ruby web service to the outside world. An implementation of this as a general purpose proxy server would make it useful for applications written in any language.

# 22nd July 2009, 1:46 pm / comet, hookout, reversehttp, ruby, webhooks

Django 1.1 release candidate available. If all goes well, the final release will be out next week.

# 22nd July 2009, 12:19 pm / django, python, releasecandidate

Fancy Fast Food (via) “These photographs show extreme makeovers of actual fast food items purchased at popular fast food restaurants.”

# 22nd July 2009, 11:51 am / fancyfastfood, fastfood, food

moddims (via) Apache 2 module which exposes ImageMagick as a URL-driven service, allowing you to request an image from a whitelisted host server and resize, thumbnail or alter the quality of it.

# 21st July 2009, 6:18 pm / apache, imagemagick, images, moddims, resizing, thumbnails

Years

Tags