Simon Willison’s Weblog

Subscribe

Blogmarks tagged python in 2010

Filters: Type: blogmark × Year: 2010 × python × Sorted by date


HotQueue. A super-simple Python work queue using Redis. The API is neat, and makes clever use of generators for blocking consumption of queue items. # 22nd December 2010, 11:51 am

Bleach, HTML sanitizer and auto-linker. HTML sanitisation is notoriously difficult to do correctly, but Bleach (a Python library) looks like an excellent effort. It uses the html5lib parsing library to deal with potentially malformed HTML, uses a whitelist rather than a blacklist and has a neat feature for auto-linking URLs that is aware of the DOM (so it won’t try to auto-link a URL that is already wrapped in a link element). It was written by the Mozilla team for addons.mozilla.org and support.mozilla.org so it should be production ready. # 25th October 2010, 1:32 pm

ijson. A SAX-style streaming JSON parser for Python, using ctypes to talk to the yajl C library. # 22nd September 2010, 9:59 pm

Writing your own traceroute in Python. How to implement traceroute in Python, using the low-level socket module. # 9th August 2010, 12:58 pm

What to do when PyPI goes down. My deployment scripts tend to rely on PyPI these days (they install dependencies in to a virtualenv) which makes me distinctly uncomfortable. Jacob explains how to use the PyPI mirrors that are starting to come online, but that won’t help if the PyPI listing links to an externally hosted file which starts to 404, as happened with the python-openid package quite recently (now fixed). The comments on the post discuss workarounds, including hosting your own PyPI mirror or bundling tar.gz files of your dependencies with your project. # 21st July 2010, 10:19 am

simplegeo’s python-oauth2. The Python OAuth library scene is frighteningly complicated at the moment. This seems to be the most actively maintained, and the readme includes working example code for talking to the Twitter API (including integration with Django auth). # 18th July 2010, 5:22 pm

MapOSMatic. Clever service built on top of OpenStreetMap, which renders double sided city maps with a map and grid on one size and an A-Z street name index on the other. Runs on top of Mapnik, PostGIS and Cairo, with a few thousand additional lines of Python and Django. # 11th July 2010, 12:15 pm

python/trunk/Lib/httplib.py in 1994 (via) Python’s original httplib implementation, checked in by Guido 16 years and 4 months ago. Not much younger than the Web itself. # 4th July 2010, 11:25 pm

Python 2.7 Release. Includes three of my favourite features from Python 3: odicts, set literals and set and dictionary comprehensions. # 4th July 2010, 11:21 pm

Slide, Inc.—open source. slide.com have open sourced a whole bunch of interesting Python libraries, most of them involving C extensions or greenlet non-blocking I/O. wirebin (fast binary serialization of native Python types) and meminfo (an extension for finding precise in-memory sizes of Python objects) look particularly interesting. No documentation yet—not even a readme. # 17th June 2010, 8:05 pm

Appending the request URL to SQL statements in Django. A clever frame-walking monkey-patch which pulls the most recent HttpRequest object out of the Python stack and adds the current request.path to each SQL query as an SQL comment, so you can see it in debugging tools such as slow query logs and the PostgreSQL “select * from pg_stat_activity” query. # 2nd June 2010, 9:09 am

django-boss (via) Management commands are one of the few bits of Django that I still have to look up in the documentation whenever I write them. django-boss offers a smart alternative to regular management commands, based around decorators and taking the containing app as the first argument. # 1st June 2010, 10:02 am

Headroid1—a face tracking robot head. Kind of creepy—Ian Ozsvald’s openCV + pySerial motorised camera follows your face around the room, and will soon be able to react to your emotions. # 21st May 2010, 4:59 pm

Django 1.2 release notes (via) Released today, this is a terrific upgrade. Multiple database connections, model validation, improved CSRF protection, a messages framework, the new smart if template tag and lots, lots more. I’ve been using the 1.2 betas for a major new project over the past few months and it’s been smooth sailing all the way. # 17th May 2010, 9:11 pm

Ruby-style Blocks in Python. Yes, yes, yes, yes. A proposal for muli-line lambda support in Python that doesn’t trip up on significant whitespace. If this gets in before the proposed feature freeze I’ll be a very happy Pythonista. UPDATE: This is a post from over a year ago, and it looks like the proposal has since stalled. # 23rd April 2010, 11:19 am

Flask 0.1 Released. Armin’s Flask (a Python microframework built around Werkzeug and Jinja2) is looking pretty solid for a two week old project—extensive documentation, comprehensive unit test support (and example applications with unit tests) and some very tidy API design. # 16th April 2010, 5:12 pm

Introduction to Surlex. A neat drop-in alternative for Django’s regular expression based URL parsing, providing simpler syntax for common path patterns. # 11th April 2010, 7:23 pm

The Onion Uses Django, And Why It Matters To Us. The Onion ported their main site from PHP and Drupal to Django in three months with a team of four developers, including a full migration of their archived content. Their developers answer questions about the switch in this thread on the Django sub-reddit. # 25th March 2010, 6:43 pm

Fun with TextMate and PDB. TextMate bookmarks (against lines in a file) are stored as OS X extended attributes, which can be accessed from Python using the xattr module. Here’s a clever piece of code that uses bookmarks to set breakpoints in the command-line pdb debugger. # 23rd March 2010, 9:48 am

The Web Server Benchmarking We Need. Ian Bicking asks for a WSGI benchmark which emphasises error handling over raw performance—can the server keep serving requests if some of them are CPU bound, I/O bound, wedged or cause a segfault? # 17th March 2010, 10:05 am

Automated deployments with Fabric—tips and tricks. “If it’s not in a Fabric fabfile, it’s not deployable”—I’m slowly applying this philosophy to my personal projects. # 16th March 2010, 11:19 am

Installing PIL on Mac OS X Snow Leopard for use in Google App Engine. PIL installation instructions that actually work... the ’export CC=“gcc -arch i386”’ incantation in particular. Make sure you run setup.py install using the Python version that the App Engine dev tools are using (I ran “sudo /usr/bin/python2.6 setup.py install”). # 15th March 2010, 4:06 pm

Introducing the PyPy 1.2 release. It’s been a long time coming, but 1.2 is the first PyPy release to ship with a Just-in-Time compiler! Performance looks pretty impressive. # 12th March 2010, 11:54 pm

Cache Machine: Automatic caching for your Django models. This is the third new ORM caching layer for Django I’ve seen in the past month! Cache Machine was developed for zamboni, the port of addons.mozilla.org to Django. Caching is enabled using a model mixin class (to hook up some post_delete hooks) and a custom caching manager. Invalidation works by maintaining a “flush list” of dependent cache entries for each object—this is currently stored in memcached and hence has potential race conditions, but a comment in the source code suggests that this could be solved by moving to redis. # 11th March 2010, 7:35 pm

Is johnny-cache for you? “Using Johnny is really adopting a particular caching strategy. This strategy isn’t always a win; it can impact performance negatively”—but for a high percentage of Django sites there’s a very good chance it will be a net bonus. # 2nd March 2010, 11:44 am

jmoiron.net: Johnny Cache. The blog entry announcing Johnny Cache (“a drop-in caching library/framework for Django that will cache all of your querysets forever in a consistent and safe manner”) to the world. # 1st March 2010, 11:48 am

Johnny Cache. Clever twist on ORM-level caching for Django. Johnny Cache (great name) monkey-patches Django’s QuerySet classes and caches the result of every single SELECT query in memcached with an infinite expiry time. The cache key includes a “generation” ID for each dependent database table, and the generation is changed every single time a table is updated. For apps with infrequent writes, this strategy should work really well—but if a popular table is being updated constantly the cache will be all but useless. Impressively, the system is transaction-aware—cache entries created during a transaction are held in local memory and only pushed to memcached should the transaction complete successfully. # 28th February 2010, 10:55 pm

Unit Testing Achievements. A plugin for Python’s nose test runner that adds achievements—“Night Shift: Make a failing suite pass between 12am and 5am.” # 28th February 2010, 3:56 pm

PiCloud. An interesting twist on cloud computing for Python. “import cloud; cloud.call(my_function, arguments)” serialises my_function and its arguments, pushes it up to one of their EC2 servers and hands you back a job ID which you can poll (or block on) for a response. They suggest using it for long running tasks such as web crawling or image processing. # 26th February 2010, 6:25 pm

jacobian’s django-deployment-workshop. Notes and resources from Jacob’s 3 hour Django deployment workshop at PyCon, including example configuration files for Apache2 + mod_wsgi, nginx, PostgreSQL and pgpool. # 19th February 2010, 2:28 pm