Simon Willison’s Weblog

Subscribe
Atom feed

Blogmarks

Filters: Sorted by date

Announcing django-viewtools. A really excellent idea—run ./manage.py viewtools --pdb /path/on/site/ to debug a view in your Django project that is raising an error using the Python debugger, or use --profile to run the full request cycle for that URL through the profiler.

# 17th February 2009, 9:35 pm / debugging, django, djangoviewtools, eric-moritz, pdb, profiler, python

CloudMade: A Summary of the Future of Mapping. CloudMade are now offering commercially supported APIs on top of OpenStreetMap, including geocoding, routing and tile access libraries in Python/Ruby/Java and a very neat theming tool that lets you design your own map styles. This is really going to kick innovation around OpenStreetMap up a notch.

# 17th February 2009, 11:25 am / cloudmade, geocoding, java, mapping, openstreetmap, python, routing, ruby, tiles

Dulwich. A pure Python implementation of the Git file format and protocols. Reinforces my impression that a key to Git’s success is stable, well designed and documented on-disk formats.

# 16th February 2009, 10:27 pm / dulwich, git, python

“Recover my account” link on the login page. For the record, collecting and verifying e-mail addresses is a VERY good idea, even (especially?) if you accept OpenID. A verified e-mail address is still absolutely the best way to deal with lost passwords or “my OpenID isn’t working”.

# 16th February 2009, 10:22 pm / accounts, email, identity, openid

Write to a Google Spreadsheet from a Python script. I didn’t know Google Spreadsheets could directly serve dynamic images that automatically update when the underlying data changes.

# 16th February 2009, 9:02 pm / google, google-docs, googlespreadsheets, python

Web Hooks and the Programmable World of Tomorrow. Tour de force presentation on Web Hooks by Jeff Lindsay. Tons of really good ideas—provided your application isn’t Flickr sized, there’s a good chance you could implement web hooks pretty cheaply and unleash a huge flurry of creativity from your users. GitHub makes a great case study here.

# 16th February 2009, 9 pm / apis, flickr, github, jeff-lindsay, webhooks

Google App Engine 1.1.9 boosts capacity and compatibility. Niall summarises the recent changes to App Engine. urllib and urllib2 support plus massively increased upload limits and request duration quotas will make it a whole lot easier to deploy serious projects on the platform.

# 16th February 2009, 8:35 pm / google, google-app-engine, niall-kennedy, urllib

The Django and Ubuntu Intrepid Almanac. Will Larson’s impressively comprehensive guide to configuring and securing an Ubuntu VPS from scratch to run Django, using PostgreSQL and Apache/mod_wsgi behind nginx.

# 14th February 2009, 3:42 pm / apache, django, modwsgi, nginx, postgresql, sysadmin, ubuntu, vps, will-larson

Xapian performance comparision with Whoosh. Whoosh appears to be around four times slower than Xapian for indexing and empty cache searches, but Xapian with a full cache blows Whoosh out of the water (5408 searches/second compared to 26.3). Considering how fast Xapian is, that’s still a pretty impressive result for the pure-Python Whoosh.

# 14th February 2009, 1:15 pm / full-text-search, python, richard-boulton, search, whoosh, xapian

Tokyo Cabinet and Tokyo Tyrant Presentation. By Tokyo Cabinet author Mikio Hirabayashi. The third leg of the Tokyo tripod is Tokyo Dystopia, a full-text search engine which is presumably a modern replacement for Mikio’s older hyperestraier engine.

# 14th February 2009, 11:34 am / full-text-search, hyperestraier, mikiohirabayashi, tokyocabinet, tokyodystopia, tokyotyrant

Tokyo Tyrant Tutorial. Buried at the bottom of the Tokyo Tyrant protocol documentation, this is the best resource I’ve seen yet for getting up and running with the database server (including setting up replication).

# 14th February 2009, 11:29 am / databases, keyvaluepairs, replication, tokyocabinet, tokyotyrant

Specify your canonical. You can now use a link rel=“canonical” to tell Google that a page has a canonical URL elsewhere. I’ve run in to this problem a bunch of times—in some sites it really does make sense to have the same content shown in two different places—and this seems like a neat solution that could apply to much more than just metadata for external search engines.

# 14th February 2009, 11:28 am / canonical, google, metadata, relcanonical, search-engines, seo, urls

pytyrant. A pure-python client library for the Tokyo Tyrant binary protocol (used to access Tokyo Cabinet databases over a network). The library appears to be developed by Bob Ippolito and the team at Mochi Media.

# 14th February 2009, 11:19 am / bob-ippolito, mochimedia, python, pytyrant, tokyocabinet, tokyotyrant

Tokyo Cabinet: Beyond Key-Value Store. Useful overview of Yet Another Scalable Key Value Store. Interesting points: multiple backends (hash table, B-Tree, in memory, on disk), a “table” engine which enables more advanced queries, a network server that supports HTTP, memcached or its own binary protocol and the ability to extend the engine with Lua scripts.

# 14th February 2009, 11:17 am / databases, hash, http, keyvaluepairs, lua, memcached, tokyocabinet

Twitter Don’t Click Exploit. Someone ran a successful ClickJacking exploit against Twitter users, using a transparent iframe holding the Twitter homepage with a status message fed in by a query string parameter. Thiss will definitely help raise awareness of ClickJacking! Twitter has now added framebusting JavaScript to prevent the exploit.

# 12th February 2009, 7:56 pm / chris-shiflett, clickjacking, framebusting, javascript, security, twitter

EuroDjangoCon 2009. Tickets are now on sale for the conference, scheduled for 4th-6th of May (not March as I originally said) in Prague (followed by two days of development sprints).

# 12th February 2009, 4:59 pm / django, djangocon, eurodjangocn, prague, python

Whoosh. A brand new, pure-python full text indexing engine (think Lucene). Claims to offer performance in the same league as wrappers to C or Java libraries. If this works as well as it claims it will be an excellent tool for adding search to projects that wish to avoid a dependency on an external engine.

# 12th February 2009, 12:49 pm / full-text-search, lucene, open-source, python, search, whoosh

Django Settings Tip—Setting Relative Paths. This is the first thing I do in every single one of my Django projects—it makes projects relocatable to other machines with just a couple of lines of code. I wouldn’t be at all upset to see it added to the default Django settings.py file created by ./manage.py startproject

# 12th February 2009, 12:30 pm / django, gareth-rushgrove, python, settings

Plaxo sees 92% success rate with OpenID/OAuth hybrid method. Really wish I could have been at the OpenID UX Summit hosted by Facebook yesterday—sounds like an awful lot of important problems are being solved.

# 11th February 2009, 5:20 pm / comcast, facebook, google, openid, plaxo

JsonML (JSON Markup Language). An almost non-lossy serialization format for sending XML as JSON (plain text in between elements is ignored). Uses the (element-name, attribute-dictionary, list-of-children) tuple format, which sadly means many common cases end up taking more bytes than the original XML. Still an improvement on serializations that behave differently when a list of children has only one item in it.

# 10th February 2009, 3:03 pm / json, jsonml, serialization, xml

Yahoo! Query Language thoughts. An engineer on Google’s App Engine provides an expert review of Yahoo!’s YQL. I found this more useful than the official documentation.

# 9th February 2009, 10:29 pm / google, google-app-engine, yahoo, yql

Open in Browser Firefox Add-on (via) Solves the “application/json wants to download” problem, among others.

# 9th February 2009, 10:24 pm / firefox, json, plugins

A Unix Utility You Should Know About: Pipe Viewer. Useful command line utility that adds a progress bar to any unix pipeline.

# 9th February 2009, 10:15 pm / cli, pipes, pipeviewer, unix

Facing up to Fonts. Slides and notes from Richard Rutter’s excellent typography presentation at a recent SkillSwap Brighton. Includes some new thinking about the font stack (comma separated list of fonts provided to the font-family property) you should use to get the best possible implementation of a given font on various different platforms.

# 9th February 2009, 9:16 pm / design, fonts, fontstacks, richard-rutter, skillswap, skillswapbrighton, typography

YQL opens up 3rd-party web service table definitions to developers. This really is astonishingly clever: you can create an XML file telling Yahoo!’s YQL service how to map an arbitrary API to YQL tables, then make SQL-style queries against it (including joins against other APIs). Another neat trick: doing a SQL “in” query causes API requests to be run in parallel and recombined before being returned to you.

# 9th February 2009, 9:08 pm / apis, sql, yahoo, yql

Introduction to Information Retrieval (via) This looks excellent—a modern guide to implementing search engines written by some of the engineers behind Yahoo! Search. The full text is available online, but it looks like it’s well worth investing in the dead tree edition.

# 9th February 2009, 8:54 pm / books, freebooks, search, yahoo-search

1901EasternTelegraph.jpg (via) A map of undersea telegraph cables as of 1901.

# 9th February 2009, 8:44 pm / cables, maps, undersea

Four reasons why public Facebook status updates won’t kill Twitter. Mike Butcher highlights the importance of “follow” rather than “friend” in social software.

# 9th February 2009, 7:04 pm / facebook, follow, friend, mike-butcher, social-software, twitter

Google App Engine: A roadmap update! Receiving e-mail, background tasks and XMPP. I predict a bunch of sites will start building small parts of their overall functionality on App Engine when some of these features land (much easier than hosting your own custom XMPP server).

# 9th February 2009, 7 pm / cloud-computing, email, google, google-app-engine, python, xmpp

Years

Tags