Feed Sign in with OpenID OpenID

Simon Willison’s Weblog

Investigate your MP's expenses. Launched today, this is the project that has been keeping me ultra-busy for the past week—we’re crowdsourcing the analysis of the 700,000+ scanned MP expenses documents released this morning. It’s the Guardian’s first live Django-powered application, and also the first time we’ve hosted something on EC2.

Tagged , , , , , ,

14 comments

  1. Excellent work, very impressed all round.

    Julian Burgess - 19th June 2009 02:02 - #

  2. Very nice.
    Quick bug report; the date parser for line items seems quite strict, and the error message isn't obvious; could it get a red highlight or something?

    Iain Broadfoot - 19th June 2009 02:12 - #

  3. Marvellous job. Thank you for your example. It is true that "think different is not enough".

    Are you going to liberate the code?

    Javier de la Cueva - 19th June 2009 02:21 - #

  4. Amazing, amazing work.

    Stuff like this reminds you of everything that's good and right about the internet.

    How big was the team building this, and how long were they working on it?

    Chris Adams - 19th June 2009 08:52 - #

  5. Quite wonderful.
    How is the EC2 going?

    布里斯班 - 19th June 2009 09:18 - #

  6. Fantastic work: great idea + excellent interface = epic win

    john - 19th June 2009 09:51 - #

  7. I started development last Thursday, we added a designer, client-side engineer and operations guy on Tuesday, then yesterday a bunch more (15 people at least, mainly working to solve the PDF conversion problem and dealing with QA) got involved in a massive team effort to help push it through to launch.

    Simon Willison - 19th June 2009 10:31 - #

  8. Well done on getting this created so fast! A superb example of crowdsourcing.

    Jake Brumby - 19th June 2009 10:55 - #

  9. Could you add a 'rotate image' feature? Some invoices are in landscape (e.g. http://mps-expenses.guardian.co.uk/page/332224/) making them hard to read.

    Mike Dimmick - 19th June 2009 20:58 - #

  10. How do you sign in with a username once set? I see a "set username" box, and it asked for a password too, but can't find the login box...

    Matthew Pettitt - 19th June 2009 21:15 - #

  11. How did you set up your EC2 instances to handle this amount of traffic? Plays nice even with all those post requests and "uncachable" things... Impressive. I'm still relying upon Django generating hundreds of static files every 10. minutes to handle server loads...

    Anders Eriksen - 19th June 2009 22:08 - #

  12. Anders: it took a lot of effort after the app had launched! We made every mistake in the book, I plan to write up some of the lessons we learnt as a full blog entry.

    Simon Willison - 20th June 2009 01:11 - #

  13. Thanks for the good work on this.

    When the huge US Recovery bill was signed earlier this year, the full "final" text had been available for less than a day, and then, only in the form of scans with handwritten annotations.

    It occurred to me then that this is a common tactic that could easily have been thwarted in no time by a news organization that had an army of volunteers to help with transcription and analysis.

    I wasn't surprised though that no newspaper seemed up to the task. At the time journalists were spending most of their attention on the "you'll miss us when we are gone," and getting ready to work themselves into a frezy about twitter.

    I hope that your work at the Guardian will show the way, and will be shamelessly copied and improved on across the world.

    eas - 23rd June 2009 19:27 - #

Sign in with OpenID

Auto-HTML: Line breaks are preserved; URLs will be converted in to links.

Manual XHTML: Enter your own, valid XHTML. Allowed tags are a, p, blockquote, ul, ol, li, dl, dt, dd, em, strong, dfn, code, q, samp, kbd, var, cite, abbr, acronym, sub, sup, br, pre

A django site