Blogmarks
Filters: Sorted by date
pudb. A full-screen, curses console based visual debugger for Python, built using the urwid console UI library.
No PDFs! The Sunlight Foundation point out that PDFs are a terrible way of implementing “more transparent government” due to their general lack of structure. At the Guardian (and I’m sure at other newspapers) we waste an absurd amount of time manually extracting data from PDF files and turning it in to something more useful. Even CSV is significantly more useful for many types of information.
The Secret Identity of the Peep Show Tweeter. Like many others, I had assumed the Peep Show character accounts were “official”—especially when they started live-tweeting their thoughts in real time as the episodes were aired. Turns out it was actually a very clever fan.
memcache-top. Useful self-contained perl script for interactively monitoring a group of memcached servers.
JSLitmus. “A lightweight tool for creating ad-hoc JavaScript benchmark tests”. Includes an ingenious hack for graphing the results—it generates a Google Chart, then provides a TinyURL for viewing that chart in the future. The TinyURL is generated by pointing an inconspicuous iframe at the TinyURL API and letting the user copy-and-paste the resulting shortened URL directly out of the iframe.
Underscore.js. A new library of functional programming primitives for JavaScript—each, map, all, any, inject, detect etc. Unlike some similar libraries this one doesn’t extend the built-in objects, instead opting to bind the new functions to the underscore symbol. A jQuery-style noConflict() option is available if even that is too much namespace pollution for you.
PostgreSQL 8.5 alpha 2 is out. “P.S. If you’re wondering about Hot Standby and Synchronous Replication, they’re still under heavy development and still (at this point) expected to be in 8.5.”—Hot Standby is PostgreSQL-speak for MySQL-style master/slave replication for scaling your reads.
Django 1.2 planned features.
The votes are in and the plan for Django 1.2 has taken shape - features are split in to high, medium and low priority. There's some really exciting stuff in there - outside of the things I've already talked about, I'm particularly excited about multidb, Model.objects.raw(SQL), the smarter {% if %} tag and class-based generic views.
Toiling in the data-mines: what data exploration feels like. Useful advice from Tom Armitage on the exploratory development approach required when starting to build a project against a large, complex dataset. Tips include making sure you have a REPL to hand and using tools like gRaphael to generate graphs against pretty much everything, since until you’ve seen their shape you won’t know if they are interesting or not.
Twisted inlineCallbacks and deferredGenerator. inlineCallbacks are a brilliant (but seemingly under-promoted) feature of Twisted which use the ability to return a value from a yield statement to make asynchronous callbacks look much more like regular sequential programming.
Play framework for Java. I’m genuinely impressed by this—it’s a full stack web framework for Java that actually does feel a lot like Django or Rails. Best feature: code changes are automatically detected and reloaded by the development web server, giving you the same save-and-refresh workflow you get in Django (no need to compile and redeploy to try out your latest changes).
Bits of Evidence (via) A slide deck from Greg Wilson: “What we actually know about software development, and why we believe it’s true”.
Introducing BERT and BERT-RPC. Justification for inventing a brand new serialisation protocol: Thrift and Protocol Buffers both use IDLs and code generation, XML “is not convertible to a simple unambiguous data structure in any language I’ve ever used” and JSON lacks support for unencoded binary data. The result is BERT—Binary ERlang Term—which extracts a format from Erlang in much the same way that JSON extracted one from JavaScript.
How We Made GitHub Fast. Detailed overview of the new GitHub architecture. It’s a lot more complicated than I would have expected—lots of moving parts are involved in ensuring they can scale horizontally when they need to. Interesting components include nginx, Unicorn, Rails, DRBD, HAProxy, Redis, Erlang, memcached, SSH, git and a bunch of interesting new open source projects produced by the GitHub team such as BERT/Ernie and ProxyMachine.
Introducing Cloudera Desktop. It’s a GUI for Hadoop, and under the hood is a whole stack of open source software, including Python, Django, MooTools, Twisted, lxml, CherryPy, Mako, Java and AspectJ.
Comcast: Twitter Has Changed The Culture Of Our Company. “Frank Eliason (@Comcastcares on Twitter) now has 11 people working under him simply to respond to information about Comcast being broadcast on Twitter.”
High-end Varnish-tuning. Tuning the Varnish HTTP cache to serve 27K requests/second on a single core 2.2GHz Opteron.
nginx_http_push_module. More clever design with webhooks—here’s an nginx module that provides a comet endpoint URL which will hang until a back end process POSTs to another URL on the same server. This makes it much easier to build asynchronous comet apps using regular synchronous web frameworks such as Django, PHP and Rails.
“I made the first animated under construction icon”. twoleftfeet on MetaFilter describes how he created the first ever Under Construction animation in 1995, after discovering his server-push animations could be replaced by the exciting new animated GIF.
The State of Solid State Hard Drives. From Jeff Atwood’s report it sounds like the price/performance ratio for SSD hard drives has got to a point where switching is the most cost effective way of improving a personal machine’s performance. Anyone know what’s involved in putting one of these things in a MacBook Pro?
Temporary Mapping: Solar Decathlon. The OpenStreetMap default renderer supports start_date and end_date tags, meaning you can map temporary installations (in this case the 2009 Solar Decathlon on the DC National Mall) and have them automatically appear and disappear at the correct times.
MySQL backups with EBS snapshots. Assaf Arkin’s 45 line ruby script shows how to lock tables / XFS freeze / create an EBS snapshot / unfreeze and unlock, with hourly snapshots preserved for the past 24 hours and daily snapshots for the past week. Is an EBS snapshot enough to restore your data to somewhere other than EC2 though?
OSM static map api. A very welcome addition to the OpenStreetMap world (with plenty of options for overlaying points, polygons etc) slightly marred by the size and relative ugliness of the OpenStreetMap watermark.
OpenStreetMap Rendering Database. Amazon have added an OpenStreetMap snapshot as a public data set, thanks to some smart prompting by Jeremy Dunck.
WebKit, Mobile, and Progress. Alex Russell responds to PPK’s analysis of the many different WebKit variants in today’s mobile phones, pointing out that the replacement cycle and increasing quality of WebKit in more recent phones means the situation still looks pretty good.
Django security updates released. A potential denial of service vulnerability has been discovered in the regular expressions used by Django form library’s EmailField and URLField—a malicious input could trigger a pathological performance. Patches (and patched releases) for Django 1.1 and Django 1.0 have been published.
Micro Men. “Affectionately comic drama about the British home computer boom of the early 1980s.”—aired last night, and on BBC iPlayer for the next week. I thought it was absolutely charming, as well as being a thought provoking history of the rise and fall of the British computer industry in the early 80s.
MichaelMoore.com in Django. A seriously impressive case study—a complete rebuild from the ground up completed in just five weeks using Django, Solr and Haystack for a high traffic site with a top 10,000 US Alexa ranking.
Cloudvox. A brand new startup offering “API-driven phone calls” with a beautifully simple webhooks based API.
Official Google Webmaster Blog: A proposal for making AJAX crawlable.
It's horrible! The Google crawler would map url#!state to url?_escaped_fragment_=state, then expect your site to provide rendered HTML that reflects that state (they even go as far as to suggest running a headless browser within your web server to do this). Just stick to progressive enhancement instead, it's far less hideous. It looks like the proposal may have originated with the GWT team.