Operations engineering does not consist of firefighting your shitty software, it is the science of delivering value to users.

Charity Majors # 14th February 2019, 1:12 am

Setting up Munin on Ubuntu. Useful guide to setting up my favourite graphing/monitoring tool for personal projects. # 1st September 2010, 2:05 pm

Zero-downtime Redis upgrade discussion. GitHub have a short window of scheduled downtime in order to upgrade their Redis server. I asked in their comments if they’d considered trying to run the upgrade with no downtime at all using Redis replication, and Ryan Tomayko has posted some interesting replies. # 28th May 2010, 2:50 pm

Linux performance basics. This kind of Linux knowledge is rapidly becoming a key skill for server-side web development. # 24th January 2010, 1:50 pm

Round-robin Django setup with nginx. An nginx trick I didn’t know: a low proxy_connect_timeout value (e.g. 2 seconds) combined with the proxy_next_upstream setting means that if one of your backends breaks a user won’t even see an error, they’ll just have a short delay before getting a response from a working server. # 21st December 2009, 3:43 pm

Announcing Kong: A server description and deployment testing tool. An ultra simple website monitoring tool written in Django which makes it easy to manage a list of Twill scripts for testing different sites. It was developed at the Lawrence Journal-World—Eric showed me a demo if this a year or so ago and I’ve been hoping they would open source it. # 18th November 2009, 12:47 pm

Using Graphics Card Memory as Swap (via) Interesting idea: “Graphic cards contain a lot of very fast RAM, typically between 64 and 512 MB. With Linux, it’s possible to use it as swap space, or even as RAM disk.” # 3rd November 2009, 11:01 am

I loathe [hardware load balancers]. They’re expensive, restrictive, slow, and generally cause you a lot more pain and suffering than they’re worth. At my last job, one of my projects was to convert most of one of our existing clusters from a load-balancing appliance to use keepalived. Why would we do this? Because the $100k worth of appliance wasn’t capable of doing the job that $15k worth of commodity hardware and an installation of keepalived were handling with ease.

Matt Palmer # 3rd November 2009, 10:45 am

We’re all ops people now. Edd’s experience reflects my own: the kind of systems I’m building these days involve way more than just development, they often involve significant sysadmin type skills as well. Desperately need to get better at that stuff. # 20th June 2008, 9:02 pm