What are some system administration best practices? If someone is running a production web server, what are the basic things they should be doing?
Graph everything. I’m not a good sysadmin, but one thing I’ve learned from working with with good sysadmins is that they spend a bunch of time looking at graphs.
A graph showing CPU usage, load average, disk operations per second, network traffic etc over the past day/week/month makes any strange behaviour really easy to spot.
I’ve found munin relatively easy to set up, and there are hosted services like Cloudkick that can help out as well.