How the wayback machine works
31st August 2002
How the Wayback Machine Works is a must read for anyone geeky enough to be interested in cheap clustered databases on a huge scale. The interview includes some fascinating details on the cost effectiveness of Linux clusters:
What’s amazing to me is the fact that the hardware is free. For doing things even in the hundreds of terabytes, it costs in the hundreds of thousands of dollars. When you talk to most people in IT departments, they spend a couple hundred thousand dollars just on a CPU, much less a terabyte of disk storage. You buy from EMC a terabyte for maybe $300,000. That’s just the storage for 1 TB. We can buy 100 TBs with 250 CPUs to work on it, all on a high-speed switch with redundancy built in.
More recent articles
- Weeknotes: the aftermath of NICAR - 16th March 2024
- The GPT-4 barrier has finally been broken - 8th March 2024
- Prompt injection and jailbreaking are not the same thing - 5th March 2024
- Interesting ideas in Observable Framework - 3rd March 2024
- Weeknotes: Getting ready for NICAR - 27th February 2024
- The killer app of Gemini Pro 1.5 is video - 21st February 2024
- Weeknotes: a Datasette release, an LLM release and a bunch of new plugins - 9th February 2024
- Datasette 1.0a8: JavaScript plugins, new plugin hooks and plugin configuration in datasette.yaml - 7th February 2024
- LLM 0.13: The annotated release notes - 26th January 2024
- Weeknotes: datasette-test, datasette-build, PSF board retreat - 21st January 2024