Open Mosix
I can’t remember how I stumbled across it, but Open Mosix looks like a really interesting project. It’s a Linux kernel extension that makes creating a Linux cluster is as simple as installing a kernel module on a number of machines and supplying each one with a shared config file.
Once the cluster is set up, any of the machines on the network has the ability to farm long running processes out to a different box. The clustering only kicks in when a process is running in the background takes more than a few seconds to execute. Once that happens, Open Mosix checks the load averages on the machines in the cluster and, if a better host is found, migrates the process over to the other machine. The whole process is completely transparent; as far as the end user is concerned the processes they run just keep on running until they terminate.
Since clustering only kicks off for longer running processes this woudn’t be much use for something like a web server farm, but could be ideal for tasks such as compilation or rendering where individual processes perform computationally intensive work for a long period of time.
Even more fascinating is ClusterKnoppix, a modified Knoppix distro that uses an Open Mosix enabled kernel. Burn a bunch of CDs, boot some standard networked PCs with them and you’ve got an instant cluster.
I’ve been doing some pretty intense log crunching today on a year’s worth of daily web server logs, each one between 15 and 20 MB in size. Since most of the work is in running a regular expression on each line of each file it’s likely a cluster would have speeded the whole thing up considerably. Unfortunately, I doubt my co-workers would have been overjoyed with me turning their machines in to cluster nodes for the day.
"Unfortunately, I doubt my co-workers would have been overjoyed with me turning their machines in to cluster nodes for the day."
Is that really so? I mean, given that they're not cluster nodes now your cow-orkers might have complained if you said "stop what you're doing while I install a new OS there", but I can imagine an office being set up like this initially. You'd need some relatively clever task scheduling tools in Mosix to farm out CPU-intensive work to CPU-unbusy boxes in the cluster, but it could all happen under the covers and no-one would know. I can't really see a downside to this -- people not doing a lot might suddenly find their machine slows down as is starts doing work for other people in the cluster, but they're not doing a lot, so they won't notice as much.
I saw something recently (a LazyWeb idea: something which grabs each page you look at in Firebird and chucks it in a search engine so you can search for stuff to work out where it was) about a game development team who used IncrediBuild (or something) on Windows, which is like the shared gcc thing on Linux: they both push out compilation to idle machines in your workgroup, so compiling a big project takes two minutes instead of two hours. Everyone who does this loves it. More to the point, you could probably make it the default on machines and announce your machine's willingness with ZeroConf or something, and then everyone using your OS would suddenly get massive performance benefits in a network environment of computers running that OS.
sil - 20th December 2003 06:06 - #
Simon Willison - 20th December 2003 20:43 - #
I was reading something about Half life 2 modding the other day (since I have a few ideas for mods I'd like to try), and apparently they'll let you farm out work to other machines on the network when compiling levels. Very useful when you have a home network with a few comps that are usually idle. Actually, I'd probably be interested in setting the comps up for one of those distributed computing projects if bandwidth weren't so expensive...
Or is this all too off-topic?
Lach - 21st December 2003 02:37 - #
Johann - 17th February 2004 12:09 - #
Zippo - 4th September 2004 15:27 - #
Martin Carlsson - 13th July 2005 09:11 - #