Feed Sign in with OpenID OpenID

Simon Willison’s Weblog

Implementing filesystems in Python

LUFS-Python provides a relatively simple API for implementing new Linux filesystems in pure Python. You install the package, write a class implementing methods for handling filesystem operations such as creating a directory, opening/reading/writing/closing a file, creating symlinks etc and finally mount your new filesystem with some special arguments to the mount command.

At first glance, this is a bit of a gimmick—why would you want to write your own filesystem in the first place? We’ve been talking about this at work and came up with a few ideas. How about a filesystem where HTML files saved in a certain directory were instantly run through HTMLTidy and converted in to valid XHTML ? Or a custom network filesystem that saves files on a remote server using GnuPG to encrypt them before transfer? How about a read-only filesystem that lets you browse the contents of a MySQL database? Just imagine being able to use tools such as grep and find to search your database. A module that maps someone elses public web server to your own filesystem, making mirroring as easy as running a recursive cp command. A filesystem that updates a swish-e full-text index every time a file is saved to it—years before Microsoft release Longhorn. The possibilities are endless.

Here’s a really fun idea: a filesystem that implements a dynamic website. Instead of using tools like mod_python to dynamically create pages, implement a filesystem that dynamically creates HTML files as they are requested and set up a stock Apache install with the dynamic filesystem as the document root. Then point ProFTPD at it so you can log in via FTP and mess with your content dynamically. We’re thinking about bulding an FTP interface to our new database driven CMS, but we could just build a filesystem interface and point our FTP server straight at it.

I’m sure there are performance and stability issues that make most of the above more trouble than it’s worth, but I think you’ll agree it’s a pretty exciting technology.

This is Implementing filesystems in Python by Simon Willison, posted on 10th December 2003.

View blog reactions

Next: More blogmark tweaks

Previous: Nasty new IE vulnerability

13 comments

  1. Earlier this year Martin and myself found ourselves having this same conversation, promted by a suggested final year project to implement a file system interface to a database. We intially discussed the idea of the database idea. We didn't actually mention read only, I think that would be a mistake to make it read only. I was proposing the idea of directories representing tables, then sub folders representing rows and columns, rows and tables could be greped etc. Then the idea of stored queries/views came to mind, and THEN the idea that you could create a file, type your sql in there, chown it to a certain user, and it's automatically recreated as the results. Handy. We also talked about propogating the permissions in the database to the file system.

    It was at the is point that we decided that the setup for the system would probably be best done by the sysadmin :P Removing a lot of potential issues to them to deal with :D. THen of course applications. We basically came up with the ones you had. I particularly was interested from the perspective of document stores. You could share out this system over SMB so that Windows users could do all this funky stuff (just imagine trying to code THIS on windows...). Adding the web interface, adding the automatic indexing and you've got a very nice base there for quick prototyping of file stores.

    Swannie - 10th December 2003 23:49 - #

  2. This is slightly off topic, but Seth Nickel seems to be implementing a metadata friendly 'virtual*' filesystem (like WinFS is supposed to be) in the Gnome Storage project.

    *In the sense that the true file system that deals with the actual bits on the disk is something more traditional like ext3 or whatever. As far as I know, this is the same approach that WinFS uses.

    jgraham - 10th December 2003 23:50 - #

  3. Slightly more ontopic, a great use for this type of thing would be email; I'd love to be able to navigate my email as a set of directories and run grep and friends on it.

    jgraham - 10th December 2003 23:55 - #

  4. Hey, I love the SMB idea. You could point Samba at your custom filesystem and instantly make it available to any Windows machine on the network - if your filesystem is the interface to your CMS, Windows users can edit the stuff in it using whatever tools they are most comfortable with.

    A related concept to all of this is KDE's ioslaves, which allow you to define new protocols over which files can be browsed and opened in KDE applications. Unfortunately it's remarkably hard to find information about this on Google, but they allow you to do things like open files over HTTP, browse external servers over SFTP and access thumbnails from a digital camera all in the KDE file chooser interface. I'm sure there was an ioslave for accessing rows and tables in a Postgres database but I can't find it anywhere.

    Simon Willison - 11th December 2003 00:15 - #

  5. I asked David McNab about the LUFS-Python project today. He told me that he now actually uses FUSE instead, another userland filesystem with Python support. http://sf.net/projects/avf I haven't tried FUSE myself, but David gives it high praise.

    Shane Hathaway - 11th December 2003 01:49 - #

  6. It's possible KDE's IOSlaves are the coolest thing ever. Rip CDs using the audiocd:// protocol, read any file over (S)FTP in any text editor using (s)ftp://, access servers over SSH using fish://, browse WebDAV servers via webdav:// ... It boggles my mind that this feature hasn't gotten the attention it deserves.

    I've already shown Simon this, but some other readers might benefit: It's possible to make your own IOSlave in Python. Oh, and here's a helpful IOSlave development article and the official IOSlaves documentation.

    Simon, you're right about the lack of helpful Google results for this topic. I could find only one list of IOSlaves, but any KDE user can check the KDE Control Center | Information | Protocols for a complete list.

    Adrian Holovaty - 11th December 2003 02:46 - #

  7. "I'd love to be able to navigate my email as a set of directories and run grep and friends on it." If you use the MH email format, rather than mbox, I think you'll find that it does organise itself in directories, with each email being a file. Thus it would be 'grep'able. To take that to the next step, you could even setup a seperate partition for your mail.

    Ben Thorp - 11th December 2003 08:54 - #

  8. A certain vendor of a very very expensive CMS has done something like this for a while. They have a filesystem layer on top of a relational database that acts as a version control system. All users also have sandboxes in the database which lets them play some neat tricks when combining the virtual filesystem with mod_proxy. The user can edit a web page and then when they point their browser at the site they (and only they until they commit changes) see their changes. Thus, no need for link rewriting when publishing a site. The system works well enough that one rather large company has tens of thousands of employees world wide using it to maintain their corporate site and to each employee it feels like they are editing the main site.

    Personally, the reason I like this idea is because I can write my web apps in whatever language or framework I want, and as long as it generates data that matches the mimetype it'll run on any webserver (including IIS if exported via Samba!). Hmm, the high speed Cthulu web server uses mem mapped I/O for all files... I wonder how well that works with LUFS :)

    Van Gale - 11th December 2003 20:48 - #

  9. In response to Van Gale: Yes, I've used that CMS before ;)

    Paul Hart - 11th December 2003 21:12 - #

  10. LUFS is a good implementation of a great idea from one of the best computer scientists of all time.

    I am not sure how many of you know anything about Plan9, from the smartest man I have personally had the opportunity to meet Ken Thompson (father of Unix, multiple languages such as the B languague which was modified by K+R to give you C, the first master cheese program, etc.) In Plan9, Ken and the boys at Bell-labs took his original idea for a file centrix os in Unix, to a file centric networked device filesystem. If you have every used a plan9 account you know the power of using simple ideas like namespace construction, mounting, and binding to effect adaptive execution.

    If you would like to know the power and modifying your own filesystem namespace and some "ideas" for how to use the modified namespace of the resulting modifications in a file centric system. You need not go any farther then looking at the papers of the Plan9 researchers..

    Anthony Tarlano

    Anthony Tarlano - 12th December 2003 09:49 - #

  11. On the subject of IOSlaves in Python, you should take a look at this page for more details:

    http://www.boddie.org.uk/david/Projects/Python/KDE /index.html

    I suppose that it is a tradeoff with respect to different notions of portability when considering writing IOSlaves or LUFS filesystems: IOSlaves should work in KDE on any of the supported operating systems, but only be available to KDE applications; LUFS filesystems will only work on Linux, but be available to all applications.

    Paul Boddie - 12th December 2003 11:17 - #

  12. The Python Wiki page containing the IOSlave tutorial is shamefully incomplete because:
    1. I realised that both the example IOSlaves I'd written were rather more verbose than is good for a tutorial.
    2. Some of the code for synchronising the state of the user's bookmarks is rather clunky.
    3. There's way too much non-IOSlave related stuff going on, which detracts from the explanation.
    I've got a better example which I'll aim to put up there once I've rolled it into the components framework. I think I'll also try and create a LUFS-Python filesystem with the same backend, too, just to see how they compare. I've still no idea whether anyone else has had success with Python IOSlaves. All I know is that they work fine on my system and fail on one particular flavour of Red Hat Linux...

    David Boddie - 12th December 2003 19:04 - #

  13. Thinking about the comments about Interwoven, couldn't this be handled fairly easily with normal WebDAV? You can either make the view dependent on who is logged in, or give every user their own subdomain (the subdomain being significantly more convenient and powerful, I would think -- then the subdomain is really a branch, where each user may be working in their own branch).

    WebDAV seems like the right cross-platform solution to creating virtual filesystems. Linux doesn't have a very good implementation, as far as I know, but it should, and other operating systems already do. It also gives explicit hooks for all the extended metadata that you might need, even if current tools don't make much use of them.

    Ian Bicking - 12th December 2003 22:13 - #

Comments are closed.

Previously hosted at http://simon.incutio.com/archive/2003/12/10/pythonFilesystems

A django site