18 Jul 2002 dutky   » (Journeyer)

considering a new project, along with Advogato article submission to gather some input and ideas, for an end-user file system that would better support undeletion of files and allow rollback of metadata changes. Here is a draft of the article:

<ARTICLE>

<TITLE>End-User File System</TITLE>

<LEAD>Current Linux file systems do not well support recovery of deleted files or modified metadata. A new file system should be written to address this deficiency.</LEAD>

<BODY> <P>While Linux provides a wide selection of high performance/high reliabiliy file systems, it is in dire need of a file system targeted at the consumer/end-user. Current unix file systems are very good at providing fast, reliable access to disk storage, but if you accidentally delete a file, or change the file metadata (mode, uid, gid, etc.) you are pretty well screwed as far as recovering the deleted file or previous metadata. Some attempts have been made to document file recovery under Linux, but they are not for the faint of heart and their effectiveness is not guaranteed. As far as recovering modified metadata, there is no known method of attack, even on some commercial systems renowned for their user-friendliness.

<P>In order to allow recovery of deleted files the file system must make a couple of guarantees: First, the allocation of free blocks should be done in a strictly least recently used (LRU) or first-in first-out (FIFO) order. Second, each block in the file sysetm should keep enough information, even in the free state, to determine what file it last belonged to and what part of the file it represented. With these two properties guaranteed by the file system, it is a simple process to scan the free blocks and recover all deleted files whose blocks have not already been reused.

<P>A similar issue to recovery of deleted files, is recovery of metadata. While file deletion can be difficult to recover from in most unix file systems, metadata changes are impossible to recover. If you accidentally chmod or chown all the files in a directory, there is no way to find out what the old ownership and mode bits were, much less restore them. The simple solution to this is to keep a record of the most recent previous metadata in the inode itself. Now, if you accidentally chmod or chown a bunch of files, you can run a utility that looks at the saved metadata entries and restores the previous values. A better solution might be able to keep track of more than one previous set of metadata, but is probably more effort than it is really worth.

<P>The metadata recovery mechanism can be trivially added to existing Linux filesystems, since it only requires a few more entries in the inode structure to keep the saved metadata. The file undeletion features, however, require a redesign of almost the entire file system. The file ownership links must be maintained whenever a file's data is modified. The LRU allocation of free blocks may not be too difficult to add, but would still be fair amount of work. Becuase of the difficulty of adding the undeletion features to something like Ext2 or Ext3, I suggest that a new file system should be written for use in purely consumer/end-user applications. If the undelete and metadata recovery features are worthwhile, either the new file system will be expanded to include performance and stability features from existing file systems, or existing file systems will adopt the recovery features from the new file system.

<P>My questions to the Advogato community are: First, are there any other file system projects addressing these issues? I've done some cursory searches, but haven't seen anything that deals with the issues at the file system design level. Second, what are some good references for writing a Linux file system (aside from the source code? I can wade through the source code as well as the next guy, but I'd prefer a guided tour). Third, any other suggestions or comments? </BODY>

</ARTICLE>

I figure that a simple block strutured file system can be used. Each block will be one of several types: FS_MASTER which keeps track of the important file system properties, FILE_NODE, which is the equivalent of an inode in traditional unix file systems, FILE_DATA, which contains the actual data of the file, and FILE_EXTENT, which builds a tree structure for rapid access to a file's data by location.

The FS_MASTER block keeps links to a free block list and a used block list, as well as a unique file tag, which will be incremented each time a new file is created, and other bookkeepping information.

The FILE_NODE, FILE_DATA, and FILE_EXTENT blocks will have a pair of links joining them to either the free or used block lists, and a unique file tag indicating which file they are part of (or where last part of). The FILL_NODE will have links to the list of data blocks and the extents trees, as well as assorted inode-like information (including the saved metadata and probably some indication of one of the names used for the file).

Unlike most unix file systems, there would be no separation between inodes and data nodes, which would eliminate the problem of running out of inodes while there was still avaialable disk space in a partition.

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!