Recent blog entries for PaulWay

*sigh* I have to look for ten new jobs a fortnight in order to qualify for unemployment allowance ('the dole'). And finding ten new jobs every week through people who actually know me - agencies - is proving difficult. Trawling the job sites in Australia brings my quota up but I can't help feeling like I'm prostituting myself. But it's better than not being paid at all.

I was originally recommended to Advogato by Skud, the international jetsetter and twice winner of the All Australia Turkish Bath Competition. "Hey, if you've got ideas, go to Advogato and you can post them in your diary and other people read them and then they reply and you start chatting and next thing you know you're doing an IPO" were her words, or something like them. And I did, because the ideas were, at that point in time, bursting at the seams of my cerebellum and the desire to actually talk to someone about these things without having their eyes glaze over was overwhelming.

But so far you can count the people that have commented on my ideas on the fingers of one foot. I've browsed other people's diaries, but there seems to be little way to actually find people who are interested in similar things. I suppose I could always find some project which I was similar to what I'm thinking with file systems, but it doesn't seem likely...

Anyway, I should probably be looking for work rather than waffling here.

I'm currently grovelling through the ReiserFS documentation, which is not so much documentation as a sort of doctorate thesis in the reasons behind the underlying principles. It's like trying to work out where you are in a city by feel - you might work out what this bit is, roughly, but every time you go somewhere else you get lost again and you can't link the bits together easily.

It's given me an idea, though. I'm not going to attempt a Balanced Tree representation, especially since I can't entirely work out what the ReiserFS uses the B*Tree for. However, the idea is to divide the free space up, initially, into (arbitrarary but roughly equally sized) blocks. When you allocate space to a file, you merely find an appropriate block and subdivide it into two. When you create a directory, the directory gets a portion of a block as its own free space. This way, you could get locality of reference - i.e. looking for files in the same directory means looking at (relatively) close parts of the disk - speeding up access times.

This links in an idea I had when I first looked at this, which was that some drive configurations (e.g. IDE drives) don't perform linearly, but have sections of space which are quicker to access than others - due to the disk being CAV and outer portions of the disk spinning past the head faster than inner portions. So if you wanted to, you could run a profiling operation on the disk and then divide the space up into blocks of roughly the same speed. Directories of files that needed to be accessed quickly could then be put in the faster areas. The FS code could even watch the operations and move files accessed more often into quicker areas.

The other offshoot of this idea is that a directory also stores blocks of free space, which are aggregated on the fly and allocated to files when needed. This saves having to build a memory map of free disk space on mounting the disk - the structures are already stored on the disk itself - and means we also get the locality of reference I talk of above.

Can anyone who reads this and knows something of the ReiserFS workings email me? I'm having a hard time getting into it and I don't want to get to the coding stage just to find out that all my ideas have been duplicated elsewhere - and the ReiserFS looks like the most likely candidate.

BTW, hands up all those who find the ext2fs's limitation on 2GB per file restrictive? This, from what I've read of its capabilities, seems to be the case, and having managed to write a 2.6GB file (editing CD-quality stereo audio seems to do that sort of thing) I would find it a bit annoying to be told 'no, you can't do that'.

Further thoughts on the Linear FS:

It could easily be emulated in another OS by the creation of one large file which housed the FS structure. While this file may not be contiguous in its native OS, it would at least be a way to experiment with the FS without having to use a partition editor when you wanted to examine the LinFS structure workings (during debugging, eg).

Taking inspiration from the Unix FS method of storing the first ten inodes in a table within the directory structure itself, for really small files it might even be possible to store the contents of the file in the directory for the ultimate in zero redirection. Of course, you wouldn't want them to be very big, or changing very often, but it's an idea. See what happens.

Life goes on. Went to Netizen's disintegration party - felt like, well, a straight boy at a goth club, really. You talk the same language, but there's this undertone of "you're probably really shocked that I've got piercings and weird sexual habits, eh?". Which I'm not, of course :-) But mostly just the feeling of being someone who's amongst good friends who can't be bothered meeting new people when they're having fun with the old people. Hey, I've been in the other position, so I don't blame them at all :-)

Ho hum. Job search goes on.

20 Sep 2000 (updated 21 Sep 2000 at 00:33 UTC) »

Life goes on. If anyone working in Melbourne (Australya :-) has a position for an experienced Windows NT (yeah, I know) admin with SQL and internet experience, I'd appreciate it. paulway@nospam.earthling.net is my address - minus my request for no spam, of course.

Any advogates out there who own an empeg car player?

A lot of my thoughts on file systems come from an underlying idea that a lot of the relatively common things we want to do with computers are actually quite tricky - writing to any part of a file being a prime example - simply because we're still in the habit of thinking in terms of 20 year old operating systems.

I'm not suggesting removing all paradigms that remind us of the physical representation of our objects. I personally think the idea of having memory and disk as one single space makes about as much sense as trying to go down to the chemist in a 747 or trying to get to another continent by scooter. But we can start looking at the way we use our computing resources and see if we can make it work better for us. I think the use of a memory cache that exists in the 'free memory' and scales dynamically as memory is required and freed for other applications is an excellent example. The overhead of managing this is now less than the performance loss it requires, and the benefits are larger than the fixed-size cache implementation.

I can't think of any other good examples of computing operation paradigms that need shifting, though. :-)

This diary entry is sort of summarising all the thoughts I've had about a new file system methodology. I think it's new because I've never heard of anything like it; I think it has a lot of good features that both make things easier for programs, users and even the OS itself; I think it makes the best use of a fully multi-tasking OS without causing too much hassle on non-multi-tasking systems (i.e. this functionality is optional); and in the days where someone can actually write the code to do this and not be employed by a huge powerful corporation (and also not have to justify its development and testing on anything but eir own basis) it's time to get it out of my mental closet.

The essential difference between this FS and all others is that this treats a disk as a single contiguous array of sequential bytes. No sectors. Each object stored on the disk, be it a file or a directory or something else, is stored as a series of chunks, where each chunk starts at a point on the disk and consists of a contiguous stream of bytes from there. All these numbers are 64-bit, to cope with modern storage requirements. Disks are numbered starting from byte 0 and working up, in some sort of established order through the sectors, cylinders and heads, to provide a linear contiguous byte array up to the last byte on the disk.

Let's start by looking at how a file is arranged. The file entry in the directory contains the file name, date, total size, and the usual attributes. Then follows a list of the chunks in a file, consisting of ([Start Byte],[Offset on disk],[Length]) tuples. [Start Byte] is the number of the first byte in this chunk, [Offset on disk] is the physical disk location of the start of this chunk, and [Length] is obviously the length of this chunk. The total length should always add up to the total length of the file.

When writing a file, ideally one simply finds a chunk of free space and writes the entire file contiguously into it. If, for some reason, you're writing a file that is larger than the largest single chunk of free space, you just write multiple chunks, using up each chunk of space before moving on to the next. I haven't done much research into which allocation methodology is best - best fit, worst fit, random fit, whatever - but that can be tested and possibly even configured at time of use. Reading a file is fairly obvious.

The immediate advantages are that there's no wasted space at the end of the last sector in a file. Small files take up exactly their amount of space, no more. Files are mostly contiguous and fragmentation is kept to a minimum by virtue of the fact that it's relatively easy to find and use a contiguous block of storage. We waste no space in large FATs or inode lists (where most of those are listing sequential sectors anyway).

Some things we can do with this that are more difficult in other FS methodologies:

  • We can write sparse files (e.g. a 100MB file with only the last byte actually storing any data) simply by pointing the Start Byte at the location of the data we want to store.
  • We can insert an arbitrarily large amount of data anywhere in the file by writing that chunk and then manipulating the chunk list appropriately. For instance, we can write data into the start of a file by writing that chunk and then adding the new offset to the Start Bytes of all the other chunks in sequence.
    These two concepts would lead to a radically new and easier way to write and read all sorts of files that have had to done in roundabout ways - such as writing to the start or middle of a file without having to rewrite the entire file.
  • There are multiple ways to defragment a file. For instance, we can examine the amount of free space immediately after each chunk, and if that free space can take the next chunk we join the two chunks and shorten the chunk list.
  • We can even be defragmenting a file while it's in use, either by only defragmenting the chunks of a file that aren't in use at the time or by making sure that the new file is written out as a new, single, contiguous chunk.
  • Supporting compressed files is much easier because we break the file up into arbitrarily large segments and compress each as a chunk. When we want to rewrite part of a file we can write that chunk afresh and re-offsetting the existing chunks. We don't have to touch any chunk other than the one which is modified.
  • I'd like to build hooks into the operating system that allows it to gradually defragment files in quiet time.
I'm still thinking on the workings of directories, but the basic plan here is that there should be little or no need to greatly expand a file's number of chunks in one go. My preference is to have each directory entry given a 16-bit length and then allocate a reasonable amount of extra space for future expansions of the chunk list. But, for instance, Read Only files could have their directory entries contracted down to minimum space because we <U>know</U> there's going to be no other blocks inserted into the file (at this point in time, anyway) and thus no extra chunks added.

I might leave it there for one day and see what comment (if any) that invites.

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!