Older blog entries for baruch (starting at number 53)

Filed a bug report with Mercurial and it got fixed pretty fast, though I was asked to actually produce a small test case script. This turned out to be a good experience.

I've also filed bug reports and feature requests at AlbumShaper, and they get solved one by one. I'm still stuck with a bug that prevents drag-n-drop from working when using the Ion2 window manager.

I've been distracted from working by the Gramps program to generate the family tree. This is a really nice program and I've already got a lot of data into it. It generates some nice graphs with GraphViz. I need to figure out how to make it work properly with Hebrew in the text, currently Hebrew is munged and I use English names instead. This is a bit uncomfortable for those in my family whose English command is lower than their Hebrew command.

I've caused my parents to go out and ask for the missing pieces of information, so now I have better knowledge of my family tree, and hopefully a chance to preserve some of the family history.

I just found AudioScrobbler, the idea sounds nice though I didn't get to find any musical neighbours yet.

The XMMS plugin is just crap! It hangs when there is an ID3v2 tag, I fixed that and provided the patch to the Debian BTS. The author should learn about reusing libraries and find some that implement this code already. Or at least copy code from another GPLed app instead of inventing his own.

drag-n-drop woes

I've been trying to use AlbumShaper for managing my digital photo albums, but for a while now I had a problem that I couldn't reorder the photos in it because I couldn't drag-n-drop the photos.

After much trying and playing I found that when I'm working in my normal window-manager (ion2) I can't drag-n-drop, but when I switched to another environment, such as Gnome or KDE, with another window-manager, everything works.

Now I got my albums in order, but I haven't figured out what is the reason for this wierd behaviour.

We've had a conference call with Diane Peters from the OSDL and Matt McCooe who is responsible to licensing for NUIM. This will hopefully help to clear up the issues.

It was suggested that releasing the code under the GPL gives implicit license to the patent, but that to ensure that no-one will fear future litigation we can license the patent to OSDL for sub-licensing.

This is in the works now, and I've communicated the above to the kernel folks and at least Dave Miller said that he sees no trouble licensing-wise with our contributions.

Hopefully now our patches will receive the technical review that they need to continue their journey in the linus tree.

Bugs, Bugs & Bugs

I've been drilling into the inner workings of my changes, they seem to work for all of my tests, but I wanted some more confidence.

It appears that my patches break the counting of fackets_out (which I haven't deciphered it's use yet). It causes no known effect, but that does not mean it's not a bug.

Where did I put the pesticider? (An environmentally clean pesticider! -- It is actually proven to work, it hasn't killed any pest! :-)

p.s. relayfs rulez! I'm pumping tons of data out of the kernel for tracing purposes, and it just works, with little damage to performance. (Derry, this is the case where gdb is not helpful).

27 Apr 2005 (updated 27 Apr 2005 at 19:25 UTC) »
Licensing, Patents and in between...

We have finally got the approval to release H-TCP under the GNU GPL, it's been a long time and quite a bit of work to convince the university administration that it is ok to do so and that they will come to no harm by approving it. So I've submitted a note about that to LKML, only to get back a message that we need an explicit patent license for the pending patent. That means another long wait, and more dealings with the bureaucracy to get that sorted.

The problem begins with the way research is funded, you always have some strings attached. You need to show later in some review how good your research is and the fruitful results it brought humanity. Only that everyone is just so damn busy to really understand the results that they prefer to see more "tangible" results, such as patents, articles and such. You have some new code for Linux Kernel? We care not about it, we prefer an almost useless patent that is unlikely to bring any profit over real benefits to humanity.

And so you have a patent, except that you don't own it. The university does. So in order to allow others to use your invention, you need to ask someone (more often than not, more than one person) for permission to release your work for the general good. It sucks.

Basically the administrations that run the universities have lost sight of the real purpose of universities and are just looking at everything through the financial reports. No wonder RMS quit MIT to do his Free Software, he'd probably still be fighting with the administration to get a single program released with sources.

On some happier note

On the suggestion of David Miller I've contacted Diane Peters from OSDL to help us understand what we need to provide so we won't stumble through the maze like blind folks. She had agreed to help which is great, so we are trying to organise the meeting with her to get the information and then we'll (hopefully) be able to sort it all out.

Performance issues with 2.6.11

I'm close to being desperate with 2.6.11 with regards to TCP performance, some change or another between 2.6.6 and 2.6.7 killed performance and I can't find anything wrong, not in the TCP code and not in the e1000 driver.

On 2.6.6 with only H-TCP patches, I'm getting 300Mbit/s (and 40ms rtt) with no troubles at all. In 2.6.11 I'm lucky when I'm getting 150Mbit/s. The H-TCP port between the versions is simple enough and has no real changes.

I'm not sure if it's something that I did wrong in my porting of the code, or something in Linux itself. I'm close to leaving it for now and returning to it later, but it does mean that I'll probably leave my patches unsubmitted. Which means more work later on when I come back to it, forward porting the patches from 2.6.6 to whatever will be the version at the time.

I managed to get my presentation done in time and in a good manner. I covered most of what I wanted, though I left out the actual technical details of my work. It was a postgrad seminar presentation and my time slot was short as it was (30 minutes).

I didn't get a lot of feedback, but what I got was that it was good.

English not being my mother tongue I sometimes got stuck to find the best word/phrase, but got over it.

Not too bad overall. The slides are temporarily at http://baruch.ev-en.org/Baruch_seminar.pdf

Now I need to go back to my work, get performance of 2.6.11 to be at least that of 2.6.6, get the patches in order and send them to netdev for review.

This time I hear the H-TCP patches might have their license issue solved, so they have a fair chance of being included.

I'm using quilt to maintain my kernel patches, and it's nice and dandy, but sometimes you need to validate that all your patches are applying and the kernel compiles at all stages, Enter quilt-compile-all script:

set +e
function die {
	echo "$1"
	exit 1

[ -d patches ] || die "Are you in a quilt managed directory?"

quilt pop -a [ $? -ne 2 ] && die "Quilt pop -a failed." make clean [ $? -ne 0 ] && die "Make clean failed." make -j2 [ $? -ne 0 ] && "Initial make failed."

while [ "$(quilt unapplied)" != "" ]; do quilt push [ $? -ne 0 ] && die "Quilt failed." make -j2 [ $? -ne 0 ] && "Make failed." done

echo "Compilation succeeded." exit 0

I have to patch sets for the kernel folks to review, one is the HTCP congestion control algorithm and the other is performance improvements to the kernel TCP transmit code.

The first is also tied in licensing issues and there seems to be a deadlock, the kernel folks are not interested in reviewing it before the licensing is taken care of (and I understand them!) and the officials are in no hurry to solve the licensing until they'll know it will go in. I've context switched to other tasks to see who will break first.

The second set is a set of performance improvement patches initially done by my supervisor Doug Leith, and I'm improving on that now. Part of my work is to port it from 2.6.6 to 2.6.11+ which is what I'm doing now, as well as refactoring the code and throwing away bits I don't think should be there.

I've switched from the original use of web100 to use relayfs and a script to add logging code, loosely based on ideas from Aspect-Oriented-Programming.

I've passed on my journey through using kprobes (cool stuff!), but the overhead blows my code so I'd prefer to avoid it. Kprobes had a bug when placed on a ret instruction which my testing helped uncover. Relayfs had gove some improvements based on my input, so my work is helping others.

And all of that in the name of having clean code so that it will be easier to submit to lkml when I'm done. No measurement code is embedded in my production code at any time.

Thanks to Tom Zanussi for his help with relayfs and the systemtap folks for their help with kprobes.

I got bored waiting for the test machines to come back up so I can run my tests on them, so I wrote two small utilities, one is a multicast sender, I set it to run when the test machine boots up and it sends a message with the machine name in it. There is also a receiver which waits for one or more machines to notify they are up.

Now I can set my kernel to compile and load on the test machines (a script for that already exists :-), I set the receiver to run and when it exits the test is automatically run through ssh with public-key authentication.

All this means is that the usual cycle of 10 minutes waiting for a proper compile, boot and test, is now done with no human interaction, and I have time to get some more coffee (or write a blog entry).

I've been working to improve the performance of the Linux Kernel TCP stack. Apparently it's not very efficient and there are several things that can be done. The measurements I'm doing and the improvements I've done so far will be the topic of my OLS 2005 paper (should it be accepted...).

The problems with the current implementation are not evident until you actually go to long links of 200Mbit/s, they are obviously painful when you get to 1Gb/s.

The other factor in not seeing the problems currently is the use of TCP, the plain TCP congestion control algorithm (NewReno), is not effective and does not utilize the link when talking about 1Gb/s links. That's where the HTCP algorithm which was developed in the Hamilton Institute comes.

BIC sucks

A colleague has been comparing different new congestion control algorithms, and BIC is so unfair that it's the algorithm of choice if you want to starve other users from bandwidth.

In addition there is a bug in the current implementation of the Linux BIC congestion control algorithm, which makes it so much more aggressive.

Full report and patches will be provided soon.

44 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!