Older blog entries for pasky (starting at number 10)

28 Jul 2004 (updated 28 Jul 2004 at 21:15 UTC) »

blm: Hi, I couldn't find your email at your homepage (which is apparently non-existant, looking too much default ;), so I hope you will read this through the recentlog or so. And this could be interesting for other people as well.

You say that noone wants you in their project, but that's not how most free software projects work (at least the bazaar-like ones, and I suggest you to prefer those especially when taking off). You do not ask people to let you join their team. You join it and integrate to it smoothly by sending patches. Most projects are based on meritocracy - your virtual "position" in the virtual team is based on your merit for the project. More you bring in, the more credit you get and the more people listen to you.

So, you do not look for projects with team willing to accept you. You look for projects which are exceptionally interesting for you, which give you some motivation for working on them (be it ideological, fascinating technically or scratching own itch).

From these, you choose those where you have clear idea what to contribute and of course you choose projects which aren't technically over your head, be it code complexity, insane coding style, being too low-level or the code is simply too big for you (it takes too long to update, compile or grep on your machine). But do not be too afraid; of course when you are newbie Perl coder you do not start by hacking Perl6 internals, but do not be afraid to peek into Mason. Remember, you get from the newbie level only by experience and hacking someone else's code is often much more valuable than coding something from scratch.

Actually, often the best way to start is to fix some easy bug. Do not get afraid by looking at the code. It could grin at you and make obscene remarks, but it can't harm you. It is laid on the canvas of your screen, waiting to be read, understood, grasped and touched by your fine coding hand. It is too big and takes too long to understand? Keep yourself focused.

First, be sure you skim over the documentation, both user and dev (if there is any). Then, you can look around briefly, but do not try to read everything, just try to see how is the code generally organized and how the grand scheme of things looks like; this step is optional. And then, do not look left, do not look right, look ahead and stay focused. Try to find the exact location of the code you will need to work on - grep the sources. Grep for the irritating error messages, grep for likely related keywords, patiently go through the results and identify the victim. Then look what causes it - be sure you at least generally grasp what the routine does (do not be afraid to look at the functions it calls, just do not descend too deep; grep - or better, use ctags), grep for the callers and work straightly towards the fix, not spending too much time on the non-involved code which is just distracting you. So, the synonym for getting oriented in the codebase and understanding the code is <kbd>grep</kbd>. You get used to grepping and the code holds no mystery to you.

So, you fix the bug (ok, we skipped a lot now, but that's already up to you; I told you to find something easy ;). Now you make the patch (google it if you don't know how to make a patch), the devs will prefer you to do that against the very latest version, preferrably from their CVS or SVN or whatever are they using; but usually, patch against the latest release will do too, if you are too afraid. So, you submit the patch and wait until it gets integrated. Do not be afraid to take the criticism; learn from it, absorb the conventions used by the project, be sure to look how they actually changed the patch before integrating it. Sometimes, the patch gets ignored; you either don't care and let it stay forever in the mailing list archives, googlable for anyone, or you care and push it; just do not push too hard, resent it once per week or so and eventually someone will at least tell you what's wrong. Again, absorb the criticism, adapt yourself and your patch. You don't like the way the project operates? First work your way up with the patches, you either see why is it good for the project or chagne it. Or, if you really think they are stupid, you care enough, your contribution is big, and the project is small, you can fork.

So, that's how it usually works. To sum it up - you choose the project which is exceptionally interesting for you and not over your head, and start by doing some simple patch. When understanding the codebase, grep is your best friend, and you focus only on the bit of codebase relevant to you. When submitting the patch, you learn from the criticism and do not get turned away by absence of a warm and enthusiastic response.

Minor update

I'm sorry that I didn't post any updates lately, but basically I had to do other things and started another massive time waste, and I started delaying next posts after this and that and the BlueCode release etc. Well, let's see.

BTW I got finally invited to Orkut few days ago, so it was another timesink for a while. It's amazing to see the social network emerging around you and see that you are just one or two nodes away from very interesting people. OTOH the communities I'm interested in are a little less... attuned to my carrier frequency ;-) (or rather vice versa).

So right now we agreed with thement and left IRC. For some time at least. Gotta do useful things, really. So, finished writing another sequel to my CVS series for root.cz, and now time for BlueCode.

#

The surprise is always there, even right near end of the journey

[Boring school part start (just skip it ;)]

So, on the last Real School Report (in this half-year; then, only the mature exam report will come to follow), I will have the first note of 4 in my life (on a school report, that is).

The Czech school notes system is that 1 is the best, 5 is the worst, and it scales variously between. I never did worse than 3 on the reports, and was pretty embarassed even of the 3s, back then ;-). But the last time I will get such an report, I will have one note of 4 there, for the first time.

It is from Biology and I pretty honestly believe that it's absurd. We were just having ONE written exam for the half-year, and whatever you get you have on the final report, which violates even a few rules of the school system (where it is said that you need to have at least two notes, preferrably three at the minimum). But we have the schoolmaster for Biology, so he can do whatever he wants, and it's tough when she's an... well. No comment.

In fact, I didn't really care anymore, I gave the corrected exam paper just a blank stare when I saw the note on it. It is not my profile subject (that is, anything which would really matter when getting onto the university) neither would it carry my reports average over any significant threshold. So who cares, when it's the last year? And I couldn't do really anything about it anyway, she said there's no way how to correct it. So why to worry at all?

Otherwise it wasn't too bad today, I was writing quite exhaustive and tiresome exam paper from Math (vectors and analytical geometry in plane and space). It wasn't hard, but full of really long and exhausting computations (and well, I didn't have a look at it at all so given that it is a revision from a year ago I already forgot few details, but the book of math tables we are allowed to used and which contains most of the useful formulas came to help). At the end of two hours marathon, I had just one sign in one part of a vector wrong, otherwise it looks like it's all ok, so I'm really happy I didn't make any more significant mistake in the whole thing.

[Boring school part end]

Besides, I was watching The Virgin Suicides (no, that is not any porn; and imdb is strange, I can't see how this could be a Comedy in any way), as someone recommended in his diary entry on Advogato (unfortunately I cannot remember who it was; if you read it, please tell me; we really need a way to do a full-text search over the diaries ;-). And it was really a great movie. Very emotional, and relieving in some strange sense. Quite sad but I would not say depressing, unlike some "sweeter" ones. I do not know how the director achieved this, but she did it really well! I think I "liked" (it sounds cynical in the context of the storyline but I mean it in the artistic sense; basically what touched me most) especially the start part, and then the ending from point when Lux was forced to burn her records and ending by what happenned when Lux left the visitors in house alone (carefully worded to include only minimal spoilers). I didn't pay much attention to the final party so I lost the story there a little, but I'm certainly going to watch this again anyway. Certainly a must-see.

Now I'm getting Eisenstein's Battleship Potemkin and then Citizen Kane, I really wonder how it will feel like. They say these are the very best movies at all, and I have no clue what to imagine under than, how will they look and flow like etc. So I'm looking forward for finally seeing them :-).

Otherwise some fixing of GTS scripts and I also really finished both seminary works today. Tommorow afternoon I will take them to the copy shop and by Friday I should have this whole thing finally sorted out. I hope I will get back to doing anything interesting on Friday, and it will be ELinks. Besides finally releasing 0.9.1, I also hope to get a peek at few ugly bugs together with Jonas, and I will write some simple TODO list for 1.0.0; Bugzilla is good for tracking bugs but I think it is too clumsy for a simple major features list which you could have all the time in front of you.

Thanks to Slashdot, I have read some Tog's articles about proper user interface design etc, and it inspired a whole chain of new ideas in me. I've dreamt about writing my own clone of FluxBox which would follow the guidelines presented there, and doing something with the terrible UI of GIMP (others who did some research said there's currently no project working on redoing GIMP's UI; on the other side even others said that the development branch of GIMP made a huge leap in this aspect; I'm sure someone informed on Advogato could follow up, please...?). Well, whatever, you know the dreams :-). I have so many dreams that I don't know what to do (I mean which to fulfill) first (regarding the software ones, of course).

I'm so lazy to properly hyperlink my diary entries, I'm sorry.

#

Overslept

First, as you can probably see (if it's all working as it should), I've improved the Advogato XML-RPC push a little. Now, the links etc should work fine, and you Advogato people can actually see the caption somehow.

I've overslept today. I woke up at 8:45, and I decided it makes no sense to go to the school anymore (only to P.E. and then one class of English). So I slept on to about 11:00 ;-). I usually attend the school relatively regularily (unless I'm sorta ill/sick), but it's getting worse with me by time.

Then, I released ELinks-0.9.1rc2, because the rc1 release was of course messed up (and besides, some interesting bugfixes appeared during the night). I got surprisingly negative feedback by some people about the compile-time configuration switch from ./configure arguments (except the very major stuff) to feature.h (sorta vim-like). So I opened a thread for it at the mailing list, we'll see how it turns out.

Looks like they've released Linux-2.6.1. I like Linus' jokes. I think the maintainership workload split with Andrew did him very good, at least judging subjectively. I think his mailing list activity raised by few hundred percents (minimally) and his writing seems more relaxed :-).

Otherwise, this was another lazy day. I got some new TODO items from the GTS people, I checked whether Ariadne is still working somehow (it is) and otherwise I was slowly working on BlueCode. It is already basically usable but some vital features are still missing for a public release.

I was looking forward for listening to \"Toulky ceskou minulosti\" in radio (it's very interesting broadcast about Czech history, I've became addicted to it), but they stopped broadcasting it at this time! (I'll have to tune in on Thursday afternoons, which sucks.) I was seriously annoyed and it disrupted me quite a lot. But coding is a very good cure for bad mood, from my experience.

<small>#</small>

Not much new today. I've glanced at the official solutions for the Math Olympics --- it's not looking exceptionally well, it is less similiar to what I wrote than I expected ;-). I was also watching a BBC document about weather (wind, in particular). It was quite interesting, I didn't know about the jet stream at all.

Otherwise, I was in M-Soft (computer shop / small ISP) doing another seminary for its workers (I'm sorta contracted so that I'm taking care of their security [more or less, well ;] and do various things they can't do themselves for them and I get free net [64kbps microwave] from them in exchange). It boiled down to ratelimiting Postfix, some SpamAssassin dances and then fixing Debian installation over there to upgrade their kernels (while the system there was still hand-hacked RedHat, it was running 2.2.25 and I didn't have to worry about things like mremap()! ;-).

Oh and of course I've released ELinks 0.9.1-rc1 today. I originally planned to just roll out 0.9.1, but when I saw the huge amount of changes from the last night and today, I decided to let people catch at least the most obvious mispasted-code bugs. Go ahead to test it if you want ;-). It will bring various bugfixes, progressbar in status, new compile-time configuration (the ./configure parameters were getting out of hand so now you tweak feature.h), support for saving and restoring a session etc. Jonas is already the main driving force, which is good. More power to him! ;-)

The problem with me is my love for surprises. I really like making (hopefully pleasant) surprises for people, and because people also read this weblog (what a brave assumption), I can't speak out freely even here. However that doesn't matter, from the large part writing a diary for me is important because of the reflection, and usually I write this stuff down anyway, then I realize I would be telling things I shouldn't, and delete it again ;-).

Sorry for lack of update yesterday, I was quite tired because of too little sleep and it's same today. So briefly about yesterday, basically I was attending the Czech Math Olympics - category P (Programming). It was the regional round, and there usually aren't too many people there from this region --- we've been six or seven at best. It's the Computer Science style, that is not those stupid \"Code silly mouse-clicking detecting gadget in half an hour\" practical exercices (I've been on few as well). You had just paper and you ought to describe your thoughts and write your program there. The tasks weren't easy.

Actually, one thing I really hate is writing computer programs (in C or Pascal or another computer language) on paper. It just seems really stupid. They are here to program computers, therefore they should be written into computer. On the other side, they were mostly really just suplementary to the algorithm description, which was the main challenge.

The last year, I thought I did it rather well. Then my best rank was IIRC five of ten points in one task or so. Thus I'm better quite sceptical this time, but I still believe I did better than the last year. I certainly didn't reach the most optimal solutions in all tasks (at least in two I know I'm being suboptimal), but they all should work. Let's see. It's quite important because if you're good enough to progress to the national round, you're also automatically enrolled to MFF of CUNI CZ. That's really tempting and a good motivation. Maybe I'll describe some of the problems from the regional round later if I'll be in mood and have enough time.

Ihaquer confirmed my 2.2 theories, it is indeed not vulnerable.

We've been celebrating the Orthodox Christmas this night.

I'm currently mostly working on BlueCode, chattering on IRC, reviewing few ELinks patches (and I implemented some really simple quick'n'dirty <object> tag support) and now I should be sleeping. Tommorow I'm gonna visit my beloved school again.

I should implement paging of my weblog page (it's already over 100kb), and I should maybe insert some small automagic footer to the XML-RPC message sent to Advogato that my real weblog is elsewhere and only selected entries are cc'd to there (or here, depending on your POV ;). But I'll do that after I will actually sometimes submit some non-technical entry which would have nothing to do on Advogato.

Now I should yet have a look at the History stuff we're supposed to take an exam from tommorow. Bah. (Roughly) 1400 --- 1620. That's gonna be fun. And I should finish the missing bits of my seminary works. I'm translating one piece from Douglas Hofstadter's GEB: EGB in my seminary work for English (I must add some one-page introduction and closing for the translation); I'll be writing about that marvelous book in some other entry surely. The other work is to Math, I'm doing some general overview of Fibonacci (and related) numbers there. It's really just browsing the net looking for interesting stuff. But now I'll be doing only some practical applications (it's easy to talk about various occurences in nature, then some applications in art etc) and a little bit of history (Bonacci, Lucas, ...).

Woohoo, I've actually finally added at least some workaround to always reconnect to the db, this should temporarily fix my weblog page being blank besides the caption most of the time ;-).

Okay, I've planned to do various great stuff today. Well, regarding that, nothing came out of it (as usual). Oh well, I've wrote one paragraph for my article. Otherwise, well, I'm not exactly sure what have I been doing, I can remember some more ELinks reviewing, and some IRC idling.

Then, I've noticed that Linux-2.4.24 came out (fixing the mremap() vulnerability), so I had at least some fun for the evening. First, I was hunting for the vulnerability in 2.2 tree, as it was suggested by the advisory. However, when trying to backport the 2.4 fix (not Linus' 2.6 one, which is quite general and potentially having some side-impacts, so I'm afraid to port it to 2.2), I've found that sys_mremap() in 2.2 is missing that concerned code at all.

Basically, (supposedly) the exploit relies on ability to pass newaddr to mremap(), but in 2.2 mremap() doesn't take it yet. So the only alternative to exploit that is to have the new address automatically assigned by mremap() (and the new address would need to be \"dangerous\" one). However, that's impossible because new size of mremap() must be zero for the exploit and in that case, mremap() would always just munmap() it. Possibilities:

  • There is something I don't know about.
  • The advisory is mistaken and the 2.2 line is NOT vulnerable.
  • If it is vulnerable and I'm not even more misleaded, the 2.4 fix is not complete.

Anyway, mremap() looks dirty, maybe some other vulnerabilities will be discovered in it soon, some people say they are on a track of something. Quite possibly nothing will come up from it, though.

Then I was cleaning up the Kernel's homepage. Basically enhancing w3c compatibility and tidying it up a little. Now I'll just sit and watch how hpa rejects or drops my patches ;-).

And last I was talking with Dave Weinehall, and he said that he'll make 2.0.40-rc7 available later this week (-rc6 is from May 2002) if everything will go well. Great to see 2.0 still maintained.

And I'll be sleeping for only 6 hours again. Screw it.

You know, I committed myself to so many great things today. I slept very little last night so I promised to go sleep early (between 11pm and midnight, hah!) today, I'm going at school tommorow again, after two weeks of Christmas holiday (I'll write some more about last-year school pains in some other entry). And I thought I'll be reviewing some more ELinks commits (well I did but far less than I expected) and be fixing some ELinks glitches (well I did but far less than I expected). And I wanted to do some more BlueCode work (will be worthy mentioning only if I get back to it and then you'll read about it for sure in another entry as well), I did none. At least I finally fixed all the GTS bugs I knew about.

Pah, this CIA thing is indeed pretty time-consuming. But I'm finished with it, I'll just fix bugs in my ciabot.pl CVS CIA client from now on! And I also did some hacking on my own CVS commits-over-mail announcer (which I consider the best one available by now [I'll be happy to be presented by some better one so that I can catch up ;], except that it doesn't support CCMAIL and CVS_SILENT [yet]).

My current plan is pretty clear. First BlueCode, then PaVS (how I'm looking forward for that). That mixed with random writing articles for root.cz (high-priority) and leading ELinks to 1.0 (lower-priority). And that intervowen by doing stuff to earn some money, of course (that is GTS hacking for now). Bah. Well, yes. There's the school as well ;-).

My sister had her birthday today (5 yrs old). She got a nice toy cooking set, some crayons etc - only few presents, she received most at Christmas already ;-). And she'll have her name day in two weeks! Well, we had at least a good tasty cake, done by dad - he can do great cakes.

I didn't do so much as I wanted today :-(. Most of the day I was doing some annoying debugging for GTS (my logic is rather complicated and sick in one part of the exchange alarms watching application I'm doing for them), and I didn't fix it fully yet, so I'll have to finish it tommorow :/. Otherwise, I was working together with scanline on converting ciabot.pl to talk XML to CIA and we were successful.

Also poked Advogato again, and rated few other people, and marked myself as involved in another few projects. It's funny how the web-of-developers works, it's not easy to browser through people certifying themselves and not hit someone you know (as in you at least talked to him through IRC or email directly) every few nodes. Maybe it's sad, but I guess one of my mid-term life goals is to benefit the OpenSource community enough that people decide I attained the Master level (which is rather just a motivating factor for finishing the large number of started projects of mine to some usable shape).

XS26's noc machine again dead, it seems, I jsut randomly poked it when looking for something. Sad state. I don't think I'll get myself largerly involved again, noone is really interested it seems. Zebra sucks ;-).

Yippie! Finally made XMLRPC export to Advogato working --- I can't say that either XMLRPC::Lite or RPC::XML is any well documented, the opposite is truth rather and it took a little of googling and trying random approaches to figure out how to make at least RPC::XML working reasonably. I have to admit that it isn't really that ellegant as in Python, though.

Advogato people, please tell me if you will feel that my overly long, elaborate and boring rants are unappropriate for Advogato and I should rather keep them to myself (there you could also leave your opinions in discussion theoretically available near each entry). Oh and also apologies for the empty entry I have submitted a while ago, I thought naively that I could delete it afterwards. Hmm, actually I could implement these two features (deleting of diary entries and discussion attached to them) in Advogato, what do you think? Would it (especially the discussions) be feasible?

Aside of the usual IRC idling and submitting new hostfs patch for 2.5 UML, I didn't do that much today. I started to resurrect my activities (and unforunately pretty much the only activities) at XS26 a little, repairing some more broken db entries, taking care about some users crying for a long time already and playing with few peerings. I think I should finally allocate few days for a completely rewrite of the software, releasing it as opensource, finishing support for dialup users and user BGP4+ peerings, fixing up the routing and bringing XS26 to as much MIPP as possible and reasonable. So much to do... is it even worth it? Is it going to actually help enough people?

One of the main problems is actually with the routing software. Plainly: we use zebra now, and it sucks. The NL PoP which enables our virtual presence at AMSIX is disabled now due to zebra eating all the CPU there, ospf6d likes particularily crashing and going into infinite loops as well, bgpd is probably the least problematic nowadays but sometimes it does strange things as well. And time by time, the whole zebra suite happens to read from some blocking fd and hang happilly. Zebra just apparently can't keep up in mixed Linux/*BSD environment (that shouldn't be much an issue, though) with relatively big (around 5000 entries) and very dynamic (entries flying in and out with average frequency of one change per minute or so) routing table, at least not for us. Hacking it seems pretty ugly job due to nature of the bugs, which are difficultly trigerrable and unreproducible, happenning seemingly randomly. But what are the alternatives? I know only of BIRD, which would have to be told about IPv6 OSPF (we are using OSPF for choosing routes between individual PoPs and BGP for all the other routes) and the *BSD world.

I've been thinking about some usable enough versioning/revisions keeping system, when becoming annoyed about CVS for another time and thinking what to migrate ELinks source code to and what to use as a suitable platform for tracking of my bunch of Linux Kernel patches (maybe it will be best done manually anyway, it seems... I will see yet). Why I don't like CVS again? (sorted roughly by importance)

  1. The concept of changesets (tracking of whole commits and ability to work with them compactly) is totally absent --- that is particularily annoying during the daily work and pretty much defeats the version tracking ideal.
  2. When merging two branches, intermediate history between the branchpoint and mergepoint of one of the branches is lost, being compressed to one compact diff applied to the other branch. You lost possibility to individually track the evolution of the branch, revert various specific changes etc. This flaw becomes particularily annoying when you are merging with someone else tracking his working copy of the project privately, so you can't at least go and stuff the original branch revision numbers to your CVS to get the needed diffs (see also below about distributed development).
    Being able to merge a branch repeatedly without any problems is obviously a natural requirement as well.
  3. One cannot move and rename files reasonably, I mean without losing history.
  4. Treating directories too specially, being unable to remove them, move them etc. Also the directory structure should be versioned if possible. Let me also stuff here that having 'CVS' subdir crying at you there could be disturbing for some people :^). I can't see reason why it can't be ./.CVS or ./.cvs ...
  5. Dumb permissions system. You can't finegrain permissions gained to individual developers over branches, files etc.
  6. (At least partial) absence of metadata tracking --- files permissions and ownership, in particular. Symlinks and other special files storage is included in this point as well.
  7. It has problems with binary files. Ie. binary diffs support is missing.
  8. Ineffecient communication over network, and usually hackish (pserver) accomplishing of this communication at all.
  9. Absence of distributed development support, where people could easily track their personal copies of the repository and doing what they need there, then being able to ask the maintainer of the central project's repository to revisit their changes and optionally pull them all or only some changesets back to the central repository, where they would merge intelligently without not losing the history. Obviously per-branch write permissions and intelligent merging alone could do, but the branch number could grow in an uncontrollable fashion, the administrative load would be high (since to be able to benefit from this, each potential patch maker which would like to have the history tracked would ask for his account and branch for feature A and branch for feature B) and being able to work on a local copy of a repository is still incredibly useful (were you ever wanting to code something when on vacation or in airplane? do you have dialup connection?).

So I've been also looking around for various OpenSource (yes, the favorite here would be certainly BitKeeper, which seems to excellently implement most of these ideas) tools which would be able to accomplish at least a part of this. Also it should be able to use CVS-compatible interface at least for an readonly access --- 95% (or more) of OpenSource people out there just use CVS and I want them to be still able to easily track the latest ELinks tree.

The first obvious candidate was SubVersion, but the intelligent merging is mandatory for me and svn looks to entirely miss that kind of thing, unfortunately; also, I didn't look at the code but I tend to usually trust Al Viro regarding it ;-). But when forgetting SubVersion, there's frankly not much out there I've found. There's OpenCM, which misses most of the mandatory features listed above, though. Looking at Perforce, I've found only marketing talks and no real list of features which matter. A few other projects lacked any easily accessible features list at all. Anyway, the promising projects are Aegis (from what I saw at web, it is almost ideal; its problem is that the documentation is in .pdf only, which is a highly hostile document format for me) and arch (I'm not sure if that one would be that easy to have CVS frontend).

First I've thought about writing some version managment system reasonable for me from scratch, but now I hope I could adapt and possibly extend (if it has already the most fundamental features, things like metadata tracking or CVS frontend [provided that the underlying concepts aren't *TOO* different] shouldn't be that easy) either Aegis or arch. I will check them out tommorow and we will see if there will come anything out from it ;-). Do you have any other tips for good tools? Or any suggestions regarding my working \"Pasky's Ideal Versioning System Features Specification\"?

Oh, look, I've been *again* writing too much. I should rather sleep, cure from illness, not forget to take Bromhexin regularly. And... <center><table border=\"1\"><tr><td>Don't forget to water the flowers! Don't forget to water the flowers!</td></tr></table></center>

1 older entry...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!