Free Source Project Management

Posted 4 Nov 2000 at 22:01 UTC by rlk Share This

Project management and engineering is a largely neglected aspect of free source development. Monty R. Manley addressed this issue in a recent article on linuxprogramming.com, but I have reservations about his particular recommendations. Nevertheless, it's an issue we need to tackle.

Introduction

Project management and engineering is a largely neglected aspect of free source development. Monty R. Manley addressed this issue in his article Managing Projects the Open Source Way. He proposes a much more formal style of software development for free source (my term for the union of free software and open source) projects, along the lines of traditional commercial development. I absolutely agree about the need for better engineering practices. What I think we need to do, though, is recognize how free source operates, extract the best practices from it, and from that try to find engineering practices that mesh well with the culture.

Manley's article is very thought-provoking, and prodded me to give some thought to issues that have been at the back of my mind for a while. As a free source project lead myself (for gimp-print), I've had to face a lot of these issues. In my professional career, I've frequently been both a developer and release engineer, and I've developed some insights from this experience.

What I'd like to do is explore some of the issues Manley raises, analyze how they apply to the free source community, and come up with some suggestions, or at least points for future work.

The Waterfall Model and Formal Methodology

For many years, the accepted methodology for software development was what's often called the "waterfall" model: starting from a carefully done requirements analysis, the team proceeds to do a functional specification (architecture), then a high level design, then a detailed design, and only then does coding commence. After coding comes testing, and only then is the software released. This seems logical; one shouldn't start building something until one understands what's being constructed. By constructing software in this disciplined fashion, the engineer knows exactly what needs to be done at each step along the way. Many people have suggested methodologies loosely based around this model; there are various commercial tools available that supposedly make these various steps more mechanical and therefore less likely to go wrong.

As I understand Manley's thesis, it is a weakness of free source development that the early steps (requirements and design analysis) are largely neglected. I disagree with this on two counts:

  1. The waterfall model has serious weaknesses.

  2. Free source development does in fact perform requirements analysis, just in a different manner.

Let's start with the weaknesses in the waterfall model itself. Manley asks

How can you write a program if you don't know what it is supposed to do?

That's a good question. How can one build anything without knowing what it's supposed to do? Nobody in their right mind would dream of building a bridge without knowing what it connects, the traffic load it is to bear, the underlying geology, and so forth. Why should software be any different?

In many cases it shouldn't be. The software team supporting the Space Shuttle is famous (as documented by Richard Feynman) for delivering bug-free software on time without working 80-hour weeks. Programmers writing the embedded code to control a medical monitor have to deliver under equally rigorous conditions, and do so routinely. In such cases, well-disciplined teams do follow this kind of model: they are given a set of requirements, and carefully go through repeated design iterations and reviews before committing a single line of code to disk. The coding is in many cases a largely mechanical process, and bugs in even early builds are a serious concern, because they indicate that something went wrong early on. But is this model really efficient, or even appropriate for free source software, or even for most commercial (as opposed to dedicated) software? I believe that it's neither efficient or appropriate much of the time.

The examples I gave are examples of mission-critical applications, where the software is just one component of a larger deliverable on which people's lives depend. It is a slow, painstaking process that trades off innovation for reliability. I hope that if I ever need to be hooked up to a medical monitor that the programmer emphasized reliability and correctness over the latest whiz-bang graphics. Most software that most people interact with directly is not mission-critical in that way; it only needs to be good enough to get the job done effectively.

That isn't to say that I agree with the tradeoffs that, say, Microsoft makes; they have taken this to extremes even to the low levels of the operating system, so that the base is not robust. The point, though, is that most computer users benefit from having additional functionality at the expense of ultimate perfection; the perfect is the enemy of the good, and the hard part is deciding what is "good enough". It simply isn't necessary to do a perfect architecture in many cases, for example. On the other hand, not giving it enough attention means problems down the road. The more other software will rely on this package, the more essential solidity is.

The Weakness of Traditional Requirements Analysis

The other part of the problem is that the user base often doesn't know what the program is supposed to do. The space shuttle mission specialists do know exactly what they need, and they've learned over the years how to express it. Users of a word processor, for example, know that they want to edit and format text, but only at a general level. It simply isn't possible to gather comprehensive requirements before starting work. It's easy for a user to say "Oh, I really don't like having to indent each paragraph" if the editor doesn't do that automatically; it may be hard for a user to think of that in a vacuum. A user not suitably trained may not know how to express a requirement that he or she actually does understand.

In this situation, the familiar GIGO (garbage in, garbage out) principle comes into play: if the initial requirements are useless, then any functional spec written from those requirements is equally useless. Following the waterfall model may yield a perfect white elephant.

What I've sometimes observed is that in order to follow the rules, a programmer will indeed write a functional spec and design, but only after writing and debugging the code to the point of doing something useful. Sometimes this is winked at; sometimes it's blamed for delays and other problems, and sometimes management has no clue what's going on. Sometimes, in my view, it's a perfectly rational response to a perfectly rational requirement.

Why is that? Well, let's get back to the issue of incomplete or incorrect requirements. The fact that the requirements are faulty is only detected when the prototype (which is what it really is) runs well enough so that the prospective user actually sees that it isn't doing the right thing. As long as the requirements can be corrected at low enough cost, this iterative process can work.

The demand for requirements, architecture, and design documentation on the part of management is actually rational, even if the way it's handled often isn't. Even after the fact design documentation tells the next generation of programmers what's going on, and usually a lot more effectively than documentation written ahead of time that isn't updated to reflect reality. The requirements document at least demonstrates what problem the software ultimately purports to solve.

In order for requirements to be corrected cheaply, the programmer has to be close to the end user. That's a lot easier in the free source world than in the commercial world. The free source programmer doesn't feel compelled to hide what she's doing for fear that her competitors will steal a march on her, or to withhold functionality now so that it can be sold as an upgrade later.

However, for this to work well, it must be easy for the end user to find and try new packages. There have been steps taken in this direction; the GNU configure system provides a common way to build packages, for example. However, it's still very difficult and time-consuming for someone to download and build a package to try it out, but then discard it (and restore the system to its prior state, with no loss of data or system stability) if it doesn't work out. It's also often not obvious how to give feedback. None of the existing packaging systems really support this.

Creating, Copying, and Cloning

It's often said that free source is a lot better at copying and extending than innovating from scratch. I'm not entirely convinced this is true; some of the most innovative software is the product of an open development environment: the ARPAnet, UNIX (yes, UNIX started in a largely free environment), the Lisp Machine, and the very notion of a windowing system (at PARC). Furthermore, there really isn't all that much innovation from scratch, particularly in the software world; most projects build on existing work, in both commercial and free software. I agree that it's probably harder to start from scratch on an initial, very complex goal with no prior example and without the funding required to support a team of developers working full time for an extended period, but such projects are few and far between anywhere. The examples I gave of truly innovative free projects all received substantial outside funding.

But is this really a strike against free source development? I think not. It's just as satisfying for the user in the end to have a better way of doing something familiar as it is to have something truly new and innovative. Linux is a derivative of UNIX, but with a new implementation and the chance to do things better (in many ways, it's much lighter in weight than commercial UNIX products). The GIMP started out life as another image editor pretending at Adobe Photoshop; it has evolved in its own direction and in may ways is more powerful than Photoshop.

One of the major goals of object-oriented programming, in fact, is to permit the creation of components that can be used as building blocks to produce something greater. What a lot of people fail to recognize is that the groundwork for this was laid by the likes of Aho, Weinberger, and Kernighan (yes, the authors of awk), and the other tool builders of the early UNIX days. Using a simple shared data representation (whitespace-delimited fields in lines of ASCII text), they built up a set of tools that could be glued together (with shell scripts) to produce much more powerful tools. Of course, that's not true object oriented programming, but it embodies much of the spirit of reuse.

This kind of flexible tool building and reuse is a characteristic of software that distinguishes it from other engineering endeavors. The capital cost of tooling up to produce a new hardware widget, even a minor variation on an existing one, is very high. Machinery must be rebuilt, new dies must be cast, new containers designed, and so forth. A new production run must be started. If the new widget turns out to be defective or incorrect, expensive materials must be scrapped, and precious production time is lost. In software, in contrast, the production cost for a unit item is essentially zero; the design and implementation cost is the only cost. This encourages free experimentation and creative use of existing components.

In any event, the free sharing of code that characterizes the very core of free source is a tremendous strength of this form of software development. Indeed, if there's a weakness, it's that there is so much out there that nobody can keep track of it, and finding it is a challenge. This suggests an urgent need to catalog and index the variety of free source out there, so that people who might like to use it can find it more easily.

This also suggests why patents and overly-strict interpretations of copyright are potentially so devastating to software, because they inhibit this free interchange of ideas and methods. Patents are intended to encourage innovation by granting the original creator a limited monopoly on use of the innovation. But software does not need grand innovations so much as wider and more clever use of existing methods, and so patents actually serve to inhibit the kind of innovation that software needs.

Release Early, Release Often, Release Never?

Manley also takes issue with the "release early and often" concept. He specifically argues that pre-alpha (defined as feature- or API-incomplete) software should generally never live outside of the development team. He also claims that "release early and often" stands formal release methodology on its head. I disagree with both these points -- not only do I believe that development software should be available broadly (and I'll give some concrete examples of why shortly), but also, if done correctly, it is not at all at odds with good release methodology, which I endorse.

First of all, let's understand what "release early and often" is intended to accomplish. The goal of this methodology is to allow prospective users the chance to experiment with something and report problems, or request new features, or add new features themselves and contribute them back. However, simply releasing early and often doesn't guarantee that this will happen; if the project quality as a whole is poor, or it doesn't do anything that anyone cares about, this will not happen. If the releases are so frequent that users get frustrated, and they don't offer enough, they will also fail in their purpose.

Large companies frequently do internal releases periodically (anywhere from nightly to monthly), and encourage employees who wish to be on the cutting edge to try these builds. This is a form of "release early and often", even if the audience is smaller. So this really isn't something unique to free source, although fully public release is.

Let's take another look at the goal of release early and often: to allow the user to experiment with something new. I discussed above an iterative model of requirements analysis: give the user a prototype and refine it based on feedback. Doesn't that sound familiar? By releasing frequently, we give the user an opportunity to offer feedback.

To be effective, though, this must be done well. Simply cutting a new release every night and telling everyone to upgrade isn't going to work; users have no idea what to expect and will spend all their time upgrading, quickly growing tired of the exercise. Users who want to do that should use the development repository. To be useful, a release must:

  1. Contain sufficient new functionality or bug fixes to be worth the effort.

  2. Be spaced sufficiently far apart to allow the user time to work with the latest release.

  3. Be sufficiently functional so that the user can get work done (quality).

This doesn't mean that each release must be a fully polished product, but it does mean that it must have the characteristics of a good product: it must be coherent, worth the upgrade effort, and clean. Note that I didn't say complete -- we already understand why it's hard to create a complete product without understanding what the user needs.

There's another benefit to all this: it forces developers to continuously test their work, since it's never too far from visibility. None of this is incompatible with a good release methodology, which addresses all of these things.

Keeping a project too close to the development team -- not allowing outsiders to use it until very late in the game -- means cutting off the ability to find evolving requirements. Here are two sets of contrasting examples, involving very similar projects, one released publicly and the other one not:

Emacs

Emacs (the text editor) -- there are currently two versions derived from the GNU Emacs base, the continuing GNU Emacs and XEmacs, which split off in the early 1990's. GNU Emacs is developed by the Free Software Foundation, and XEmacs is developed by a broader team. The split occurred for reasons involving code copyright, but the development methodologies have been very different.

GNU Emacs is developed by a small team within the Free Software Foundation. Currently, version 21 has been in development for several years; version 20.7 is the current release.

XEmacs is developed in a public manner, with a stable and a development branch, and a public web site. The current stable branch is 21.1.11, and the curent development branch is 21.2.35.

XEmacs has many visible differences (including embedded images and proportional fonts), but the internal architecture is now very different; it's much more refined, with things such as characters, events, and keymaps being first class objects. Source: http://www.xemacs.org/About/XEmacsVsGNUemacs.html. Note that this is somewhat out of date, but many of the differences persist.


GCC

GCC (the C compiler) -- the Free Software Foundation developed GCC in a similar fashion, with only targeted releases being made available externally. Development stalled out around 1996, and with no visibility into the project, nothing happened. A few years later, a team based at Cygnus Solutions took the existing code, added some outstanding patches, and reworked some things, and released EGCS (Experimental GNU Compiler Suite). This was developed in an open fashion, and made rapid progress. Eventually this split was healed, by the FSF transferring stewardship of GCC to the EGCS team.

In summary, the advantages of the public development model are:

  1. Better accountability -- people outside of the project can see into it, and the development team feels more pressure to keep things moving along.

  2. More early testing.

  3. More attraction for prospective developers, and so a broader developer base.

Manley is quite correct, though, to note the distinction between development (or pre-alpha), alpha, beta, release candidate, and release. Let's take a closer look at what the user base typically is:

  • Development releases are used by people who enjoy living on the edge, or who are going to form the core user base of the new feature and who really need to see what's going to be happening to offer early feedback. These people might not be interested in contributing code (developing), but sometimes these users do become developers. People who use development releases have a responsibility to understand the hazards. Developers have a responsibility to make these hazards clear.

  • Alpha releases are for people whose needs are less urgent, but who want to see something that more or less looks like the final product. Early adopters should experiment with alpha releases, and be prepared for problems, but the development team needs to start exercising more discipline.

  • Beta releases are for mainstream/early adopters, who reasonably expect good functionality and reasonable polish, and who want to help test the ultimate product. The development team should exercise strong discipline at this point.

  • Release candidates should be very close to the final release. The development team should stand behind release candidates as though they are final releases.

  • Final releases should be a product that the project team is comfortable with anyone in the target audience using.

As long as it's clearly understood by all parties -- developers and users -- what's expected at each step, there shouldn't be any problems. The hard part -- and this seems to be as difficult for commercial developers as for free source developers -- seems to be exercising the appropriate discipline at each step. Usually the problem is exercising too little discipline from alpha forward, but exercising too much discipline too early runs the risk of the project growing tedious and not maintaining forward progress. The release engineer needs to understand the process. This is one place where there's no substitute for an iron fist.

In my experience, it's usually around the transition from alpha to beta that projects, both commercial and free, start to lose their way, and beta is often handled poorly. Beta is usually entered too early (with the project not complete enough), and the project is not willing to do enough beta releases (and thereby spend enough time) to ensure a clean product. Beta really should be feature complete. If testing reveals too many problems (either deep bugs or clearly missed requirements), the beta should be withdrawn, and the project should be slipped appropriately. If that's not acceptable, it should be understood that the release will be flawed

An interesting development is the rise of Linux distributions. The organizations producing these distributions are essentially system integrators, and the good ones take an active role in monitoring development and integrating packages cleanly into their distributions. This is an interesting model, and it may have implications for free software development. The distributions could be a very useful source of feedback to the development projects, and if distributions were to arrive at a common set of standards and practices for developers to follow, and publish close dates for integration into their individual distributions, it would help guide developers. Perhaps commercial Linux distributions, which have revenue, could perform as a service some of the less pleasant tasks that a lot of free source projects don't tend to do internally.

Maintenance

The first major release is always easy. There are no pre-existing expectations; the initial code base was small and easy to work with, and everyone'e excited to have their project out there for the first time. The second one is hard. People are burned out; figuring out where to go next is harder (all of the obvious good things were done the first time); the code base is more complex; people feel cocky from having done it once; and the team doesn't have the experience yet to know what happens next.

This is not unique to free source projects. I've seen exactly the same thing happen in closed source projects. I've been involved twice with major new software products, and both times I've seen a good first release and confusion in the second. This has been documented, in the context of business rather than technology, in Crossing the Chasm by Geoffrey A. Moore and Regis McKenna. While I'm not convinced that the book has all the answers, it does at least document the problem. About the only solution is perseverence and organization. I'm facing this right now with gimp-print; our first release (4.0) is very successful thus far, but figuring out what comes next is much harder.

Where Do We Go From Here?

In the spirit of stimulating further thought, here are some some concluding thoughts and recommendations.

  1. The pure waterfall model seldom works well in commercial projects, and it's likely to be even harder to apply in free projects. However, there are useful lessons to be drawn, and even performing some of the steps out of order carries benefits.

  2. "Release early and often" performs a lot of the functions for free source that regular internal releases do for closed source. If handled correctly, it is of great benefit to the project, and there are actually representative case histories that demonstrate this. Project leads should emphasize frequent convergence to allow clean (not necessarily complete, and not necessarily bug-free, but usable) frequent releases. Rather than being poor practice, it actually forces good engineering discipline.

  3. The free software model has certain unique strengths, such as the freedom to share code, that are very much in accord with contemporary development practices. However, in order to share code effectively, it must be of a certain quality and functionality. There's also so much code out there that nobody knows where it is. If we can devise a system to index all of the code, so that people can take more components off the shelf and use them with relatively little modification, we can leverage this to great effect. While it's often said that free source is weak at de novo innovation, perhaps the real answer is that free source is particularly strong at synthesis, since there are no strategic business reasons for avoiding use of somebody else's code.

  4. Good release engineering is good release engineering, free source or otherwise. A lot of that is just good self discipline. Engineering is 90% common sense applied to 10% specialized knowledge.

  5. The Linux distributions and other free source-related vendors could offer more services to the free source community. While this would carry costs for the vendors, it would also benefit them by improving the overall quality and functionality level of free source software.

  6. Getting good feedback from users is hard. What's the best way to do it? A web-based form, a mailing list, or a feedback tool built in to the application? If the latter, can we come up with a common mechanism for that purpose?

  7. It's always easier doing the first release than the second. The first release is very exciting, and usually has the biggest jump in functionality. How do we get past this barrier?

  8. Free source (particularly free software) developers are usually volunteers. How do we motivate them (ourselves, really) without pushing too hard? What kinds of organizational structures work best?

  9. What developer tools do people really need, and how can we minimize the spinup time? SourceForge is trying to address this, but it's not perfect. Analyzing what works and what doesn't work with SourceForge could go a long way toward improving it.

  10. Perhaps we need some kind of free source engineering summit, like the printing and database summits?


Experiences with GNU Parted hacking, posted 5 Nov 2000 at 04:02 UTC by clausen » (Master)

Hi all,

Interesting article (i.e. I agree with lots of it :-P)

I'm the GNU Parted maintainer (wrote ~70% of the code). Here's some reflections:

  • Parted is a fairly small program (~25000 lines, ATM), with me being the permanent only active hacker, but lots of hackers being sporadically interested, and making useful contributions. (Eg: ext2 support, PC98 support, help with Mac support, user interface issues, etc.)

  • Quality control is very important for Parted, because people's data is at stake! Therefore, Parted has regression tests, and assertions compiled into MAINSTREAM releases, etc.

  • Parted is certainly "release early, release often". Like Linux, we fork stable and development versions (porting fixes, etc. between versions), and before converting a development version to a stable version, go through a long 1.X.0-preY phase. This preY phase usually lasts ~1-2 months. Like rlk said, important missing features or design flaws are often found at this point. I think these issues (as opposed to mere bugs) should be dealt with at this point, rather than keeping to a strict "discipline" of applying bug fixes only. I have always dealt with them, because it seems Wrong to have bad code going out to The Masses (in a "stable" release).

    In fact, I sometimes fix design flaws in STABLE versions - particularly if the design flaw makes the code difficult to understand/debug. This might involve rewritting 100s of lines of code. Some people might think this is insane - but I think it leads to better reliability (by the next stable revision - I still go through a short preY phase of ~ 1 week) in the next version.

    So, my point here is: a "release engineer" should think about the impact the changes will have on the users - not simply follow "discipline".

  • Parted gets quite a bit of user feed back :-) I put encouraging messages in the documentation:

    Feel free to ask for help on this list - just check that your question isn't answered here first. If you don't understand the documentation, please tell us, so we can explain it better. General philosophy is: if you need to ask for help, then something needs to be fixed so you (and others) don't need to ask for help.

    Also, as I said earlier, Parted has lots of assertions. If one fails, an error message comes up:

    You found a bug in GNU Parted. Please email a bug report to bug-parted@gnu.org containing the version (<VERSION>), and the following message: <ASSERTION-DESCRPTION>

    I've probably received ~50 bug reports this way (just did "grep | wc" on mail)

  • Parted, being a small project, hasn't really needed any special tools for this kind of thing. OTOH, it could probably benefit from something like Aegis (which is GPL, BTW).

  • "Motivating" people isn't really an issue, is it? If some people in a project want good quality control, etc., then these issues will be discussed, and dealt with (if not, then there are more fundamental problems...?)

Andrew Clausen

Counterpoint, posted 5 Nov 2000 at 17:43 UTC by mrorganic » (Journeyer)

As the author of the original article, I felt I should post a response to this article. However, my philosophy is encapsulated in my article, so I will not restate it here; rather, I want to address the idea OSS developers -- particularly Linux developers -- are somehow exempt from the pitfalls and problems encountered by other development models. This simply isn't true -- in fact, the OSS development model presents more problems yet due to the highly distributed nature of the development teams.

I also vigorously dispute the fact that the "traditional" method does not work. When the formal method is followed, it works quite well. It's just that the software industry as a whole doesn't do a good job of following the model.

My concern is that the Linux development teams tend to disparage the formal model because it is, in their minds, a "closed source" way of doing things and is therefore bad. Nothing could be further from the truth: good engineering practices remain good, whatever the philosophy of the developers. ESR's paper "The Cathedral and the Bazaar" is often used as a defense of this method, but it's worth nothing that even ESR took Linus to the virtual woodshed for using his "inbox as a patch-queue".

As for the objection to the tradiional method of gathering requirements, I can only say that if you can't gather good requirements, you're not asking the right questions. You cannot -- absolutely -- write good software without knowing to a pretty detailed degree what the software is supposed to do and how it is supposed to do it.

In my mind, this is why so much software on Unix tends to be derivative rather than revolutionary. Unix programmers are excellent implementors, but tend to have trouble innovating -- I believe this is due to a lack of skills in the requirements and design fields. You can't innovate if you don't know how.

And the only way to get good at it is to start doing it.

Methodologies, posted 5 Nov 2000 at 19:12 UTC by nymia » (Master)

Here are some additional links for methodologies used in project management:

1. COCOMO [1 ]
2. Fagan [1 , 2]
3. UML [1 , 2 ]
4. Yourdon [1]
5. Meyer [1, 2]
6. Etc [1]

Counter-counter-point, posted 5 Nov 2000 at 21:14 UTC by rlk » (Journeyer)

I certainly don't believe that Linux and other free source development is free of the problems faced by closed source development models, although I can see why some of the things I said might be interpreted that way. I do believe that free source development does have certain advantages, and that we should understand what they are and learn how to leverage them -- that's the central thesis of my article. I'm certainly not a fan of the way Linus runs kernel development; I think that without the likes of Alan Cox and Ted Ts'o, it would be in fairly serious trouble by now.

However, the motivations behind (most) free and commercial development are quite different. The goal of commercial development is to make as much money as possible, which usually means reaching the broadest audience. Most free source developers aren't looking at that end goal; they either want to solve their own particular problem, or come up with a "Better" way of doing something. In particular, reaching the widest audience in the shortest time is probably about the closest thing to an explicit non-goal that many free source projects have.

In particular, few if any free source projects start out as "I want to do something that a lot of people will want to use; I know my basic product space, and I'm going to do the market research needed to find out what these people want". Linux didn't. Emacs didn't. Few free source developers have the resources or free time to do this, particularly early on, and I believe that rather than fighting that we need to recognize it and learn how to most effectively utilize the resources that people do have available.

Gimp-print, for example, simply started out as a need on my part to make my Stylus Photo EX print under Linux. It only grew into something more when I started seeing other people use. At that point, it was possible to start gathering requirements, in the form of email that people spontaneously sent me. I did put together a fairly coherent roadmap (which is available on gimp-print.sourceforge.net; I won't reproduce it here), and I was quite surprised when I went back to look at it that we actually mostly accomplished those goals from January. If things had headed off in a different direction I wouldn't have been at all surprised.

Clearly, I don't believe that one size fits all; different projects demand different techniques. If most free source projects are derivative, because that's how most free source programmers tend to think, then by golly let's learn how to leverage that to best advantage, and come up with the specific engineering techniques that such programmers will feel comfortable using that will assist in the production of the best software they can do.

No detailed response yet, just a minor nit, posted 6 Nov 2000 at 00:21 UTC by RoUS » (Master)

I need to read and ruminate on this in more depth before I can add anything meaningful (if then, and if it hasn't already been added). However, one minor quibble sprang out at me:

> free source (my term for the union of
> free software and open source)

I feel compelled to point out that 'free source' isn't a new term; I used it two years ago in my free source developer survey, and probably others used it before me. So it's not just your term, rlk, but one that already belongs to a community. :-)

Most OSS programs have no architecture, posted 6 Nov 2000 at 03:02 UTC by ajv » (Master)

The single biggest issue I see in all of these projects is "Version 2.0". Because there's no architecture, detailed design or requirements analysis, development of new features is ad-hoc, and occasionally detrimental to the overall conceptual integrity of the original design.

I've worked on a couple of projects over the years, including big ones like XFree86. The most recent projects have been pnm2ppa and reiserfs, which integrates into the Linux kernel.

For example, XFree86's XAA architecture, whilst zillions of times better than what was before, simply tried to retrofit OO into a big C program. When Metrolink provided us dynamic loading of modules, this helped make us use less memory and promised platform independant driver modules for any processor architecture. That never happened because drivers still require too much information on the platform they came from. When it came time for multihead, it's a major drama because there was no forethought into the original server design. When it came time to squeeze in direct 2D access for games like CivCTP or similar, there were a few false starts before DRI came along. When it came time to make 3D work, there were a few false starts, and now even as a developer with one of the best 3D cards in the current marketplace, I find it difficult to get 3D working. My mum would never make it happen.

All I'm getting at is that the larger a project sans architecture moves on, the harder it gets to maintain and improve the project during "2.0". This is true of pnm2ppa 1.0, reiserfs 4.0 is basically going to be a rewrite from the ground up, and XFree86 today.

Flexibility is key, posted 6 Nov 2000 at 07:59 UTC by Bram » (Master)

The conclusion I draw from the lack of serious up-front design in free software projects is that they work better without it.

The key to getting work done in this way is to start with a minimal feature set then gradually add to it, preferably maintaining a functional system at all times. Note that having a small feature set does not imply low quality software - it simply means a small feature set. Working this way is much more fun, since you're coding all the time, and very productive, since you get tremendous feedback from having a running system.

This is not to say that you shouldn't design. Design is crucial, it should be an integral part of the coding process. You should think about what you're doing, why you're doing it, and whether it could be done better every time you write a line of code, not just when you first start out.

Any open source project which isn't fun to work on will fail. Anything which makes it less fun to work on the codebase introduces serious risk of technical implosion.

Individual Style is also a big factor...., posted 6 Nov 2000 at 19:56 UTC by mechanix » (Journeyer)

I've noticed a similiarity between working on Open Source code and maintaince coding in the commercial world. It really is an exercise in pychology - not programming. The more people have worked on the code, the harder it becomes. Hehe...guess you could say it becomes Mob pychology :-)

Coders tend to impose some 'character' to a program almost automatically. It maybe top down, or bottom up...but it is consistent. Some sort of naming convention may be used. Stub fuctions may even be provided because you _know_ someday your going to need to go back and implement feature such-and-such. As you become more familiar with the code, you begin to pull information from it other then simple program logic. The authors quirks, sense of humor, etc - all become appearent. This 'feel' enables you to make intuitative guesses as to where problems lie, or what things to modify for a particular enhancement. Your able to tell _why_ something was done a certain way, and what the authors were thinking when they created a constant PI with a value of 22 :-P

The code is released and modified by the Community, each member with his/her own unique style and idea of how things should be done. In an ideal world, each contributor's work would match the 'feel' of the program. In the worst case, modifications are made, naming conventions come and go, and structure begins to disintegrate. Along the way, the 'feel' has been lost. This isn't to say the program doesn't work, merely that the 'hidden' information the code conveyed has been lost.

This isn't really something that can be regualted via style guidelines or standards. It's merely a reflection of different coders thinking different ways. The key is to be aware of it, and to try to adapt your style to it. If the entire program uses buckets starting with "Yearly_" then creating an "Annual_Salary" bucket is probably a bad idea. Remember, the program should look like a whole entity...not a collection of parts :-)

Linux is beginning to sound like a cult!, posted 6 Nov 2000 at 21:43 UTC by mrorganic » (Journeyer)

Maybe it's just my age (just over 33, not so old!), but many comments written by Linux programmers strike me less as engineering arguments than touchie-feelie religious babblings of the sort you hear on late- night television. "It's how the programmer feels," they say. "It's about the culture. We're different!"

You may enjoy the act of programming a great deal -- I know I do. You may get a great deal of personal satisfaction out of it. You may consider it an art form. But come end of the day, programming is an engineering excercise, and is bound by engineering rules. You may not like it, but that's how it is, and all the posturing and phrasemaking won't change it.

They key robust computer software, as in most other hard sciences, is rigor. Unfortunately, rigor seems to be the one thing many Linux programmers fear -- to be rigorous means having to do all that necessary but unfun stuff like documentation, debugging, and careful design. It's not so much that every software program must be perfect; it's that programmers who do not cultivate good development habits early carry their bad habits with them into other projects.

The big trap for Linux developers is to fall prey to the idea that good software will somehow magically "just happen" if that software is open- sourced. Open-source can lead to good software, but not inevitably so.

You're missing the point, posted 7 Nov 2000 at 06:06 UTC by samth » (Journeyer)

mrorganic -

You seem to be missing the point people keep trying to make. No one has suggested that it wouldn't be beneficial, all other things being equal, to have some sort of standard engineering practice (what that was, people might disagree one). But what you are proposing is totally unrealistic. Free software is mostly devloped by people for fun. That's right, fun. And requirements documents aren't fun. Sitting around talking about release engineering isn't fun. And certainly spending years designing before you start coding isn't fun. And if its not fun, people won't do it. Most people in the free software community don't do this out of the desire to make world-class software for its own sake. What you are suggesting is that volunteers act like they were getting paid, even though they aren't. It's just not going to happen.

More concretly, what about the example rlk gave of gimp-print? He started it to fix a problem he had. Are you seriouly suggesting that he should have instead drawn up lengthy and detailed requirements documents? If he had to do that to start, would gimp-print exist today? I doubt it.

It may well be that your ideas are the best way to deliver software to a spec on a deadline, when you are being paid. But as none of those conditions apply, why are you trying to apply the techniques?

My last diatribe on this subject (I promise!), posted 7 Nov 2000 at 15:09 UTC by mrorganic » (Journeyer)

I'll sum up so we can all move on to other things:

Saying that Linux programmers will only work if the work is fun pretty much guarantees that most of the software on Linux will be crap. Not all of it: many engineers are careful, talented, and rigorous enough to overcome the problems inherent in OSS development. But programmers who don't want to commit to doing the hard stuff because it isn't fun are going to keep producing derivative, badly-thought-out, and buggy software until the heat-death of the universe.

This is bad because an entire generation of programmers are learning horrible habits. Sure, they may be working for fun now, but eventually they will probably want to get paid for what they do, and what happens then? They've never learned how to do rigorous software engineering, and will have a tough time unlearning all the bad habits they picked up early on.

To me, the issue isn't "fun" (although I do mostly enjoy writing software). It's craftsmanship. It's taking pride in doing something right, even if that something is relatively small or trivial. It's not just about "liking" programming, but loving it enough to do it as well as you can every single time. Just because a piece of software isn't mission-critical doesn't mean it shouldn't be as well done as I can do it. Ultimately my software speaks of me and the value I place on what I do. If I don't care much about it and treat it as a lark, then my work will reflect that.

It's like the advice your parents gave you: if you're not going to do it right, don't do it at all.

Res ipsa loquitur.

On NASA software...., posted 7 Nov 2000 at 22:55 UTC by billgr » (Journeyer)

The latest IEEE Spectrum had a report about space station code. According to Spectrum, the NASA software methodologies have turned out questionable code for the space station. Due to aggressive code freezes and the like, there are hundreds of "SPNs" (station program notes) telling the crew how to work around bugs in the control computer system.

This isn't to say they should host the project on Sourceforge. :-) Just evidence suggesting that perhaps the conventional software development methodology isn't so sewn up and impregnable it no longer is possible to propose improvements, even when used where it is most at home.

I'm NOT saying that there's only one way to develop!, posted 8 Nov 2000 at 01:50 UTC by rlk » (Journeyer)

mrorganic,

I think you're entirely missing my point, and that of samth. It's no more true that Linux programming must all be fun than it is that all programming must be done according to strict methodology. However, there are a lot of "informal" programmers in the free source space (with apologies to RoUS, I hadn't heard the term used before, but I'm not surprised that it was), and telling them that they must either use formal methodology or give up programming altogether means that a lot of useful stuff simply won't get written at all.

For what it's worth, I've been a professional software engineer for about 15 years (2 years as an undergrad at Project Athena, which was unquestionably professional level work, and 13 years since). I've seen good engineering and bad engineering. I'm generally a stickler for quality myself; I'm rather annoyed that gimp-print went out with a couple of nasty bugs which didn't get fixed until 4.0.2 (at least one of these should have been caught; the other one should have also, but it's a bit closer). I've been a release engineer plenty of times (on projects up to maybe 500 Kloc, about 15x the size of gimp-print), and I generally espouse the 2x4 method of release engineering (read: I'm a big guy, carrying a 3' long piece of 2x4, and I walk into your office when a bug shows up late in the process, you're really motivated to fix it and more importantly not have me walk into your office again). I can't claim to have used a lot of formal engineering methodologies (the kinds of things that Ed Yourdon and friends like to write about), but I have done my share of formal architecture and design work. For big, complex projects, this stuff really is necessary. However, save for the flagship projects, in the free source world it's overkill.

There are examples of well-engineered free source projects, such as KDE. There are others that IMHO need a bit more control. However, for the vast majority of stuff on freshmeat (for example), it isn't. As a project grows larger, more design work is likely to be necessary. One could look at it as a lot of wasted work has been done, and the earlier junk has largely poisoned the source; one could also look at it as a potential learning experience. Certainly 4.1 is going to be somewhat painful for gimp-print because we didn't anticipate what we would need down the road. On the other hand, I think if we tried to anticipate what we need (real CMYK, color management, and such), it would never have gotten off the ground. We've learned a lot from our "prototype", as it were, and in the process a lot of people have access to useful software.

The point that I've been trying to make that just about everyone seems to have missed is that those of us who do understand good engineering principles try to come up with a streamlined approach that's easier for inexperienced programmers to apply that will assist them in creating better software. Maybe that's pie in the sky, but I believe that there are some relatively simple principles that people could apply that would help them create better software. That's what I'd like to get out of this discussion.

Free Software Project Management HOWTO, posted 29 Apr 2002 at 16:55 UTC by mako » (Master)

If anyone stumbles across this article, I thought I would place a pointer to a HOWTO I wrote on the subject. As a HOWTO, it is geared toward a bit of a different audience and has a very different tone, thrust and content. I actually used this article as a source while writing the HOWTO (and I mention it in the bibliography which I'd also love feedback on if anyone has other recommendations for things to include in it).

The HOWTO is hosted by the LDP as the Software-Proj-Mmgt-HOWTO (I think) or it's available from a project homepage that I've put up for the project. I hope this is helpful to some.

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

X
Share this page