The Pyramids and the Bazaar

Posted 18 Sep 2009 at 05:08 UTC by KlausWuestefeld Share This

Eric Raymond's software bazaar is a fantasy.

What really goes on in open source projects has nothing to do with his "great babbling bazaar of differing agendas and approaches".

In his classic "The Cathedral and the Bazaar", Eric calls us "happy networked hordes of programmer/anarchists". We are hordes alright, but are we anarchists? Workers building pyramids is what we are.

Why do you recognize names such as Linus Torvalds, Miguel de Icaza and Guido van Rossum? Because they are written over the entrance to the Linux, Gnome and Python pyramids respectively. They are charismatic pharaohs' names.

As Eric put it: "In order to build a development community, you need to attract people, interest them in what you're doing, and keep them happy about the amount of work they're doing. Technical sizzle will go a long way towards accomplishing this, but it's far from the whole story. The personality you project matters, too."

And why does nobody know your name? Because nobody knows the name of a pyramid worker.

We are lucky that, according to Eric, the fear of having their pyramids forked by dissenting princes keeps these dictators "benevolent". They let us play inside their pyramids for free, after we build them. That is cool. We are happy, voluntary workers in a win-win situation.

What Eric describes is tremendous improvement, no doubt, over our previous condition of pilgrims paying to pray at Redmond Cathedral.

But it is still not the bazaar.

In a true software bazaar you will use software produced/enhanced by your neighbors directly and they will use software produced/enhanced by you directly too.

You will not have to haul your one-ton-stone-block contribution to the presence of the pharaoh and his court of "core developers" for appraisal of worthiness to be part of their marvelous pyramid.

I quote Eric: "...this reflects a distinction between the project core (typically quite small; a single core developer is common, and one to three is typical) and the project halo of beta-testers and available contributors (which often numbers in the hundreds)."

Why don't I just send my neighbors a one-ton-stone-block patch then? Because in a few months it will no longer fit. The internals of Pyramid 2.0 will not be backward compatible with the internals of Pyramid 1.0.3g will they? We will be lucky if even the externals look vaguely similar.

Unless you spend half your time making your way into the court of core-developers, there are also no "differing agendas and approaches". There is only the pharaoh's agenda. There is only the pharaoh's approach. Live with it or go fork yourself.

The notion of multiple bazaars (one for each open source project) makes no sense either.

When it comes to be, there will only be "the" software bazaar. It will be singleton like the Internet. There we will trade directly with one another the software components we wish, produced and enhanced by ourselves, anyway we like, to build our own custom pyramids.

We need a decent component deployment and component sharing platform to enable the software bazaar, not mere mailing lists and code repositories.

Still according to Eric: "Provided the development coordinator has a communications medium at least as good as the Internet, and knows how to lead without coercion, many heads are inevitably better than one."

Regardless of coercion, since when does a bazaar need someone to lead? Since when does a bazaar need a coordinator?

Maybe he got the "babbling" part right but what Eric describes is definitely NOT a software bazaar. So where did he get this beautiful image?

Who cares? Make it happen.

Nice idea, but..., posted 18 Sep 2009 at 17:42 UTC by cdfrey » (Journeyer)

The ideas of sovereign computing (linked to from the page linked above) are nice and all, but somewhere along the line, I fail to see how the rubber actually meets the road.

You complain about how you can't offer your own giant patch, but this is just the result of practical concerns. If you look at software packages as whole units, you can of course offer your giant patch to your friend, but your friend must stay with the version you give him.

How can it be any other way? The idea of backward compatibility and stable API's is also a fantasy.... a fantasy that some people and some corporations work very hard to maintain, but it is still a fantasy.

The only way to achieve perfect API stability is to stop progress. And people aren't happy with that.

So if you have many sovereign people working in sovereign ways to make sovereign code, how do you get them to work together?

This takes work, and the best way to make it work, so far, is to put it all together, so that when one thing changes in one part of the system that breaks the other parts, those breakages are found quickly.

Linus is right when he says that the average person only trusts about 10 people when working on a project like this. This trust ends up forming the pyramids that you decry. The good thing about Free Software is that you can make an instant copy of the existing structure for your own purposes. You can't copy all the workers, nor can you make them do what you want, but you can copy their product.

And in a world of sovereign computing, that's the way it has to be.

Re: Nice idea, but..., posted 19 Sep 2009 at 00:11 UTC by KlausWuestefeld » (Master)

Software packages today are too big: millions of stone blocks patched together with tons of glue code. Pyramids.

You can never hope to stabilize the API for something like that, I agree.

The stone-block contribution is a reasonable effort for an individual but a very small part of a pyramid. The metaphor we use in Sneer is that of the lego brick.

We are betting on the creating, changing, forking, versioning and especially extinction of VERY small components. Our bricks are made of less than two classes on average.

Tiny components die or stabilize VERY fast. If the Java ternary operator had a dedicated mailing list, for example, it would be pretty quiet by now.

How do you get people to work together?

That's the thing. In a bazaar, you don't have to. Everyone does his own thing and the net effect is still positive. We just need to enable "bazaar mode" technically. Current tools and runtime environments are still too bureaucratic, geared toward pyramid-style building. XML-ridden, committee-designed OSGi is an example.

We built the platform to enable a chaotic ecosystem of bricks. We know it will not be zero, but how many components will it be able to foster? Ten? Thousands? Millions? Enough to actually attract users? We have no idea.

I think I know now how Ward must have felt...

Re: Nice idea, but..., posted 19 Sep 2009 at 06:34 UTC by cdfrey » (Journeyer)

That does sound good, but please pardon me as I struggle to understand how this all works to make something useful.

For example, how many ternary operator sized lego bricks would it take to build an editor? How many for a word processor? How many for a game? How many for an operating system?

And then, once you've calculated this number, is this number small enough for the average end user to put together? It is in the ballpark of the number of people one person can trust and work together with?

One of the advantages of the pyramid model is that the pharaohs have a name and a reputation to protect. This provides a little bit of incentive to not screw the user. Witness the recent NoScript fiasco, and how that damaged the author's reputation.

A true bazaar doesn't have this advantage, as far as I can see. And so, yes, each user has more freedom, but also a heavy load of responsibility as well.

this is predicted in snowcrash, posted 24 Sep 2009 at 18:05 UTC by lkcl » (Master)

software development a la "bricks" is predicted and described by neal stephenson, in Snowcrash. he describes the hero of the book as being a "hacker" i.e. "a programmer capable of programming well below the level of lego brick style programming" in what neal stephenson describes as "flatland". the majority of "programming" is done via 3D assembly in virtual reality, with little or no actual knowledge of "real" programming required. interestingly he refers indirectly to the prevalence of what can only be surmised as free software to be responsible for the creation of so many easy-to-use "bricks".

but - i digress. i love the idea of sovereigncomputing. however, the instantaneous turn-off and barrier for me is: java as the first implementation. leaving that aside, here are some thoughts:

1) the pyramid: a very interesting take on exactly what i've been saying now for at least ten years. the article is absolutely spot-on. time and time again i've provided strategic contributions to various free software projects, that would take their usefulness to a whole new level of interoperability, and the contributions have been shunned in the extreme. the message is clear, each and every time: "you are not one of us. your contributions threaten our clique. you can go to hell".

i've been saying for years that the linux kernel design is way too top-heavy, and that the patches for should be added to it, in order to emphasise the "core kernel" (which is only about 100,000 lines), and to DEemphasise the quotes importance quotes of the drivers (which should be run in userspace, just like they are in gnu/hurd. yes i'm aware of the implications).

now that linus himself has said that linux is bloated, somehow it's finally got through to people.

2) to do interoperable components, you need an object model. to do that properly, you will need COM. to do it _really_ properly, you need DCOM. the opinions of free software developers of the value of DCOM and DCE/RPC is extremely low (see zaitcev's comments regarding MSRPC as a prime example). this is an indication that free software developers only have themselves to blame for the present situation - the "priesthood" / de-facto "cartels" / cliques.

3) mozilla actually has a near-complete implementation of COM (not DCOM) which they've called XPCOM. it is sufficiently similar to COM (without the networking) such that python-win32com could be ported across to do EXACTLY the same job, and be called "python-xpcom".

the only thing is that XPCOM does not have "IDispatch" interfaces, and the IDL compiler does not have support for the concept of "CoClasses", which is a serious lack that is causing significant problems for the mozilla team.

not only that, but their stupid focus on "speed, speed, speed" has them ripping out XPCOM in favour of "javascript, javascript, javascript"-only interfaces.

this will turn out to be a mistake of the absolute first order, as projects like wine (with their support for MSHTML interoperability being entirely dependent on the xpcom / xulrunner interface) and other xulrunner-based applications being told, basically, to get stuffed.

long reply. huh. i'll leave it at that.

Re: Nice idea, but..., posted 27 Sep 2009 at 04:16 UTC by KlausWuestefeld » (Master)

"how many ternary operator sized lego bricks would it take to build an editor? How many for a word processor? How many for a game? How many for an operating system?"

These guys are targeting an entire operating system with utilities and basic office apps at 20k lines of code. They implemented their TCP stack in under 200 lines of code (typically done with 10k~20k lines in C).

"is this number small enough for the average end user to put together?"

The average end user does not have to put everything together. He can use ready-built lego toys from well-known vendors if he wants. But he is no longer forced to do so.

Re: this is predicted in snowcrash, posted 27 Sep 2009 at 05:32 UTC by KlausWuestefeld » (Master)

We intend bricks to be written in any language that will run on the JVM. Scala and boojay are lined up to be the first two after Java.

Yes, the idea of "bricks" or reusable components is not new. It has been the grail ever since structured programming and OO.

What we are doing is putting component development in the face of the end user, just like wikis did with web page editing.

HTML is editable. Wikis "just" make it much easier. COM DLLs, OSGi packages and Maven projects are composable. We are "just" making it much easier.

And, yes, we do need a component object model and we are using the Java VM for that. The Java security model is well suited for that. Our model requires a bit more overhead than a regular Java class (allowing static or cyclic dependencies just will not do) but MUCH LESS overhead than a CORBA object, OSGi package or Maven project. Our bricks are still pure Java code: no IDL, no XML.

As for the distributed side of things, I have seen too many distributed-object models fail. People seem to be reasonably happy with web services and message queues. We are providing a distributed tuple-space implementation.

distributed object models, posted 27 Sep 2009 at 17:33 UTC by lkcl » (Master)

"As for the distributed side of things, I have seen too many distributed-object models fail."

that's because their developers fail to understand the implications and the complexities.

the development behind DCOM dates back thirty years, through MSRPC, to DCE/RPC, in parallel through both Transarc and also ultimately to NCA/NCS which was developed by Apollo. if you're not aware of the level of experience of the people behind these projects and companies, you have absolutely no business doing an object model, let alone a distributed one.

not being funny or anything, but if you don't have the equivalent of DOM IDispatch and other interfaces; if you don't have DCOM "CoClasses"; if you don't have the equivalent of what GObject calls "introspection" and what it also calls "Interfaces"; then you are already on shaky ground.

so - again: not being funny or anything, but if the object model being developed is "less overhead" than CORBA, then that is an indication that you don't yet fully understand the implications.

the point of DCOM and also mozilla's XPCOM is that you can link disparate systems together.

i can't _stand_ java. it doesn't matter how good the system being developed is: i absolutely _will_ not install it, get involved with it - nothing.

however, if you have a specification and an object model runtime interoperability layer where i can write code in c, c++, python or in fact anything that conforms to the object model specification, _now_ you've got my attention.

"Our bricks are still pure Java code: no IDL, no XML."

pure java: bad. no idl: bad. no XML: *great*!

Re: Nice idea, but..., posted 28 Sep 2009 at 08:14 UTC by cdfrey » (Journeyer)

Thanks for the link, Klaus.

I took a look at the TCP stack code linked to from the PDF, and from what I can see, there's a reason why TCP stacks usually clock in at about 30K lines of code.

This one had no routing, no DNS, no ARP, no UDP, no frag support.

I liked how he used the ASCII art tables in the RFC documentation to create the data structure for the packets. The documentation was the code! Cool. But I'm pretty sure that a C struct would have been competitive in lines of code.... maybe even less if you count the structure.k code needed to turn ASCII art into structs.

I also think that their goal of reducing lines of code is only a partial success. I'm very curious how long it took them to create that TCP stack in their programming dialect. To get to the level of Joe User Rapid Development, we'll need small code and short development time.

Sometimes lines of code take a hit when you're focusing on rapid development.

I'm not saying that their project isn't good... I don't have the time to evaluate the whole thing, unfortunately, and while it looks a lot like LISP everywhere, I hope their research produces great things.

But their small code doesn't look like the "lego block" breakthrough that I suspect will be needed for the average user to put his own systems together. It looks an awful lot like the cryptic code that the average programmer has to deal with everyday. Only a bit less of it, and without the baggage of historical backward compatibility.

Java?, posted 29 Sep 2009 at 07:58 UTC by KlausWuestefeld » (Master)

"i can't _stand_ java. it doesn't matter how good the system being developed is: i absolutely _will_ not install it, get involved with it - nothing."

I was a professional smalltalker for 7 years before having to move to Java, so I don't like Java either.

But I am reasonable about it.

The VM is good and mature. Lib availability is great. The community is there. Portability is good. The security system is in place. VM and compiler are open source.

Static typing is the way to go. All arguments against static typing boil down to verbosity and I agree with them. The solution to verbosity, though, is inference, not type elimination.

So the language sucks but, as I said, we can run other languages on the VM.

Re: distributed object models, posted 29 Sep 2009 at 08:09 UTC by KlausWuestefeld » (Master)

We are not providing a distributed object model.

Distributed apps can use any distribution mechanism they want. The preferred mechanism we are providing is a tuple space. Our wire protocol for tuples will probably be Google protocol buffers.

"Average User", posted 29 Sep 2009 at 08:53 UTC by KlausWuestefeld » (Master)

"But their small code doesn't look like the "lego block" breakthrough that I suspect will be needed for the average user to put his own systems together."

We don't expect average users to build systems from lego bricks.

We just expect many more >programmers< to be able to participate in a true software component bazaar, rather than just a handful of pyramid pharaohs and occasional contributing pyramid workers.

To justify today's overhead required for packaging, distribution and deployment, software "things" tend to be large and complex, like those specific "shark", "boat" or "sail" lego parts that are good at what they do but are unmodifiable and useless in any other context.

Not being funny or anything, notice the "DOM IDispatch, DCOM CoClasses and XPCOM-style" elastic band glue-code necessary to make today's software things interoperate.

The best lego bricks are the small ones.

The best lego bricks are the small ones., posted 1 Oct 2009 at 22:14 UTC by lkcl » (Master)

i think it was einstein who said something like "make it simple, but no simpler than it needs to be".

if you write COM in c, it's a bit of a hairy mess. if you do COM in c++, it's a bit better. by the time you get to dynamic languages such as python (and dare i say it, visual basic), all the complexity is _gone_ and you're back to the level of function calls that really _look_ like function calls, with transparent ("lazy dispatch" as python-comtypes likes to call it) access to properties and functions "behind the scenes".

even CoClasses get transparently accessed, neatly, in the dynamic languages.

when i did the MSHTML port of pyjamas desktop, i was dreading it, but after three weeks, i'd settled on python-comtypes (a 250k download), stripped out everything else, and was having an absolute ball. it works, and it works extremely well. there are a couple of warts, and they're entirely hidden from the pyjamas-desktop developer.

what this taught me was that the combination of DCOM and python is incredibly powerful, which was something i wasn't expecting, and, because the DCOM implementation, by microsoft, is proprietary, i'd stayed away from it until now.

so - once again: a free software development project ignores the benefit of the experience of some incredible software engineers, who unfortunately produced some of the best designed proprietary software i've ever seen.

All blocks don't have to be the same size, posted 20 Nov 2009 at 16:57 UTC by pixelbeat » (Journeyer)

You ascribe python and linux as being pyramids. I just consider them to be big blocks. Take python for example which makes extensive use of other blocks like ICU, and is in turn used by bigger blocks like django. All blocks don't need to be the same size, but you can consider them to be tetrahedal if you want :)

Quote By Alan Kay, posted 17 Dec 2010 at 06:29 UTC by KlausWuestefeld » (Master)

"Most software today is very much like an Egyptian pyramid with millions of bricks piled on top of each other, with no structural integrity, but just done by brute force and thousands of slaves." Alan Kay

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

Share this page