Interop in the Bazaar

Posted 19 Sep 2002 at 17:12 UTC by gregorrothfuss Share This

Do open source projects want to inter-communicate and share? This article asks this question in the context of OSCOM Interop, a new project to foster interoperability and sharing between open source content management systems.

by Paul Everitt and Gregor J. Rothfuss

Everybody loves the idea of the bazaar. Small, autonomous shops doing commerce in the wild, right in the shadow of the centrally-planned economy of the cathedral. But even a bazaar needs rules, right? Coordination and cooperation don't always spring up out of thin air.

In the world of open source, developers wonder if KDE and Gnome will ever interoperate in a meaningful way. But first we have to address if the question is even legitimate. Should they?

This article discusses a budding effort towards interoperability between open source content management systems, while evaluating the question, "Why interoperate?"

Background

The market of content management has always been associated with the big boys. Large software, large consulting teams, and very large prices.

Due to a number of factors, this mindset is in decline. Smaller approaches to content management, including open source projects, are popping up constantly. The open source projects are attracting attention from mainstream analysts and journalists.

From this grew OSCOM, an international non-profit for open source content management. The basic idea is to foster communications and community amongst the creators of these open source projects. A very successful conference was held in Zurich earlier this year. Another is slated for Berkeley in September.

After Zurich, some of the presenters discussed ways to make future meetings less a parade of individual projects, and more a forum for sharing ideas and working together. This led to a discussion of interoperability amongst open source content management projects, particularly in relation to a Java Community Proposal for content repositories, created by and for the big boys.

To test drive our ability to tackle interop issues, the OSCOM folks are working on a single problem: a common way to give presentations using a "SlideML" format and a set of rendering templates.

Reality Check

We are eager to continue these discussions face-to-face in Berkeley. But we should also step back and ask, "Is interop a bunch of crap?"

It's a serious question. Why should a project leader or project team do the extra work? Many of the best open source projects aren't really architectures. They are programs that started by scratching an individual itch. Later in their life, if they live long enough, they realize the bigger picture and do a massive rewrite, thus getting an architecture. But rarely is this new architecture designed with the idea of talking to other, similar systems.

So interop can impose serious scope creep on the architecture of a project. Strike one.

Next, how powerful is the motivation for working with "the competition"? At the least, a project leader has little cultural involvement with other projects, and thus doesn't have that good old maternal feeling that sparks late hours doing something for free. At the worst, one project can view another with condescension, envy, or any other mixture of emotions that come from the tribalism of balkanized projects.

Strike two.

Finally, aren't there already enough standards? Writing standards is a difficult process, one that doesn't come naturally to open source developers with the ethic of "speak with code". Shouldn't we embrace the man-years of existing standards and focus on good implementations? (Note: the answer is "yes".)

Beneficiaries and Their Expectations

We now have a stark, bleak picture. Thus, what is the driving need for interop, and who are its beneficiaries?

The first benefit is the "cognitive burden" that our projects place on developers. Imagine you are a consultant, and you have become an expert at Midgard. But you have a project where you need to work with AxKit. Atop the difference in programming languages, everything about the world of content management is different. Concepts, jargon, etc. If interop can give the tenuous grip of 5% commonality in approach, this can at least provide the mental connections to the next 25% of functionality.

The second beneficiary is customers who might have more than one project in use, or want to reserve the right to throw out their current project next year if they aren't happy. Can they even get 25% of the current site's content and configuration migrated? If not, then they are locked in. It is often argued that open source does not lock you in. But is this really true in a meaningful way? While it is certainly possible to migrate data between open source projects, or content mangagement systems for that matter, it is by no means an easy and painless process.

The third beneficiary is the implementor of authoring tools. Imagine you are a developer at Mozilla, OpenOffice, Xopus, Bitflux, or KDE. You'd like to tie the client-side environment, where *real* authoring happens, into the server side environment, where collaboration happens.

There are over ten projects presenting at OSCOM. If Mozilla has to talk to ten different systems in ten different ways, they will probably opt to talk to none of them. However, if the various projects agree to a thin set of common capabilities, then there is a basis for authoring-side integration.

But we're all open source veterans here, so let's cut the crap. Do any of these people have a right to ask for interop? This is open source, scratch your own itch, I-do-this-because-I-like-it territory. The time spent serving these beneficiaries could be better spent implementing Gadget B, which my mailing list tells me will cure cancer. Right?

Wrong, but first, let's explore the hidden costs of the process of interop.

Hidden Costs

Doing interop is hard. It's a lot harder than starting your own software project. Just review the mailing list archives for an interoperability project such as WebDAV. On and on, the messages go on for months and years. It takes time to distill the common wisdom from diverse perspectives into a standard that can have multiple implementations.

Harder, though, are the human issues. As we have learned with the SlideML project, you have to bootstrap a culture and a process. Most of the participants are used to being the big fish in their pond. So who is the big fish in a shared pond? How do decisions become final?

From a process perspective, standards require a different kind of rigor than software. In fact, the purpose is to render something that exists separate from the software.

Similar to the projects themselves, though, successful efforts seem to show character traits that combine intellectual confrontation with patient encouragement, with a strong dose of humor and enjoyment.

The Revenge of the Upside

We have discussed the reality check of interop, explored the beneficiaries and questioned their rights, and surveyed the hidden costs. So that's the downside. What's the upside of interop that makes it worthwhile?

The authors of this article are promoting the idea of pursuing interop between open source content management. We are advocates. So we'll focus the article on the provocative questions of interop in general and thus we will limit the upside to one discussion point.

In the world of open source web servers, there is one project that has a majority of the gravity. For databases, there are a couple of projects that split the gravity. Same for desktop environments. But for content management, there are a trillion. This kind of market splintering helps ensure that the big boys are safe to dominate the mainstream, where size and stability matter more than innovation and revolution.

Interop efforts, such as the Linux Standards Base, reduce risks for the mainstream customer. Not completely, perhaps not much at all initially. But it proves that we are interested in the motivations of the mainstream.

But interop is not solely a "feature" to appeal to the mass market, it can also unleash many new possibilites. Consider XML-RPC, which brought interop to scripting languages, and is now baked into dozens of scripting environments on various platforms.

Possible Progress

The existance of OSCOM, the conferences, the budding community, SlideML, and the interop workshops in Berkeley next week are all signs that this interop effort is taking baby steps. At this early stage, we can all be prognosticators and foretell with 100% certainty the future. Choose your pick:
  • Prediction One: Interop between open source projects is a fool's errand.

  • Prediction Two: If we stay practical and focus on small steps, we can provide value with lower risk.

  • Prediction Three: We'll stumble across the Big Idea that is the bait to get the fish (project leaders) on the hook in a big way.

  • Prediction Four: Somebody will get sued over a patent infringement and we'll all move to Norway.
Open Questions

There are no easy answers for interop, nor are the questions that need to be answered unique to the content management space.

How and when is interop "sexy" and arouses interest among developers? What can be learned from interop efforts that succeeded?

Is lowest common denominator functionality still worth anything? Let's say the choices are 100% interoperability (fantasy), 0% interoperability (surrender), or 20% interoperability (pragmatism).

Is 20% better than nothing?

But ... where is interop ends?, posted 19 Sep 2002 at 20:46 UTC by Malx » (Journeyer)

What is the level at which you no more need to bother of interop?

It is really a question. Many of developers go to FS/OS just becouse they like to do things the way they like. If you whould insist on such "standarts" you whould loose them.

And the REALLY BIG INTEROP QUESTION is - "when you whould create ultimate translator from any Earth natural language to another" :)
or at least a computerized subset of langaueges........
English is not the answer. Just look at local communities. :(

My solution is - _you_ should solve interop question by yourself. You shouldn't blame developers. You shouldn't ask them for any standart complince.
Need example? - FreeBSD ports - they are not asking you(developer) to make software support FreeBSD. They just port your software independent of your wish to do this.

is porting / converting a good use of time?, posted 19 Sep 2002 at 21:02 UTC by gregorrothfuss » (Journeyer)

malx:
i think "do it yourself" very often misses the point. not everyone has the skills or time to port every software / convert data from one format to another.

im a developer myself, and i only have time / resources to work on a few projects. when i spend my spare time i want to make sure its maximally useful. i personally consider converting data / code between platforms / applications not the best or most satisfying use of my time :)

re: levels of interop, i believe 20% (to pick a number) is worth it. another consideration for the adoption of standards is how much effort it takes to implement them, and what implementing a standard gives you. a example: i implemented the blogger api for postnuke. it was very simple to do. it took me about one evening, with no prior experience with XML-RPC. the payoff is that i can now use various nifty tools to blog from the desktop, or from a PDA / mobile. Standards should have these characteristics in my opinion.

Do it yourself is the only option, posted 19 Sep 2002 at 23:17 UTC by neil » (Master)

You have no right to demand anything of the developers that you aren't willing to do yourself.

im a developer myself, and i only have time / resources to work on a few projects. when i spend my spare time i want to make sure its maximally useful. i personally consider converting data / code between platforms / applications not the best or most satisfying use of my time :)

Well guess what? The same is true of other developers, too. Developers would rather be making their software more useful for themeselves and their contributors, than to help further a goal that you won't even further yourself.

That's life in the bazaar. Didn't you say you liked the bazaar?

i will incorporate interop into my software, posted 20 Sep 2002 at 00:18 UTC by gregorrothfuss » (Journeyer)

neil:
i was being inprecise. im ready to support interop in the programs i write / contribute to (a small subset of open source programs). im not ready to go all the way for programs which i merely use (most open source programs). i was wondering how other devs make this call for their own software. is it a non-issue? what is the sweet spot for them?

Unix Philosophy: Solve one problem at a time, posted 20 Sep 2002 at 11:39 UTC by CaptainNemo » (Journeyer)

Is lowest common denominator functionality still worth anything? Let's say the choices are 100% interoperability (fantasy), 0% interoperability (surrender), or 20% interoperability (pragmatism).

Is 20% better than nothing?

It depends on the means by which interoperability is obtained. Personally, I would perfer an rss feed and well commented xhtml, then full compliance to a "standard" that delivers 20% of what I need.

For the last 25 years the unix world has been createing small programs that usually solve one problem and one problem only. The only attempt at interoperability they have made is to output decently formatted and understandable text to STDOUT. Others provided the "glue" languages (sed, awk, perl, etc.), and this has proven to be not only sufficient, but very efficient as well.

I recently wrote a script to convert my mozilla bookmarks.html file to the Web Links module of a PostNuke site. It was relatively simple task as the bookmarks.html file was easy to parse. However, there is a "standard" protocol for the storage of bookmarks (the name escapes me), and if Mozilla (and PostNuke) would have both adheared to the protocol it would have made my job MUCH easier.

I guess what I'm trying to say is, that rather then create yet another prototcol, start by encouraging CMS creators to support the smaller protocols and standards. I would be VERY happy if a Bulletin board would export using the mbox format. It would be VERY easy for the developers to include, and IMO would go a long way in helping people who are interested in interoperating with your program.

When it comes to interop, one of the best examples I've seen is ODBC. Great stuff.

Sorry it this post is a bit confusing and/or pointless... it was written 2 lines at a time over a period of 4 hours :)

Yes but *which* smaller protocols?, posted 20 Sep 2002 at 14:35 UTC by brendan » (Apprentice)

I agree that we should "start by encouraging the CMS creators to support the smaller protocols" but the question is which protocols? There really aren't many around in the CMS space. WebDAV and the nascent JCP 170 are about all we have that have been specifically created for content management -- everything else (W3C XML Schema? XSLT?) have been created for other purposes but could be transferred to the CMS space.

Perhaps we need a kind of W3C Technical Architecture Group style report, stating "best practices for content management software". I deliberately didn't say "open source" in that sentence as I think this goes beyond open source -- many developers (and users) of commercial systems would love to see some more standards and interoperability in this space.

To take a straightforward but controversial example, let's look at templating languages -- a crucial part of any CMS. Can we really hope to standardise on one templating protocol? Even something like XSLT is implemented with different extension functions in each system, without even considering the proliferation of competing templating "standards" (HTML::Template, Velocity, WebMacro, etc etc). But the big thing in XSLT's favour is that it's cross-platform, supported by an independent collaborative standards body, and has multiple interoperable implementations. You can't say that about too many other systems.

So I think interop for CMS should start at the 37,000 foot level: come up with some overall design principles, look at what's out there piece by piece, and create some best-practice standards that can be supported by any system that cares about standardisation and interoperability.

next steps?, posted 20 Sep 2002 at 18:47 UTC by gregorrothfuss » (Journeyer)

tamnir asks a good follow-up question:

What baby steps should the OSCOM Interop project work on first?

two possible angles: leveraging rss much more, working closely with mozilla.

interoperability is about sharing, not about beauty, posted 20 Sep 2002 at 23:36 UTC by Ankh » (Master)

Disclaimer: I work for the W3C, an organization dedicated to trying to promote interoperability.

Interoperability is about people being able to use tools from multiple sources, multiple vendors, multiple authors, and have them work together, on the same data, in the same sort of way.

To do this, you have to compromise.

As a result, the specifications that are the most widely adopted are often those that have the strongest marks of compromise, which tends to make them look technically ugly.

For example, XML has features that few would call beautiful, but that were included because they were seen as significant for increasing early adoption. Of course, today we're stuck with them, and people who use them depend on them. But that's not necessarily a bad thing.

Where you can't get 100% interoperability, you sometimes *can* get 100% interoperability of a smaller specification, and then have an extension mechanism so that at least people can recognise and delineate the boundaries, the edge of what's interoperable.

Sometimes the hard part is getting people to agree to talk in the first place, and to agree to be bound by a decision if one is made, even if it loses some functionality.

Maybe this is too general to be of much use, I don't know.

Why re-invent the wheel?, posted 21 Sep 2002 at 00:17 UTC by kevindumpscore » (Master)

Why re-implement an XML standard for creating slides (SlideML) when one already exists?

Download the slides package (derived from the DocBook XML DTD) from here:
DocBook Open Repository

Here's a slide presentation that I created using the Slides XML DTD. You can download the XML source files and the final rendered HTML files from here:
Writing Technical Documentation with DocBook (download)

BTW: here's an on-line version of the same slide presentation:
Writing Technical Documentation with DocBook (on-line)

Thoughtful Programming and Forth, posted 24 Sep 2002 at 16:52 UTC by badvogato » (Master)

by Jeff Fox

Why re-invent the wheel?, posted 4 Oct 2002 at 16:50 UTC by Roger » (Journeyer)

Maybe kevindumpscore should visit this page;)

SlideML to Docbook Slides

The following below was written earlier:

At OSCOM (www.oscom.org) we created SlideML exactly because there are several XML presentation languages around and everyone thinks his/hers is the best. With SlideML we didn't reinvent only a language to write OSCOM presentations, but also a translator with which you can go to Docbook Slides, but also to OPML, to SVG, whatever.

IMHO SlideML is a new presentation mark up language that isn't devolped on scratch but integrates the best from the already existing xml presentation formats. It is also a true interop format (which none of the other presentation languages is today). It is lightweight (XHTML) and powerfull (full use of the neweste XML trends: Namespaces, XInclude, Dublin Core) at the same time. It is extendable, but also simple.

As for docbook

I for myself like Docbook quite a lot (we use a subset in our Bitflux CMS and we did our presentation at OSCOM 1 in Docbook Slides with our CMS) and I am very glad that XHTML 2.0. comes nearer to Docbook than any X/HTML before. But for some people Docbook is to heavy and they would prefer to use XHTML to write their presentation. So SlideML is some compromise, but a compromise with a lot of benefits for everyone (the Docbook Slides enthousiasts, as well as the AxKit, the OPML, the SVG,... enthousiasts.

I hope that clarifies some misunderstandings that I thought reading in Kevin's comment.

Why re-invent the wheel?, posted 4 Oct 2002 at 16:50 UTC by Roger » (Journeyer)

Maybe kevindumpscore should visit this page;)

SlideML to Docbook Slides

The following below was written earlier:

At OSCOM (www.oscom.org) we created SlideML exactly because there are several XML presentation languages around and everyone thinks his/hers is the best. With SlideML we didn't reinvent only a language to write OSCOM presentations, but also a translator with which you can go to Docbook Slides, but also to OPML, to SVG, whatever.

IMHO SlideML is a new presentation mark up language that isn't devolped on scratch but integrates the best from the already existing xml presentation formats. It is also a true interop format (which none of the other presentation languages is today). It is lightweight (XHTML) and powerfull (full use of the neweste XML trends: Namespaces, XInclude, Dublin Core) at the same time. It is extendable, but also simple.

As for docbook

I for myself like Docbook quite a lot (we use a subset in our Bitflux CMS and we did our presentation at OSCOM 1 in Docbook Slides with our CMS) and I am very glad that XHTML 2.0. comes nearer to Docbook than any X/HTML before. But for some people Docbook is to heavy and they would prefer to use XHTML to write their presentation. So SlideML is some compromise, but a compromise with a lot of benefits for everyone (the Docbook Slides enthousiasts, as well as the AxKit, the OPML, the SVG,... enthousiasts.

I hope that clarifies some misunderstandings that I thought reading in Kevin's comment.

We don't want to re-invent the wheel at all, posted 4 Oct 2002 at 16:50 UTC by Roger » (Journeyer)

Maybe kevindumpscore should visit this page;)

SlideML to Docbook Slides

The following below was written earlier:

At OSCOM (www.oscom.org) we created SlideML exactly because there are several XML presentation languages around and everyone thinks his/hers is the best. With SlideML we didn't reinvent only a language to write OSCOM presentations, but also a translator with which you can go to Docbook Slides, but also to OPML, to SVG, whatever.

IMHO SlideML is a new presentation mark up language that isn't devolped on scratch but integrates the best from the already existing xml presentation formats. It is also a true interop format (which none of the other presentation languages is today). It is lightweight (XHTML) and powerfull (full use of the neweste XML trends: Namespaces, XInclude, Dublin Core) at the same time. It is extendable, but also simple.

As for docbook

I for myself like Docbook quite a lot (we use a subset in our Bitflux CMS and we did our presentation at OSCOM 1 in Docbook Slides with our CMS) and I am very glad that XHTML 2.0. comes nearer to Docbook than any X/HTML before. But for some people Docbook is to heavy and they would prefer to use XHTML to write their presentation. So SlideML is some compromise, but a compromise with a lot of benefits for everyone (the Docbook Slides enthousiasts, as well as the AxKit, the OPML, the SVG,... enthousiasts.

I hope that clarifies some misunderstandings that I thought reading in Kevin's comment.

Excuses, posted 11 Oct 2002 at 00:54 UTC by Roger » (Journeyer)

I just saw that I have posted my reply three times. Whereas I thought, I was just previewing. Sorry for that. Can I take my first two posts out again?

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

X
Share this page