Older blog entries for mikehearn (starting at number 18)

8 Jan 2004 (updated 8 Jan 2004 at 18:36 UTC) »

Ah, my first diary entry of the new year. I haven't posted much lately, it seemed I never had much to say.

Christmas was pretty average but new years eve was fun - all my old friends got together and we all had a big house party. There must have been 25-30 people there (ok, so not that big).

I've been pretty busy with Wine and autopackage related stuff lately. In particular, I got tired of people asking how to install Internet Explorer on #winehq, so I wrote a script to do it for you. It was posted to Wine Weekly News, frankscorner.org and linux-gamers.net CounterStrike howto, so it's been getting a steady ~250 downloads a day since then. As you can imagine, my inbox has been groaning.

For those who want it btw, you can get it here

One thing that really annoys me is how like half the script is to work around stupid packaging bugs. Some of them are really, really stupid - ie wine won't even start at all on an out-of-the-box install. Have these people done any QA whatsoever? Clearly not.

That's one category of problems (obvious and stupid brokenness). The other category is when packagers try and "improve" upon the install sequence we use in the source tarballs. For instance, they decide that because the wineserver has the word "server" in its name, it'd be a good idea to put it into the init scripts. Unfortunately this stops changes to the config file taking effect, as they are loaded in wineserver startup. Etc. This sort of thing goes on and on.

The problem is clearly that packagers for distros are typically not Wine developers, they are users. Therefore they make mistakes, even though we have a dedicated packaging guide (it seems people don't always read this).

There are two possible solutions.

The first is pretty obvious - winehq should ship one officially blessed package that works anywhere. If people want to use distro-specific packages built by who-knows-whom then great but if it breaks you keep the pieces. Unfortunately, short of rolling something with Loki Setup a la codeweavers, we currently can't do that well. This is exactly the sort of thing autopackage is built for of course, but I might make a Loki Setup installer in the meantime anyway.

The second is that every distro package should be done by a developer. We have this for Red Hat, and it works pretty well. The RH/Fedora packages generally work well. But for most distros, we don't have this. It's clearly not a reliable solution.

So it seems my two entirely separate projects are intertwined in some way ... interesting :)

I'm starting to regret taking on this Wine porting work. The app basically works fine, as far as I can tell (though printing is proving problematic) but it is riddled with small yet annoying buglets. For instance, if you correct a spelling mistake in an editor, the context menu pops up shortly afterwards. Why? Because the application itself pokes the richedit control with a WM_RBUTTONUP message. Inscrutable.

Win32 has a very useful ability, namely an operating system level understanding of exceptions. It's called SEH, for "structured exception handling", and is what I feel like discussing at the moment. Linux has no such equivalent. Let's examine what happens when a thread faults on Windows and Linux, and see what we can learn from the other side.

Typically exceptions are handled by the language runtime in use, however moving this functionality down into lower level shared code has some significant advantages.

Let's pretend that a thread in my application dereferences a null pointer. This is invalid, as typically the lower regions of memory are protected, and therefore a page fault/access violation/segmentation fault will occur.

In Win32, what happens is as follows. The CPU delivers an interrupt which is handled by kernel mode code. The interrupt vector dispatches control to the KiUserExceptionDispatcher function, which begins the process of SEH invocation. Win32 SEH frames are stored on the stack, with a thread-local linked list pointing to each different frame in turn (the list is actually stored in the TEB). The SEH frames define things like a callback to be executed, which is passed a structure describing the nature of the exception. An exception type is identified by an unsigned int, and looks like (for an access violation) 0xc0000005.

As the stack is unwound, the callbacks are called twice. The first is to figure out how to handle the exception, and the second time is as the stack is unwound, giving implementors the ability to have "finally" type functionality. Eventually, even if the application has itself registered no handlers, control reaches an exception handler put in place by the operating system loader code. This calls the default handler code (UnhandledExceptionFilter) which on Windows calls some debugging hooks before finally displaying the famous "This application has performed an illegal operation" dialog box. Visual C++ exceptions are partly implemented on top of SEH, however SEH is not rich enough to fully support C++s needs, so there are limitations in interoperability. In particular C code cannot trap C++ exceptions.

Linux as an OS imposes no particular exception handling system on programs, in fact, it's not even aware of exceptions. Typically a signal delivered to an app will be handled and exception handling machinery built into the binary by the compiler will be invoked.

That works OK, but has a few problems. Firstly, it's not possible to throw C++ (or Python, or Java, or .NET etc) exceptions across C libraries. Even though you can compile C code with -fexceptions, typical programs are not expecting the stack to be unwound underneath them and so internal state can be corrupted by doing so. Even if people wanted to write exception-aware C code, there is no C API or ABI for doing it. There's no way to robustly translate exceptions from one environment to another - for instance, C++ exceptions thrown in a library used from Python would probably go badly wrong.

One possible solution is to have a third party library (or maybe a new part of glib) provide an explicit API for exception handling in C, similar to SEH. If rich enough, the most common error handling systems could be mapped to it, bringing us a little bit close to having interoperating code modules. It would also lead to the ability to catch up to Windows in some aspects - for instance, a unified crash handler window. Doing this properly would probably require co-operation from the libc and gcc maintainers however.

Ahhhhh, I came to Advogato because I wanted to be distracted from work, and here I am again, trying to avoid writing this History essay. I love you Interweb!

I started hacking on something for GStreamer last night. I hope it won't take too long, I get jumpy if I'm away from autopackage for a while, and I keep feeling I should spend all my spare time on this contract work for Wine. Nonetheless, it's kind of important that GStreamer is more robust - the fact that it works at the moment is mostly a matter of blind luck (on single processor machines).

Luckily dolphy wants to build a high end streaming server using it, which means fixing SMP breakage, which means making GStreamer finally use threads sanely.

I'm not working for dolphy unfortunately, but as a part of that and my recurring interest in low level kung-fu I am writing an inter-thread marshaller, or to be more accurate as no actual marshalling is done, a thread thunker.

I fiddled with using inline assembly, some interesting hacks with the stack, gcc extensions and so on for a bit until I resigned myself to the inevitable - code generation is the easiest way to get portable and easy to understand code. The way it basically works is that you place a macro call (?) as the first line of your function (so you need a C99 compiler), for instance:

void some_object_method(SomeObject *self, int param) {
THREAD_THUNK(some_object_method, self, param);

"self" is a GObject (or really any structure) which tells you which GThread owns the current object. If that thread != the current thread, the call is redirected to a stub which places all the parameters into a structure which is then pushed onto an asynchronous queue. We then block on a condition until the owning thread dispatches the function call via a stub which unpacks the structure. The proxy/stub code should hopefully be transparent. So far, so simple.

Issues I don't currently deal with - the code generator is non-existant, I'll prolly try hacking that tonight. If you call a thunked function and that function calls another, the second function will be executed in the owning thread of the first which might not be what you expect. Signals have the same problem, they aren't marshalled into other threads, so you can get callbacks on a different thread to the one you expect.

Typically that's not a problem but code which uses TLS slots (GPrivate in glib), for instance OpenGL, can break if that happens.

You can also get deadlock as there's no re-entrancy yet. If you have two objects owned by two threads (A and B), A can call into B which can then call back into A but function dispatch is not being done, so the threads freeze. Probably the simplest solution to this is to have the stub dispatch incoming thunked calls while waiting on the result of the previous one. I say "simple" but of course this can get hairy really quickly - we don't want to do what DCOM/Bonobo do and re-enter the main loop, or perhaps even dispatch any non-related incoming method call while blocking inside another one. The principle of least surprise tells me I should somehow restrict the scope of thunk re-entrancy, but I need to think it through more clearly.

Fun stuff, anyway :)

I need to get more productive. University is enormous fun, but a lot of my free time seems to come in bite-size chunks which aren't useful for getting things done it. I either need to increase my productivity or scale back my commitments - prolly my open source ones :(

More object-model thoughts

I wish I had time to work on this stuff, but I don't. As a compromise I might as well write up my thoughts. Maybe somebody else will see them and act on them, or at least get interested as a result.

Recently I have started to fancy the D language. It's cute. More importantly, it's not cute in a theoretical abstract way, it's cute because it is an extremely practical language. Physically it resembles Java or C#, however it has lots of features and more importantly fits into the traditional workflow - there is a compiler which outputs native .o binaries given input text files. Those binaries have no external dependencies (assuming you statically link the fledgling standard library). There are no VMs.

Unfortunately, D currently has a few problems which make it unsuitable for use in the free software community. The first is the lack of a free-speech compiler. The second is the lack of a working GObject bridge, which means that despite excellent C compatability, it does not have access to any Linux APIs.

As I pondered this problem, an idea occurred to me - what if instead of a random home-grown ABI for D on Linux, an experimental compiler was written that output GObject C. The resultant C could be compiled using gcc (one of the main blocking issues for a GNU D compiler is the complexity of writing gcc frontends).

Why would this be interesting? Basically, because it means that you could write a class in D (think similar to C++ but cleaned up), compile it, then use it from C with no intermediate stage. An extra step or two could make it automatically bound into Python, perhaps with the bindings embedded in the same shared library that the original code is in.

Ahhh... we would be one step closer to a unified object model on Linux. So what are the difficulties with this approach?

The most obvious one is that D has many constructs that cannot be supported natively in C, like exceptions. Exception handling in C *is* possible, it's done on Windows (google for "win32 seh"), but we currently have no standard for it on Linux as far as I'm aware. Other constructs that might cause problems would be visibility modifiers on methods, and method overloading. Ideally there'd be an obvious mapping from symbols in D to symbols in C (which would be their native "mangling"), however making method overloading reliable would make this a lot harder.

Nonetheless, the possibilities are interesting. There's no real need to create a home-grown ABI for D on Linux, rather, we can reuse what already exists. Another alternative would be to write a D compiler for .NET, however D can be at times quite low level, for instance, it's possible to have direct mappings onto C structs and export D functions with C linkage (indeed, this is one of its assets), so I'm not sure the .NET CTS would be a good match here.

Unfortunately I don't have time to work on this project.

chakie: IIRC such a program already exists, I certainly remember using it when I was a KDE user. I don't recall the name unfortunately.

You can hide the preview pane in Evolution (at least in evolution 1.4) by pressing ` - this option by the way is in the menus, in "View | Preview Pane", so I'm not sure why you had so much difficulty finding it.

Wow, I've just started university and so far, it's been a blast! Fortunately things are starting to settle down now I'm well into the second week, and I once again have time for hacking.

We released autopackage 0.3.5 the other day, and we are now looking at what cool features to work on next. I think network resolution seems to be a popular one, so posted some design notes tonight. The problems of building a decentralised package download network are fascinating.

Tonight though, I felt a bit rough so decided to try and package Straw rather than go to the bar. Straw is not easy to package. I started on the python libdb bindings. Writing the package wasn't so hard, though it took a while because I had to add extra infrastructure for python modules, and fix bugs that only really showed themselves when working with a very different build system to autotools.

It builds and installs OK now, and the specfile is simple enough, you can see it here. Unfortunately libdb library versioning madness means that the test cases almost all fail - hopefully the maintainer can help me get to the bottom of it. It should just be a case of rebuilding the package once it's been figured out.

Next up, adns. But first I have to do a quote for a guy who wants a vetinary app working on Wine. Fun stuff :)

My last blog post pondered what was needed to share objects between "platforms" (ie .NET, C, Python, C++/std, C++/Qt and so on). This time, I'll be thinking out loud about potential systems that we could use in free software land. It'll be long, humour me. I promise I won't do it again ;)

I think there are basically 3 different types that stand a chance of being accepted. I'm going to exclude CORBA from this list on the grounds that it has already have a good attempt at the market, and failed to make a noticeable impact. So:

The first type is some derivative of COM

COM has been endlessly imitated throughout the years - because its core is so simple, it's easy to duplicate. For instance, Mozilla uses XPCOM, which is a simpler, slightly cleaner variant (but is designed for internal usage only rather than operating system wide usage). From the XPLC web site, I'd guess that's similar to COM also. The basic features of a COM style system are that you pass around arrays of function pointers (an interface), and each object must expose at least one standard interface, typically similar to IUnknown (XPCOM uses nsISupports).

I don't think a COM derivative is right for us. In order to be truly useful especially for performing language bindings many things must be built on top, like for instance a variants API. Because all this must fit in the basic framework of cascaded function tables, it's difficult to get a natural API. Microsoft themselves are leaving COM behind in favour of .NET - it would be foolish to head in a direction the dominant technology provider is leaving. COM systems often require free floating external magic, for instance a registry somewhere. Getting people to agree to depend on this would probably be difficult - the "what's in it for me?" factor gets in the way. While COM would work, it'd not the best we can do. There is also no widely deployed implementation available - having working code on peoples systems is a great boon. A COM style system does have the big advantage of being a known quantity, and at least to most politically acceptable if not technically so.

The second type of system is a common language runtime

.NET provides a CLR and CTS (common type system). As far as I know, .NET is the first attempt to produce a truly generic compiled ABI - traditionally they have always been designed for a particular language.

A quick review - .NET allows language interop not through making languages "skins" of C# as some have cynically suggested, but by providing a unified binary format and type system. Managed C++ extends C++ for instance, but it's still possible to compile the entirety of Microsoft Word with it. Some people seem to think that if a language has a form that can be compiled to MSIL it must be "crippled" - in fact this is no more the case than a language being crippled because at some point it must be translated into x86 opcodes.

The "crippled" accusations normally arise because people see that in order to share objects with other .NET languages they must conform to the common type system. This much is obvious - only C++ supports multiple inheritance so an API that makes use of this feature cannot be used from a language that does not support it. That doesn't mean you can't use multiple inheritance in .NET, it just means that objects you create using it will only be usable by other Managed C++ programs.

Clearly, in order to share code you must have a common ground, this was covered in my previous posts. The .NET CTS is an extremely rich common ground, and therefore .NET APIs can be very expressive. Because the .NET class hierarchy is available to all .NET langauges that understand the CTS, it's possible to write an API in say Python that returns an XML document usable from C#, as long as they use the common ground provided by that shared class library. You're free to write and use APIs that work in terms of PyXML of course, but don't expect to be able to export that to other languages. Sharing code between platforms is always going involve a set of compromises, but this one is unusually good.

The approach .NET takes to allowing code interop is technically elegant, and makes the most sense. Creating a rich shared ABI, object model, a strong introspection framework for interpreters and also providing a shared class library is probably the ideal way to enable code sharing at the technical level.

We now hit a problem

Creating all this infrastructure takes a huge amount of effort. It's taken Microsoft years to produce .NET - so far there is no equivalent being produced by the open source community. No, Parrot is not it - even assuming they overcame the technical problems with some see with their designs, it still only provides one part of the puzzle, namely a VM and new opcode/binary format. As far as I know, Parrot does not provide a shared object model, type system, and it most certainly does not provide a shared class library. Parrot is not specified anywhere, it's magic, - by contrast, the core of .NET is specified in ISO standards.

It could be done of course, by a team sufficiently dedicated. But, what you'd end up with would simply be something very much like .NET but incompatible - where is the logic in that? It makes more sense then to clone .NET exactly, and remain compatible with Windows - something we need anyway if we are to run many of the apps of the future. For this of course we have Mono, and a very important project it is even if you really don't like .NET at all. It's the Wine of the year 2010.

Mono, of course, has understandable political problems. Remaining compatible with .NET has a tangible cost, both technically and socially. It not only causes wierd hacks and warts like using the PE format for binaries instead of the more common on Linux (and somewhat superior, imho) ELF format, but socially it places control of the shared platform in the hands of a sworn enemy of not only Linux but the entire free software movement.

It's not surprising that people question the wisdom of writing free software for Linux on this framework. Don't get me wrong - patents do not enter into my argument here - rather, I'm more concerned about the cost of having Microsoft dictate the design of the framework and the statistical difficulty of "forking" it - we need Mono to be compatible with MS.NET for the apps, so we cannot simultaneously take it in another direction if we disagree with what Microsoft are doing.

Is there are 3rd alternative?

Of course there is. It's not perfect, far from it. It's not as wide-reaching or elegant as .NET, it's not as simple or well understood as COM. It is, like all attempts to bridge different platforms, a set of compromises.

It's GObject, or rather, a derivative of it. Most people here probably see GObject as a way of bringing object orientation to C, something already done by C++ and Objective-C years ago. In fact, one of the primary motivations for developing GObject was the need for a way to make language bindings very easy to do, and at this task it succeeds admirably. Often the parts of GObject that seem unnecessarily complex such as the dual phase destruction process are there to support language bindings - in this case, dual phase destruction allows garbage collectors to break up reference counting cycles before destroying/freeing the individual objects.

GObject is not especially fun to define objects in. It typically requires a lot of boiler plate code, a lot of tedious typing and so on. This is not a valid reason for avoiding GObject in our quest to share code. The need to write lots of code to define an object is only true in C, and by its nature, everything in C is tedious and involves a lot of typing. If you want ease of use, use Python, not C.

The reason GObject is valuable for sharing objects is that it provides a rich object model, type system and more importantly it exposes it all via a C API - accessing it from C++, Python, .NET and so on is easy.

Typically today, GObjects are written in C. There is no particular reason why that must be the case, in fact there is already a proof-of-concept program that uses .NET reflection to examine a .NET object hosted by Mono and spits out libmono->GObject bindings. You can then apply the PyGTK codegen code to bind it into Python, C++ and so on.

You can of course also go direct from .NET to Python, in at least two different ways. The first is to write a Python interpreter that spits out MSIL, and that is then JITed by the mono runtime. I think ActiveState were working on something like this. Unfortunately Python is largely magic remember - there are no guarantees that code runs perfectly unchanged when we do that. The other way is to do direct Mono<->Python bindings, ie that use the libmono/libpython C apis to directly map objects between them. This is possible because both platforms support good reflection/introspection capabilities.

What advantages does using C/GObject as our "hub" bring?

The first is that GObject is a very rich object model - almost as rich as the .NET one, I think. It supports things like signals, which typically have to be bolted onto a languages native object model (see libsigc++). It supports a variety of memory management models. It supports properties, static and virtual methods (kind of, see below), and it supports arbitrary string->value associations. You can do inheritance with it, in contrast to COM which does not really know about that.

The second is that the combination of GObject, GLib and C also provides a lot of the infrastructure needed to usefully share code in the form of a common type system, variants api, marshalling infrastructure, common data types and so on, which are frequently very minimal. There's no need to haggle over the definition of a list, because it's already been defined, and has only the bare essentials. The binding logic is free to layer on top any API it sees fit - translating a GList (which is little more than struct { void *prev, *data, *next }) into a QList which provides a much richer API is straightforward.

GObject has a few practical advantages. It already exists, which is a major plus, and better, is already deployed - every Linux (and FreeBSD and Windows) system that has GTK has GObject. It's well understood - GObject has been used to bind many types of objects into other languages, not just widgets or GNOME APIs - the GNU Triangulation Library, for instance. There is documentation, experience and tools available to make the job of producing the wrapper code easy.

The final advantage of GObject is that it doesn't depend on the user to set anything up. If GTK is installed, that's all you need. There's no registry, no runtime needed - as long as you can deal with shared objects, you're sorted. It can be used entirely internally, for instance you can write parts of a program in C and parts in Python.

GObject though falls short in a few places.

For instance, an object bound into Python from C will not respect virtual method overrides - GObject/GType knows nothing about virtual methods, and there is no set convention for them. A mix of styles is currently used, with no easy way to override methods as a result. I have some ideas for how that could be made better. The defs file format which expresses an API (similar to COM/CORBA IDL) is not specified anywhere, nor even really documented it seems. The tools to bind objects from C are plentiful but the tools to bind objects to C are virtually non-existant. Thread safety is not really handled, unlike in COM where thread safety is (rightly) specified as part of the objects interface (well actually its specified in the registry, not IDL) - though arguably this should not be a part of the object system at all.

Its last problem is that its name starts with a G. Yes, this is really sad, but it causes problems for some people. Hopefully the political problems (which really only seem to affect KDE developers) associated with "another desktops library" are more easily surmounted than the problems with .NET - at any rate, an extension of GObject to deal with the problems outlined above could be called, for instance, CoreObject: maybe that would help.

Why COM rocks

So, this post is about what COM gets right, and therefore what we need to copy from it :)

I do believe that we need to copy some things from it. COM on Windows has been a great success in terms of code sharing. When the Jabber team wanted Windows clients, Peter Millard wrote JabberCOM. There were no arguments over what language to use, what dependencies it could have, or how to expose the API - it was written in Delphi (a dialect of object pascal) exposed via COM and that's how it was used. End of story. Contrast this to the stupid and childish arguments that happen in the community over trivial matters of code style and utility libraries, and you see why while the rule of a dictator may be hated and unjust, it is at least somewhat orderly.

The current scenario on Linux where libraries are written in C then bound into other languages has a few advantages, but also a few major disadvantages.

The biggest advantage is that you do tend to end up with a reasonably sane, well integrated API that follows the idioms of the platform you're using. For instance, pygtk does not require you to make calls like CreateObject("{3a36cb61-e38a-43b3-aa0d-407e7cfb2168") in order to construct a window, you can just do "import pygtk" like every other native Python library. This can be done regardless of the design of the underlying API.

It does however have many disadvantages (obviously, or I wouldn't be saying we need something better). For starters, wrapping is an manhour intensive process. The bindings have to be created, then maintained. It has to be done over and over again for each language. Worse, this software doesn't actually do anything, it is "metasoftware", that simply helps us write other software better. The more time we spend writing this metasoftware, the less time we have for software that solves end user problems.

Perhaps one of the biggest disadvantages is that while theoretically you can bind from any platform into any other, in practice all the experience and tools is in going from C to somewhere else. It's common to find bindings for a library written in C to Python, but rare (in fact I've never heard of one) going from Python into C. They are both theoretically possible, it's just that one is unexplored territory so the conventional wisdom is that if you want to share code, it must be written in C.

Clearly this attitude is not productive. C is not an especially efficient language to work in. Higher level languages can be used to get things done a lot quicker, and while Python (and maybe in the future mono) is taking off for applications the same is not true of software components.

So what do we need in order to share software components between "platforms" (which for want of a better word is the term I'll use to mean C vs C++ vs .NET vs Python and their associated standard libraries)?

The basic requirement is a shared object model. An object model tends to consist of a) a set of agreements on how things should be done and b) some sort of "magic", by which I mean opaque implementation details. COM for instance is roughly equal parts agreements and magic - COM components must use the COM reference counting model, must implement IUnknown and so on, however they don't necessarily have to rely on the CoCreateInstance() activation registry magic. While IDL is a formalized agreement, the code that turns that agreement into code you can invoke from your chosen platform is magic.

The magic to agreements ratio of object models varies wildly. The C++ object model for instance is almost entirely based on agreements, for instance between compiler vendors on the ABI to use, and as embodied in the C++ language specification. The lack of any ABI agreements on Windows was one of the driving forces behind the uptake of COM. From the users perspective a lot of this takes place automatically, but they could if they wished format the ABI themselves (to construct vtables on the fly for instance) - it's just that nobody does it, because it's so complex.

In contrast, the Python object model is almost entirely magic. There is only one canonical implementation of the Python object model, the one produced by the Python project. Alternative implementations such as found in Psyco tend to be a little buggy. The advantage of a magic based object model is that you're not tied down by standards, and you are therefore much freer to extend or improve it. Magic is also a lot easier to work with, you can rely on quirks of the magic to get things done.

In order to share objects between platforms a common ground is needed, in other words you need a shared object model. In the case of COM, this model is very simple, it can hardly even be called an object model at all. CORBA is rather more advanced but rather more complex.

Of course COM does not just provide an object model. It also provides network transparency and thread safety to its components. I'd argue that these things are actually entirely separate issues that should not be glued onto a shared object framework simply because it's sort of possible, though there may well be a separate, clearly defined way to get these features in the context of the framework. The overheads and complications that seep into COM because of the location transparency generally aren't worth it, IMHO.

We don't have very many possibilities here, unfortunately. There are at least three different shared object models that we could use in the free software community (and that have a slight chance of being accepted). Next time I'll go over what they are, and which I personally would choose.

Perhaps in future once autopackage is up and running, I might try my hand at hacking one of them up. We'll see how it goes.

COM and the COMpetition

So pphaneuf was wondering what was up with COM. I'm not the guy who wrote the original rant, but I've worked quite a bit with the COM/OLE subsystem of Wine lately, so perhaps I can provide some insights.

COM attempts to solve a simple problem - providing a stable, language neutral in process ABI that allows components from many providers to work together, without the source code.

Well. That's the theory. In fact, COM has been used to do all kinds of things over the years. So pretty quickly we reach the first problem with COM:

1) When all you have is a hammer, everything looks like a nail.

COM and the various technologies built upon it (which are often referred to just as COM) such as OLE/ActiveX/DCOM and so on, have been used for the following things over the years:

  • To allow Excel to embed itself in Word and vice-versa
  • To try and make the Win32 API object oriented
  • As a replacement for designing network protocols
  • As a system for IDE plugins/OCX controls (which is really just a specialisation of OLE compound documents)
  • As a Java-applet killer
  • As a way to expose code to Visual Basic. In fact OLE Automation was originally developed for the VB engine (which is why it uses BSTRs, the B stands for BASIC)

That's just off the top of my head. In all these things, a technology like COM can be used, but that doesn't mean it's actually the best technology for that particular purpose. Unfortunately, because COM is so flexible, there has traditionally been a tendancy in Redmond to try and ram a square peg into a round hole.

COM is mind bogglingly complicated. That might surprise people who know the basics - and in a way it is surprising because the core of COM is very simple. Components are black boxes, that support interfaces. Interfaces can inherit from each other, and all interfaces inherit from IUnknown, which provides the basic services every COM object must implement.

So we reach the 2nd reason COM sucks:

2) COM is not actually object oriented.

You cannot inherit any functionality from IUnknown, only a contract, so you have to reimplement the IUnknown functionality each time. Because of that, IUnknown is very simple and provides almost no functionality.

Interfaces are basically vtables, ie a table of function pointers. IUnknown provides reference counting, and the ability to query for other interfaces. That's it. In comparison, for instance, the GObject base provides methods, properties, public variables, signals, arbitrary data association, reference counting and so on. So, COM objects tend to be quite primitive.

Actually I lied earlier. You can do inheritance in COM, by using a technique called aggregation. Unfortunately this is such a pain in the ass nobody does it.

Seems simple enough so far, right? To build a COM object, all you need to do is implement IUnknown and then whatever interfaces you need. Well, almost.

3) COM sucks because it forces often unnecessary optimizations upon you

All COM objects, if they are going to be instantiated via the standard CoCreateInstance API, must have an associated class factory. Because having COM instantiate an object for you is expensive, the designers decided that actually CoCreateInstance should return a class factory - ie an object which exists only to create the actual object you want. You can then create instances of the object you want very easily, just call the right method on the IClassFactory interface.

Of course, not all objects need creation efficiency. Many are one shots, you don't need to create them thousands of times per second, you create them perhaps once in the lifetime of the app. Unfortunately, COM forces the class factory paradigm on you. And of course, you have to implement it yourself.

Well, that isn't quite true. You can get toolkits, frameworks and IDEs which automate a lot of this boilerplate for you - unfortunately we just fell into Windows syndrome where problems with the underlying framework are disguised by having an IDE write code for you, rather than fixing the problem.

4) COM sucks because it starts simple, then rapidly gets complicated.

OK, so far it's all been conceptually easy, if rather a lot of typing. But at the moment, your object is not that useful. For it to be used from other languages/environments, you need to define the interface in IDL. MS IDL is not a simple language. It's the job of the midl compiler to turn IDL into header files that describe the object, so you can easily invoke its methods.

IDL can be compiled into several different things. Header files for C/C++ is one obvious target, but you can also make "type libraries", which are binary equivalents to the IDL. Sort of. Unfortunately, even though there are two different TLB file formats (with the same extension), neither of them can completely represent everything available in IDL.

For instance, a type library cannot represent the IDispatch interface, which is a core COM interface. You need support for that built into the framework.

Type libraries can also be compiled into a DLL as a resource. At that point, you basically need an IDE to extract it, and create the boilerplate code necessary to represent the vTables in your language.

So back to IDispatch. What's that?

5) COM sucks because late binding is truly horrid

IDispatch allows you to perform "late binding" on an object. Rather than needing to know the layout of the vtable at compile time, you can pass IDispatch a string representing the method you wish to invoke, and a set of arguments. It'll then invoke that method, and return the result. It's needed if you want your COM object to be accessible from a scripting language like JScript, VBScript, or ... yes, Visual Basic itself (which internally makes heavy use of OLE Automation).

Unfortunately, IDispatch is the interface from hell. MSDN recommends that you don't attempt to implement it yourself, because you'll get it wrong. It over optimizes (again) - for instance you don't pass the method name to IDispatch::Invoke, you pass a numeric ID, which you must retrieve beforehand from IDispatch::GetIDsOfNames. Why? For speed. Unfortunately, Visual Basic doesn't actually use this optimization - it was decreed that keeping a cache of method names to member IDs was too complex and easy to screw up, so it does the lookup each time. You still have to implement it of course.

In order to make this situation suck less, Microsoft provide a few implementations of IDispatch for you. One of them, the one it's probably best to use, uses type libraries! Ah, that's better. You just have to write your IDL, compile it to a type library, and then delegate IDispatch to ... oh, wait. LoadTypeLibrary, then GetTypeInfoOfGuid, then CreateStdDispatch, but then you have even more arbitrary limitations placed on you.

Don't worry. If you're using Delphi, this is all sorted out. Just don't forget that you can only have one IDispatch per object. If you have three interfaces you'd like to expose via OLE automation, you have to choose one, or get into "dispinterfaces".

Oh yes. Finally, remember that IDispatch::Invoke works in terms of the variant API. The Win32 VARIANT is really scary. Variants can contain not just about a zillion types of numbers, but also things like BSTR, the most annoying string type in the world. Variants can also contain other dispatch interfaces.

6) COM sucks because DCOM is really complicated

DCOM lets you remote interfaces into other contexts. What is a context? COM defines an "apartment model", which typically means thread. If a COM object is STA threaded (single threaded apartment) then it's not thread safe (ie, most COM objects), so if you want to access it from another thread you must marshal the interface into that thread. Remember to run a message loop though! If you don't, DCOM will start sending window messages to that hidden window you just implicitly created, and your app will deadlock.

COM supports many different threading models, not all of which are supported by all versions of Windows. What the difference between the apartment, free, or both threading models is, is buried inside books like "Essential COM".

You can also marshal interfaces into other processes or machines of course. Of course we now come back to the thorny issue of exactly how the interfaces are marshalled. One way is to use type libraries again, but remember they can't represent everything that IDL can, so sometimes you'll have to generate marshalling code manually. I say manually, of course the RPC NDR APIs are so baroque - NdrUserMarshalMarshall() anybody? - that this isn't possible, so you have to let midl do it for you. The code MIDL generates does some extremely wierd things - look but don't touch, and don't ask questions whatever you do. The code is also messy, using a bizarre mix of indent styles.

I'm not going to continue, this entry/rant is already far too long. Just take a look through the (extensive) COM APIs some time, and note how many functions there are that appear to exist for no better reason than invoking a certain function on an interface, or creating a certain object while by passing the standard mechanisms to do so.

Next time - why COM doesn't suck, and what we can do in free software land to get the benefits of COM (which are many) while avoiding the suckyness of a system that's been grown over a period of a decade.

Congrats to the GNOME developer team on 2.4, I'm looking forward to playing with it when the next Red Hat is out.

Wine wise, things are good. WineCfg continues to make good progress, mini-flamewars over instant apply on the side. Some highly dubious statistics were thrown around, for instance "70% of linux users use KDE, 90% of people use Windows, so therefore 99% of people are not used to instant apply".

Personally, having used both types quite a bit, I find instant apply to be much nicer, it actually makes the machine feel faster (or does to me), and I didn't have any problems adapting to it. On the other hand, I didn't have any problems with the Gnome2 button ordering either, and I know some people found both these things very disruptive (well, a few did, because they bitched about it online).

Now, there are arguments for doing both things in winecfg. Most seem to agree that on its own, instant apply makes sense, and has good usability. The arguments for the old way basically boil down to "That's what the rest of the world does!" which at least in my view is a rather weak argument. Nonetheless, conformance does have some merits.

The difficulty seems to be a belief that consistancy with Windows and KDE is a good thing. KDE tends to be consistant mostly with itself and Windows, not necessarily any particular set of usability guidelines, so I'm not inclined to use that as an example. Likewise, despite being implemented using Win32, winecfg will be used exclusively by users of the Linux desktop (well also freebsd etc as well). At that point it boils down to consistancy with either KDE or GNOME, and given the usability arguments each side can present, I'd favour the same route chosen by Gnome.

Hence - we have an instant apply winecfg.

Can I be bothered with the flamewars and controversy though? Probably not. It's a very little thing. Worse, Win32 does not have good support for instant-apply dialogs, the PropertySheet API we're using for instance does not support it. The transactions system I did internally can support both easily enough. Possibly a command line switch might be used, so the Gnome menu launches "instant apply mode", and the KDE menu launches "OK/Cancel" mode. You could use the OnlyShowIn property of the .desktop format to control this.

I can't help feeling that it would be ultimately a bad idea. User interfaces are more malleable than APIs, and while Wine is a fairly conservative project, in this respect I think it should be more flexible. Alexandre appears to have little opinion on the matter, he generally indicated he wanted instant apply in the future but has not yet weighed in on the debate.

On the other hand, it's not all been argument - I might be getting a contract from Relux Informatik AG to port their software using Wine, as they are interested in a Linux version of their software. Being a specialist lighting design app though, there is not a chance of a full port, it would basically mean a rewrite which is clearly not acceptable. Is this the start of my own little company? I hope so :)

9 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!