Solid reusability
Posted 16 Aug 2000 at 20:48 UTC by gord
To date, there have been many promises that if we write software a
certain way, it will be reusable in the future. Free software authors
would especially benefit from the fulfillment of such a promise, because
many of us are interested in maximizing the usefulness of our work. I
have some ideas on what we can do to nail down this dream, and I want to
use this discussion to help make software reuse a reality.
I've spoken often about Figure, which is my
attempt to build a software environment that is useful for all kinds of
programs, and is portable to all kinds of environments.
The thing that I find wrong with existing portable environments is
that they are bound to policy (whether to use a garbage collector,
whether to run programs in the same address space or separate spaces,
what object representation to use, what instruction set to use, what
programming language, etc). I feel that this is a natural consequence
of starting to design and implement a new software environment (thereby
fixing on certain policies), then choosing to make it portable at a
later date. Portability, to me, means to be independent of policy, and
so I believe it is difficult to attain once the policies have snuck into
the design.
What makes Figure different is that it was designed from the ground
up to be portable. A common theme in Figure is interfaces that are
chosen to hide policies from Figure's user. For example, it is possible
to write garbage-collected and reference-counted programs and have
Figure use the memory reclamation scheme you choose. To go one step
further, you can use the same interfaces Figure uses for memory
reclamation, and thereby make your program as portable (in this respect)
as Figure is.
The ideas used by Figure are not particularly innovative... I'm
mostly trying to collate the different approaches that programmers have
found useful in the past, and structure them in a way that hides their
differences. However, note that this does not introduce much
overhead... you as a programmer are still able to exert as much
platform-dependent control as you need. Figure is a toolbox, not a
religion.
I started designing Figure three years ago, when I was looking for a
way to write an application that would run under both Linux and the
Hurd, while taking advantage of the special features of each platform.
Since that time, I have not found an application framework that would do
the job, and so I began writing Figure.
Now, I am looking for people who know of existing software that fills
this job, who have comments and suggestions, or who want to help with
implementation (including documentation). But before I turn the floor
over to you, I recommend that you take a look at the existing design (if
you feel comfortable with C, nested functions and lots of FIXMEs), which
you can find at the Figure home
page.
Why should I use it?, posted 16 Aug 2000 at 23:45 UTC by kjk »
(Journeyer)
I understand that this post is half an invitation to discussion about
reusability and half an evangelization for the project you're working
on. Nothing wrong with that. I'll only post my comments about the
latter. My comments will be negative but please don't treat them as a
flaming. My intent is to provide a constructive critique.
Two minor points: web site sucks (I know that it's modelled after
Pliant's web site, but their web site
sucks too). Mailing list should have an archive.
One major point: I've heard about Figure some time ago (I think you've
announced it on pliant or I came to your web page through pliant). I've
also read the current web page carefully. And I still don't know what
Figure is all about. It looks like it's goal is the same as
Java's (portable application framework). But I'm not sure. I know it
will be object-oriented, it will have it's own programming language
etc. but I don't know how it will help me achieve anything.
So here's a constructive critique part. I assume that the main
purpose of the web page is to attract developers (I might be wrong but
that's my best guess) so think of your visitors as users or customers.
You want to sell them sth. (an idea) and they will have to pay (with
sth. very valuable: their time). So they are developers with very
little time and looking (maybe) for sth. to do. They encounter your
page (eg. by following link from Advogato), they read it (if you're
lucky) and they're gone unless you give them a good reason to stay. The
current web page doesn't give a good reason to stay. I would have to
download the sources and probably look at them to get even the
slightest idea on what Figure is all about. What I would like to see is
a (convincing) answer to the following question: how can Figure help
me?
I presume that the goal of Figure is to help in writing programs so the
more precise question is: can Figure help me write better software?
Will I be able to write software faster? If yes, how? Give me concrete
examples (such and such program has been written in half the time it
would take in C and it ran faster on 57 platforms, including PDP-11).
Compare it with other approaches (will it be better than Lisp, Perl,
whatever?). Ok, I know What kind of programs will be able to use it
(well, one sentence claims that it will be useful for all kinds of
programs but this doesn't seem convincing and is unlikely to be true -
there is NO silver bullet). In other words: think about marketing.
Decide who are you targeting Figure at (application developers? script
writers?). Then figure out what they need and convice them that Figure
gives tham what they need. And more.
I, too, am a little lost (though I'm not interested in critiquing). I've been studying (or trying to study) Figure for a bit since you mentioned it
in my article on path-relocatable software. I guess I'm trying to figure out how it will help my goals and
others'.
You mentioned in a diary entry that you had a well-received presentation on it; perhaps
you could post the contents of that presentation?
A White Paper would be a very good way of presenting your proofs.
Perhaps, you could post your presentation online too.
code reuse, posted 17 Aug 2000 at 05:30 UTC by mettw »
(Observer)
One thing I've never seen with code reuse projects is a solid study
of which forms of programming result in good reuse and which don't.
If you want figure to meet its design goals then I think you'll
have to start with such a study rather than just a new idea.
My experience with code reuse evangelism is that people seem to put all
of their faith in yet another programming language or idea. We now
have GNOME putting all of its faith in CORBA. Given the woefull
failure of other such claims I think I'm justified in being skeptical.
But to be more constructive, here the successes in code reuse that
I've seen:
- Good API
Systems like Java don't acheive good code reuse because of any
characteristic like VM, OOP or so forth. Most code reuse with it
comes from the well designed standard APIs such as DOM. If a programme
only uses the DOM API then it is assured to run regardless of what
DOM library the computer actually has. This is especially important
in cases where every programme has a different GUI library. If there
was a standard GUI API then each library could have wrappers for this
API and therefore you would only need one GUI library on a system.
- Low levelness
The standard C libraries enjoy enourmous code resuse mainly because
they only impliment the bare bones. There are no policies among the
C libraries that can conflict with a programmers view of the world
simply because they are too low level to have any policy at all.
- Pipes and lazy eval
Unix pipes and lazy evaluation in some functional languages give
great code reuse because they have a common, unrestrictive way to
share data between programmes. The most important part being that the
second programme doesn't need to know how the first is going to serve up
the data because the first can only serve it up one way - sequentially.
This ofcourse needs some strong glue to tie together the different
programmes/functions by converting output from the format of the first
programme to that of the second. In the shell world you have AWK for
this, in functional programmes the language itself provides the glue.
- Types
C++ templates tries to get at something existant in functional
languages, but misses the mark IMO. Basically, in a language like
haskell you define a function like add a b = a + b and this
function will work for any type for which the `+' operator
makes sense. This is the ultimate in code reuse.
- Flexibility
Going back to UNIX shell utilities, a large part of the code
resuse in them comes from their incredible flexibility. There is
no such thing as a view in the unix shell world so you just don't
see things like the DOM and RAX APIs being needed for differing
views of the same thing.
No matter how much effort you put into getting things right in your
base system though, you'll never get good code reuse without good
engineering on the part of those using it.
Essentials?, posted 17 Aug 2000 at 10:20 UTC by tetron »
(Journeyer)
mettw: you're dancing around a few points, let me see if I can put my
finger on them.
The basics of software reuse are twofold: interfaces and adaptibility.
I'll take the second one first. Adaptability or flexibility is not only
how general a case of the problem at hand this code solves, but what
sort of meta-level hooks the code has to be extended with. A good
example of this might be the C library function qsort(). Instead of
enforcing a policy ("this function only sorts integers") for example, it
allows the user to supply their own function pointer which qsort() will
then use when it does comparisons as it sorts. Taking this further out,
you get ideas like parametric polymorphism (which C++ templates try to
be, and is found properly in languages like ML and Haskell), reflection
(the ability to analyze program structure at runtime, as in the Java
reflection API java.lang.reflect) and way out there are meta-object
protocols as such found in the Common Lisp Object System (CLOS).
What these things do is allow the programmer to take the essential
structure of a piece of code and decorates it with her OWN code. The
MVC pattern (Model-View-Controller) is a good architectural example of
this, because it is built on the concept of using events (model changes)
to trigger hooks (viewers) which have been added on at a later time by
the application programmer. The point here is that you are not simply
using the code, but actually changing or augumenting the way it works
for your application. Applications with embeded scripting languages are
another good example of this; the scripting system then lets you reuse
the native application logic (for example, Emacs and elisp).
The second part of reuse are interfaces. These of course define how
exactly one module is going to interact with other modules that need to
use it. Interfaces and extensibility are orthagonal issues, I think;
you can have a useful API that is totally nonextensible, or a very
extensible system with a horrible API. However, the best systems are
going to have both.
One of the dreams of component-based achitectures (and when I say
components here I'm also refering to object-oriented systems more
generally) is that no piece of code needs to be written more than once.
Then all other programs can simply use the interface exported by that
component, and everyone is happy. Well, this suffers from several
problems. One is that the API may not match up exactly to your needs
and if it is not extensible as I discussed above you're out of luck.
Another problem is that APIs tend to change over time, which breaks
dependencies. Also, as anyone who has done much object-oriented
programming can attest, components themselves tend to form dense
networks of interdependencies, so that component X depends on component
Y which depends on component Z (and if you're really unlucky, X and Z
will both depend on Q, but different versions of Q).
I should note that interfaces bear many resemblences with (and are in
fact related to) type systems. For example, a pipe is a universal
interface, but it's a lot like "void *" in C. It is completely untyped,
and left up to the parties at each end to make any sort of sense out of
the data being exchanged. They have to agree on their own protocol, and
a change of how data is interpreted at one end of the pipe may
completely throw off the logic at the other end. At the other end you
have strong interfaces, such as CORBA IDL (Interface Design Language),
which are somewhat like typesafe languages like ML. You get strong
typechecking at compile time, and are basically guaranteed that data
will come in in a certain format, or not at all. If one end sends a bad
message to the ORB, it will (I assume) reject it for not following the
previously-agreed interface. The problem here now becomes that you have
lost the ability to change your interfaces at runtime, which actually
works against the ability to extend your program logic at runtime.
It's an interesting problem. If interfaces were a piece of rope, very
strict interfaces would enough rope to tie yourself up with, and pipes
would be enough rope to hang yourself with. Interfaces are something
that most programming languages don't give you
much choice about, of course. Most of them tend towards the strict,
typesafe model with lisp being a notable exception; C would be more
strongly typed but has a much too lenient compiler (that is to say, C
has a decent type system, but the compiler tends to ignore it.)
Good interface design is a difficult thing, as you need to balance ease
of use with complexity with overall power, in addition to the
aformentioned issues of dealing with issues when interfaces don't quite
match up. A good API, rather than simply providing certain services,
should completely encapsulate a certain conceptual computational
structure, and more importantly expose that computational structure in
the API at both high and low levels - for example, the OSI (I think
that's the right acronym) layered network model, going from hardware
protocols (ethernet) to routing (IP) and streams (TCP). However, the
application can, at its discretion, select which layer it actually deals
with, or tweak operations at a lower layer to better support the
higher-level high-layer operations.
Whoops! I just realized it's 6:25 in the morning and I want to go to
bed so I'm
going to stop here :-)
Documentation, posted 17 Aug 2000 at 20:19 UTC by gord »
(Master)
Since the primary complaint so far is that Figure is lacking
documentation, I've made that my priority for the next little while.
My only regret is that the docs may not be useable for a while yet,
so it seems this discussion may be premature. On the other hand, I like
the points raised by mettw and tetron.
I will try to post more as soon as I get the chance. In the
meantime, I hope there may be others with their own thoughts on what
portability and reuse entail.
It looks like you're soliciting for ideas, so here are some of my
simple ideas worth two cents.
I've always look at software reuse in the light of the following:
(1)Abstractions
(2)Syntax and Semantics
(3)Frameworks
Abtractions
Software reuse became possible because of abstraction. In compiler
design, abstraction is done by defining a type for
an object and is known as typing. A type would mean a storage
location having
defined operations. For example, an integer might be a 16 bit storage
capable of performing addition and subtraction. Multiplication and
division can then be implemented using addition and subtraction
respectively. With that kind of system, programmers can easily grasp
the idea and reuse it on a language level.
Syntax and Semantics
Once the types are defined, the next step is to establish a way how an
idea gets translated down into an expression that is both readable and
maintainable. Readable in the sense that I can understand with ease
what the expression is trying to accomplish. As a result, it then
becomes maintainable, programmers can then add more features as it goes
along the process of upgrade.
Frameworks
This is an area where most compiler makers stumble. They seem to be
afraid of defining a concrete definition of what kind of framework a
language should have. An example of that would be K&R and Stroustrup
where they believed that it was not the language's responsibility to
force programmers to work inside a framework. That's why see hear or
read Stroustrup stating C++ is a language, not an environment. IMHO, if
these people managed to define a framework for the language then,
probably, software reuse wouldn't be at issue at all. But, that didn't
happen and now we are seeing projects like Java, Jini, COM, CORBA,
GNOME, KDE and the New Amiga are now tackling this issue.
Conclusion
Overall, I simply must submit myself to the idea that software reuse at
the framework level will never happen. Or maybe I'm wrong, if you can
only prove it.
OK, I've thought some more about code reuse and I think I may have hit
upon two suggestions for how to get it.
- Expressions By expressions I mean things like regex and
Xpath. These offer enourmous flexibility and their use is not
fixed at compile time. Compare for example a series of DOM requests
with a function like xpath(Node*, char*). The former is
limited to how many cases you account for in your code, while the
latter can have the char* argument completely constructed during
execution, giving much greater flexibility. You also won't see DOM and
RAX APIs with Xpath because XPath accounts for every view of the
underlying data. So, generally, every data structure should be
examined using expressions rather than an API.
- Cascading Style Sheets These have many more applications
than just HTML. I came up with this idea while trying to come up with
a structure for a generalised GUI API. My idea was that, rather telling
the GUI library how to do something, the programme should only tell it
what to do and use CSS to convert this into a GUI. For example, a
programme wouldn't tell a GUI library to put a menu bar at the top and
construct these menus/submenus etc. Instead the programme would tell
the GUI library that it wishes to export the following callbacks
to the UI and would also group them together by what sort of callback
they contain. This then leaves the user free to have his own style
sheet for binding keys to these callbacks, or if he is a hacker or
blind he might have a stylesheet to bind all of these callback to
an emacs style minibuffer instead of a menubar. Someone new to the
application, but who wishes to use a minibuffer might have a stylesheet
that binds the callbacks to both a manu system as well as custom
keybindings and a minibuffer. The real point here being that
the setting of policy is taken away from the programmer and given
to the user.
Heh..., posted 18 Aug 2000 at 14:37 UTC by sab39 »
(Master)
mettw: Three letters - X U L :)
Seriously, I think that things like XUL and Glade, combined with
flexible scripting architectures that give accessibility to the
underlying components of the application, have definite promise in this
area. Aphrodite, for example, is a complete re-implementation of a
XUL-based browser that doesn't share much (any?) XUL or javascript with
the Navigator browser, but can do all the same things through use of the
same components. Basically, XUL allowed the whole user-interface to be
re-implemented from scratch at very little cost (cost == time, of
course).
This has obvious implications for experimentation with good UIs and
alternative placement of elements, but it also has implications for code
reuse - in a XUL/scripting environment, if I want to write (say) an HTML
enabled help application, I just need to pop up a XUL window and embed
the same HTML viewing component that the browser does. If I want an HTTP
connection, I can (presumably) instantiate the scriptable http object.
If I want to construct and sent a MIME email from my application, there
are objects for it. Scriptable components go a long way towards code
reuse, and the best way to ensure that the scripting interfaces are
useful is to build your actual application using those scripting
interfaces!
Wide-spread adoption, posted 18 Aug 2000 at 22:31 UTC by kjk »
(Journeyer)
One and single most important thing in reuse is wide spread
adoption. It's much more important than engineering details like
the stuff already mentioned: good API, abstraction, flexibility etc.
This statement doesn't imply that those are unimportant. It's just if
you have wide spread adoption you can get away with not having the most
briliant design, if you have most briliant design and no adoption at
all no reuse will happen. You want proofs? I could try to win a
popularity contest here and pick on a few technologies from a certain
company we all love to hate but I'll limit myself to a wonderful world
of Unix. Exibit number one: X Windows. xlib is just another name for
the Evil. Really. People wrote programs in it only because they had no
choice and xlib was there, on every Unix machine. Then came Motif. Some
would say it's even more Evil but people were just desperate and would
do everything not to use xlib directly. And Motif was there, on every
Unix machine (I'm talking about pre-Linux times when Motif was shipped
with every major Unix workstation). Now we have Gtk. Do people use it
because it's oh so great? No. File manager sucks like you wouldn't
believe, list/tree widget is useless if you have a large number of
items, and fonts... My point is: even if you have less than stellar
design wide adoption will make up for it tenfold. libc, perl, Java are
all evidences in support of this statement. Conclusion. The
lesson here is if anyone wants to promote reuse of any technology (like
Gord:Figure) he should strive for technical excellence and wide
adoption with the latter being more important.
[Don't mind me... I'll just keep posting to this thread.]
As before, I appreciate the comments that have come out thus far.
I just wanted to mention that I have an online demonstration of a bit
of Figure, that you can check out if you're interested in seeing it, but
don't want to bother with the code yet.
Just do telnet fig.org and follow your nose.
I always try to divide the code to 3 different pieces:
- Client is outside the system you're building and
outside of your control. You need to make client's life as simple as
possible.
- Co-ordinator is inside your system, but only deals
with
complexity of one system.
- Reusable code is inside your system, but it needs
to deal
with complexity
of all systems that it can be used with.
The important part here is that client and co-ordinator are in the same
level of abstraction.
They are only designed to work with one system, while reusable code
should
be usable with many systems. Co-ordinator thus hides many details that
are irrelevant for the client, but which reusable code provides for
other systems. Co-ordinator's interface should be
simpler than
the reusable code ever can be -- this comes from the fact that it needs
to
implement much smaller set of requirements than the reusable code.
This division of roles/responsibilities gives the following design(UML):
Client <>---->1 Co-ordinator <>----->n Reusable_Class1
Usually Co-ordinator's interface it very simple -- like methods like
DoAllOfIt(), while
reusable code has been split to very many small methods that do very
simple part of
the whole problem. Maintainance of reusable code involves usually adding
new
methods to customize different aspects of the code -- rewrites of the
main algorithm
inside reusable code are many times needed when new parameters appear.
Soon that
code implements support for many parameters and co-ordinators using it
need to
do much setup to utilize the code.
This is also called Facade in the Design Patterns. (Though, the
difference between the above is that facade is usually done after you
have existing classes and you want to make them simpler, the
co-ordinators should be designed to the system -- Also Design Patterns
does not mention the important connection between client and the
co-ordinator
that they're in same level of abstraction and co-ordinators are only
responsible of one
kind of clients, while the subsystem/reusable classes are responsible of
all possible clients -- there are usually many co-ordinators for same
reusable
classes which utilize different aspects of the reusable code.)
For example, this has happened to Gtk+, they've added large number of
methods
to gtk+ to make it usable in as many contexts as possible. This allows
people
to build co-ordinators (gtk+ based libraries/applications) that provide
simple
interfaces to clients... => Most of the existing code already works this
way...