Older blog entries for wingo (starting at number 327)

girations

time

Well, I do declare! It took me until after lunch to realize this day is has a personal significance: it was five years ago today that I came to Spain, not for the first time, but for this time.

Europe's been good to me, but the heartstrings still tug homewards. Here my word choice betrays me. True, there is no one place for me to go back to in the States, a "home" of relationships; but there is something there. Something green, something makeshift; something not entirely settled.

It sure isn't the health care. Or the architecture, for that matter, I notice as I hear the bells chime three, from the office where I sit.

I guess one doesn't have to explain the pull of a native land.

travellin

I'm told that in France one may wish bonne année all throughout January, as long it's the first time you see someone. So bonne année, tubes! Nice to rap at ya.

These waning days of my twenties are somewhat dislocated; or bilocated, perhaps. I spend a fair amount of time in Paris. Modern times, modern relationships, right? So it's me, my girlfriend, and the Talgo. I slept four nights on the overnighter last month, it will be four this month, and next month at least two.

It's not the cheapest way to travel, but I just feel bad about taking planes all the time -- apart from the environmental impact, plane travel just doesn't do a body right. You're alternately treated as a terrorist or a consumer. Your mind doesn't have time to arrive. It just ain't natural.

Anyway, until soon. Ciao!

Syndicated 2010-01-15 14:26:28 from wingolog

gnu, gnome, and the fsf

I have been meaning to write this article for a couple weeks now, ever since the recent GNU Hackers Meeting over in Göteborg. I'd rather be hacking tonight, but some other circumstances make such a report a bit more timely; so, here goes!

GHM

I arrived a little late, having missed the first day's talks. I was particularly irked to have missed Bruno Haible's talk on modularity and extensibility, but I understand that there will be some video up soon.

I didn't know enough at that time to miss having seen José Marchesi's talk, but now that I know the fellow, it is a shame indeed. Actually that was one of my biggest take-aways from that event: that I never really knew GNU as a community of maintainers. GNU in 2009 is not the kind of organization that flies people all over to meet each other, and it's to our loss. I hope to see much more of GNU hackers in the future.

So everyone that goes to conferences knows why they are great: the hallway track. Or the bar track, as it might be. So what were the topics?

GNU is not FSF

This point really surprised me: that GNU is not the Free Software Foundation. There a relationship, but they are not the same thing, not by a long shot.

So here's the deal. In the beginning, there was GNU. But Richard realized that in the world of 1985, with proprietary software on all sides, it would be easier to defend the small but growing software commons if hackers collaborated by assigning their copyright to one U.S.-based organization. The FSF was set up as the legal entity associated with GNU, with RMS at its head.

25 years later it's still like that. The copyright assignment paperwork that every GNU contributor signs has some clauses that obligate the FSF to keep their conributions free (in the Free Software sense), but ultimately trust in the FSF is a trust in RMS, and in his principles. It is a testament to RMS's character that there are 35000 lines in fencepost.gnu.org:/gd/gnuorg/copyright.list, representing at least 5000 contributors.

But what's up now? As free software became more and more successful, it became clear that there were other ways to get involved than just writing code. There's all kinds of advocacy work that needs doing, for example. The FSF was a natural concentration point for these efforts.

However, also inevitably, with the influx of people, the composition of the FSF changed. There are very few hackers in the FSF now. There is RMS of course, whose work these days doesn't involve programming, but that's about it. Recently Mako was added, which is an important step to redressing that situation, and more on that later.

I mean, look at www.gnu.org and www.fsf.org. See the difference?

So the advocacy and campaigning is with the FSF, and the hackers are with GNU. I think every GNU hacker is really down with the message of freedom; I mean, we are the ones hacking the hack. But, as a majority, the GNU hackers are not down with "Windows 7 Sins" or "Bad Vista" or most of these negative campaigns the FSF has been running recently.

I say this as a GNU maintainer, not as a representative of GNU, but I believe I have the facts right.

GNU and RMS

All GNU hackers respect RMS. We respect his ethical principles, his vision, his tenacity, and his hack. I mean, GNU Emacs: this, for most of us, is the best software in existence. The early days of GCC. Texinfo. Really remarkable contributions, these, without which the world would be a poorer place.

(If you disagree, that's cool; but I'm just trying to explain where we come from. I guess also I should clarify what I mean by GNU here. I mean, "people who have signed paperwork to assign copyright to the FSF". I know that mentioning the FSF there is a bit ironic, but it's a strict definition.)

But nowadays, while RMS's principles remain strong (thankfully), he's a bit absent, technically. That would be OK -- he has certainly put in his hack-time -- but that the GNU project doesn't really have a means to make technical decisions on its own right now. Each maintainer can mostly make decisions about their modules, and we can talk to each other (mostly via gnu-prog-discuss), but there is no governance structure for the GNU project itself.

Worse, sometimes RMS decides things without any input at all, when such input really is needed. The decision to bless Bazaar as the official GNU version control system, for example, really sits poorly with a lot of people. The adoption of SCM and MIT Scheme as two additional Schemes in the GNU project also are real WTF decisions.

These "blessings" don't have much of an effect on people's behavior -- most active GNU modules use git, for example -- but they make people lose faith in GNU's technical coherence.

The issue here is that the GNU project is a community of people working for software freedom, yes, but we have some specific values binding us together. One, the ethical principles of supporting user freedom; more on that later. Secondly, there is a technical vision of a "GNU project" as a cohesive, well-thought-out, integrated technical whole. One can now argue about the extent to which that is currently true, but it is a very important value to GNU hackers.

Things were more or less fine when there was more of UNIX that needed to be reimplemented, and when RMS himself was more on the ball. But now there is a significant measure of dissatisfaction within GNU itself about the way decisions are made. There are some steps being taken to fix this (a recently created GNU advisory board, for example), but it is ironic that some decisions in GNU really do come from one man.

At the same time there is actually a widespread concern within GNU about what would happen if decision making were to be made more open. No one wants outsiders making decisions, it would still have to be maintainers; but there is concern that things might get out of hand, that there might be too many pointless discussions, that bad decisions could be made, that there would be a tyranny of the talkers, etc. This is why things are changing really slowly. Everyone wants more autonomy, but they don't want to lose the solidarity.

GNU and GNOME

So what's the deal with GNU and GNOME? Everything, and nothing. Allow me to explain.

GNU people see GNOME as an "outside" thing at this point. To most GNU hackers, GNOME is not quite GNU. GNOME doesn't follow GNU's coding standards (recall the point about technical integrations), assigns no copyright to the FSF, and seems to prefer LGPLv2+ and not GPLv3+. There is hardly any communication between GNOME people and GNU people. In GNOME, people look at problems their systems are having, and decide to solve them within GNOME -- PolicyKit, for example.

(I don't want to pick on David's excellent software; but I'm equally sure that it never crossed David's mind to suggest PolicyKit for inclusion into the GNU project. There are many similar examples.)

Hell, many GNU people even use things like Sawfish or Xmonad or StumpWM or other such things. Admittedly, GNU folks would be more likely to install GNOME on their cousin's computer; but as for themselves, Emacs is the only program many people need.

GNOME people on the other hand are in two camps, more or less. Broadly speaking these camps correspond to Free Software and to Open Source, respectively. The former feel a strong heart-tie with the GNU project; perhaps they started working on GNOME back when it really was a GNU project, or perhaps they started with it because they believed in GNU's ethical principles and chose that environment because they wanted to spread user freedom to their friends. As I say, most GNOME people are disconnected from GNU per se, so this tie really is more cultural than practical, but it is there, and it is strong in its own way.

The Open Source people tend to value GNOME in particular from the days when Qt was GPL-only, or even proprietary, and GNOME allowed them to develop open source applications as well as proprietary ones. Open Source people appreciate the technical comraderie, as well as the technical excellence of the platform, but often disagree with the GNU project's copyleft principles, instead appealing to individual choice as to whether to use or make proprietary software or not.

I hope I haven't been uncharitable to anyone here. Please correct me in the comments if so.

is GNOME GNU?

So, Andre posts, wondering about the relationship between GNU and GNOME. I hope I have been able to add some broader context to the question.

I think that Andre's characterization of RMS as "fascistic" is totally wrong. There are serious problems of decision-making within the GNU project, yes, but "fascistic" takes it a bit too far, and almost brushes Godwin's Law ;)

The particular context in which the discussion has been brought up, that nasty thread on foundation-list, is unlikely to bring about any mutual understanding. It is ironic that the topic is the code of conduct, which was designed to promote understanding.

Bottom line, GNOME can be GNU if it wants to. But I don't think a decision is necessary right now, and certainly not on foundation-list, not in that thread of the talkers.

GHM redux

I promised to talk more about hallway tracks at the GHM, but I've run out of semicolons. Until next time!

Syndicated 2009-12-13 21:33:44 from wingolog

maxwell's equations of software

There was some interesting feedback on my last article, and I ended up wanting to say too much in reply that here I am again, typing at the ether.

Stephen reflects the common confusion that somehow this is just a little file that might or might not work. No man, this is Guile! That's the implementation of Guile's eval. So unit tests, yes, we have more than 12000 of them. Only some of them specifically test the evaluator, but all of them go through the evaluator.

But to be honest, I think unit tests help, but when I'm hacking the compiler the most useful test is to simply rebuild the compiler. If it successfully bootstraps, usually we're doing pretty well.

Zed raises a more direct criticism:

This is why I hate lisp: [link to my article] Long and dense, no comments, not semantic. That's "awesome code"?

I don't think I used the word "awesome", but yes, I believe so, in the sense of "inspiring awe" -- at least for me.

I'm going to call on my homie Alan Kay for some support here. There was an oft-cited interview with Alan Kay a few years back, when he described eval as "Maxwell's equations of software". I quote:

[Interviewer:] If nothing else, Lisp was carefully defined in terms of Lisp.

[Alan Kay:] Yes, that was the big revelation to me when I was in graduate school—when I finally understood that the half page of code on the bottom of page 13 of the Lisp 1.5 manual was Lisp in itself. These were “Maxwell’s Equations of Software!” This is the whole world of programming in a few lines that I can put my hand over.

I realized that anytime I want to know what I’m doing, I can just write down the kernel of this thing in a half page and it’s not going to lose any power. In fact, it’s going to gain power by being able to reenter itself much more readily than most systems done the other way can possibly do.

Just for context, here is that half-a-page -- Maxwell's equations of software:

evalquote is defined by using two main functions, called eval and apply. apply handles a function and its arguments, while eval handles forms. Each of these functions also has another argument that is used as an association list for storing the values of bound variables and function names.

 evalquote[fn;x] = apply[fn;x;NIL]

where

 apply[fn;x;a] =
       [atom[fn] → [eq[fn;CAR] → caar[x];
                    eq[fn;CDR] → cdar[x];
                    eq[fn;CONS] → cons[car[x];cadr[x]];
                    eq[fn;ATOM] → atom[car[x]];
                    eq[fn;EQ] → eq[car[x];cadr[x]];
                    T → apply[eval[fn;a];x;a]];
        eq[car[fn];LAMBDA] → eval[caddr[fn];pairlis[cadr[fn];x;a]];
        eq[car[fn];LABEL] → apply[caddr[fn];x;cons[cons[cadr[fn];
                                                   caddr[fn]];a]]]

 eval[e;a] =
       [atom[e] → cdr[assoc[e;a]];
        atom[car[e]] → [eq[car[e];QUOTE] → cadr[e];
                        eq[car[e];COND] → evcon[cdr[e];a];
                        T → apply[car[e];evlis[cdr[e];a];a]];
        T → apply[car[e];evlis[cdr[e];a];a]]

pairlis and assoc have been previously defined.

 evcon[c;a] = [eval[caar[c];a] → eval[cadar[c];a];
               T → evcon[cdr[c];a]]

and

 evlis[m;a] = [null[m] →  NIL;
               T → cons[eval[car[m];a];evlis[cdr[m];a]]]

From the LISP 1.5 Programmer's Manual

What a mess! Where are the unit tests? Where are the comments? A little bit of whitespace, please! They use "cdar", "caar", and "cadr". They should be using named accessors! What if you eval an atom that's not found in the association list? Did they really name a function "evlis"? Et cetera.

Now, I do think the LISP 1.5 (it wasn't yet spelled "Lisp" then) evaluator is awesome, as it was in McCarthy's first papers on the subject. If you disagree with that, that's cool, I see why one would "hate" my poor imitation.

But given the Lisp history of meta-circular evaluators, one doesn't need to comment every one as if it were the first. In the same way that one recognizes an if statement without the need for a design pattern behind it, I would want anyone who's hacking on Guile's evaluator to have read or perhaps even written several evaluators; and once you've written one, you know the pattern.

More substantively, Charles speculates on source-to-source transformations to ease interpretation. I admit almost total ignorance regarding PreScheme; I've been meaning to learn about it for years now. But the point of this evaluator was to have it use the same representation as VM-compiled procedures; ideally it should run fast, but the first priority is for it to run on the VM itself instead of on a separate stack. There's certainly many more clever things that can be done there, and thankfully since eval is implemented in Scheme and compiled like anything else, it will also reap advantages of an improving compiler.

Regarding Emacs, I hope to say more on that point within a week or so, when the video for my talk at the recent GNU Hackers Meeting gets put up.

Finally, Bubo asks if this work is in a released tarball. Not yet is the answer, but it will be in Tuesday's 1.9.6 release.

Happy hacking!

Syndicated 2009-12-11 14:23:10 from wingolog

in which our protagonist forgoes modesty

I wrote the most beautiful code of my life last week.

I would like to explain it to you all, but I have to tell a little bit of backstory first.

the distant future, the year 2000

Until earlier this year, Guile has been an interpreted Scheme. Guile ran your code by parsing it into tree-like data structures, then doing a depth-first evaluation of that tree. The evaluation itself was performed with a C function, ceval, which would recursively call itself when evaluating sub-expressions.

ceval was OK, but not so great. Instead of recursively traversing a tree, it's better to pre-examine the expressions you're going to evaluate, and then emit linear sequences of code to handle those steps. That's to say that it's better to have a compiler than an interpreter. So I dug up Keisuke Nishida's old bytecode compiler that he wrote back in 2001, and eventually hacked it into a shape that we merged it into Guile itself.

That was a pretty sweet hack, to retrofit a compiler into Guile. But it wasn't as beautiful as the code I wrote last week.

the present

See, the problem was that now we had two stacks: the C stack that ceval used, and the virtual machine stack used by byte-compiled code. This was a debugging headache, as to get backtraces you had to ping-pong back and forth between the two stacks, interleaving their frames together in the right order. Also with two stacks it's practically impossible to write a real debugger that does single-stepping, inspection and modification of stack frames, etc.

The two-stack solution (ahem) had another problem: you couldn't tail-call between interpreted and compiled code, because the procedures used different stacks. Normally this wasn't a big deal because all code was compiled, but it would occasionally bite you. (The usual case would be when you had a compiled Guile, but just pulled new code from the git repository, then tried to compile it again -- so some of your compiled code was out-of-date and therefore not loaded, you had a mix of compiled and interpreted code in some important places, and your loop starts consuming stack.)

Finally, interpreted code behaved differently than compiled code in some cases. For example, consider the following code:

;; Returns two values: the value, if found,
;; and a flag indicating success.
(define (table-lookup table key)
  (let ((handle (assq key table)))
    (if handle
        (values (cdr handle) #t)
        (values #f #f))))

(define (trace-call f . args)
  (let ((result (apply f args)))
    (format #t "\nfunction returned ~a\n" result)
    result))

(trace-call table-lookup '((x . y)) 'x)

So if I try this at my Guile 1.8 prompt, I get this:

guile> (trace-call table-lookup '((x . y)) 'x)
function returned #<values (y #t)>
$1 = y
$2 = #t

We see that trace-call returns two values, and the tracing printout shows a "multiple values object" -- a Scheme object like any other, but that the primitive call-with-values knows how to destructure. The toplevel repl is wrapped in a call-with-values, so t and #t print separately.

Now if I fire up Guile 1.9, let's see what we get:

> (trace-call table-lookup '((x . y)) 'x)
function returned y
$1 = y

Guile 1.9's repl compiles its expressions, by default, and indeed we see different behavior -- the trace printout has a naked value, y, and only one value is returned.

Both of these behaviors are compatible with standard Scheme from R5RS on. The origin of the difference is that the behavior of values within a continuation that was not created with call-with-values is unspecified. Relatedly, it is not specified what will happen when you return N values to a continuation accepting M values, where N != M.

What's happening is that the compiler actually has two return addresses in each stack frame -- one for the normal singly-valued case, and one for multiple values. values will return to the multiple-value return address (MVRA), and anything else will go to the normal return address. So actually, compiled code can choose what to do when it gets multiple values. Instead of raising an error when two values are returned to the (let ((result ...)) ...) continuation, Guile chooses to do what you (probably) expect and just drop the second value.

In contrast, with a C evaluator, even noticing that two values were returned to a singly-valued continuation is a pain -- because you have to check and branch every time you recursively call ceval to see if you're getting a multiple-values object.

But I digress. I promised something nice, and here I am noodling about something else.

exit strategy

The solution to all these problems, of course, is to use just one stack, and have that stack be the same as the one that compiled code uses.

Practically what this means is that eval should not be a C function, because Guile does not compile to C; it should be something that ends up as compiled code.

(For now, compiled code is bytecode, run on the VM. I'm being a little vague here because Guile doesn't do native compilation yet, but it will, within a year or two, and the same considerations apply.)

I actually toyed with the idea of writing a hand-coded eval in VM bytecode, but I came to my senses soon enough, and the answer was delightful.

eval in scheme

Of course! Scheme's eval should be written in Scheme itself. Then we just compile it to bytecode, like any other Scheme procedure.

At this point, anyone who's actually had to do Scheme at university (not me) will recognize this as the meta-circular evaluator pattern. To be honest I had never written one before -- and I think the reason was that they always seemed so peripheral. When you write a meta-circular evaluator, the language you really work in is the one that implements the meta-circular evaluator, not the one implemented by the evaluator -- or at least, that's the case if you're trying to get something done, rather than learn about language.

But this is different. This time the meta-circular evaluator actually sits at the heart of Guile -- in fact, we use eval, as implemented in Scheme, and compiled to bytecode, to compile the compiler -- which itself is written in Scheme of course.

In the end, though, you have to have a Scheme compiler to compile eval.scm itself, so we do end up keeping around an evaluator in C. Its only purpose is to interpret the compiler, so we can compile eval.scm: then the compiled version of eval.scm compiles the rest of Guile, including the compiler.

Another option would have been to require a new-enough version of Guile itself to compile the compiler. But I want to be able to sanely bootstrap Guile's compiler, so that's out of the question. We could implement the compiler in portable Scheme, but that would forbid the compiler from making use of any of Guile's niceties.

the code

So here it is (and below). I don't claim that it is actually the most elegant code I have written, though I can think of none better at the moment; nor is it the fastest code, nor the most concise. But it sits in such a powerful place, and in so few lines, that I cannot help but to be pleased with it.

(define primitive-eval
  (let ()
    ;; The "engine". EXP is a memoized expression.
    (define (eval exp env)
      (memoized-expression-case exp
        (('begin (first . rest))
         (let lp ((first first) (rest rest))
           (if (null? rest)
               (eval first env)
               (begin
                 (eval first env)
                 (lp (car rest) (cdr rest))))))
      
        (('if (test consequent . alternate))
         (if (eval test env)
             (eval consequent env)
             (eval alternate env)))
      
        (('let (inits . body))
         (let lp ((inits inits) (new-env (capture-env env)))
           (if (null? inits)
               (eval body new-env)
               (lp (cdr inits)
                   (cons (eval (car inits) env) new-env)))))
      
        (('lambda (nreq rest? . body))
         (let ((env (capture-env env)))
           (lambda args
             (let lp ((env env) (nreq nreq) (args args))
               (if (zero? nreq)
                   (eval body
                         (if rest?
                             (cons args env)
                             (if (not (null? args))
                                 (scm-error 'wrong-number-of-args
                                            "eval" "Wrong number of arguments"
                                            '() #f)
                                 env)))
                   (if (null? args)
                       (scm-error 'wrong-number-of-args
                                  "eval" "Wrong number of arguments"
                                  '() #f)
                       (lp (cons (car args) env)
                           (1- nreq)
                           (cdr args))))))))

        (('quote x)
         x)

        (('define (name . x))
         (define! name (eval x env)))
      
        (('apply (f args))
         (apply (eval f env) (eval args env)))

        (('call (f . args))
         (let ((proc (eval f env)))
           (let eval-args ((in args) (out '()))
             (if (null? in)
                 (apply proc (reverse out))
                 (eval-args (cdr in)
                            (cons (eval (car in) env) out))))))
      
        (('call/cc proc)
         (call/cc (eval proc env)))

        (('call-with-values (producer . consumer))
         (call-with-values (eval producer env)
           (eval consumer env)))

        (('lexical-ref n)
         (let lp ((n n) (env env))
           (if (zero? n)
               (car env)
               (lp (1- n) (cdr env)))))
      
        (('lexical-set! (n . x))
         (let ((val (eval x env)))
           (let lp ((n n) (env env))
             (if (zero? n)
                 (set-car! env val)
                 (lp (1- n) (cdr env))))))
        
        (('toplevel-ref var-or-sym)
         (variable-ref
          (if (variable? var-or-sym)
              var-or-sym
              (let lp ((env env))
                (if (pair? env)
                    (lp (cdr env))
                    (memoize-variable-access! exp (capture-env env)))))))

        (('toplevel-set! (var-or-sym . x))
         (variable-set!
          (if (variable? var-or-sym)
              var-or-sym
              (let lp ((env env))
                (if (pair? env)
                    (lp (cdr env))
                    (memoize-variable-access! exp (capture-env env)))))
          (eval x env)))
      
        (('module-ref var-or-spec)
         (variable-ref
          (if (variable? var-or-spec)
              var-or-spec
              (memoize-variable-access! exp #f))))

        (('module-set! (x . var-or-spec))
         (variable-set!
          (if (variable? var-or-spec)
              var-or-spec
              (memoize-variable-access! exp #f))
          (eval x env)))))
  
    ;; primitive-eval
    (lambda (exp)
      "Evaluate @var{exp} in the current module."
      (eval 
       (memoize-expression ((or (module-transformer (current-module))
                                (lambda (x) x))
                            exp))
       '()))))

Syndicated 2009-12-09 20:51:00 from wingolog

codification

Quoth Johnny Rotten: "You're only twenty nine, got a lot to learn". In my case, both conditions hold, and, regarding the latter:

Who knew?

Syndicated 2009-11-17 21:39:42 from wingolog

hacker culture, permaculture

callings

Here's an idea: of everyone out on the ether brushed by these bytes, there has to be a good number of us that are in "it" not just for the game, but for life: in the sense that our actions can be life-affirming, that our interactions can help bring about a more beautiful world.

So, with that realization in mind, I call "book club". Let's read a book together!

What book, you ask? Here's mine for now: Permaculture: Principles and Pathways Beyond Sustainability. It's by David Holmgren, one of the originators of the permaculture design system.

I think that Holmgren has as much to say about how we live life as Alexander. He's also a builder, in a way. It really seems to me that Holmgren's work fosters Alexander's quality-without-a-name. All that is by way of introduction, to say that people that like Alexander, of whom there are many in the hacker world, might well enjoy Holmgren.

But why this book now? The answer is that Lyn Gerry, the host of the radio show, Unwelcome Guests, is reading it to us: an hour every week. It's nice to hear a book. It's also nice to hear it like this, over time, so every part has a chance to seep in.

So. For the first installment I'll give a direct link to the MP3: here. It starts with a poem and a tune, then the reading. I'm not sure where the reading starts in second installment, from this week's episode, because I just downloaded it. Usually it's at the start of the second hour -- the hours are separated in the direct downloads -- but I always listen to the whole thing anyway, so I download them both. There's a podcast link too on the archives page.

I would like for our conversations about the book to be open, in the sense that radio is open, for people to tune in and listen to if they want. By that I mean to say let's not have a mailing list -- what do people think about seeing if we can have a cross-blog-and-comments, cross-identica/twitter discussion? That could fail of course, but it sounds like a nice thing to try.

Anyway, let me know if you want to join. I know that if you're interested, that will make at least two of us.

cold, cold part of the world

I'm off to Sweden tomorrow for the 2009 GNU Hackers Meeting, co-located with FSCONS, the Free Society Conference and Nordic Summit. I'll talk at the GHM about recent developments in Guile, and Guile's place within the GNU universe. If you're going to be at FSCONS, let's meet up!

Syndicated 2009-11-10 22:53:15 from wingolog

class redefinition in guile

Yes hello hello!

Long-time readers will perhaps recall this diagram:


figure zero: things as they were.

It comes from an article describing how Guile represents its objects in memory, with particular concern for Guile-GNOME.

I was hacking on this code recently, and realized that this representation was not as good as it could be. Our switch to the BDW garbage collector lets us be more flexible with the size of our allocations, and so we can actually allocate the "slots" of an object inline with the object itself:


figure one: things how maybe they could have been.

Alas, during the hack, I discovered a stumbling block: that this representation doesn't allow for classes to be redefined.

redefine a data type, what?

Yes, Guile's object-oriented programming system (GOOPS) allows you to redefine the types of your data. It's OK! CLOS lets you do this too; it's an old tradition. Redefining a class at runtime allows you to develop by incremental changes, without restarting your program.

Of course once you change a class, its instances probably need to change too, probably reallocating their slots. So we have to reintroduce the indirection -- but allowing for locality in the normal, non-redefined case. Like this:


figure two: things how they are, almost.

So updating an instance is as simple as swapping a pointer!

Almost.

Well, not really. This is really something that's unique to Lisp, as far as I can tell, and not very widely-known in the programming community, and hey, I didn't completely understand it -- so man, do I have a topic for a blog or what.

step one: make a hole in the box

The way redefinition works is that first you make a new class, then you magically swap the new for the old, then instances lazily update -- as they are accessed, they check that their class is still valid, and if not update themselves. It's involved, yo, so I made a bunch of pictures.


figure three: class redefinition begins with defining a new class.

So yeah, figure three shows the new class, lying in wait beside the old one. Then comes the magic:


figure four: same identity, different state.

What just happened here? Well we just swapped the vtable and data pointers in the old and new classes. For all practical purposes, the old class is the new class, and vice versa. All purposes except one, that is: eq?. The old class maintains its identity, so that any code that references it, in a hash table for example, will see the same object, but with new state.

The class' identity is the same, but its state has changed. That's the key thing to notice here.

Now we mark the old class's data as being out of date, and the next time its instances check their class... what? Here we reach another stumbling block. The old class has already has new state, so it is already fresh -- meaning that the instance will think nothing is wrong. It could be that the instance was allocated when its class declared two slots, but now the class says that instances have three slots. Badness, this; badness.

So what really needs to happen is for instances to point not to the identity of their classes, but to the state of their classes. In practice this means pointing directly to their slots. This is actually an efficiency win, as it removes an indirection for most use cases. Comme ça:


figure five: instances actually point to class state, not class identity.

As we see in the figure, a well-known slot in the class holds the redefinition information -- normally unset, but if the class is invalidated, it will allow the instance to know exactly which version of the class it is changing from and to.


figure six: new equilibrium.

And finally, figure six shows the new state of affairs -- in which slot access has been redirected for all redefined classes, and all of their instances, transitively.

efficiency

All in all, this is quite OK efficiency-wise. Instance data is normally local, and class data is one indirection away. A redefined instance will have nonlocal data, but hey, not much you can do otherwise, without a copying collector.

There is one efficiency hack worth mentioning. Accessors, discussed in an earlier article, don't need to check and see if their class is up to date or not. This is because they are removed from the old class and re-added to the new one as part of the redefinition machinery.

summary

Redefinition is complicated, but pretty neat.

really, that's the summary?

Yes.

Syndicated 2009-11-09 14:16:27 from wingolog

optionals, keywords, oh my!

OK! Where were we?

In my last dispatch, I talked about case-lambda in guile. The gist of it is, let procedures parse their own arguments, and they can do neat stuff like multiple-arity dispatch.

Also, neat stuff like optional and keyword arguments! Consider our my-write example from last time:

(define my-write
  (case-lambda
    ((obj port) (write obj port))
    ((obj)      (my-write obj (current-output-port)))))

It's a little silly, to write it this way. It's not essentially one procedure with two different bodies, it's one procedure with one required argument, and one optional argument. The optional argument defaults to (current-output-port).

So, as you would imagine, there is a better way to express this "design pattern": lambda*, and its sugary friend, define*.

In this case, we would simply define my-write like so:

(define* (my-write obj #:optional (port (current-output-port)))
  (write obj port))

So nice, so clear. Default values are only evaluated if the argument is missing. (It's a rare Python programmer that's not surprised about Python's behavior in this regard; but I digress.)

keyword args too

Optional arguments are good at allowing for concision and extensibility, but code that uses them can be confusing to read. Actually this is a problem with positionally-bound arguments in general.

I like how Carl Worth puts it: that nice prototypes can result in inscrutable code. His solution in C is to have function names encode their arities, but we can do better in Scheme, with keyword arguments.

So let's say we want to add a "detailed" argument to my-write. We can add a keyword argument:

(define* (my-write obj
                   #:optional (port (current-output-port))
                   #:key (detailed? #f))
  (if detailed?
      (format port "Object ~s of type ~s" obj (class-of obj))
      (write obj port)))

Invocations are really nice to read:

(my-write 'foo #:detailed? #t)
=| Object foo of type #<<class> <symbol> 8c4fca8>

(my-write 'foo (open-output-file "foo.log") #:detailed? #t)
; writes the same thing to foo.log

The second example gives an explicit port; and indeed, I am left wondering what it is, when I read it. Keyword arguments make for more readable code.

Keyword arguments also allow for better extensibility. But don't take it from me, take it from P. Griddy:

Most of the operators in Rtml were designed to take keyword parameters, and what a help that turned out to be. If I wanted to add another dimension to the behavior of one of the operators, I could just add a new keyword parameter, and everyone’s existing templates would continue to work. A few of the Rtml operators didn’t take keyword parameters, because I didn’t think I’d ever need to change them, and almost every one I ended up kicking myself about later. If I could go back and start over from scratch, one of the things I’d change would be that I’d make every Rtml operator take keyword parameters.

-- Paul Graham, from a talk he gave back when he didn't talk about startups so durn much

And, there's one more thing, which applies both to optional and keyword arguments: the default values are evaluated in the lexical context of their preceding arguments. So you can have a later argument referring to an earlier one. For example, Guile's compile is defined like this:

(define* (compile x #:key
                  (from (current-language))
                  (to 'value)
                  (env (default-environment from))
                  (opts '()))
  ;; wizardly things here
  ...)

See how env's default value references from? Awesome, yes? I thought so.

newness

So what's new about all this? Not much, semantically. Guile has supported lambda* and define* for more than 10 years. But now they are available in the default environment, and they are fast fast fast -- for the same reasons that case-lambda is faster now. There are special opcodes to process stack arguments into optionals, and to shuffle and bind keyword arguments, all without consing a single cell.

Also, now the toolchain knows about optional and keyword arguments, so that backtraces and printouts show them nicely. For example, my-write prints like this:

#<program my-write (obj #:optional port #:key detailed?)>

Ah, there is case-lambda*; though it is of dubious utility, given that it can only reasonably dispatch on the required and optional arity, and not on keyword args. But there it is.

In any case, I look forward to using lambda* more in the future, without speed trepidations. Just say no to rest arguments masquerading as optionals!

Syndicated 2009-11-08 12:33:29 from wingolog

case-lambda in guile

Oh man, does the hack proceed apace. I really haven't had time to write about it all, but stories don't tell themselves, so it's back behind the megaphone for me.

Guile is doing well, with the monthly release train still on the roll. Check the latest news entries for the particulars of the past; but here I'd like to write about a couple aspects of the present.

First, case-lambda. The dilly here is that sometimes you want a procedure that can take N or M arguments. For example, Scheme's write can be invoked as:

(write "Hi." (current-output-port))
=| "Hi."

(=| means "prints", in the same way that => means "yields".)

But actually you can omit the second argument, because it defaults to the current output port anyway, and just do:

(write "Hi.")
=| "Hi."

Well hello. So the question: how can one procedure take two different numbers of arguments -- how can it have two different arities?

The standard answer in Scheme is the "rest argument", as in "this procedure has two arguments, and put the rest in the third." The syntax for it is not very elegant, because it introduces improper lists into the code:

(define (foo a b . c)
  (format #t "~a ~a ~a\n" a b c))
(foo 1 2 3 4)
=| 1 2 (3 4)

You see that 1 and 2 are apart, but that 3 and 4 have been consed into a list. Rest args are great when your procedure really does take any number of arguments, but if the true situation is that your procedure simply takes 1 or 2 arguments, you end up with code like this:

(define my-write
  (lambda (obj . rest)
    (let ((port (if (pair? rest)
                    (car rest)
                    (current-output-port))))
      (write obj port))))

It's ugly, and it's not expressive. What's more, there's a bug in the code above -- that you can give it 3 arguments and it does not complain. And even more than that, it actually has to allocate memory to store the rest argument, on every function call. (Whole-program analysis can recover this, but that is an entirely different kettle of fish.)

The solution to this is case-lambda, which allows you to have one procedure with many different arities.

(define my-write
  (case-lambda
    ((obj port) (write obj port))
    ((obj)      (my-write obj (current-output-port)))))

implementation

You can implement case-lambda in terms of rest arguments, with macros. Guile did so for many years. But you don't get the efficiency benefits that way, and all of your tools still assume functions only have one arity.

Probably the first time you make a VM, you encode the arity of a procedure into the procedure itself, in some kind of header. Then the opcodes that do calls or tail-calls or what-have-you check the procedure header against the number of arguments, to make sure that everything is right before transferring control to the new procedure.

Well with case-lambda that's not a good idea. Actually if you think a bit, there are all kinds of things that procedures might want to do with their arguments -- optional and keyword arguments, for example. (I'll discuss those shortly.) Or when you are implementing Elisp, and you have a rest argument, you should make a nil-terminated list instead of a null-terminated list. Et cetera. Many variations, and yet the base case should be fast.

The answer is to make calling a procedure very simple -- just a jump to the new location. Then let the procedure that's being called handle its arguments. If it's a simple procedure, then it's a simple check, or if it's a case-lambda, then you have some dispatch. Indeed in Guile's VM now there are opcodes to branch based on the number of arguments.

So much for the VM; what about the compiler and the toolchain? For the compiler it's got its ups and downs. Instead of a <lambda> that just has its arguments and body, it now has no arguments, and a <lambda-case> as its body. Each lambda-case has an "alternate", the next one in the series. More complicated.

Then you have the debugging information about the arities. The deal here is that there are parts of a procedure that have arities, probably contiguous parts, and there are parts that have no arity at all. For example, program counter 0 in most procedures has no arity -- no bindings have been made from the arguments to local variables -- because the number of arguments hasn't been checked yet. And if that check fails, you'll want to show those arguments on your stacktrace. Complication there too.

And the introspection procedures, like procedure-arguments and such, all need to be updated. On the plus side, and this is a big plus, now there is much more debugging information available. Argument names for the different case-lambda clauses, and whether they are required or rest arguments -- and also optional and keyword arguments. This is nice. So for example my-write prints like this:

#<program my-write (obj port) | (obj)>

So yeah, Guile does efficient multiple-arity dispatch now, and has the toolchain to back it up.

Next up, efficient optional and keyword arguments. Tata for now!

Syndicated 2009-11-07 11:52:25 from wingolog

digital interfaces

I made some tomato and red pepper soup for lunch yesterday. Before I had a chance to eat it, the universe decided it still needed more red, and that I should try something stupid with a pocketknife. I sliced up my left forefinger and thumb pretty good.

This blog seems to be specializing in thoughts just before blood, so here it is: ah fuck, going to have to get stitches. I knew that in the first second.

Thankfully, there's a CAP (Centre d'Atenció Primària) in most neighborhoods, so after laying down on the couch to make sure I wouldn't faint, and grabbing chocolate from the cupboard, considering I hadn't yet eaten the soup, I walked the 10 minutes to the CAP, my hastily bandaged hand held high. People looked at me funny.

An hour and a half later, well, four stitches in the index finger, three on the thumb. But I'm ok.

* * *

I hear that some family of mine is going to these "tea party" protests. If you're not plugged into the States political scene, the deal is this: the Republican brand is broke, and everyone knows it. But there is so much anger at their base. So voila Republican anger without the Republican state trappings, a catchment of the neofascist tendencies in all of us, whipped up around a symbol: the idea that Obama is a foreign element, an outsider, not of us, coming to enslave us all.

One of my family writes, referring to the return of another from these "tea party" protests:

When you are released and your tracking anklet has been removed. . . what do you say we move to Montana and prepare for the movie Red Dawn? We don't have to paint Wolverine on the side of every Afghan or Obamian tank that we destroy, but we can live off the land, sleep under the stars and pee in overheated radiators. The enemy will be obvious out there. I remember Sesame Street. . . . which one does not look like the others. . . . got it. Always look for the red dot and the table cloth on the head.

This makes me so sad. And the thing that's really binding them together, even the less racist, is hatred of "Obamacare" -- the idea that one should be able to walk into a clinic, get treated well and kindly, and walk out, regardless of your employment status, without signing for anything, without paying anything, as I did yesterday, in this foreign land.

Syndicated 2009-11-07 10:08:48 from wingolog

318 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!