27 Aug 2010 tampe   » (Journeyer)

Compress that for me, please

So, if you are in a habit of analyze lists and tree structures like code, and like pattern matching, you may end up writing ton's of code like.


(def sum-of-leaves
   ((X . L)      (+ (sum-of-leaves X) (sum-of-leaves L)))
   (()           0)
   ( X           X))

e.g. if you have a list sum the first and the sum of rest the empty list yields 0 and all non list elements -the leaves, assumed numbers - end up being just counted as the value they represent.

So how should we code this destruction into a primitive lang? One idea I used is to work with the guile VM directly and use a few extra primitives there.

So consider destructuring


(define (f A)  (match A ((X 'a (X . L) . U) (list X L U) ...)

we could do this with a stack machine like,


(push A)         ;; Stack -> (A)
(explode-cons)   ;; Stack -> ((car A) (cdr A))
(set X)          ;; Stack -> ((cdr A))
(explode-cons)   ;; Stack -> ((cadr A) (cddr A))
(match-eq? 'a)   ;; Stack -> ((cddr A))   
(explode-cons)   ;; Stack -> ((caddr A)  (cdddr  A))
(explode-cons)   ;; Stack -> ((caaddr A) (cdaddr A)
(cdddr A)))
(match-eq? X)    ;; Stack -> ((cdaddr A) (cdddr A))
(set L)          ;; Stack -> ((cddr A))
(set U)          ;; Stack -> ()
(list X L U)
;;; note if an explode-cons or eq? fails it will 
;;; reset and the next pattern will be tried.

And what you note is that this is a more compact reduction of pattern matching then doing it with the standard VM of guile. So the end results is that code executed on this VM is both faster and more compact then using the standard setup. But of cause if we would like to compile this to the native platform then the standard compilation to pure scheme probably has an edge.

Interestingly though (I'm implementing a prolog on top of guile), this pattern can be generalized to the case where A - the input is a unifying variable. The destruction will look almost the same, but we need to tell the VM that we are in a mode of destructing a unifying variable. Meaning that if X is not bound we will set that variable to a cons and push two new unbound variables (car A) and (cdr A) onto the stack.

Cheers!

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!