nikodemus is currently certified at Journeyer level.

Name: Nikodemus Siivola
Member since: 2003-04-17 09:27:17
Last Login: 2008-06-07 13:49:51

FOAF RDF Share This

Homepage: http://random-state.net/

Notes: I finally have access to my Advogato account again after several years... Let's see if I can make this thing get updates from my actual blog.

Recent blog entries by nikodemus

Syndication: RSS 2.0

29 Jun 2008 »

ELS 2008 report

Thank You, for everyone who worked to make The 1st European Lisp Symposium the roaring success it was.

For me, the symposium started by having dinner with other attendees (and some organizers) on Wednesday. This was obviously the right thing to do, because the whole conference went in the same vein: great people, good food, excellent wine. No presentations on Thurdsay -- but if you ask me, they are not really the thing that makes a conference tick for me.

Thirsday was the first day on the official program. After the initial welcome session, and kibbitzing over coffee people split into two tracks: the work-in-progress track, and birds-of-feather sessions. Not having submitted a WIP paper I did not attend that track, but from what I heard it was definitely worth attending.

  • The BOF track started off with a session on concurrent and distributed processing by Sebastin Gonzlez. Questions of code mobility, identity, and structure of communication between nodes were talked about. Sebastin is working on a distributed and concurrent processing system, and has obviously spent quite some time thinking about the issues. From my perspective the project will be interesting to follow, and it is especially interesting to see what sort of things they will want from the underlying Lisp -- but generally speaking my own interests are on a slightly lower level.
  • Next session was on image processing APIs, chaired by Matthieu Villeneuve -- working towards possibly consolidating Imago and ch-image into a single library. I admit I spent most of the session hacking on SBCL and checking my email, but I gather that those who participated properly found it engaging. Not quite sure were it ended up.
  • Third was the CLIM (or perhaps rather McCLIM) session. I'm sorry to say it did not go as well as I would have hoped. The McCLIM hackers who were prepared for it had prepared to talk about McCLIM internals and future directions with each other, but due to the largely non-cognoscenti audience it turned into an impromptu CLIM tutorial/demonstration, which no-one was prepared for: demo-effects were plentiful, and direction was a bit lacking. Could have been worse, but could have been a lot better. It would be really great if someone could make a screencast to showcase the stuff that should have been in the demos -- but time being in a globally short supply I totally understand if that doesn't happen anytime soon.
  • Lunch in a nearby resteraunt was simply excuisite -- and part of the symposium fee. Quite probably the mellowing effects of good food had a lot to do with the success of the following session... (Actually, maybe lunch was before the CLIM session? I forget.)
  • Final, quite impromptu, session was on a future successor to ASDF. Andreas Fuchs, I, and several other intrested parties had realized this might be a good time to hash some things out face to face, and took the opportunity. Several conclusions were made -- in no particular order: Purely declarative system definitions are good: loading a system definition from a file should not use LOAD, just READ, and then some sort of an EVAL-SYSTEM. System definition extensions required for doing operations on a given system should live in a separate file. Manually managed fine grained dependencies are too brittle, and complete automated dependency management is too hard. Therefore encourage module-level dependencies (with stronger dependecy semantics then ASDF) as a half-way house between purely serial definitions and brittle file-level dependencies. Protocols are good: extending existing operations to new kinds of components is good and simple. Adding new operations for existing components is also good, but trickier. Need to examine several existing ASDF systems and extensions to see what kinds of things are being done with the protocol. Plan-then-execute is nasty: makes a lot of things harder, and doesn't buy very much. No need to globally replace ASDF as long as the "future successor" can depend on ASDF systems (and as long as we can extend ASDF to depend on the new kinds of systems.) No project plans were made, but after everyone has had a bit of time to reflect on this, maybe something will happen.

Syndicated 2008-05-27 14:31:36 from Nikodemus Siivola

5 May 2008 »

Quick Notes

The only valid measurement of code quality: WTFM

Getting Git series will continue later this week, but in the meanwhile I would like to bring to your attention an oft-forgotten output operator in Common Lisp: WRITE. It is perfect for both REPL and code in many cases, since you don't need to bind printer control variables around it -- just pass the ones you care about as keywords. Similarly, you don't have to worry about it munging variables your callers may care about.

Presenting ESRAP 0.1. It is a simple packrat parser for Common Lisp. It's been almost a year since I wrote it, and it seems unlikely that I'll work more on it in near future. In its current state it is neither particularly optimized or polished, nor does it have a great deal of fancy features, but it did what I needed it to do at the time, and I figured someone else might find it a more useful starting point for their own needs then CL-PEG. The feature list reads:

  • dynamic redefinition of nonterminals
  • inline grammars
  • semantic predicates
  • simple introspective facilities

Examples:

(parse '(or "foo" "bar") "foo") ⇒ "foo", NIL

(add-rule 'foo+ (make-instance 'rule 
                 :expression '(+ "foo"))) 
  ⇒ FOO+

(parse 'foo+ "foofoofoo") 
  ⇒ ("foo" "foo" "foo"), NIL

(add-rule 'decimal
           (make-instance 'rule
            :expression '(+ (or "0" "1" "2" "3" 
                                "4" "5" "6" "7" 
                                "8" "9"))
            :transform 
            (lambda (list) 
              (parse-integer (format nil "~{~A~}" 
                                     list)))))
  ⇒ DECIMAL

(parse '(oddp decimal) "123") ⇒ 123, NIL

(handler-case
    (parse '(oddp decimal) "124")
  (error (e)
    (format t "~&oops: ~A~%" e))) ⇒ NIL
; output
oops: Expression (ODDP DECIMAL) failed at 0.

(parse 'foo+ "foofoofoobar" :junk-allowed t) 
  ⇒ ("foo" "foo" "foo"), 9

(parse '(evenp decimal) "123" :junk-allowed t) 
  ⇒ NIL, 0

(add-rule 'foos-or-decimal
          (make-instance 'rule
           :expression '(or foo+ decimal)))
  ⇒ FOOS-OR-DECIMAL

(describe-grammar 'foos-or-decimal) ⇒ NIL
; output
Grammar FOOS-OR-DECIMAL:
   FOOS-OR-DECIMAL <- (OR FOO+ DECIMAL)
   FOO+    <- (+ "foo")
   DECIMAL <- (+ (OR "0" "1" "2" "3" "4" "5" "6" "7" "8" "9"))

Existence of bugs is guaranteed. Licence is zero-clause MIT.

Syndicated 2008-02-05 14:57:29 from Nikodemus Siivola

5 May 2008 »

Getting Git, part 3

Note: updated to correct a logical error. When development converges, a child will have multiple parents -- not vice-versa. Kudos to Johannes Grødem for sharp eyes and a heads-up.

Intermission</b>

I've learned that some people are reading this and wondering if they really need to know all this to use Git?.

No.

If all you're using Git for is "edit-diff-commit, edit-diff-commit" cycle, you don't. This seems to be 99% of what many people use a VCS for, and there is nothing wrong with that. You can even go quite a bit beyond that, and still you don't need to know anything about what's really going on -- just follow the a simple recipe, and you're good to go.

It's when you move beyond the recipe level that you need to understand the model the VCS uses, just like you need to with CVS, Subversion, Darcs, or any other VCS. If you don't think that's true, you've either internalized the model without noticing, or you're just hammering out recipes.

Specifically, this series is written to teach enough of the Git model to be able to look at a bunch of disparate branches you need to merge somehow, figure out the kind of history you want to build, and then do that. No recipe in the world is going to to this for the general case: you need to know what is going on before you can decide what you want.

What's In A Commit

So, where were we? Ah, storing history. If you haven't read it yet, here's where you can read the story so far.

Commits are the third kind of object stored in the object database. You could call commit "a moment in history", but let's see exactly what it contains:

  • Tree: the entire contents of the directory tree associated with the commit.

  • Parent(s): a commit has one or more parent commits. A parent commit is the "previous" commit: the changes introduced by a commit can be seen by comparing the trees of the commit and its parent(s).

    The common case is a single parent: this represents normal linear development.

    Multiple parents represent converging lines of development. A commit with multiple parents is called a merge commit. Two parents is the norm, but multiple branches can be merged with a single commit.

  • Comment: some text describing the commit.

  • Committer: the person who actually created the commit, and the date this was done.

  • Author: the person responsible for the change represented by the commit, and the date. Often the author and the committer are the same, but when eg. submitting patches by email Git automatically preserves information about the original author.

Commits form a DAG via parents. When development diverges, multiple children will share that same parent. When development converges, a single child will have multiple parents. If this is not clear, get a piece of paper, and draw a few dags -- it's more effective then any fancy graphic I might cook up.p>So, if you have hold of a commit object, you have hold of the entire history up to that point -- but you don't know anything about the future. In other words: History is the DAG rooted at any given commit.

<s>Back to our regularly scheduled Erlang envy.</s> Review: What is a commit object? Can multiple commits refer to a single tree; if so, what does it mean; if not, why not? Can multiple commits share parents; if so, what does it mean; if not, why not? If the tip of a branch is a commit, can you guess what the history of the branch is?

Next time: tagsoup.

Also, here's some moral support for me: Git is the next Unix says apenwarr. I may not agree with the metaphor, but it's a nice read:

Git was originally not a version control system; it was designed to be the infrastructure so that someone else could build one on top. And they did; nowadays there are more than 100 git-* commands installed along with git. It's scary and confusing and weird, but what that means is git is a platform. It's a new set of nouns and verbs that we never had before. Having new nouns and verbs means we can invent entirely new things that we previously couldn't do.

Syndicated 2008-02-03 06:18:20 from Nikodemus Siivola

5 May 2008 »

Getting Git, part 2

Read part 1 first.

At the core of Git is the object database. It is not an implementation detail, but a fundamental part of the whole. Don't ignore it, and don't be scared of it. You don't have to use it directly, but knowing the basics makes life a lot easier.

So:

The object database lives somewhere under .git/ in each and every repository clone -- nevermind where exactly.

The object database is garbage collected: the only way to delete an object is to remove all roots that point to it, and let the GC reclaim it. Roots also live under .git/, but are distinct from the database.

Since content is immutable, there is no way to mutate anything in the database -- you can only add new objects.

Now, there are four kinds of objects in the database. Today we will cover just two of them -- the lower level, if you will. This is content at its most content-seeming:

Blobs are binary content. They don't contain any pointers. Blobs are used to store file contents. Not file names, etc -- just contents. If you have files x/foo.txt and y/bar.txt, which both contain just the string "foobar", then in the object database there will be a blob that stores the string "foobar" -- representing the content of both files. Remember: content is identity.

Trees are lists of entries. The entries represent other objects in the database: for each entry the tree stores the object type, the pointer/SHA1 for the object, the object name, and the mode (the executable bit, really.) A single tree object represents a single directory with its files and subdirectories; in normal circumstances a tree willl only contain blob and tree entries. Again, content is identity: if you have to two directories containing identically named files with identical contents and executable bits, both will be represented by the same tree object.

Consider directories and the file here: x/y/z.txt

We have:

  • Blob-object B<sub>z</sub> for the contents of z.txt.
  • Tree-object T<sub>y</sub> for y/, containing the name z.txt, and a pointer to B<sub>z</sub>.
  • Tree-object T<sub>x</sub> for x/, containing the name y, and a pointer to T<sub>y</sub>.

Now, if we change the contents of z.txt, and commit the new content to the object database -- what happens to the object graph as a whole?

Remember: Content is identity, and not just for blobs, but all objects. If you change the contents of a file, you need a new tree object containing a pointer to the new blob, etc. This is a really important bit, so make sure you understand this: any pointer into the object database is a unique identifier for the whole object graph reachable from that point.

So, you will have:

  • Blob-object B<sub>z2</sub> for the new contents of z.txt.
  • Tree-object T<sub>y2</sub> for y/, containing the name z.txt, and a pointer to B<sub>z2</sub>.
  • Tree-object T<sub>x2</sub> for x/, containing the name y, and a pointer to T<sub>y2</sub>.

The old versions are still there: content is immutable -- as long as GC hasn't reclaimed them, we can get at them.

Now, as long as you remember that a tree object represents the whole state of the whole directory structure under it, including file contents, you can forget about blobs. Just think of trees, and you will be fine.

Review: <s>Why did the hacker cross the road?</s> Where does Git store content? How are files and directories stored? Can you mutate stored content; if so, how; if not, why not? Can you deleted stored content; if so, how; if not, why not? What does a tree object represent?

Next time: how history is content, and how it is stored.

Syndicated 2008-01-27 13:10:42 from Nikodemus Siivola

5 May 2008 »

Getting Git, part 1

Many people -- me included -- find Git really nice, but equally many seem somewhat confused by it. I think this is due to Git being conceptually different compared to other VCS we're used to: unless you have an at least mostly correct theory of Git your expectations based on experiences with other systems will lead you astray.

This is the first part in a series of posts that tries to address this issue, by providing the aforementioned theory. In this part I talk about general concepts: the terms I use are not Git terms; I'm trying to start you thinking in a Git-compatible way as opposed to whatever CVS and others have taught you over the years. Details and real terms will start in the next episode.

Ready? Buckle up!

Git stores content, not metadata. Some of the data stored by Git may very well describe some other data also stored by Git, but it is all content to Git.

Git stores content, not changes. It can reason about changes in content, but only content is stored.

Content is identity. If two "things" have identical content, they are the same thing for Git.

Content is immutable. If you think you are mutating something stored by Git, think again: what you're doing is making an altered copy and possibly throwing away the original. Since content is identity, you cannot mutate an object and have it retain its identity.

That's it.

I hope this was simple enough, but if you've had trouble understanding Git before, please review the points above a time or two to make sure you understand what I'm saying. You don't yet need to understand how the points above relate to Git -- just try to grok das Ding an sich. When you think you have it, ask yourself:

<s>Conan! What is best in life?</s> What does Git store? How would you describe this stuff that Git stores?

Next time we'll pop open the hood and see what's inside.

Syndicated 2008-01-24 08:28:30 from Nikodemus Siivola

8 older entries...

 

nikodemus certified others as follows:

  • nikodemus certified nikodemus as Apprentice
  • nikodemus certified biz as Apprentice
  • nikodemus certified wmf as Journeyer
  • nikodemus certified bonzini as Master
  • nikodemus certified varjag as Journeyer
  • nikodemus certified dan as Master
  • nikodemus certified shillo as Journeyer
  • nikodemus certified mdanish as Journeyer
  • nikodemus certified Akira as Journeyer
  • nikodemus certified jolan as Journeyer

Others have certified nikodemus as follows:

  • nikodemus certified nikodemus as Apprentice
  • bonzini certified nikodemus as Apprentice
  • jolan certified nikodemus as Apprentice
  • lerdsuwa certified nikodemus as Apprentice
  • ak certified nikodemus as Journeyer
  • xach certified nikodemus as Journeyer
  • cyrus certified nikodemus as Journeyer
  • technik certified nikodemus as Journeyer
  • pthierry certified nikodemus as Journeyer

[ Certification disabled because you're not logged in. ]

New Advogato Features

FOAF updates: Trust rankings are now exported, making the data available to other users and websites. An external FOAF URI has been added, allowing users to link to an additional FOAF file.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

X
Share this page