Older blog entries for fxn (starting at number 521)

8 Sep 2009 (updated 19 Sep 2009 at 22:05 UTC) »

FluidDB Terminology


This post explains the terminology I’ve come up with after two weeks working in Net::FluidDB.

There’s no Perl in this post because albeit my laboratory is said module what I want to communicate are abstractions, not some particular implementation. We talk here about nouns and verbs, and introduce a model to some extent.

Note the focus is on a chosen terminology, this post does not explain the involved concepts themselves. The high-level docs explain FluidDB as such. <h2>Objects</h2>

We start with objects. Objects are the central entities in FluidDB. They have an id, which is a meaningless UUID, and an about attribute, which is an optional arbitrary string.

An object knows the paths of the existing tags on it, that’s a set of strings called tag_paths. The following section explains what is a tag and a path. <h2>Tags</h2>

The next most important entity in FluidDB is the tag. For example, “fxn/was-here”.

Tags have a description, which is a string attribute, and can be indexed, a boolean. They have also an object, explained later.

Tags have a path, “fxn/was-here”, and a name. The name is the rightmost fragment of the path, “was-here” in the previous example.

Each tag belongs to a namespace. Namespaces are explained later.

To tag an object you associate a tag and (optionally) a value to it.

In an object-oriented language tagging an object could look like this:

    object.tag(rating, 10)

The method name is a verb, and the first argument is a tag.

A library may provide a convenient way to tag an object given a tag path:

    object.tag("fxn/rating", 10)

There “fxn/rating” acts as an identifier, it points to the tag with that path, if any. In fact that’s what the REST API asks for, but that’s low-level stuff, the schema we are presenting runs at a higher level.

In a dynamic language you can have such a dynamic signature. In a statically typed language you would probably have different methods for different signatures. But that’s not important for the mental model we are building.

Tags in FluidDB are not typed. You could tag an object as having an “fxn/rating” of 10, and tag another object as having an “fxn/rating” of “five stars”. Values are typed. <h2>Values</h2>

Tagging involves an object, a tag, and optionally a value. Values are typed. There’s some technical stuff related to encodings and such, but for the purposes of this post I think we do not need to go further. <h2>Namespaces</h2>

To organize tags FluidDB provides namespaces.

Namespaces have a description, which is a string attribute, and an object, explained later.

Namespaces can contain other namespaces, and tags. Tags cannot, tags are leaves.

The namespace_names attribute of a namespace is the possibly empty set of the names of its children namespaces. The tag_names attribute of a namespace is the possibly empty set of the names of its tags.

Each namespace that is not top-level has a parent. A concrete implementation may define the parent of a root namespace to be some sort of null object.

Any namespace has a path, and a name, akin to tags. The namespace with path “fxn/reading/books” has name “books”. Its parent has path “fxn/reading”, and name “reading”.

You can compute the path of any child namespace or tag from the path of the containing namespace and their respective names. <h2>Permissions</h2>

Each possible action on each existing tag and namespace has a permission associated with it. A permission consists of a policy and an exception list, to be applied to a certain category and action.

The policy may be open, or closed, and the exception list is a set of usernames. (Note: FluidDB in general lacks ordered collections, read “set” when you see “list”.) <h2>Policies</h2>

Each possible action on tags and namespaces have a default set of permissions. When you create a tag or a namespace, each one of the possible actions gets such defaults. Each of those defaults is called by definition a policy.

There’s a name clash here which is not good. It is inherited from the API. I’ve departed in some places from the API, but I believe we need to stick to it in this case: A policy consists of a policy and an exception list, to be associated with certain category and action on behalf of a certain user.

The policy attribute may be open or closed, and the exception list is a set of usernames. (Note: FluidDB in general lacks ordered collections, read “set” when you see “list”.) <h2>Users</h2>

Users have a name, a username, and an object, explained in the following section. <h2>Where are the IDs?</h2>

If you are familiar with the API you may be wondering where did the IDs of tags, namespaces, and users go, and what are those objects I’ve mentioned.

Tags, namespaces, and users are not FluidDB objects themselves. They have no ID, they have no about, you can delete them.

The proper identifier of a tag or a namespace in the system is their paths, and the one of a user its username.

FluidDB, however, creates an object for each tag, namespace, and user. They can be found in their object attribute. So, for example, if you wanted to tag the user whose username is “fxn” there’s a canonical object in the system for it. You can tag that object, but you cannot tag the user itself.

If the user is deleted, the corresponding object is not. Remember, objects are immortal. In particular if the user was tagged the tags are still there, with the object that represented it in FluidDB. This parallels the object for any other thing in life that once existed.

Syndicated 2009-09-08 02:39:54 from FluidThinking

4 Sep 2009 (updated 4 Sep 2009 at 17:58 UTC) »

Rails Tip: Which is the difference between request.xhr? and format.js?

Since Ruby on Rails has respond_to the following has become a common idiom for routing Ajax requests in controllers:

   respond_to |format|
     format.html { ... } # ordinary request
     format.js   { ... } # Ajax request, kinda, keep on reading
Testing for format.js like that often suffices, but strictly speaking this is not testing if the request is Ajax. As everything else, abusing some logic is fine as long as you know what you are doing.

You know an Ajax call is just a fancy name for an ordinary HTTP call that is performed from JavaScript. Ajax requests sent by any of the major JavaScript frameworks include an HTTP header that distinguishes them:

Thus, the proper test to detect Ajax calls checks that header, and this is what request.xhr? does.

On the other hand, Ajax calls do not expect JavaScript necessarily. Remember, they are ordinary HTTP calls, so they may ask for HTML, JSON, whatever. What you ask for goes in the HTTP Accept header. For example Ajax functions in Prototype send by default:

   Accept: text/javascript, text/html, application/xml, text/xml, */*
And that is something format.js tests for (in addition to an optional explicit format parameter somehow encoded in the URL). If the Accept header starts with "text/javascript", or "application/javascript", or "application/x-javascript" you'll get routed into format.js.

In fact, jQuery by default does not send that header and vanilla jQuery calls do not enter format.js. The jRails plugin plays nicely with that idiom.

Note that even link_to_remote with the :update option is routed to format.js. That is fine from the HTTP point of view, because the header says it is OK to send HTML, but is kind of weird to serve HTML from within format.js isn't it?

But you can set whatever Accept header you need, jQuery provides even some shortcuts like "xml", which tells the library to send

   Accept: application/xml, text/xml, */*

So, in practice format.js kind of works, but both tests are not equivalent:

  • An Ajax call asking for "text/xml" is xhr? but won't be routed through format.js.
  • A call triggered by an ordinary SCRIPT tag that reaches your application (perhaps it serves dynamic JavaScript) is routed through format.js, but it is not xhr?.

That being said, if I control the interface and know what's what I am personally fine using format.js. Routing through respond_to is concise and clean, and you may know that in that action the only expected JavaScript calls are Ajax. If that holds you are still not testing for true Ajaxness but are testing a sufficient condition, which is correct anyway.

20 Aug 2009 (updated 20 Aug 2009 at 17:54 UTC) »

Ruby Regexps and Unicode

In Ruby 1.8 strings have no encoding associated, they are only a handful of bytes from Ruby's view. Regexps are agnostic in that sense as well they match bytes against bytes. Unless you pass one of the flags /u for UTF8, /s for SJIS, or /e for EUC-JP. By the way note that /s in Ruby has a different meaning than in Perl, and it is not the only flag that conflicts.

If you set $KCODE to "u" then source code itself is assumed to be UTF8 and Ruby turns the /u flag on. Ruby on Rails does that since version 1.2 for example.

AFAICT it is not clearly defined which support does Ruby 1.8 provide for Unicode in regexps. For example Flanagan & Matz have little about it except for some vague descriptions. You could say it is just not supported, but some things do work. For example, it is a known trick that counting /./ matches gives you the length of a UTF8 string, whereas #length returns number of bytes.

A couple of important bits with definitely partial support are the character classes \w and \s (and thus their negations \W and \S).

In general, the definition of a word char depends on the locale. In Catalan "ò" is a word char. Regexp engines are locale-aware and the meaning of \w depends on it. That is, \w is equivalent to [a-zA-Z0-9_] only in ASCII-like locales. In Ruby, if source code is UTF8 and /u is enabled "ò" matches \w.

That's important of course, a Rails application that validates domain or account names against \w for example is permitting accented letters. If they should not be allowed you need to write the character class explicitly: [a-zA-Z0-9_].

On the other hand, since "ò" and friends match \w you could be tempted to validate Unicode against \w, I certainly have beed more than tempted :-). Wrong! There are characters that match but shouldn't. For example "¿" or "¡", or "·".

With whitespace there's also poor support. NEL (U+0085) belongs to \s, but it doesn't in Ruby 1.8. A string that consists of NELs not only is not blank in Rails, but it in addition matches \w in Ruby 1.8! Two gotchas for the price of one!

If you need proper Unicode support, among other goodies, you switch to using Oniguruma. That's the regexp engine used in Ruby 1.9, which is available for 1.8 as a gem:

    sudo gem install oniguruma

That needs a C library available as a tarball, and also packaged for Ubuntu (at least):

    sudo apt-get install libonig-dev

The API is here.

3 Aug 2009 (updated 6 Aug 2009 at 08:46 UTC) »


I am excited to announce I joined Terry Jones and esteve in building FluidDB.

Very happy, Terry and Esteve are terrific, and I sincerely think FluidDB might be something revolutionary. I believe there's something latent there related to data sharing that it could be big.

17 Jul 2009 (updated 17 Jul 2009 at 00:14 UTC) »

What is a browser?

Have a look at this video of a Google guy asking what is a browser in Times Square. People have basically no idea.

I don't know whether the interviews in this video can really be extrapolated, but my instict says there's something into it. When you are into technology you need to look ahead and construct the future, but a corner in your head has to keep you balanced and take into account the man in the street is very very far from your view. Just a reality check, your duty is to be ahead, as is the duty of any specialist in any field.

PS: I saw this video in a post of Seth Godin, which uses it to depict an unrelated point.

13 Jul 2009 (updated 13 Jul 2009 at 18:51 UTC) »

Rails Contributors

Almost a year ago I started to work on a script to count the number of people that have contributed to Ruby on Rails, the aim was to be able to give a good approximation in my keynote at Conferencia Rails 2008.

That's not a direct count because since Subversion does not track authors credit was given by hand following a few conventions. The committer typically put your name/email/nick whatever at the end of the commit message for example. Even nowadays with Git the author of a commit to Rails is not always the Git author, some munging is still needed for fine tracking.

So the script identified authors where day appear, and normalized the names to identify every handler, typo, etc. and map them to a "canonical" name.

I am very happy that effort took shape in the official Rails Contributors index, with design by José Espinal. It has been online for a while but didn't blog about it yet.


After three years from its foundation I left ASPgems at the beginning of June. I have no plan B, it was simply something I felt I had to do.

I have done some contract work for the rest of June but I am currently taking a break with my family in the seaside. I am gonna have fun with my daughter in the beach, read, sleep, walk, ride our bikes, open source, and take perspective to think what's next.

16 May 2009 (updated 16 May 2009 at 23:19 UTC) »

EuRuKo 2009

EuRuKo 2009 is over!

SRUG is very very happy about the outcome, we put effort and organised the conference with illusion, and people felt it and had a really great time. Talks were interesting, and most important people had the chance to chat, sit in the grass, go to the beach at night....

We were honoured Matz came to the conference to give the opening keynote, he made a 22-hours flight from Japan! We tried to make him feel at home. Matz actually attended the conference, I mean, you know those stars that give their keynote and then go to do sightseeing. Not Matz, he stayed at the conference and talked with everytbody, we was at the conference and if you took a perspective of the hall he was mixed with the audience as any other attendee. Hat tip at him.

Next year EuRuKo goes to Kraków, our best wishes to the organisers. We met them in Barcelona and we are sure they are going to run an extraordinary conference.

/me waves from Scotland on Rails.

14 Mar 2009 (updated 14 Mar 2009 at 10:15 UTC) »

Why Did I Write Acme::Pythonic

Acme::Pythonic is a Perl module of mine that allows the user to write Pythonic code as valid Perl code. I mean, you feed this code to perl:

    use Acme::Pythonic; # this semicolon yet needed
     sub delete_edges:
         my $G = shift
         while my ($u, $v) = splice(@_, 0, 2):
             if defined $v:
                 $G->delete_edge($u, $v)
                 my @e = $G->edges($u)
                 while ($u, $v) = splice(@e, 0, 2):
                     $G->delete_edge($u, $v)

and perl executes it right away, directly. There's no intermediate file being generated or anything. Sounds like magic unless you know what's a source filter.

But some people don't get that even with the work behind this module, the test suite, etc. this module is just a fucking joke! That's why it belongs to the Acme:: namespace in the first place.

It is a joke about taking programming languages too seriously, to the hell with that, there you have Python and Perl mixed together. Sublimation. Climax. You can put that code against a wall and do vipassana contemplating it, release your attachments to this mundane world!

Rails Documentation Team

Rails has now an official documentation team! That's Pratik, Mike, and me. I am very happy this converged this way, there has been a great deal of work in docrails and Rails Guides that finally takes shape.

512 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!