Older blog entries for fxn (starting at number 530)

11 Apr 2010 »

Rails Committer

I've been contributing to Ruby on Rails on a regular basis, had almost 100 code patches, and about 500 doc patches, and was recently granted commit right. That's so great! It will allow me to work with more agility and have a little more freedom to do stuff.

10 Mar 2010 (updated 10 Mar 2010 at 15:05 UTC) »

An around filter to temporarily apply changes

A client of mine has an admin tool where people can edit stuff, but modifications need approval. They are represented in the application rather than applied right away as you normally do.

Changes are basically stored in the form of method names, ID of receiver, and serialized arguments, there are a handful of them. Applying changes is dynamic invocation, you get the idea.

My client wanted an approval interface where he could select a handful of changes to the same model, submit, and get a split view with the current website on the left, and the resulting website on the right.

The tricky part is that views may access the database, for example using named scopes (from a MVC viewpoint that's fine, the view is clean it happens that a named scope triggers a query). And the 3-line solution was to write an around filter that started a transaction, yielded to the action, and rolled the transaction back.

30 Jan 2010 (updated 30 Jan 2010 at 18:23 UTC) »

Tracking Class Descendants in Ruby (II)

My previous post explains a way to keep track of a class' descendants, and encapsulates the technique into a module.

There are two things you may want to do different: Since all descendants inherit the descendants class method you may prefer them to be functional. On the other hand, the module defines the inherited class method into the base class because it needs it to be a closure. That may work for some particular need, but it is not good for a generic solution. The inherited hook is the business of your client's code.

Now we'll see a different approach that addresses both concerns. Using the same hook any class in the hierarchy may easily keep track of its direct subclasses, and compute its descendants:


    class C
      def self.inherited(subclass)
        subclasses << subclass
      end
 
      def self.subclasses
        @subclasses ||= []
      end
 
      def self.descendants
        subclasses + subclasses.map(&:descendants).flatten
      end
    end

In the previous solution the inherited hook needed to ensure descendants was invoked on the root of the hierarchy. In this solution it doesn't care because we precisely take advantage of polymorphism. The way it is written a class pushes into its own @subclasses instance variable, which is what we want.

The module that encapsulates that pattern is much simpler:


    module DescendantsTracker
      def inherited(subclass)
        subclasses << subclass
        super
      end
 
      def subclasses
        @subclasses ||= []
      end
 
      def descendants
        subclasses + subclasses.map(&:descendants).flatten
      end
    end
 
    class C
      extend DescendantsTracker
    end

You know extend is like doing an include in the metaclass of C. In particular we are not defining C.inherited, we are defining a method with the same name in an ancestor of the metaclass. That way C can still define its own inherited class method. A call to super within such a C.inherited will go up the next ancestor of the metaclass, eventually reaching the inherited from DescendantsTracker.

29 Jan 2010 (updated 30 Jan 2010 at 18:25 UTC) »

Tracking Class Descendants in Ruby

I am going through all Active Support core extensions lately because I am writing the Active Support Core Extensions guide, due for Rails 3. There are some patches in master as a result of that walkthrough, and I am now focusing on keeping track of descendants in a class hierarchy.

A known technique uses ObjectSpace.each_object. That is a method that receives a class or module as argument and yields all objects that have that class or module among their parents. Since classes are instances of the class Class, you can select descendants of class C this way:


    descendants_of_C = []
    ObjectSpace.each_object(Class) do |klass|
      descendants_of_C << klass if klass < C
    end

That is a brute force approach, it works, but it is inefficient. JRuby even disables ObjectSpace by default for performance reasons.

A better approach is to leverage the inherited hook. Classes may optionally implement a class method inherited that is called whenever they are subclassed. The subclass is passed as argument:


    class User
      def self.inherited(subclass)
        puts 0
      end
    end
 
    class Admin < User
      puts 1
    end
 
    # output is
    0
    1

That's a perfect place to keep track of descendants:


    class C
      class << self
        def inherited(subclass)
          C.descendants << subclass
          super
        end
 
        def descendants
          @descendants ||= []
        end
      end
    end

In that code we have an array of descendants in @descendants. That is an instance variable of the very class C. Remember classes are ordinary objects in Ruby and so they may have instance variables. It is better to use an instance variable instead of a class variable because class variables are shared among the entire hierarchy of the class and we need an exclusive array.

Another fine point is that we force descendants to be the one in the C class. If we didn't and we had A < B < C, the hook would be called when A was defined, but by polymorphism it would be B.descendants what would be called, thus setting B's instance variable @descendants. That is not what we want.

The call to super is just a best practice. In general a hook like this should pass the call up the hierarchy in case parents have their own hooks.

That pattern can be implemented in a module for reuse indeed:


    module DescendantsTracker
      def self.included(base)
        (class << base; self; end).class_eval do
          define_method(:inherited) do |subclass|
            base.descendants << subclass
            super
          end
        end
        base.extend self
      end
 
      def descendants
        @descendants ||= []
      end
    end
 
    class C
      include DescendantsTracker
    end

A class only needs to include DescendantsTracker to track its descendants.

When the module is included in a class Ruby invokes its inherited hook. The hook receives the class that is including the module, and we leverage that to inject the class methods we saw before. For inherited we open the metaclass of base and define the method in a way that has base in scope, which is something we saw before we need. After that we add the descendants class method with an ordinary extend call.

Update: There's a followup to this post.

3 Jan 2010 (updated 3 Jan 2010 at 20:13 UTC) »

Rails Tip: Avoid mixing require and Rails autoloading

I've seen in a few Rails applications warnings about constants being redefined at some point. Problem was always the same: a file was autoloaded and required afterwards, and this results in the file being interpreted twice. If the class or module defines ordinary constants you may be lucky and see a warning, but if not you may not even be aware of it. Let me explain why this happens.

For example, given:


  # lib/utils.rb
  module Utils
    X = 1
  end

If we autoload the module and then require the file:


   $ script/runner 'Utils; require "utils"'

a warning is issued:


   warning: already initialized constant X

This is an artificial example, but in practice it may be the case for instance that some initializer autoloads Utils and later a model requires lib/utils.rb. Of course you don't need and shouldn't do that, but perhaps the model was written by someone not conversant in Rails or whatever.

OK, the warning is telling us lib/utils.rb is being interpreted twice. That happens both in development and production modes, but for different reasons.

In development mode Rails autoloading uses Kernel#load by default to be able to reinterpret code per request. So, the usage of the constant Utils triggers the interpretation of lib/utils.rb with load, and since require knows nothing about that file it happily interprets its content again.

In production mode Rails autoloading uses require, and that is supposed to run the file once, what's the matter?

When require loads something, it stores its path in the array $":


   $ ruby -rdate -e 'puts $"'
   enumerator.so
   rational.rb
   date/format.rb
   date.rb

require checks that array to see whether a given file was already loaded before it attempts to go for it. If it was there's nothing to do and just returns (false). The point is that require does not detect whether two different paths point to the same file, and Rails autoloading passes absolute pathnames to autodiscovered source files:


   $ script/runner -e production 'Utils; require "utils"; puts $"' 
   ...
   /Users/fxn/tmp/test_require/lib/utils.rb
   utils.rb

Since they do not match, the file is again loaded twice.

The solution to this gotcha is as simple as removing the call to require. An idiomatic Rails application names its files after the classes or modules they define and delegate their loading to the dependencies mechanism. Generally, the only calls to require load external libraries.

19 Dec 2009 (updated 20 Dec 2009 at 11:40 UTC) »

Kinesis Advantage Pro

My family of somewhat special keyboards includes a IBM Model M, a couple of Happy Hackings, a Unicomp Customizer and now Santa came with a Kinesis Advantage Pro.

This is the less conventional of them as you can see in this set of photos I took while unpacking it. Besides the design itself, it supports macros, you can remap keys, have Mac and PC layouts (hot-switchable), builtin support for QWERTY and DVORAK (you can choose any with a key combo), and support for foot switches. In addition to emit a beautiful minimal beep-beep sound I love done by software to add feedback. Everything operated from the very keyboard, no additional software distributed. The computer only sees an ordinary USB keyboard.

I've known personally two seasoned programmers that have only used them for years, having owned several units, and do not want to use anything else. I think I understand them. Typing with this keyboard feels really good. Your arms are relaxed and your hands, fingers, and wrists work with little stress compared with standard desigs. This is not only an investment for your health, I think the typing experience is hands down better, something very important if you like me spend hours and hours day after day typing.

That's mainly because of three characteristics: First, as you can observe in the pictures keys are aligned in columns, instead of being staggered. So, your fingers move mostly in straight lines. Second, hands are more quiet thanks to the convex shape, the curve matches naturally your fingers reach. Third, the space bar, backspace, return key, commad, option, and a few more keys are pressed with your thumbs.

I've been using the Kinesis non-stop since Tuesday. I think you need to do an immersion to get used to it and don't want to use any other keyboard at least until I finish the adaptation period. The box includes a booklet with exercises designed to ease the transition. It is not a touch typing course, they are designed to help you to adapt. I was not a touch typist so it is taking me a while to get a decent speed, though as of today it is already quite acceptable.

The Pro model comes with a foot switch. That is a switch you can map to any key. I have it mapped to Shift but I am unconvinced by now. Problem is that even if the pedal is built in a way that prevents quite well accidental presses, I can't for the life of me train myself to relax my foot on it for a long time period. So I have some tension which is not good. I'll try a little more before giving up though.

Keyboard shortcuts is something you need to retrain completely because your current muscle memory is useless here, since involved keys are in different spots.

That's pretty normal, it is known that you need a few weeks to get at full speed and have your brain rewired, but I believe from my experience so far it is going to be worth the effort.

25 Oct 2009 (updated 26 Oct 2009 at 11:00 UTC) »

Emulating Ruby Blocks in Perl

Ruby Blocks

Ruby blocks allow you to run code in the caller from within a method:


   # definition
   def twice
     yield 1
     yield 2
   end
 
   # example call
   twice do |n|
     puts "croak! (#{n})"
   end
 
   # output
   # croak! (1)
   # croak! (2)

In case you never saw something like that before, twice expects to be called with a block. That's the code between the do ... end keywords in the example call. Each call to yield runs the block in the caller. Arguments may be passed, in the example we pass a integer.

This is a powerful tool, Ruby is full of all sorts of internal iterators thanks to it:


    array.each_with_index do |item, i|
      # ...
    end

You can wrap/decorate chunks of code with them providing a very easy and intuitive usage. For example:


    File.open(filename) do |file|
      # ...
    end

opens a file, passes the filehandle to the block, and when the block is done it closes it.

Another example from my Rails Contributors application (simplified here):


    acquiring_sync_file do
      # ...
    end

The code within the block runs only if acquiring_sync_file could get a lock on some given sync file. If it could, once the block is done the lock is released.

Ruby Blocks in Perl

Perl does not have blocks like that, but you can emulate them in a way that may cover some use cases:


    acquiring_sync_file {
      # ...
    };

The trick comes from a feature of Perl that is not widely used: subroutine prototypes.

You may have noticed that some builtin Perl subroutines behave in a strange way. For example push:


    push @array, 5;

If push was an ordinary subroutine, it would receive the elements of @array and a 5 in @_, with no possible way to modify @array, which was lost by its evaluation in list context before the function call.

But push somehow is able to tell @array from the 5, and actually modify it. There's something special going on there. You can write you own subroutines like push and other thanks to subroutine prototypes. They are documented in perlsub.

Point is: if the first argument of a call is an anonymous subroutine, Perl forgives you the sub keyword if you set an appropriate prototype:


   sub twice(&) {
       my $coderef = shift;
       $coderef->(1);
       $coderef->(2);
   }
 
   # as if it said twice sub { ... }
   twice {
       my $i = shift;
       print "croak! ($i)\n";
   };

The rationale for that is to give you a way to write your own subroutines with map-like syntax. As you see, we can pass arguments as well.

Opening a file with automatic close would be:


   sub xopen(&@) { ... }
 
   xopen {
       my $fh = shift;
       # ...
   } $mode, $filename;

As you see xopen arguments would go after the anonymous subroutine, because this trick is only allowed if the coderef is the very first argument.

Drawbacks

The whole range of subroutine prototypes have a few gotchas, but what's specifically needed here works fine and some Ruby idioms can be emulated quite well. Subroutine prototypes do not work at all in methods though.

rubyisms

Simon Cozens' rubyisms Perl module provides a hackish yield so that you do not need to execute coderefs explicitly.

20 Oct 2009 »

An Object-Oriented Design for FluidDB Interfaces

Introduction

This post outlines the object-oriented design of Net::FluidDB.

This model may serve as a starting point to other OO libraries. Only the most important interface is documented, not every existing method. The purpose is to give a picture of how the pieces fit together.

Net::FluidDB is a Perl interface but there’s little Perl here. If you are interested in further implementation details please have a look at the source code. You can either download the module, or click the “Source” link at the top of each documentation page in CPAN.

Design Goals

Some design goals of Net::FluidDB:

To offer a higher abstraction to FluidDB than the plain REST API.
To find a balance between a normal object-oriented interface and performance, since most operations translate to HTTP calls.
To provide robust support for value types keeping usage straightforward.

Goal (1) means that you should be able to work at a model level. For instance, given that tags are modeled you should able to pass them around. Users should be able to tag an object with a Net::FluidDB::Tag instance and a value, for example. For convenience they can tag with a tag path as well, but there has to be a complete interface at the model level.

Goal (2) is mostly accomplished via lazy attributes. For example, one would expect that a tag has an accessor for its namespace but it wouldn’t be good to fetch it right away, so we load it on-demand.

Goal (3) is Perl-specific and I plan a separate post for it. The problem to address here is that the FluidDB types null, boolean, integer, float, and string have no direct counterpart in Perl, because Perl handles all of them under a single scalar type.

Communication

Net::FluidDB

FluidDB has a REST interface and thus you need a HTTP client to talk to it. Net::FluidDB in particular uses a very mature Perl module called LWP::UserAgent.

Calls to FluidDB need to set authentication, Accept or Content-Type headers, payload, … It is good to encapsulate all of that for the rest of the library:

Of course some defaults may be handy, like a default protocol, host, or environment variables for credentials. The constructor new_for_testing() gives an instance pointing to the sandbox with test/test for people to play around.

Net::FluidDB::JSON

FluidDB uses JSON for native values and structured payloads, so you’ll need some JSON library. Net::FluidDB uses JSON::XS at this moment with a little configuration. That’s encapsulated in Net::FluidDB::JSON:

The actual class has a few more methods for goal (3), but that’s a different post.

Future Extensions

FluidDB may speak more protocols and serialisation formats in the future. When that happens it may be the case we need a few more abstractions to plug them into the library. But for the time being this seems simple and enough.

Resources

Net::FluidDB::Base

Objects, tags, namespaces, policies, permissions and users, need an instance of Net::FluidDB and of Net::FluidDB::JSON to be able to talk to FluidDB. We set up a root class for them:

(Labels in the previous diagram have no Net::FluidDB namespace because the image was too wide with them, but classes do belong to the Net::FluidDB namespace.)

Net::FluidDB::Object

Objects are:

The signature of the tag() method is quite flexible. For example, you can pass either a Net::FluidDB::Tag instance to it, or a tag path. You can tag with native or non-native values. Values may be either scalars plus options, or self-contained instances of Net::FluidDB::Value, not covered in this post. I plan to support tagging given a filename with automatic MIME type, etc.

In Perl and others it is fine to offer a single method like tag() whose behaviour depends on the arguments when the contract is clear and having a single method pays off. Some other programming languages may prefer to split tag() into multiple methods with different names or signatures.

The signature of the value() method also accepts a Net::FluidDB::Tag instance or a tag path.

Net::FluidDB::HasObject

Tags, namespaces, and users have a canonical object for them in FluidDB. You have its object_id, and a lazy object() getter. You would use this object for example to tag those resources themselves.

We factor the common functionality out to a Moose role. A role is like a Ruby mixin. A bunch of attributes and methods that can’t be instantiated by themselves, but can be somehow thrown into a class definition as if they were part of it:

Net::FluidDB::HasPath

Tags and namespaces also have some common stuff modeled as a role:

There’s an interesting bit here: Both name and path are lazy attributes.

Net::FluidDB::HasPath is thought to be consumed by classes that implement a parent() accessor. By definition, the parent of a namespace is its parent namespace, and the parent of a tag is its containing namespace.

In general, instances make sense as long as they have either a path, or a name and a parent. If you set a path instances will lazily sort out their name and parent if asked to. Given a parent and a name, instances may compute their path if needed. This is easily implemented thanks to the builtin support for lazy attributes in Moose.

Net::FluidDB::Tag

Tags are modeled like this:

The namespace() reader loads the namespace a tag belongs to lazily.

Net::FluidDB::Namespace

Namespaces are similar to tags:

Again, the parent() getter loads the parent namespace on-demand.

Net::FluidDB::ACL

Both policies and permissions have an open/closed policy, and a set of exceptions. Net::FluidDB::ACL provides stuff common to both:

Net::FluidDB::Policy

With that in place policies are:

Net::FluidDB::Permission

And permissions are:

Net::FluidDB::User

Finally, users are pretty simple:

Credits

These cool diagrams are a courtesy of yUML.

Syndicated 2009-10-20 22:48:52 from FluidThinking

17 Oct 2009 (updated 17 Oct 2009 at 09:29 UTC) »

Moose

Not blogging too much lately, I am a bit overclocked with all the work and university classes. Though in the positive side I enjoy my work and it feels like I am doing my hobbies full-time, which is actually pretty cool.

My open source work these days concentrates around the Perl interface for FluidDB, Net::FluidDB, the Rails documentation project docrails, and writing the Active Support Core Extensions Rails Guide, which I'd like to finish for Rails 3. I am doing Net::FluidDB from Fluidinfo by the way.

With Net::FluidDB I've discovered Moose. That's the de facto standard nowadays for writing object-oriented Perl and it is fucking awesome. Very concise, good abstractions, and powerful.

Moose supports a lot of stuff, for instance attributes are declared. You can declare also their visibility, predicates, whether they are required, or lazy... There's multi-inheritance and roles (think Ruby mixins). Method delegation, method modifiers (before callbacks and such as in CLOS or Rails). Meta-classes and other stuff for meta-programming... It's an amazing module, doing OO with it is a joy. And it is very well designed in my view, usage feels just right.

Even if you are not into Perl I think the Moose Manual is worth having a look at.

8 Sep 2009 (updated 19 Sep 2009 at 22:05 UTC) »

FluidDB Terminology

<h2>Introduction</h2>

This post explains the terminology I’ve come up with after two weeks working in Net::FluidDB.

There’s no Perl in this post because albeit my laboratory is said module what I want to communicate are abstractions, not some particular implementation. We talk here about nouns and verbs, and introduce a model to some extent.

Note the focus is on a chosen terminology, this post does not explain the involved concepts themselves. The high-level docs explain FluidDB as such. <h2>Objects</h2>

We start with objects. Objects are the central entities in FluidDB. They have an id, which is a meaningless UUID, and an about attribute, which is an optional arbitrary string.

An object knows the paths of the existing tags on it, that’s a set of strings called tag_paths. The following section explains what is a tag and a path. <h2>Tags</h2>

The next most important entity in FluidDB is the tag. For example, “fxn/was-here”.

Tags have a description, which is a string attribute, and can be indexed, a boolean. They have also an object, explained later.

Tags have a path, “fxn/was-here”, and a name. The name is the rightmost fragment of the path, “was-here” in the previous example.

Each tag belongs to a namespace. Namespaces are explained later.

To tag an object you associate a tag and (optionally) a value to it.

In an object-oriented language tagging an object could look like this:

    object.tag(rating, 10)

The method name is a verb, and the first argument is a tag.

A library may provide a convenient way to tag an object given a tag path:

    object.tag("fxn/rating", 10)

There “fxn/rating” acts as an identifier, it points to the tag with that path, if any. In fact that’s what the REST API asks for, but that’s low-level stuff, the schema we are presenting runs at a higher level.

In a dynamic language you can have such a dynamic signature. In a statically typed language you would probably have different methods for different signatures. But that’s not important for the mental model we are building.

Tags in FluidDB are not typed. You could tag an object as having an “fxn/rating” of 10, and tag another object as having an “fxn/rating” of “five stars”. Values are typed. <h2>Values</h2>

Tagging involves an object, a tag, and optionally a value. Values are typed. There’s some technical stuff related to encodings and such, but for the purposes of this post I think we do not need to go further. <h2>Namespaces</h2>

To organize tags FluidDB provides namespaces.

Namespaces have a description, which is a string attribute, and an object, explained later.

Namespaces can contain other namespaces, and tags. Tags cannot, tags are leaves.

The namespace_names attribute of a namespace is the possibly empty set of the names of its children namespaces. The tag_names attribute of a namespace is the possibly empty set of the names of its tags.

Each namespace that is not top-level has a parent. A concrete implementation may define the parent of a root namespace to be some sort of null object.

Any namespace has a path, and a name, akin to tags. The namespace with path “fxn/reading/books” has name “books”. Its parent has path “fxn/reading”, and name “reading”.

You can compute the path of any child namespace or tag from the path of the containing namespace and their respective names. <h2>Permissions</h2>

Each possible action on each existing tag and namespace has a permission associated with it. A permission consists of a policy and an exception list, to be applied to a certain category and action.

The policy may be open, or closed, and the exception list is a set of usernames. (Note: FluidDB in general lacks ordered collections, read “set” when you see “list”.) <h2>Policies</h2>

Each possible action on tags and namespaces have a default set of permissions. When you create a tag or a namespace, each one of the possible actions gets such defaults. Each of those defaults is called by definition a policy.

There’s a name clash here which is not good. It is inherited from the API. I’ve departed in some places from the API, but I believe we need to stick to it in this case: A policy consists of a policy and an exception list, to be associated with certain category and action on behalf of a certain user.

The policy attribute may be open or closed, and the exception list is a set of usernames. (Note: FluidDB in general lacks ordered collections, read “set” when you see “list”.) <h2>Users</h2>

Users have a name, a username, and an object, explained in the following section. <h2>Where are the IDs?</h2>

If you are familiar with the API you may be wondering where did the IDs of tags, namespaces, and users go, and what are those objects I’ve mentioned.

Tags, namespaces, and users are not FluidDB objects themselves. They have no ID, they have no about, you can delete them.

The proper identifier of a tag or a namespace in the system is their paths, and the one of a user its username.

FluidDB, however, creates an object for each tag, namespace, and user. They can be found in their object attribute. So, for example, if you wanted to tag the user whose username is “fxn” there’s a canonical object in the system for it. You can tag that object, but you cannot tag the user itself.

If the user is deleted, the corresponding object is not. Remember, objects are immortal. In particular if the user was tagged the tags are still there, with the object that represented it in FluidDB. This parallels the object for any other thing in life that once existed.

Syndicated 2009-09-08 02:39:54 from FluidThinking

521 older entries...