29 Jan 2010 fxn   » (Master)

Tracking Class Descendants in Ruby

I am going through all Active Support core extensions lately because I am writing the Active Support Core Extensions guide, due for Rails 3. There are some patches in master as a result of that walkthrough, and I am now focusing on keeping track of descendants in a class hierarchy.

A known technique uses ObjectSpace.each_object. That is a method that receives a class or module as argument and yields all objects that have that class or module among their parents. Since classes are instances of the class Class, you can select descendants of class C this way:


    descendants_of_C = []
    ObjectSpace.each_object(Class) do |klass|
      descendants_of_C << klass if klass < C
    end

That is a brute force approach, it works, but it is inefficient. JRuby even disables ObjectSpace by default for performance reasons.

A better approach is to leverage the inherited hook. Classes may optionally implement a class method inherited that is called whenever they are subclassed. The subclass is passed as argument:


    class User
      def self.inherited(subclass)
        puts 0
      end
    end
 
    class Admin < User
      puts 1
    end
 
    # output is
    0
    1

That's a perfect place to keep track of descendants:


    class C
      class << self
        def inherited(subclass)
          C.descendants << subclass
          super
        end
 
        def descendants
          @descendants ||= []
        end
      end
    end

In that code we have an array of descendants in @descendants. That is an instance variable of the very class C. Remember classes are ordinary objects in Ruby and so they may have instance variables. It is better to use an instance variable instead of a class variable because class variables are shared among the entire hierarchy of the class and we need an exclusive array.

Another fine point is that we force descendants to be the one in the C class. If we didn't and we had A < B < C, the hook would be called when A was defined, but by polymorphism it would be B.descendants what would be called, thus setting B's instance variable @descendants. That is not what we want.

The call to super is just a best practice. In general a hook like this should pass the call up the hierarchy in case parents have their own hooks.

That pattern can be implemented in a module for reuse indeed:


    module DescendantsTracker
      def self.included(base)
        (class << base; self; end).class_eval do
          define_method(:inherited) do |subclass|
            base.descendants << subclass
            super
          end
        end
        base.extend self
      end
 
      def descendants
        @descendants ||= []
      end
    end
 
    class C
      include DescendantsTracker
    end

A class only needs to include DescendantsTracker to track its descendants.

When the module is included in a class Ruby invokes its inherited hook. The hook receives the class that is including the module, and we leverage that to inject the class methods we saw before. For inherited we open the metaclass of base and define the method in a way that has base in scope, which is something we saw before we need. After that we add the descendants class method with an ordinary extend call.

Update: There's a followup to this post.

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!