Jay Fields' Thoughts: extend

Showing posts with label extend. Show all posts

Friday, July 25, 2008

Ruby: Dwemthy's Array using Modules

Following my blog entry Underuse of Modules, an anonymous commenter asks

Can you show another example, then, of how one might implement the "magic" of Dwemthy's Array (http://poignantguide.net/dwemthy/) just using modules? I can never remember how to do this sort of thing, and if modules can make it conceptually simpler it would be most useful.

I'm not sure exactly what magic they were referring to, but I'll assume they mean what allows creature habits to be defined in class definitions. Based on that assumption, I pulled out this code from the example.

class Creature
  def self.metaclass; class << self; self; end; end

  def self.traits( *arr )
    return @traits if arr.empty?

    attr_accessor(*arr)

    arr.each do |a|
      metaclass.instance_eval do
        define_method( a ) do |val|
          @traits ||= {}
          @traits[a] = val
        end
      end
    end

    class_eval do
      define_method( :initialize ) do
        self.class.traits.each do |k,v|
          instance_variable_set("@#{k}", v)
        end
      end
    end
  end
end

class Rabbit < Creature
  traits :bombs
  bombs 3
end

Rabbit.new.bombs # => 3

Note, I removed the comments and added the Rabbit code. The Rabbit is there to ensure the magic continues to function as expected.

The version using a module isn't significantly better, but I do slightly prefer it.

class Creature
  def self.traits( *arr )
    return @traits if arr.empty?

    attr_accessor(*arr)

    mod = Module.new do
      arr.each do |a|
        define_method( a ) do |val|
          @traits ||= {}
          @traits[a] = val
        end
      end
    end

    extend mod
    
    define_method( :initialize ) do
      self.class.traits.each do |k,v|
        instance_variable_set("@#{k}", v)
      end
    end
  end
end

class Rabbit < Creature
  traits :bombs
  bombs 3
end

Rabbit.new.bombs # => 3

The above version is a bit clearer to me because it defines methods on a module and then extends the module. I know that if I extend a module from the context of a class definition the methods of that module will become class methods.

Conversely, the first example forces me to think about where a method goes if I do an instance_eval on a metaclass. By definition all class methods are instance methods of the metaclass, but there are times when you can be surprised. For example, using def changes based on whether you use instance_eval or class_eval, but define_method behaves the same (change metaclass.instance_eval in the original example to metaclass.class_eval and the behavior doesn't change). This type of thing is an easy concept that becomes a hard to find bug.

If you spend enough time with metaclasses it's all clear and easy enough to follow. However, modules are generally straightforward and get you the same behavior without the mental overhead. I'm sure someone will argue that metaclasses are easier to understand, which is fine. Use what works best for you.

However, there are other reasons why it might make sense to use modules instead, such as wanting to have an ancestor (and thus the ability to redefine and use super).

Again, it probably comes down to personal preference.

Thursday, July 24, 2008

Ruby: Underuse of Modules

When I began seriously using Ruby I noticed two things that I didn't like about the language.

There's no method to get the metaclass.
Some people create modules as a hint that they reopened the class.

It's a few years later and I've noticed a few interesting things.

I can't remember the last time I actually wanted a method to get the metaclass.
If you use a module to add behavior your behavior becomes part of the ancestor tree, which is significantly more helpful than putting your behavior directly on the class.

No metaclass method necessary
In April 2005, Why gave us Seeing Metaclasses Clearly. I'm not sure the article actually helped me see metaclasses clearly, but I know I pasted that first block of code into several of my first Ruby projects. I used metaclasses in every way possible, several of which were probably inappropriate, and saw exactly what was possible, desirable, and painful.

I thought I had a good understanding of the proper uses for metaclasses, and then Ali Aghareza brought me a fresh point of view: defining methods on a metaclass is just mean.

We were on the phone talking about who knows what, and he brought up a blog entry I'd written where I dynamically defined delegation methods on the metaclass based on a constructor argument. He pointed out that doing so limited your ability to change the method behavior in the future. I created some simple examples and found out he was right, which lead to my blog entry on why you should Extend modules instead of defining methods on a metaclass.

Ever since that conversation and the subsequent blog entry, I've been using modules instead of accessing the metaclass directly. "Just in case someone wants to redefine behavior" isn't really a good enough reason for me if the level of effort increases, but in this case I found the code to be easier to follow when I used modules. In programming, there are few win-win situations, but Ali definitely showed me one on this occasion.

If you interact with a metaclass directly, do a quick spike where you introduce a module instead. I think you'll be happy with the resulting code.

Include modules instead of reopening classes
In January of 2007 I wrote a blog entry titled Class Reopening Hints. I didn't write it because I thought it was very valuable, I wrote it so developers afraid of open classes could get some sleep at night. Those guys think we are going to bring the end of the world with our crazy open classes, and I wanted to let them know we'd at least thought about the situation.

I'm really not kidding, I thought the entry was a puff piece, but it made some people feel better about Ruby so I put it together. I never followed the Use modules instead of adding behavior directly advice though, and I don't think many other Rubyists did either. It was extra effort, and I didn't see the benefit. In over two and an half years working with Ruby I've never once had a problem finding where behavior was defined. With that kind of experience, I couldn't justify the extra effort of defining a module -- until the day I wanted to change the behavior of Object#expects (defined by Mocha). I was able to work around the fact that Mocha defines the expects method directly on Object, but the solution was anything but pretty.

It turns out, using modules instead of adding behavior directly to a reopened class has one large benefit: I can easily define new behavior on a class by including a new module. If you only need new behavior, then defining a new method on the class would be fine. But, if you want to preserve the original behavior, having it as an ancestor is much better.

Take the following example. This example assumes that a method hello has been defined on object. Your task is to change the hello method to include the original behavior and add a name.

# original hello definition
class Object
  def hello
    "hello"
  end
end

# your version with additional behavior
class Object
  alias old_hello hello
  def hello(name)
    "#{old_hello} #{name}"
  end
end

hello("Ali") # => "hello Ali"

That code isn't terrible. In fact, there are a few different ways to redefine methods and access the original behavior, but none of them look as nice as the following example.

# original hello definition
class Object
  module Hello
    def hello
      "hello"
    end
  end
  include Hello
end

# your version with additional behavior
class Object
  module HelloName
    def hello(name)
      "#{super()} #{name}"
    end
  end
  include HelloName
end

hello("Ali") # => "hello Ali"

When you have an ancestor, the behavior is only a super call away.

note:Yes, I've also reopened the class to include the module, but when I talk about reopening the class I'm talking about defining the behavior directly on the reopened class. I could have also included the module by using Object.send :include, HelloName. Do whichever you like, it's not pertinent to this discussion.

Prefer Modules to metaclasses and reopened classes
The previous example illustrates why you should prefer modules, it gives simple access to your method behavior to anyone who wishes to alter but reuse the original behavior. This fact applies to both classes that include modules and instances that extend modules.

So why didn't Matz give us first class access to the metaclass? Who cares. He probably knew extending modules was a better solution, but even if he didn't -- it is. He didn't give you a method to access the metaclass, and whether he knew it or not, you don't need it.

Monday, April 07, 2008

Alternatives for redefining methods

Ruby's open classes allow you define and redefine behavior pretty much at will; unfortunately, almost every option comes with caveats.

The example below is a gateway class that defines a process method. For the purposes of the example, assume that we need to redefine the process method on Gateway itself and call the original process method.* Also, assume that Gateway is not our class, so we cannot easily alter the original definition of process.

class Gateway
  def process(document)
    p "gateway processed document: #{document}"
  end
end

Gateway.new.process("hello world") 
# >> "gateway processed document: hello world"

Solution 1: alias

The following example uses alias to redefine the process method.

class Gateway
  alias old_process process
  def process(document)
    p "do something else"
    old_process(document)
  end
end

Gateway.new.process("hello world") 
# >> "do something else"
# >> "gateway processed document: hello world"

The example above creates an alias (old_process) for the process method. With an alias in place you can redefine the process method to anything you want and call the old_process method using the alias. This is probably the easiest solution and the most commonly used solution.

Unfortunately, it's not without problem. First of all, if someone redefines old_process you will get unexpected behavior. Second of all, the old_process method is really nothing more than an artifact of the fact that you have no other way to refer to the original method definition. Lastly, if the code is loaded twice, an infinite loop is created that causes the always painful to see 'stack level too deep' error.

Solution 2: alias_method_chain
Like I said, solution 1 is the most popular way to redefine a method. In fact, it's so popular Rails defines the alias_method_chain method to encapsulate the pattern. From the Rails source above alias_method_chain:

Encapsulates the common pattern of:
#
# alias_method :foo_without_feature, :foo
# alias_method :foo, :foo_with_feature
#
# With this, you simply do:
#
# alias_method_chain :foo, :feature
#
# And both aliases are set up for you.
#
# Query and bang methods (foo?, foo!) keep the same punctuation:
#
# alias_method_chain :foo?, :feature
#
# is equivalent to
#
# alias_method :foo_without_feature?, :foo?
# alias_method :foo?, :foo_with_feature?
#
# so you can safely chain foo, foo?, and foo! with the same feature.

Using alias_method_chain we can define our Gateway as the example below.

class Gateway
  def process_with_logging(document)
    p "do something else"
    process_without_logging(document)
  end
  alias_method_chain :process, :logging
end

Gateway.new.process("hello world") 
# >> "do something else"
# >> "gateway processed document: hello world"

Using alias_method_chain is nice because it's something familiar to many Rails developers. Unfortunately, it also suffers from the same problems as using alias on your own.

Solution 3: Close on an unbound method
The following code uses the class method "instance_method" to assign the "process" method (as an unbound method) to a local variable. The "process_method" local variable is in scope of the closure used to define the new process method, so it can be used within the process definition. Calling an unbound method is as simple as binding it to any instance of the class that it was unbound from and then using the call method.

class Gateway
  process_method = instance_method(:process)
  define_method :process do |document|
    p "do something else"
    process_method.bind(self).call(document)
  end
end

Gateway.new.process("hello world") 
# >> "do something else"
# >> "gateway processed document: hello world"

I've always preferred this solution because it doesn't rely on artifact methods that may or may not collide with other method definitions. Also, if the code is loaded multiple times the behavior is altered multiple times, but I find that easier to diagnose than when my only clue is "stack level too deep".

Unfortunately, this solution is not without flaws. Firstly, it relies on the fact that define_method uses a closure and has access to the unbound method. Of course this also implies that you have a handle on anything else defined in the same context. As with any closure, it's possible to accidentally create a memory leak. Also, (in MRI) I'm told that define_method takes 3 times as long to execute when compared to def.

Solution 4: Extend a module that redefines the method and uses super
This solution relies on creating a module with the new behavior and extending an instance with the module. Since the module is extended from the instance it will be checked first for the method definition when "process" is called (because it's the first ancestor). Since the module is the first ancestor it can use super to execute the process method defined in Gateway (the second ancestor).

module ProcessLogging
  def process(document)
    p "do something else"
    super
  end
end

Gateway.new.extend(ProcessLogging).instance_eval("class << self; self; end").ancestors 
# => [ProcessLogging, Gateway, Object, Kernel]
Gateway.new.extend(ProcessLogging).process("hello world")
# >> "do something else"
# >> "gateway processed document: hello world"

This solution is my favorite because I can use def and super and never worry about creating any memory leaks or artifact methods.

Of course, it assumes that you get the opportunity to extend instances of the class. However, I haven't found that to be a problematic requirement.

* There are generally other options such as delegation, defining hooks, etc. Often I find these to be cleaner solutions and try that route first. But, sometimes redefining a method cannot be avoided.