Monday, August 06, 2007

Ruby: Lazily Initialized Attributes

Update: This is probably going to be included in Refactoring, Ruby Edition so I took another pass through it following all of your comments. Thanks for the feedback.

Initialize an attribute on access instead of at construction time.

class Employee
def initialize
@emails = []
end
end

==>

class Employee
def emails
@emails ||= []
end
end

Motivation
The motivation for converting attributes to be lazily initialized is for code readability purposes. While the above example is trivial, when the Employee class has multiple attributes that need to be initialized the constructor will need to contain all the initialization logic. Classes that initialize instance variables in the constructor need to worry about both attributes and instance variables.The procedural behavior of initializing each attribute in a constructor is unnecessary and less maintainable than a class that deals exclusivey with attributes. Lazily Initialized Attributes can encapsulate all their initialization logic within the methods themselves.

Mechanics
  • Move the initialization logic to the attribute getter.
  • Test
Example using ||=
The code below is an Employee class with the email attribute initialized in the constructor.

class Employee
attr_reader :emails, :voice_mails

def initialize
@emails = []
@voice_mails = []
end
end

Moving to a Lazily Initialized Attribute generally means moving the initialization logic to the getter method and initializing on the first access.

class Employee
def emails
@emails ||= []
end

def voice_mails
@voice_mails ||= []
end
end

Example using instance_variable_defined?
Using ||= for Lazily Initialized Attributes is a common idiom; however, this idiom falls down when nil or false are valid values for the attribute.

class Employee...
def initialize
@assistant = Employee.find_by_boss_id(self.id)
end

In the above example it's not practical to use an ||= operator for a Lazily Initialized Attribute because the find_by_boss_id might return nil. In the case where nil is returned, each time the assistant attribute is accessed another database trip will occur. A superior solution is to use code similar to the example below that utilizes the instance_variable_defined? method that was introduced in ruby 1.8.6.

class Employee...
def assistant
@assistant = Employee.find_by_boss_id(self.id) unless instance_variable_defined? :@assistant
@assistant
end

16 comments:

  1. Martin Plöger4:20 PM

    I really like that idea. But when accessing the attributes you'll have to use the getter. Otherwise the attribute might still not be initialized. (I always use the getter. Seems good style to me.)
    So: great idea!

    ReplyDelete
  2. That's not quite bullet proof though. It breaks where false is a legal value, but isn't the default, and when nil is a legal (but not default) value. The bulletproof approach is:

    def emails
      unless instance_variable_defined? :@emails
        @emails = []
      end
      return @emails
    end

    It'd be very handy if you could write:

    attr_accessor :emails {|instance| ... }

    which would use the block to generate the default value for the attribute. It's a simple matter of programming I suppose...

    ReplyDelete
  3. Piers,
    Thanks for the tip on instance_variable_defined?. I'm currently still on 1.8.5, but I went ahead and updated the example since I expect many people have moved on to 1.8.6

    As far as creating an attr_accessor with a default value: I'd suggest solving this RubyQuiz (http://rubyquiz.com/quiz67.html) if you want a really bullet proof version.

    I was looking for the simple version to include in Refactoring, Ruby Edition, which was my actual motivation for this post.

    Cheers, Jay

    ReplyDelete
  4. Nice quiz. It's quite good fun solving it so that you don't need any conditional code in the body of your generated methods...

    ReplyDelete
  5. I am sure I am missing something... Why can't you just say:

    class Employee
    def emails
    @emails ||= []
    @emails
    end
    end

    ReplyDelete
  6. Michael12:39 AM

    "The motivation for converting attributes to be lazily initialized is for code readability purposes."

    I thought the purpose of Lazy Initialization was to make the code more flexible while sacrificing some readability. Since with lazy initialization there is no longer a single spot to see how all variables are initialized. Also, getters take on two roles (getting and initializing) instead of just one.

    ReplyDelete
  7. > But when accessing the attributes you'll have to use the getter.

    That's just a bonus :) Always use the getter...it's an attribute, leave the variable alone.

    > Since with lazy initialization there is no longer a single spot to see how all variables are initialized.

    Another place where really thinking of our_obj.emails as an attribute instead of a variable helps. It happens to use an instance variable to store itself, but that really is an implementation detail. Ideally, the only code touching @emails is in emails()

    ReplyDelete
  8. Mario, you can if the variable cannot be false or nil ( as in the example ). However, to accomidate all situations the implementation provided by Piers is perhaps more appropriate.

    Cheers, Jay

    ReplyDelete
  9. I tend to agree that the motivation's a bit dodgy, especially where you're hand rolling the accessors. A RubyQuiz67 solution that meant you could write:

    attribute :emails {[]}
    attribute :content => "Replace this"

    Or, (harder to implement, but revealing more in the way of intent.)

    attributes {
      emails.default {[]}
      content.default "Replace this"
    }

    then you get declarative code that reveals your intent clearly.

    Without a class method to build the accessors for you, you're arguably better with a composed initialize along the lines of:

    def initialize(args = {})
    initialize_defaults
    initialize_attributes(args)
    yield self if block_given?
    end

    def initialize_defaults
      @emails = []
      @content = "Replace this"
    end

    The big win for lazy initialization comes when the code for generating the default is expensive and/or you expect that you won't be falling back on the default very often.

    ReplyDelete
  10. Oh yes, I forgot to mention, another win for the attributes(&block) style of definition is that it would be relatively easy to extend along the lines of:

    attributes {
      emails.default(&{[]}).type(Array)
      contents.valid do |instance,newval|
        ! newval.blank?
      end
      whatever do
        type(Array)
        valid {...}
      end
    }

    I'm becoming more and more convinced that, if you want to see how to write effective DSL type code in Ruby you should be looking far more closely at RSpec than at ActiveRecord's class methods.

    ReplyDelete
  11. why not just:

    class Employee
    def emails
    @emails ||= []
    end
    end

    ReplyDelete
  12. Someone's probably posted this already, but everything in ruby is an expression, so get rid of your unneccessary code.

    In other words, this:

    def emails
    @emails = [] unless instance_variable_defined?:@emails
    return @emails
    end

    is *exactly* the same as this

    def emails
    @emails = [] unless instance_variable_defined?:@emails
    end

    Also, given that @emails is supposed to be an array, and as such there is no sane reason for it to be false, you could just use the guard operator, and do this:

    def emails
    @emails ||= []
    end

    Shorter code has less opportunity for bugs, and is easier to maintain

    You can do yourself a massive favour by learning the ins and outs of whatever programming language you are using so that you too can write shorter code. Your QA department, boss, and customers will thank you for it

    ReplyDelete
  13. Orion,

    def emails
    @emails = [] unless instance_variable_defined?:@emails
    return @emails
    end

    is *not* exactly the same as this

    def emails
    @emails = [] unless instance_variable_defined?:@emails
    end

    The 2nd example returns nil when @emails is defined.

    ReplyDelete
  14. Anonymous5:42 AM

    damn, this is extremely obvious and insignificant. what's next, an entry on conditional statements? oh wait...

    seriously, it's no wonder why ruby programmers have a bad rep.

    ReplyDelete
  15. I would say that using instance_variable_defined? is overly verbose in this case. Why not just use

    defined? @emails

    ... But just as I write this, I realize you can't dynamically generate a statement using defined? in this way. So instance_variable_defined? is probably a better solution after all.

    ReplyDelete
  16. Steven1:36 PM

    require 'traits'
    class Employee
       has :emails => []
       has :assistant {
          Employee.find_by_boss_id(self.id)
       }
    end

    ReplyDelete

Note: Only a member of this blog may post a comment.