Jay Fields' Thoughts: April 2007

Friday, April 27, 2007

Ruby: Validatable 1.2.2 released

The Validatable gem received a fair amount of attention in the past few weeks. Thanks to Ali Aghareza, Jason Miller, Xavier Shay, and Anonymous Z for their contributions.

Validation Groups
On my previous project we found that our object.valid? method needed to depend on the role that an object was currently playing. This led to the introduction of groups.

Validation groups can be used to validate an object when it can be valid in various states. For example a mortgage application may be valid for saving (saving a partial application), but that same mortgage application would not be valid for underwriting. In our example a application can be saved as long as a Social Security Number is present; however, an application can not be underwritten unless the name attribute contains a value.

  class MortgageApplication
    include Validatable
    validates_presence_of :ssn, :groups => [:saving, :underwriting]
    validates_presence_of :name, :groups => :underwriting
    attr_accessor :name, :ssn
  end

  application = MortgageApplication.new
  application.ssn = 377990118
  application.valid_for_saving? #=> true
  application.valid_for_underwriting? #=> false

As you can see, you can use an array if the validation needs to be part of various groups. However, if the validation only applies to one group you can simply use a symbol for the group name.

The inspiration for adding this functionality came from the ContextualValidation entry by Martin Fowler.

validates_true_for
The validates_true_for method was added to allow for custom validations.

The validates_true_for method can be used to specify a proc, and add an error unless the evaluation of that proc returns true.

  class Person
    include Validatable
    validates_true_for :first_name, :logic => lambda { first_name == 'Book' }
    attr_accessor :first_name
  end

  person = Person.new
  person.valid? #=> false
  person.first_name = 'Book'
  person.valid? #=> true

The logic option is required.

validates_numericality_of
This release also adds the validates_numericality_of method. The validates_numericality_of method takes all of the standard parameters that the other validations take: message, times, level, if, group.

Validates that the specified attribute is numeric.

  class Person
    include Validatable
    validates_numericality_of :age
  end

after_validate hook method
Another new feature of this release is the after_validate hook. This feature allows you to manipulate the instance after a validation has been run. For example, perhaps you are happy with the default messages; however, you also want the attribute to be appended to the message. The following code uses that example and shows how the after_validate hook can be used to achieve the desired behavior.

class Person
  include Validatable
  validates_presence_of :name
  attr_accessor :name
end
 
class ValidatesPresenceOf
  after_validate do |result, instance, attribute|
    instance.errors.add("#{attribute} can't be blank") unless result
  end
end

person = Person.new
person.valid? #=> false
person.errors.on(:name) #=> "name can't be blank"

The after_validate hook yields the result of the validation being run, the instance the validation was run on, and the attribute that was validated.

include_validations_for takes options
The include_validations_for method was changed to accept options. The currently supported options for include_validations_for are :map and :if.

  class Person
    include Validatable
    validates_presence_of :name
    attr_accessor :name
  end

  class PersonPresenter
    include Validatable
    include_validations_for :person, :map => { :name => :namen }, 
                            :if => lambda { not person.nil? }
    attr_accessor :person

    def initialize(person)
      @person = person
    end
  end

  presenter = PersonPresenter.new(Person.new)
  presenter.valid? #=> false
  presenter.errors.on(:namen) #=> "can't be blank"

The person attribute will be validated. If person is invalid the errors will be added to the PersonPresenter errors collection. The :map option is used to map errors on attributes of person to attributes of PersonPresenter. Also, the :if option ensures that the person attribute will only be validated if it is not nil.

validates_confirmation_of
The validates_confirmation_of method now takes the :case_sensitive option. If :case_sensitive is set to false, the confirmation will validate the strings based a case insensitive comparison.

validates_length_of
The validates_length_of method now takes the :is option. If the :is option is specified, the length will be required to be equal to the value given to :is.

Bugs
The comparison in validates_format_of was changed to call to_s on the object before execution. The new version allows you to validate the format of any object that implements to_s.

Thursday, April 26, 2007

DSL: Expressive Software

Lately I've been thinking about DSLs that are designed for and by programmers. After some recent work and discussions I formed the following idea.

Expressive Software is a style of software development inspired by Domain Specific Languages. In-fact, Expressive Software can be considered a type of Domain Specific Language. Specifically, Domain Specific Languages designed for programmers can be categorized as Expressive Software.

The term Domain Specific Language is overloaded and suffers from lack of clarity*. Because of the lack of clarity some consider Rails to be a Domain Specific Language and others are vehemently against the idea. However, it's hard to argue against Rails being considered expressive software.

The expressiveness of Rails can be seen in it's many class methods. A good example of a class method designed to be expressive is the has_many method of ActiveRecord::Base. The has_many method and all the other expressive methods of Rails make working with it easy when developing new and maintaining old code.

An example of creating your own Expressive Software could be a class that defines a workflow path. The class could do this via the following syntax:

class StandardPath
  pages :select_product, :contact_information, :summary

  path select_product >> contact_information >> summary
end

The same application may have a path for existing users

class ExistingUserPath < StandardPath
  path select_product >> summary
end

The above example utilizes an expressive class method, similar to Rails. In addition it also shows an example of using a Fluent Interface (the path method). Utilizing a Fluent Interface helps create software that expresses intent within the code. This is one of the goals of Expressive Software, to easily convey the intent within the code itself.

Software designed with an Expressive Software mentality is easier to read and maintain. This style of software development can lead to productivity gains. While it's hard to quantify the level of productivity gains, Rails (and various other examples such as Rake and Builder) prove that a higher level of productivity can and should be achieved.

* This isn't the first time I've mentioned that I believe "DSL" is too vague. In the Business Natural Language material I've also discussed why I think there's a need to create more descriptive phrases for more focused ideas.

Friday, April 20, 2007

Ruby: Class Methods

It's very common when moving from Java or C# to be weary of Ruby's class methods. Many people have written about how static methods are evil in C# and Java, but do those arguments apply to Ruby?

Complaints

Static methods cannot participate in an Interface. Ruby: No interfaces, thus irrelevant.
Static methods cannot be abstract. Ruby: No concept of abstract, also irrelevant.
Static methods are not polymorphic. Ruby: doesn't seem like it applies since you don't declare type in Ruby. Also, you don't access class methods directly from an instance.
Static methods are hard to test. Ruby: Class methods are as easy to test and mock as any instance method. Perhaps this is because they are simply instance methods of the sington class.

At this point I ran out of complaints, but it's been over a year since I've touched C#. If you have complaints of your own that support or oppose this entry, please drop them in the comments.

Another thought about static methods is that they tend not to encourage proper object design. For example, a method that removes characters from a string is probably best served on the string itself.

class String
  # an example class method
  def self.delete(string, characters)
    ...
  end

  # a better instance method
  def delete(characters)
    ...
  end
end

You see this type of class method much less in Ruby since Ruby has open classes. But, it's good to remember to put methods on the object whose data is being manipulated when possible.

So what exactly is a class method?

class Parser
  def self.process(script)
    # ...
  end
end

Parser.singleton_methods.inspect #=> ["process"]

From the example, you could say a class method is a singleton_method. Where do singleton methods live?


class Parser
  def self.process(script)
    # ...
  end
  
  def self.singleton
    class << self; self; end
  end
end

Parser.singleton.instance_methods(false).inspect 
  #=> ["singleton", "new", "superclass", "allocate", "process"]

Singleton (class) methods are actually all instance methods of the singleton class.

In fact, you can define a class method various ways, but the result is always an instance method on the singleton class.

class Parser
  def self.self_process(script)
    # ...
  end
  
  def Parser.parser_process(script)
    # ...
  end
  
  class << self
    def singleton_process(script)
      # ...
    end
  end
  
  def self.singleton
    class << self; self; end
  end
end

Parser.singleton.instance_methods(false).inspect
  #=> ["singleton_process", "parser_process", "new", 
       "superclass", "allocate", "singleton", "self_process"]

Of course, the process method is also inherited by subclasses and the singleton classes of subclasses.

class Parser
  def self.process(script)
    # ...
  end
end  

class Object
  def self.singleton
    class << self; self; end
  end
end

class HtmlParser < Parser
  
end

HtmlParser.singleton_methods.inspect
  #=> ["singleton", "process"]
HtmlParser.singleton.instance_methods(false).inspect
  #=> ["singleton", "new", "superclass", "allocate", "process"]

In a later post I'll talk about where those methods are stored in the underlying C code.

Thanks to James Lewis, Matt Deiters, and Mike Ward for help on this entry.

Thursday, April 19, 2007

Ruby: Assigning instance variables in a constructor

Every time I assign an instance variable in a constructor I remember that I've been meaning to write something that takes care of it for me.

class DomainObject
  attr_reader :arg1, :arg2

  def initialize(arg1, arg2)
    @arg1, @arg2 = arg1, arg2
  end
end

So, without further ado.

class Module
  def initializer(*args, &block)
    define_method :initialize do |*ctor_args|
      ctor_named_args = (ctor_args.last.is_a?(Hash) ? ctor_args.pop : {})
      (0..args.size).each do |index|
        instance_variable_set("@#{args[index]}", ctor_args[index])
      end
      ctor_named_args.each_pair do |param_name, param_value|
        instance_variable_set("@#{param_name}", param_value) 
      end
      initialize_behavior
    end
    
    define_method :initialize_behavior, &block
  end
end

The above code allows you to create a constructor that takes arguments in order or from a hash. Here's the tests that demonstrate the behavior of the initializer method.

class ModuleExtensionTest < Test::Unit::TestCase
  def test_1st_argument
    klass = Class.new do
      attr_reader :foo

      initializer :foo
    end
    foo = klass.new('foo')
    assert_equal 'foo', foo.foo
  end

  def test_block_executed
    klass = Class.new do
      attr_reader :bar

      initializer do
        @bar = 1
      end
    end
    foo = klass.new
    assert_equal 1, foo.bar
  end

  def test_2nd_argument
    klass = Class.new do
      attr_reader :foo, :baz

      initializer :foo, :baz
    end
    foo = klass.new('foo', 'baz')
    assert_equal 'baz', foo.baz
  end

  def test_used_hash_to_initialize_attrs
    klass = Class.new do
      attr_reader :foo, :baz, :cat

      initializer :foo, :baz, :cat
    end
    foo = klass.new(:cat => 'cat', :baz => 2, :foo => 'foo')
    assert_equal 'foo', foo.foo
    assert_equal 2, foo.baz
    assert_equal 'cat', foo.cat
  end
end

There are limitations, such as not being able to use default values. But, for 80% of the time, this is exactly what I need.

Wednesday, April 18, 2007

Ruby: Mocks and Stubs using Mocha

Update: Older versions of Mocha didn't warn when a stub was never called. The newer versions will; therefore, it makes more sense to prefer stub since it's less fragile and more intention revealing. I no longer feel as I did when I wrote this entry, but I've left it for historical reasons. Related reading here.

I've previously written about using Mocks and Stubs convey intent. When I was using C#, I believed this was the best solution for creating robust tests. These days, all my code is written in Ruby. Making the switch to Ruby provided another example that reinforces an assertion I've heard before: Best Practices are so context dependent it's dangerous to use the term.

Here's the reason I no longer feel as I did when I created the above entry: When using Mocha to mock or stub behavior, I can't think of a reason I would ever want to use SomeObject.stubs(..) instead of SomeObject.expects(..). The closest I could come to a reason was that stubs will allow an arbitrary number of calls to the same method. However, I don't believe that's a good enough reason since I can also use SomeObject.expects(:a_method).at_least_once.

The problem with using SomeObject.stubs is that it's almost the same as using SomeObject.expects, except if it's no longer necessary it doesn't cause a test to fail. This can lead to tests that unnecessarily stub methods as the application's implementation changes. And, the more methods that require stubbing the less the test can concisely convey intent.

I'm not asserting that you should never use SomeObject.stubs, but I do believe it's smarter to favor SomeObject.expects if you are concerned with keeping your test suite well maintained.

Of course, none of this is related to creating actual stubs for your tests. I still believe it's wise to create stub object instances for classes that are required for a unit test, but are not the class under test. For example, if I were testing a mapper class, I would use a stub for the objecting being mapped.

test "an order's id is mapped to the request object" do
  request = OrderServiceMapper.create_request(stub(:id=>1, ...))
  assert_equal 1, request.order_id
end

I use a stub in the above example because I don't want to couple the test to the implementation of an order. I believe this creates a more robust, maintainable test

Saturday, April 14, 2007

Blogging: Blog as a skills assessment

When I joined ThoughtWorks, almost 2.5 years ago, I had a conversation with Paul Hammant where he said he always googles someone before he interviews them. At the time it was a good idea, but now I think it's a great idea.

It's fairly common, in my experience interviewing TW candidates, to list every technology a person has ever touched. If they've done a 'hello world' in IO, you'll find IO on their resume.

Contrast the above situation with what you find on a person's blog or in their emails to mailing lists. For example, I have done 'hello world' in IO, but there's not a single entry on the web that links me to IO. Conversely, a glance at my blog shows I've been doing Ruby/Rails full time for more than a year at this point and Agile for almost 3.

A quick search can save you from a lot of tracer bullet questions. Your first question in an interview usually needs to be very high level, to see where the candidate skills are at. The next question may be very detailed, to see how deep the candidate is. But, a simple search could have revealed the same information without the common interview dance: If you ask a high level question, you usually get a high level answer, despite the fact that the candidate may have a very deep understanding of the topic.

Of course, this wont work on all occasions. Searching for Michael Johnson isn't likely to produce relevant information, nor will searching for someone who never publishes anything on the web. But, even if it works 50% of the time, it's better than having to ask the same broad questions at the beginning of every interview.

Also, if this practice were more widely adopted, having a blog may become an advantage when searching for a job in the future. Personally, I'd prefer someone be able to size me up before an interview. I don't enjoy tracer bullet interview questions any more than the person asking them.

Wednesday, April 11, 2007

Ruby: Default method arguments to instance variables

Several projects ago I worked with Brent Cryder. One night over dinner, he told me that on occasion he would pass instance variables to instance methods of the same class to improve testability. His assertion was that by passing in the variable you could test the method without depending on the state of the object (instance variables). Below is a contrived example to demonstrate the idea.

class Radio
  def add_battery(battery)
    @battery = battery
  end
  def on
    @battery.on
  end
end

class Battery
  def on
    @on = true
  end
  def on?
    @on
  end
end

require 'test/unit'
class RadioTest < Test::Unit::TestCase
  def test_on_turns_battery_on
    battery = Battery.new
    radio = Radio.new
    radio.add_battery(battery)
    radio.on
    assert_equal true, battery.on?
  end
end

In the above test, you are required to add the battery before you can test that the on method delegates to the battery. The following example demonstrates how you can test the same thing without requiring a call to the add_battery method.

class Radio
  ..
  def on(battery=@battery)
    battery.on
  end
end

class Battery
  def on
    @on = true
  end
  def on?
    @on
  end
end

require 'test/unit'
class RadioTest < Test::Unit::TestCase
  def test_on_turns_battery_on
    battery = Battery.new
    radio = Radio.new
    radio.on(battery)
    assert_equal true, battery.on?
  end
end

This is a fairly common idea; however, I like that Ruby allows me to specify a value for the test, but in the production code I would still call radio.on with no parameters and it would use the battery instance variable.

Tuesday, April 10, 2007

Rails: Use Ruby Schema syntax without using Migrations

I've touched on this topic a few times. I previously listed the pains of using Migrations. And, in my last post I gave an example of our solution. But, I never went further into why we made the choice.

For my current team, the choice was a fairly easy one. It's a large team and we were churning out migrations very quickly. Of course, the numbering became an issue, but keeping a consistent view of the database became a larger issue. There was no one place to get a clear view of what the table looked like (in code).

Another issue with Migrations is that the self.down methods are never really tested. Sure, you might go down then up with each build, but do you test the application at every step down and up? Without running all the tests at each down, it's impossible to know that the self.down did exactly what you expected.

Also, how often do you need to step up or down? We found that we wanted to drop the entire db with each build to ensure that we didn't have any data dependencies. Therefore, going back a version or 2 never seemed valuable.

Most importantly, we were eventually going to need to generate SQL and send it to the database group. This meant that migrations were only ever going to be run by our team. And, if we needed a view of the database at any given date, we could just look in Subversion for that copy of our schema.

Given the above thoughts, we decided to create one schema file per release. The schema file uses the same syntax as migrations and even allows us to specify the schema version (very much like the idea of using one migration per release).

ActiveRecord::Schema.define(:version => 1) do

  create_table :accounts, :force => true do |t|
    t.column :first_name, :string
    t.column :last_name, :string
    t.column :username, :string
    t.column :password, :string
    t.column :email, :string
    t.column :company, :string
  end

  ...
end

And, in case you missed it in the previous post, the task to run it is very simple.

task :migrate => :environment do
  ActiveRecord::Base.establish_connection(environment)
  require File.dirname(__FILE__) + '/../../db/schema/release_1.rb'
end

Migrations are fantastic, but if you don't need them you shouldn't live with the overhead.

Rails: Generating an Oracle DDL without Oracle installed

At my current project we are developing on Mac Minis and deploying to linux boxes. A fairly large problem with developing on Mac Minis is that there's no driver for Oracle that runs on the Intel Mac Minis.

We (painfully at times) address this problem by running Postgres locally and Oracle on our CI boxes. This works for us, but recently we ran into some pain. We needed to create SQL scripts from our schema definitions.

We store our schema definitions in the same format as a migration, but we put them all in one block, similar to the code below.

ActiveRecord::Schema.define(:version => 1) do

  create_table :accounts, :force => true do |t|
    t.column :first_name, :string
    t.column :last_name, :string
    t.column :username, :string
    t.column :password, :string
    t.column :email, :string
    t.column :company, :string
  end

  ...
end

And, we define our db:migrate to simply require this file.

task :migrate => :environment do
  ActiveRecord::Base.establish_connection(environment)
  require File.dirname(__FILE__) + '/../../db/schema/release_1.rb'
end

Since we run against Oracle on our CI boxes we could generate the DDL as a build artifact, but each time we make a change to the DDL we would need to check-in to see the changes for Oracle. This wasn't the most efficient use of our time, so we decided to get the OracleAdapter working on our MacMinis, despite not having the OCI8 drivers or oracle installed.

The code isn't the cleanest I've written, but it's also not as bad as I expected. It works for generating DDLs; however, it won't work if you have any code that requires a trip to the database to get table information (SomeActiveRecordSubClass.create is a good example of code that requires a trip).

namespace :db do
  namespace :generate do
    namespace :oracle do
      desc "generate sql ddl"
      task :ddl do
      $:.unshift File.dirname(__FILE__) + "../../../tools/fake_oci8"
      Rake::Task[:environment].execute
      file = File.open(RAILS_ROOT + "/db/oracle_ddl.sql", 'w')

      ActiveRecord::Base.instance_eval do
        def connection 
          ActiveRecord::ConnectionAdapters::OracleAdapter.new nil
        end
      end

      class ActiveRecord::Schema < ActiveRecord::Migration
        def self.define (info={}, &block)
          instance_eval(&block)
        end
      end

      ActiveRecord::Base.connection.class.class_eval do
        define_method :execute do |*args| 
          file << "#{args.first};\n"
        end
      end

      Rake::Task[:"db:migrate"].execute
      file.close
    end
  end
end

Much like a previous entry, I'm stealing the execute method to get the generated SQL. This time, I've also stolen the AR::Base.connection method and put an OracleAdapter in there. The change to AR::Schema is required because the original method updates the schema_info table at the end. Since I'm not using migrations, the update is unnecessary.

The result: An Oracle specific DDL generated without actually connecting to an Oracle database.

Sunday, April 01, 2007

Ruby: [Anti-pattern] Extract Module to shorten Class Definition

A Ruby module can be used to encapsulate a role that an object might play. For example, if an object wants to delegate, it can extend Forwardable.

Unfortunately, more than once I've seen an (anti)pattern of methods being moved into a module to shorten a class definition. This anti-pattern can be fairly easy to spot.

Do you have a module that only one class includes (or extends).
Do you have a module that requires other methods to be defined on the class that includes it.

I understand the motivation, when the class definition is getting a bit long and hard to follow it might make sense to take a group of similar methods and move them to a module. The problem with this refactoring is that it is possibly making the situation worse. If a class definition is already beginning to run long, it's possible that the class may be doing too much. Simply moving behavior into a module will not address this problem, and now if someone wants to see what an objects responsibilities are they have to look in more than one place.

A better solution could be to extract a class that encapsulates a subset of the responsibilities. If the behavior is required in various classes then it's probably time to look at converting the class to a module.