Jay Fields' Thoughts: February 2006

Thursday, February 16, 2006

Add dependency support to custom rake tasks

I previously wrote about creating a NUnitTask for Rake. The NUnitTask is based on the PackageTask that comes standard with Rake. I assumed that the PackageTask would have support for adding dependencies at construction time. This turned out to be a false assumption. To add a PackageTask and dependency for that task they need to be declared separately.

Rake::PackageTask.new "foo","1.2.3" do |pkg|
  pkg.need_zip = true
  pkg.package_files.include("build_helpers/*.rb")
end
task :package => :test

I had expected it to work much like a standard task and dependency.

#this will not work
Rake::PackageTask.new "foo","1.2.3" => :init do |pkg|
  pkg.need_zip = true
  pkg.package_files.include("build_helpers/*.rb")
end

Upon discovery, I decided to alter NUnitTask to take a dependency in the expected manner. The only change required is to alter the assignment of the name variable.

def initialize(name=:nunit)
  @name = name
  ..
end

Becomes

def initialize(params=:nunit)
  @name = case params
    when Hash
      task params.keys[0] => params[params.keys[0]]
      params.keys[0]
    else
      params
  end
  ..
end

Hopefully, Jim Weirich can add the same capability to the next version of the Rake Tasks.

Rake Tutorial

If you are tired of reading my entries about Rake and not knowing what I'm talking about I suggest:

The Rake Tutorial
Martin's entry on Rake
Jim Weirich's blog
Attending Stuart Halloway's NFJS session

Friday, February 10, 2006

Initialization Chain

Initialization Chain allows you to chain additional property setters to an object's initialization.

Column column = new Column().WithHeader("ID").WithType(typeof(int))

Initialization Chain is a special case of Fluent Interface. It's most often seen during testing, but it also has a place in production code.

Initialization Chain can be used to reduce the number of constructors of a class. In the above example a constructor of Column could be created to take both a header and column type. However, neither of these properties are dependencies of the class. Creating constructors that initialize properties that are not dependencies can lead to constructor bloat.

Initialization Chain is more expressive than using a constructor to initialize properties. When maintaining an existing codebase you will need to understand which properties are being set by which constructor arguments. Additionally, you will need to know which constructor arguments are dependencies and thus required across all constructors of the class. Initialization Chain will explicitly show that any argument of a constructor is a dependency and any additional properties that need to be set can be done in the chain by methods that express which property is being set.

Initialization Chain can remove the need for a temporary variable. In the above example an instance of the Column class is being initialized. Generally, this type of operation will occur in some type of loop. In general you will need to create each Column instance, set the properties, and add the new Column instance to the collection. Initialization Chain allows you to set the properties following construction and immediately pass the instance to the collection without the need for the temporary variable.

There are also negatives to Initialization Chain. Readonly fields cannot be initialized from an Initialization Chain. There is overhead in creating and maintaining the Initialization Chain methods. They can clutter the API by adding an additional method for each property. And, there is no stopping someone from using one long after initialization.

Implementing Initialization Chain is quite simple. For each property that you wish to be able to initialize in an Initialization Chain you must simply add a method that sets the property and then returns the instance.

class Foo...
  public string Bar
  {
    get { return bar; }
    set { bar = value; }
  }

  public Foo WithBar(string value)
  {
    Bar = value;
    return this;
  }

Creating an interface for testing

When following some common unit testing guidelines you will often need to create interfaces for the purpose of testing.

A common example of this occurs when you use constructor dependency injection. For example, if you have a Foo class that takes a Bar as a constructor argument.

class Foo...
  public Foo(Bar bar)
  {
    this.bar = bar;
  }
  ..
}

In the above example Foo cannot be tested without creating an instance of the Bar class. However, the dependency could be mocked or stubbed (depending upon intent) if Foo instead depended upon an interface.

class Foo...
  public Foo(IBar bar)
  {
    this.bar = bar;
  }
  ..
}

Unfortunately, this approach is often met with criticism.

The common complaints include.

Complaint: Creating extra code to support testing. Response: Code written to support testing should be minimized. However, any code that allows for better testing is well worth adding. It will need to be maintained, but this level of effort should be far less than the level of effort required when working with detestable code.
Complaint: The overhead of maintaining the class and the interface. Response: Static languages are great at alerting you when your class/interface relationship has become out of synch. Additionally, refactoring tools such as ReShaper ensure that little effort will be necessary to maintain a good relationship.

Using an interface has additional positives.

Easy mocking or stubbing. Interfaces dependencies allow you to create mocks or stubs that behave specifically as your test requires. This improves the durability of the tests you write.
Truly unit testable classes. Classes that have constructor arguments that are not interfaces require instances of concrete types with defined behavior. When this dependency exists any change to the behavior of a dependency can cause false test failures. Classes that have interfaces as dependencies can have their behavior tested in isolation.
Using other types (as long as they implement the interface) as dependencies. While developing applications, requirements change. Using the example above, today it may appear that Foo will always depend on Bar. However, in the future it may turn out that Foo depends on BarNorth and BarSouth. If BarNorth and BarSouth are nothing alike they could share a base class, but will more likely simply implement the same interface. When this occurs, if Foo depends on Bar, the constructor (and likely every place that uses type Bar) will need to be changed to use the interface instead. If an interface were used from the beginning, no changes to the existing codebase would be required.

This is a simple concept, but it's often a tripping point. Don't be afraid of the overhead required to use an interface when it will improve your ability to write quality software.

Wednesday, February 08, 2006

.norm find collection usage

In my last entry I detailed how to use .norm to return an instance of an object from the database. However, often you are looking to return more than one instance. Enter DataGateway.FindCollection.

DataGateway.FindCollection is overloaded. You can call FindCollection with only a type and return an array of instances that represent each record in the database. Or, you can execute FindCollection with a type and a IWhere instance. The IWhere is used to specify which rows to return.

ConnectionInfo info = new ConnectionInfo("fieldsj2","norm","jay","jay");
DataGateway gateway = new DataGateway(info);
Bar[] results = (Bar[]) gateway.FindCollection(typeof(Bar));

The above code will return all instance of Bar stored in the database. If you only needed the instance of Bar where Bar.Baz is equal to 1 you could specify that in an IWhere parameter.

ConnectionInfo info = new ConnectionInfo("fieldsj2","norm","jay","jay");
DataGateway gateway = new DataGateway(info);
IWhere where = new Where().Column("Baz").IsEqualTo(1);
Bar[] results = (Bar[]) gateway.FindCollection(typeof(Bar),where);

.norm object graph future

.norm was created to be a simple object mapping layer. It's not a silver bullet.

It does have benefits

Persist simple data types with little (attributes) or no work.
Create objects that rely on Constructor Initialization.

But, it also has limitations

It heavily uses reflection which may be a problem if you require very high performance.
It ignores fields or properties which are user defined types.

These limitations are by design. By heavily using reflection it is possible to allow usage with little or no set up. This can add velocity to the start of a project. Later in the project, if performance is critical you can look for optimization opportunities.

.norm could address the issue of working with graphs of objects; however, the complexity may not be worth the pay off. Consider for example you have a Foo class that has a Bar property. Persisting Foo and Bar is simple. But, if Foo takes a Bar as a constructor argument things become much more complicated. Now, to return an instance of Foo you need an instance of Bar registered to DataGateway.

There are ways to work around this issue.

Simply add a default constructor.
Before returning an object (Foo) you could register all it's properties that are user defined types (Bar). However, there is no way to ensure that the object that was persisted as the property of type Bar was also the constructor argument of type Bar.
Change Foo's dependency from a Bar to a BarFactory. But, now each user defined type will also need a factory. Register the BarFactory to satisfy Foo's constructor at creation time. This could clutter your domain. .norm is not supposed to be intrusive enough to force design issues.
Add a boolean to the ColumnAttribute that specified if the decorated property (or field) should be registered as a dependency.

If .norm does become object graph friendly option 2 will likely not be implemented because it would complicate usage without clear benefit. Option 4 is quite possible, but it is just one of the issues related to object graph support. Cascading deletes is another obvious issue that will need to be dealt with. Cascading delete can be handled via attributes also, but at some point to the domain objects become to cluttered with orm code.

Remaining unfriendly to object graphs may be a superior alternative. .norm can be used as a base for a custom data mapping layer. Again, .norm isn't a silver bullet. .norm is a simple framework that automates simple tasks. When situations become more complex, classes can wrap .norm functionality and enforce the persistence/finding logic.

Consider this applied to the above Foo and Bar problem. Foo and Bar can both be persisted by .norm. But, Foo needs a Bar instance as a constructor argument. If you wrote a FooMapper class it could handle Foo persistence and persistence all of the child objects of Foo. Additionally, FooMapper could create (via .norm) or get from a factory the Bar dependency, register it with DataGateway and then return the Foo class correctly hydrated. Additionally, using FooMapper to delete could ensure that the necessary child objects are also removed.

Regardless of implementation, abstracting the dependency resolution to another class clearly has benefits. The alternative relies on .norm to make complex decisions (which increases the likelihood of mistakes).

Does this mean that .norm will not support object graphs? Absolutely not. If object graph comaptibility is required to increase adoption it will clearly be added. But, not until it's clear this is a widely requested feature.

Friday, February 03, 2006

.norm find usage

.norm allows you to easily return instances of any type from that database.

DataGateway.Find takes a type and an IWhere as parameters. The type is the type of the instance you are returning, and the IWhere specifies which record to return. If IWhere does not limit the matching records to only one, the first record returned will be used to create the result.

ConnectionInfo info = new ConnectionInfo("source","catalog","user","pass");
DataGateway gateway = new DataGateway(info);
IWhere where = new Where().Column("bar").IsEqualTo(1);
Foo foo = (Foo) gateway.Find(typeof(Foo),where);

The result will be a fully hydrated instance of the Foo class.

If the Foo class has any dependencies, those dependencies can be registered with the DataGateway.* Registering a dependency will allow .norm to use constructors that contain dependencies. The code is basically the same except the dependency registration.

ConnectionInfo info = new ConnectionInfo("fieldsj2","norm","jay","jay");
DataGateway gateway = new DataGateway(info);
gateway.RegisterComponentImplementation(typeof(MappedFactory));
IWhere where = new Where().Column("some_field").IsEqualTo(1);
MappedFind result = (MappedFind) gateway.Find(typeof(MappedFind),where);

The above code will return an instance of MappedFind with it's MappedFactory dependency already injected.

[RelationalMapped("mapped")]
private class MappedFind
{
 private readonly MappedFactory factory;

 public MappedFind(MappedFactory factory)
 {
  this.factory = factory;
 }

 [Column("some_field")]
 public int Field = 0;
 private string prop;

 [Column("some_prop")]
 public string Property
 {
  get { return prop; }
  set { prop = value; }
 }

 public int FactoryGet()
 {
  return factory.ReturnSix();
 }
}

private class MappedFactory
{
 public int ReturnSix()
 {
  return 6;
 }
}

*DataGateway is actually delegating the registration to a PicoContainer instance. This instance is also used for creating the objects that are returned from the database.

.norm version 0.2

After some good feedback on version 0.1 another release of the .norm framework has been prepared.

Changes in 0.2

ConnectionInfo class now allows Windows Authentication or User and Password specified.
WhereBuilder has become the Where fluent interface.
Types byte, short, int, long, Single, double, decimal, bool, DateTime, Guid, and byte[] are all supported.
Unsupported types are ignored.

Thursday, February 02, 2006

.norm delete usage

Using .norm to delete database records is very similar to updating records. The only difference between update and delete is that delete doesn't require an instance of a type. Instead, Delete takes a type as a parameter. Delete uses this type to remove all the records of that type that match the WhereBuilder clause.

ConnectionInfo info = new ConnectionInfo("data_source","my_catalog","user_name","password");
DataGateway gateway = new DataGateway(info);
WhereBuilder where = new WhereBuilder("ID",0);
gateway.Delete(typeof(Foo),where);

Following the execution of this code, all records in the database where the ID column is equal to 0 will be deleted. As expected, .norm will use type infomation or attributes to map the type to a table. IWhereBuilder usage is exactly the same as previously explained.

Where Fluent Interface part II

Yesterday I detailed my experiences creating my first fluent interface. The implementation of the first attempt at a Where fluent interface can be downloaded here.

However, I wasn't satisfied with leaving the user the ability to do something such as Where().And() or Where().Column("foo").IsEqualTo(0).Column("bar"). Though both of these usages would throw an exception, I prefer a compile time check instead. The solution I was looking for would only allow access to the methods in the correct order. Finding this solution would also remove the need for maintaing the next expected method call.

To achieve this the Where class needed to lose the majority of it's behavoir. The Where class now has only a Column method that returns a new WhereColumn. Additionally, Where no longer implements IWhere. The WhereColumn class still only has an IsEqualTo method; however, this method now returns a CompleteWhere instance.

By adding the CompleteWhere class I was able to limit usage to only correct methods. CompleteWhere implements (the now slimmer) IWhere interface. The only property of the IWhere interface is And. The And property returns a Where instance, which allows you to begin a new column/value condition. CompleteWhere has a few other public methods that are used internally in the .norm framework to build SQL statements. However, correct usage should never return you an instance of CompleteWhere.*

Overall, I am happy with the evolution of the where fluent interface. Version 2 is as easy to use (and less easy to make a mistake with), and the code behind it is simpler. You can download v2 here.

*In the past I would have made the constructor of Where that takes a CompleteWhere instance internal. Additionally, the CompleteWhere class should be internal to ensure that it isn't misused. However, I chose not to take this approach because Zak Tamsen convinced me that I should always provide the rope.

Wednesday, February 01, 2006

Using Fluent Interface

While working on .norm I needed to create a class that could be used to specify criteria. Specifically, I needed a class that represented a SQL where clause.

I considered using a comparison object array, but this would only allow for "and" comparisons (and would exclude "or"). I finally decided that it was important to maintain the order of the comparison objects in some way. I decided on a builder. The builder would produce a string based on the order that the user added "and" or "or" comparisons.

Then, I recently read Martin Fowler's bliki entry, fluent interface, and it seemed a better fit.

In the 0.1 version of .norm the WhereBuilder class can be used like the example from .norm update usage. However, the in the next version, where usage would look more like a fluent interface. A where clause with one equal comparison will look like:
IWhere where = new Where().Column("foo").IsEqualTo("bar");

A where clause with 2 equal comparisons will look like:
IWhere where = new Where().Column("foo").IsEqualTo(0).And.Column("bar").IsEqualTo(1);

The new implementation is more verbose than the previous version of WhereBuilder. But, I think it is valuable because it better conveys what the code is accomplishing.

The most interesting thing I learned while developing the interface was how to guard against improper usage. For example, users should include the And property instead of skipping it. Therefore, some type of state in the object must be kept that determines what the next expected method (or property) call will be. If the next method called is not the expected call, an exception is thrown specifying which method call was expected.

Requiring the users to use the fluent interface correctly is important because much of the value of fluent interface is in conveying a message in the code. If the interface isn't used in a fluent manner than there's no point in creating the fluent interface.

.norm update usage

Using .norm to update database records is almost as easy as inserting new records.

Foo user = new Foo();
user.FirstName = "Jay";
user.lastName = "Fields";
user.ID = 0;

ConnectionInfo info = new ConnectionInfo("data_source","my_catalog","user_name","password");
DataGateway gateway = new DataGateway(info);
WhereBuilder where = new WhereBuilder("ID",user.ID);
gateway.Update(user,where);

.norm will use type infomation or attributes to map your class to a table exactly the same way I previously documented.

The Update method takes an object and a WhereBuilder instance. The object is the object who's data will be used to update the database. The WhereBuilder instance is used to limit which records are updated. In the example above, only the rows in the database where the ID column is equal to 0 (the user.ID) will be updated. WhereBuilder has an And method that allows you to chain multiple equality tests.