How do I begin to write about refactoring? The traditional way to begin talking about something is to outline the history, broad principles, and the like. When someone does that at a conference, I get slightly sleepy. My mind starts wandering with a low-priority background process that polls the speaker until he or she gives an example. The examples wake me up because it is with examples that I can see what is going on. With principles it is too easy to make generalizations, too hard to figure out how to apply things. An example helps make things clear.
So I'm going to start this book with an example of refactoring. During the process I'll tell you a lot about how refactoring works and give you a sense of the process of refactoring. I can then provide the usual principles-style introduction.
With an introductory example, however, I run into a big problem. If I pick a large program, describing it and how it is refactored is too complicated for any reader to work through. (I tried and even a slightly complicated example runs to more than a hundred pages.) However, if I pick a program that is small enough to be comprehensible, refactoring does not look like it is worthwhile.
Thus I'm in the classic bind of anyone who wants to describe techniques that are useful for real-world programs. Frankly it is not worth the effort to do the refactoring that I'm going to show you on a small program like the one I'm going to use. But if the code I'm showing you is part of a larger system, then the refactoring soon becomes important. So I have to ask you to look at this and imagine it in the context of a much larger system.
The Starting Point
The sample program is very simple. It is a program to calculate and print a statement of a customer's charges at a video store. The program is told which movies a customer rented and for how long. It then calculates the charges, which depend on how long the movie is rented, and identifies the type movie. There are three kinds of movies: regular, children's, and new releases. In addition to calculating charges, the statement also computes frequent renter points, which vary depending on whether the film is a new release.
Several classes represent various video elements. Here's a class diagram to show them (Figure 1.1).
I'll show the code for each of these classes in turn.
Movie
Movie is just a simple data class.
class Movie
CHILDRENS = 2
REGULAR = 0
NEW_RELEASE = 1
attr_reader :title
attr_accessor :price_code
def initialize(title, price_code)
@title, @price_code = title, price_code
end
end
Figure 1.1. Class diagram of the starting-point classes. Only the most important features are shown. The notation is Unified Modeling Language UML [Fowler, UML].
Rental
The rental class represents a customer renting a movie.
class Rental
attr_reader :movie, :days_rented
def initialize(movie, days_rented)
@movie, @days_rented = movie, days_rented
end
end
Customer
The customer class represents the customer of the store. Like the other classes it has data and accessors:
class Customer
attr_reader :name
def initialize(name)
@name = name
@rentals = []
end
def add_rental(arg)
@rentals << arg
end
def statement
total_amount, frequent_renter_points = 0, 0
result = "Rental Record for #{@name}\n"
@rentals.each do |element|
this_amount = 0
# determine amounts for each line
case element.movie.price_code
when Movie::REGULAR
this_amount += 2
this_amount += (element.days_rented - 2) * 1.5 if element.days_rented > 2
when Movie::NEW_RELEASE
this_amount += element.days_rented * 3
when Movie::CHILDRENS
this_amount += 1.5
this_amount += (element.days_rented - 3) * 1.5 if element.days_rented > 3
end
# add frequent renter points
frequent_renter_points += 1
# add bonus for a two day new release rental
frequent_renter_points += 1 if element.movie.price_code == Movie.NEW_RELEASE && element.days_rented > 1
# show figures for this rental
result += "\t" + element.movie.title + "\t" + this_amount.to_s + "\n"
total_amount += this_amount
end
# add footer lines
result += "Amount owed is #{total_amount.to_s}\n"
result += "You earned #{frequent_renter_points.to_s} frequent renter points"
result
end
end
Customer also has the method that produces a statement. Figure 1.2 shows the interactions for this method. The body for this method is on the facing page.
Figure 1.2. Interactions for the statement method
Comments on the Starting Program
What are your impressions about the design of this program? I would describe it as not well designed and certainly not object oriented. For a simple program like this, that does not really matter. There's nothing wrong with a quick and dirty simple program. But if this is a representative fragment of a more complex system, then I have some real problems with this program. That long statement routine in the Customer class does far too much. Many of the things that it does should really be done by the other classes.
Even so the program works. Is this not just an aesthetic judgment, a dislike of ugly code? It is until we want to change the system. The compiler doesn't care whether the code is ugly or clean. But when we change the system, there is a human involved, and humans do care. A poorly designed system is hard to change. Hard because it is hard to figure out where the changes are needed. If it is hard to figure out what to change, there is a strong chance that the programmer will make a mistake and introduce bugs.
In this case we have a change that the users would like to make. First they want a statement printed in HTML so that the statement can be Web enabled and fully buzzword compliant. Consider the impact this change would have. As you look at the code you can see that it is impossible to reuse any of the behavior of the current statement method for an HTML statement. Your only recourse is to write a whole new method that duplicates much of the behavior of statement. Now, of course, this is not too onerous. You can just copy the statement method and make whatever changes you need.
But what happens when the charging rules change? You have to fix both statement and html_statement and ensure the fixes are consistent. The problem with copying and pasting code comes when you have to change it later. If you are writing a program that you don't expect to change, then cut and paste is fine. If the program is long lived and likely to change, then cut and paste is a menace.
This brings me to a second change. The users want to make changes to the way they classify movies, but they haven't yet decided on the change they are going to make. They have a number of changes in mind. These changes will affect both the way renters are charged for movies and the way that frequent renter points are calculated. As an experienced developer you are sure that whatever scheme users come up with, the only guarantee you're going to have is that they will change it again within six months.
The statement method is where the changes have to be made to deal with changes in classification and charging rules. If, however, we copy the statement to an HTML statement, we need to ensure that any changes are completely consistent. Furthermore, as the rules grow in complexity it's going to be harder to figure out where to make the changes and harder to make them without making a mistake.
You may be tempted to make the fewest possible changes to the program; after all, it works fine. Remember the old engineering adage: "if it ain't broke, don't fix it." The program may not be broken, but it does hurt. It is making your life more difficult because you find it hard to make the changes your users want. This is where refactoring comes in.
Tip
When you find you have to add a feature to a program, and the program's code is not structured in a convenient way to add the feature, first refactor the program to make it easy to add the feature, then add the feature.
Jay, I think your example is great. Sure, refactoring become more necessary as the code base crows, but your example shows up one common mistake in a very condensed way and the reasons for refactoring the code should be plausible to programmers who haven't worked on huge code bases. Looking forward to your book!
ReplyDeleteBTW., the getPriceCode call in the interaction diagram is a wonderful visualization of a violation of the law of demeter...
Glad to see a Ruby port of this book. However, I think this post should be more clear that the original content is by Martin Fowler. You did mention that it's a port in another post, but this post doesn't mention him.
ReplyDeleteLooking forward to seeing the refactored version in Ruby.
Looking forward to this. I'm particularly wondering how much programming language affects best refactorings and if refactoring will be tough to do in Ruby without the great IDE support that java has?
ReplyDeleteDid Fowler gave his authorization to copy the whole chapter of his book? Is it legal?
ReplyDeleteThis comment has been removed by the author.
ReplyDelete"Did Fowler gave his authorization to copy the whole chapter of his book? Is it legal?"
ReplyDeleteUhm, yeah... I seriously doubt Jay is stupid enough to plagiarize the work of the Chief Scientist at his own company and then post it on a weblog that is aggregated by the same company.
Jay -- looking forward to reading more.
It's great that there's a Ruby refactoring book coming out. It's one of the large holes in the Ruby library - that and patterns. I'm looking forward to the finished version.
ReplyDeleteMeanwhile, I have a question: Is there any reason why you chose the undescriptive names arg and element instead of something more descriptive such as rental:
def add_rental(arg)
@rentals.each do |element|
samaaron,
ReplyDeleteI followed the examples that Martin originally used. Later in the examples he changes the name to a more descriptive name and explains why it is important. We'll be following his lead in this area.
Ah wonderful :-) I think that changing variable names is one of those subtle refactoring techniques that people don't even realise is part of the refactoring process.
ReplyDeleteI bought the Fowler book, but put it on the post PhD 'must-read' pile. I'm thinking I should just ignore it now, and follow your progress.
I am very happy that you are doing this. I don't want to subject myself to any more Java code than is necessary after I submit my thesis :-)
Thanks for showing us the work in progress. I haven't picked up Fowler's tome yet. I think it's the $60 price point that scares me away, and the lack of a PDF version that would cut the costs, too. But having the work in Ruby format would lower the barrier of entry to me, and I'm looking forward to it.
ReplyDelete