Chapter 1. Refactoring, a First Example
How do I begin to write about refactoring? The traditional way to begin talking about something is to outline the history, broad principles, and the like. When someone does that at a conference, I get slightly sleepy. My mind starts wandering with a low-priority background process that polls the speaker until he or she gives an example. The examples wake me up because it is with examples that I can see what is going on. With principles it is too easy to make generalizations, too hard to figure out how to apply things. An example helps make things clear.
So I'm going to start this book with an example of refactoring. During the process I'll tell you a lot about how refactoring works and give you a sense of the process of refactoring. I can then provide the usual principles-style introduction.
With an introductory example, however, I run into a big problem. If I pick a large program, describing it and how it is refactored is too complicated for any reader to work through. (I tried and even a slightly complicated example runs to more than a hundred pages.) However, if I pick a program that is small enough to be comprehensible, refactoring does not look like it is worthwhile.
Thus I'm in the classic bind of anyone who wants to describe techniques that are useful for real-world programs. Frankly it is not worth the effort to do the refactoring that I'm going to show you on a small program like the one I'm going to use. But if the code I'm showing you is part of a larger system, then the refactoring soon becomes important. So I have to ask you to look at this and imagine it in the context of a much larger system.
The Starting Point
The sample program is very simple. It is a program to calculate and print a statement of a customer's charges at a video store. The program is told which movies a customer rented and for how long. It then calculates the charges, which depend on how long the movie is rented, and identifies the type movie. There are three kinds of movies: regular, children's, and new releases. In addition to calculating charges, the statement also computes frequent renter points, which vary depending on whether the film is a new release.
Several classes represent various video elements. Here's a class diagram to show them (Figure 1.1).
I'll show the code for each of these classes in turn.
Movie
Movie is just a simple data class.
class Movie
CHILDRENS = 2
REGULAR = 0
NEW_RELEASE = 1
attr_reader :title
attr_accessor :price_code
def initialize(title, price_code)
@title, @price_code = title, price_code
end
end
Figure 1.1. Class diagram of the starting-point classes. Only the most important features are shown. The notation is Unified Modeling Language UML [Fowler, UML].
Rental
The rental class represents a customer renting a movie.
class Rental
attr_reader :movie, :days_rented
def initialize(movie, days_rented)
@movie, @days_rented = movie, days_rented
end
end
Customer
The customer class represents the customer of the store. Like the other classes it has data and accessors:
class Customer
attr_reader :name
def initialize(name)
@name = name
@rentals = []
end
def add_rental(arg)
@rentals << arg
end
def statement
total_amount, frequent_renter_points = 0, 0
result = "Rental Record for #{@name}\n"
@rentals.each do |element|
this_amount = 0
# determine amounts for each line
case element.movie.price_code
when Movie::REGULAR
this_amount += 2
this_amount += (element.days_rented - 2) * 1.5 if element.days_rented > 2
when Movie::NEW_RELEASE
this_amount += element.days_rented * 3
when Movie::CHILDRENS
this_amount += 1.5
this_amount += (element.days_rented - 3) * 1.5 if element.days_rented > 3
end
# add frequent renter points
frequent_renter_points += 1
# add bonus for a two day new release rental
frequent_renter_points += 1 if element.movie.price_code == Movie.NEW_RELEASE && element.days_rented > 1
# show figures for this rental
result += "\t" + element.movie.title + "\t" + this_amount.to_s + "\n"
total_amount += this_amount
end
# add footer lines
result += "Amount owed is #{total_amount.to_s}\n"
result += "You earned #{frequent_renter_points.to_s} frequent renter points"
result
end
end
Customer also has the method that produces a statement. Figure 1.2 shows the interactions for this method. The body for this method is on the facing page.
Figure 1.2. Interactions for the statement method
Comments on the Starting Program
What are your impressions about the design of this program? I would describe it as not well designed and certainly not object oriented. For a simple program like this, that does not really matter. There's nothing wrong with a quick and dirty simple program. But if this is a representative fragment of a more complex system, then I have some real problems with this program. That long statement routine in the Customer class does far too much. Many of the things that it does should really be done by the other classes.
Even so the program works. Is this not just an aesthetic judgment, a dislike of ugly code? It is until we want to change the system. The compiler doesn't care whether the code is ugly or clean. But when we change the system, there is a human involved, and humans do care. A poorly designed system is hard to change. Hard because it is hard to figure out where the changes are needed. If it is hard to figure out what to change, there is a strong chance that the programmer will make a mistake and introduce bugs.
In this case we have a change that the users would like to make. First they want a statement printed in HTML so that the statement can be Web enabled and fully buzzword compliant. Consider the impact this change would have. As you look at the code you can see that it is impossible to reuse any of the behavior of the current statement method for an HTML statement. Your only recourse is to write a whole new method that duplicates much of the behavior of statement. Now, of course, this is not too onerous. You can just copy the statement method and make whatever changes you need.
But what happens when the charging rules change? You have to fix both statement and html_statement and ensure the fixes are consistent. The problem with copying and pasting code comes when you have to change it later. If you are writing a program that you don't expect to change, then cut and paste is fine. If the program is long lived and likely to change, then cut and paste is a menace.
This brings me to a second change. The users want to make changes to the way they classify movies, but they haven't yet decided on the change they are going to make. They have a number of changes in mind. These changes will affect both the way renters are charged for movies and the way that frequent renter points are calculated. As an experienced developer you are sure that whatever scheme users come up with, the only guarantee you're going to have is that they will change it again within six months.
The statement method is where the changes have to be made to deal with changes in classification and charging rules. If, however, we copy the statement to an HTML statement, we need to ensure that any changes are completely consistent. Furthermore, as the rules grow in complexity it's going to be harder to figure out where to make the changes and harder to make them without making a mistake.
You may be tempted to make the fewest possible changes to the program; after all, it works fine. Remember the old engineering adage: "if it ain't broke, don't fix it." The program may not be broken, but it does hurt. It is making your life more difficult because you find it hard to make the changes your users want. This is where refactoring comes in.
Tip
When you find you have to add a feature to a program, and the program's code is not structured in a convenient way to add the feature, first refactor the program to make it easy to add the feature, then add the feature.