Sunday, June 24, 2012

Design by Contract and Refactoring

I am a great believer both in refactoring and in design by contract.  How do these two work together?

First, contracts are a great help when refactoring.  The first thing you need to know when you refactor a piece of code is what it is supposed to do.  This tells you what you can and can't do with it, and, no less important, what would be useful to do with it.  If it's an arbitrary set of statements, say part of an existing method you are trying to extract into a new method, there's little to help beyond reading the code and trying to figure out what it does and how it fits with the rest of the code.  (Even then, there are tools that try to deduce a contract for an arbitrary piece of code; more on that in a later post).  But if it's a method, you should have more help.

If you have a good suite of unit tests, you can try to look at the tests of that method to figure out what it's supposed to do.  But it's much easier if you have a contract attached to the method; the contract should give you a lot of the information you need.  For example, suppose you want to move one or more methods and fields from one class to another (perhaps you are using Pull Up Method, Push Down Method, or just Move Method).  In that case, you should check the contracts of moved methods to see how they fit in their new class.  Are the invariants of the new class maintained by the moved methods?  Are the contracts of the methods still valid, taking inheritance into account?  Nasty bugs can result if the answers are negative.

On the other hand, refactoring may require corresponding changes in existing contracts, as well as the creation of brand-new contracts.  In other words, contracts need to be refactored together with the associated code.  This is an added burden, which may well be an obstacle to the use of the design-by-contract methodology; not only do I have to invest effort in creating the contracts in the first place, I also have to refactor them later.  Why should I bother?

There are several answers to this complaint.  First, consider the alternatives.  You may be flying blind, trusting only on your undocumented understanding of the code.  In that case, good luck to you; you'll need it.  If you follow an agile methodology, such as Extreme Programming, you should have an extensive suite of unit tests to help you refactor.  Unit tests are very vulnerable to change in the code, since they are attached to small units of code.  This means that any shift in responsibilities is likely to invalidate some tests, which will then have to be refactored or completely rewritten.  So there's always the need to refactor associated artifacts when you refactor the code.

Having contracts can significantly reduce the amount of detail in unit tests, and even the total number of unit tests.  This is due to the fact that a major part of the responsibility usually given to unit tests is now taken by the contract.  All that the tests need to do is exercise the system, but correctness checking (or a large part of it) is now done by checking the contracts.  So now you can have higher-level tests that exercise the system, instead of unit tests for each class and method.  For example, suppose you are implementing a cryptographic algorithm such as RSA.  A test that creates a random key, then encrypts a random data buffer, decrypts it, and checks that the result is the original data, is guaranteed to exercise almost all of your cryptographic code.  Moreover, this test can be used on many different implementations.  In contrast, the internals of the implementation can vary widely, since there are many ways to implement the large-number arithmetic operations required for RSA, and their efficiency will be different under different circumstances.  By having an application-level test augmented with contracts, the burden of refactoring tests is reduced to almost nil.

This still leaves the requirement of refactoring contracts, and the question of how much refactoring tools can help.  In order to understand that, it is necessary to examine the relationships between code refactorings and contracts.  On the simplest level, contracts are treated just like code. For example, when renaming a method, all references to it in the code must be appropriately modified; so must all references in assertions. Similarly, a method may be eliminated when it is not used anywhere, including in assertions.  Contract-aware tools should find these kinds of relationships easy to perform automatically.  Of the refactorings listed in Fowler's Refactoring book, 32% do not interact with contracts except possibly in this way.  These are mostly syntactic refactorings such as Remove Parameter or Rename Method, or those that eliminate classes or methods, such as Inline Class and Inline Method.

Some contracts affect the applicability of certain refactorings.  As mentioned above, a method should only be moved if it won't violate the invariant of its new class.  (Of course, it might be possible to modify that invariant to accomodate the moved method; this can be done in a separate step, before moving the method.)  13% of Fowler's refactorings have this nature.

Some refactorings require the creation of new or modified contracts.  The most extreme case is Extract Method, which creates a new method out of arbitrary code.  Discovering the contract for arbitrary code is impossible in general, although some tools can discover partial contracts.  Another example is Create Superclass, which can create contracts for new methods based on existing contracts in subclasses.  59% of Folwer's refactorings fall into this category (which includes 4% that are also included in the previous one).

Finally, some new refactorings can be defined to deal specifically with contracts; these include Pull Up Contract, Push Down Contract, Create Abstract Precondition, and Simplify Assertion.

In a future post I will discuss automation of contract-related refactorings and how refactoring tools can take some of the burden of the contracts.