Wednesday, January 19, 2011

Training Wheels

Parents put training wheels on bicycles when children first start learning to ride.  Once the children have enough experience and confidence, however, the training wheels get in the way and are removed.  You will never see an experienced cyclist using training wheels.  But you will often see experienced computer users, including software developers, using features meant for novices.

There is certainly a lot to learn about using computers, especially with complex applications such as development environments.  It makes sense to help people learn and adjust to the proper use of these applications.  But as they acquire experience, they should be able to remove the training wheels, lest they become a burden.  And giving people help on the learning curve should not mean dumbing down interfaces!

The mouse is a case in point.  I believe that taking the mouse away from developers' workstations will improve developer productivity more than any other action, possibly after a small adjustment period.  I keep seeing developers waste their time using the mouse to hunt for a specific menu entry, which may be hidden under one or more menu levels.  This is becoming more and more difficult as screen resolutions get higher and the clickable area gets proportionately smaller.  Instead, a short sequence of keys (two to five keystrokes) will achieve the same result in less than a second.

Of course, in order to take advantage of the keyboard you need to memorize quite a few such sequences.  This is impossible for a casual user of the application, but happens easily and automatically for an experienced user, given some learning time.  It's just like riding a bicycle: at first you need to think explicitly about every action you take, but after a while it becomes automatic.  And this is true for any repetitive activity, such as walking, swimming, driving, or touch typing: if you practice enough, it becomes automatic to the point where trying to think about it is actually confusing.  Try to think about the mechanics of walking while you do it, and you will lose your step.  I touch-type, and when I'm asked about the location of some key I need to mimic the typing movement in order to get reminded of the location through my muscle memory.  Similarly, I know a great many keyboard shortcuts for applications I use frequently, but I find it hard to articulate the key sequences.

I'm an avid  Emacs user.  Emacs (The One True Editor) has many excellent qualities, but it has a steep learning curve because of its many keyboard shortcuts, which can use any combination of the Ctrl and Alt modifiers with any other key, and also have multi-key combinations.  (The Lisp Machine keyboard, also called the space-cadet keyboard, had two additional modifiers, called Super and Hyper).  Once you learn these sequences, however, you can make text fly on the screen.  I remember sitting next to one expert when I first got to MIT; I was staring at the screen and seeing the program transform itself, and I couldn't relate what I was seeing to the keystrokes used to create that effect.  I have since had that effect myself on other people.

For many years I taught courses based on the excellent book Structure and Interpretation of Computer Programs, which uses the Scheme programming language.  My interpreter of choice for that course was MIT Scheme, whose editor component is based on Emacs.  However, students prefered to use Dr. Scheme, simply because it has a more conventional editor.  I can understand their reluctance to learn the Emacs keystrokes, since they thought this skill would only be useful in that single course.  But there is no justification for professional developers to avoid learning skills that will make their work easier and faster.

Unfortunately, the tendency among tool developers is just the opposite.  They target their interfaces at the lowest common denominator, thus making the tools attractive to the novice user but hampering the experienced user.  Mouse-based interfaces are today much preferred over keyboard shortcuts, to the extent that Microsoft has eliminated the indications of which keyboard shortcuts are available (using underlined characters); these can be revealed through the "accessibility options" dialog, as though you are somehow mouse-challenged if you want to use the keyboard.  (Hint: look for "Show extra keyboard help in programs".)

Because of this behavior of tool writers, users have no chance to learn the keyboard shortcuts and make their use automatic.  Thus, they are stuck with the laborious use of the mouse and never outgrow their training wheels.  As always, you should use the right tool for the job.  The mouse is good for applications you only use occasionally, and is indispensable for some graphics applications.  But keep your hands on the keyboard for software development!

Monday, January 10, 2011

Pointers: Two Views on Software Development

I recommend reading the article Risks of Undisciplined Development by David Parnas in the Inside Risks column of the October 2010 issue of Communications of the ACM.  David Parnas is famous (among other things) for opposing Reagan's Stragegic Defense Initiative (SDI, nicknamed "Star Wars") on the grounds that software engineering state-of-the-art was not sufficient to build such a system (it still isn't).  This article urges using, and teaching, disciplined development practices.

On a lighter note, here is another view of a good software development process:



This is taken from xkcd, which I follow regularly (together with Dilbert).

Sunday, January 2, 2011

It Looked Good When We Started

"Well, in our country," said Alice, still panting a little, "you'd generally get to somewhere else — if you run very fast for a long time, as we've been doing."
"A slow sort of country!" said the Queen. "Now, here, you see, it takes all the running you can do, to keep in the same place. If you want to get somewhere else, you must run at least twice as fast as that!"
— Lewis Carrol, Through the Looking-Glass, and What Alice Found There  

The Red Queen's race is often quoted in discussions about the increasing rate of change in our lives, which demands running just to stay in the same place.  This is particularly true in software development, where change is the only constant factor.  Any successful software system will need to be changed, and, if it can't, will fall by the roadside.  Change can be due to many factors, such as new legislation, new standards, new business opportunities, the need to support a larger volume of business, or the need to move to a new platform (for example, when an operating systems is no longer supported by the vendor).  The introduction of a software system into an environment that didn't have one before changes that environment; that change ripples to create new requirements.  Perhaps initially the system was only meant to replace manual work, as when using computers to track customer orders instead of doing it on paper.  But once the system is in place, users realize that it can do much more; for example, letting customers enter their own orders through the internet.  Adding that functionality may trigger the idea that customers may now be able to track the order fulfillment process through the internet.  Every change is the catalyst for another one.

There is a common misconception that software is easy to change.  It is certainly true that distributing software updates is much easier (and cheaper) than changing hardware.  However, making changes in software correctly while preserving previous functionality and without introducing new bugs is very difficult, as all experienced software developers know.  (In spite of that, they are often content to get their software more-or-less working, and rely on the users to discover problems and on future releases to fix them; see Programmers as Children.)

Obviously, it makes sense to prepare for future changes, so as to make the inevitable changes as easy to make as possible.  Some requirement changes, however, are beyond prediction, and many predictions that seem obvious at the time turn out to be wrong.  These unfortunate facts have far-reaching implications on all aspects of software development.  They mean that it is impossible to capture the precise system requirements, since these will change unpredictably.  They also mean that it is impossible to prepare for all future changes, since every design decision renders some future modifications easier while making others more difficult.  But they do not mean that we should just build for known requirements and ignore the possibility (or rather, inevitability) of future changes.

The Agile Development community takes an extreme view of our inability to predict future changes, and uses the rule "do the simplest thing that could possibly work" (abbreviated DTSTTCPW).  This rule is based on the assumption that keeping the code as simple as possible will make future changes easier; it also assumes the meticulous use of refactoring to keep the code as clear as possible while making changes.  (I have a lot to say about refactoring, but that will have to wait.)  However, there are some decisions that are very difficult to back out of, and I believe that most agilists will agree that planning ahead on these decisions does not violate DTSTTCPW.

One of the earliest decisions that must be made, and one that has a profound influence on all aspects of the development process, is about tooling, including the programming languages to be used.  Changing the language after a considerable amount of code has been written is very difficult.  Changing languages in a system that has been in operation for over twenty years and contains several million lines of code is almost impossible; this almost amounts to a complete rewrite of the system.  Much of the knowledge about many such systems has been lost, and the original developers are probably doing other things, possibly in other companies, and may even be retired.  Yet there are many large-scale critical systems that are over twenty and even thirty years old.

This received world attention in the 1990s, with the famous Year 2000 (or Y2K) Problem.  Programmers in the 1960s were worried about the size of their data on disk, since storage sizes were much much smaller in those days.  It seems incredible that today you can get a terabyte disk for well under $100; in those days you would be paying by the kilobyte.  So the decision to represent dates using just two digits to represent the year was very natural at the time.  In fact, it can be argued that it was the right decision for the time.  It was inconceivable that these systems would last until the year 2000; it was obvious that they will have to be replaced well before then.  Programmers kept to this design decision during the 1970s, 1980s, and, incredibly, even sometimes in the 1990s, although by then storage sizes have grown, prices came way down, and 2000 was not the far future any more.

To the amazament of the 1960s programmers, some of the systems they wrote then are still in operation today.  Naturally, these survivors contain some of the most critical infrastructure used by banks, insurance companies, and government bodies.  All of these had to be revamped before 2000 (actually, some even before 1998; for example, credit cards issued in 1998 expired in 2000).  This cost an incredible amount of money, and quite a few companies made their fortunes from Y2K remediation.  The fact that this effort was largely successful, and none of the doomsday predictions made before 2000 came to pass, should not detract from the importance of the problem and the necessity of fixing it.

Now switch points of view, and think twenty years into the future.  The code we are writing today is the legacy of the future; what kind of problems are we preparing for ourselves or the people who will be saddled with these systems after us?  I claim that we are being as incosiderate as the programmers of the 1960s, and with less justification.  After all, we have their lesson to learn from, whereas they were pioneers and had no past experience to draw on.

These days, we are building systems using a hodgepodge of languages and technologies, each of which is rapidly changing.  A typical large-scale project will mix code in Java for the business logic, with Javascript for the browser front end, SQL for the back-end database, and one or more scripting language for small tasks.  A legacy Cobol system may well be involved as well.  The Java code will be built on top of a number of different frameworks for security, persistency, presentation, web services, and so on.  Each of these will change at its own pace, necessitating constant updates to keep up with moving technologies.  How will this be manageable in twenty or thirty years?

Think of a brand new application written by a small startup company.  They are very concerned about time to market, but not really thinking twenty years into the future (after all, they want to make their exit in five years at the most).  And their application is quite small, really, just perfect for a scripting language such as Perl, Python, or Ruby on Rails.

If they fail, like most startups, no harm has been done.  But suppose their idea catches on, and they really make it.  They keep adding more and more features, and get more and more users.  Suddenly they have many millions of users, and their system doesn't scale any more.  Users can't connect to their site, transactions get dropped, and revenue is falling.  Something must be done immediately!  The dynamic scripting language they used was very convenient for getting something running quickly, because it didn't require the specification of static types, classes could be created on the fly, and so on.  But now that the application consists of 20 million lines of code, it is very difficult to understand without the missing information.

This scenario is not fictional; many successful internet companies went through such a phase.  In fact, even infrastructure tools went through such changes.  For example, consider PHP, one of the most successful web scripting languages today.  PHP 4 was downward compatible with PHP 3, although it had a number of serious problems, because it was deemed inappropriate to disrupt the work of tens of thousands of developers.  PHP 4 was wildly successful, with the result that when incompatible changes had to be made in PHP 5, hundreds of thousands of developers were affected, and adoption of PHP 5 was delayed.

I think that many times developers use the wrong tool for the job (see The Right Tool for the Job) because it looks like the right tool when development starts.  Changing circumstances then make it the wrong tool, but switching tools can be extremely costly.  With hindsight, it was obviously a mistake to base the whole application on a scripting language in the scenario described above, even though it looked like a good idea at the time.  Can we use foresight to avoid such mistakes in the future?