Monday, December 27, 2010

The Right Tool for the Job

No professional plumber will think of using a wrench instead of a hammer (although quite a few amateurs wouldn't think twice before doing so).  In fact, professionals avoid adjustable wrenches, using the appropriate fixed wrench for each task.

Developers are much less picky about the tools they use.  Take, for example, the C programming language.  It was designed for the development of the Unix operating system, and is well suited for that task.  It allows low-level access to the computer's resources, it can be efficiently compiled into efficient code, and requires very little run-time support while not being tied to any specific hardware platform.  In essence, it is a high level, machine independent, assembly language.  Of course, these properties, extremely desirable for writing system code, required the language designers to make various choices that make C extremely unsuitable for most other programming tasks.  For example, the desire for efficiency and free access to all memory led to unconstrained array access, without any runtime bounds checking.  This has enabled the infamous buffer-overrun attacks, so favored by writers of viruses and other malware.  In turn, these require extra care by application developers who use C or C++ (which, unfortunately, preserved this and many other undesirable properties of C), and necessitated the development of various tools for checking C/C++ programs for these kinds of errors.

Why did this happen?  There are several reasons why programmers don't always use the right tool for the job.  One is the generality of our tools.  Usually, the same application can be written using any general-purpose language.  One may be particularly well-suited for the task, but, with more effort, others can be used.  Another reason is the steep learning curve associated with our tools.  It takes a lot of time to learn all the intricacies of a programming language (or other tool), and a programmer who mastered one will be reluctant to start from scratch with another language just to perform one task.  Obviously, the developers of the Unix kernel were masters of the C language (and influenced or even participated in its evolution).  It was natural for them to write all other parts of the system, including user applications, in the same language.  The great success of Unix and C created a large body of programmers whose hammer was C, and all applications their nail.  As a result, the first really popular object-oriented language, C++, was built as an extension to C, inheriting all the properties that make it unsuitable for user-application development.

It took a long time until a successor appeared, in the form of Java, which rejected many of the assumptions underlying C and C++, and removed many of its undesirable properties.  (Of course, we are still stuck with the awful syntax, which is responsible for a whole set of problems by itself.)  First and foremost, Java challenged the assumption that the language must be compilable into extremely efficient code; this is in fact wrong for most application development work.  Yes, inner loops in database management systems, communications, cryptography, and similar software must be heavily optimized.  But it is much more important for business (and desktop) applications to be free from security vulnerabilities than for them to be a little faster.  In addition, development expenses, which include developer time as well as the cost of tools that attempt to find security vulnerabilities and other problems, often outweigh the benefits of application speed.  Of course, loss of customer trust as a result of security breaches can be devastating.

We saw two reasons why developers don't always use the right tool for the job: the overlap between tools due to their generality, and the difficulty of learning a new one.  These have a corollary, resulting from the nature of organizations.  While it is good to have experts in various kinds of technologies and tools, it is important for a cohesive organization to have a common set of tools, a lingua franca that everybody is familiar with, for the main part of the development work.  This is important for good communications, and also makes integration easier.  But it also leads to the use of the least common denominator, the language and tools that can support most of the work, even if they are not the best for any single purpose.  Thus I find myself these days programming in Java, although it may not be my first choice for the kinds of systems I develop.

There are more reasons why developers don't use the best tool for the job, and many other tools worth discussing.  But these will have to wait for later posts.

P.S. See the excellent book, C Traps and Pitfalls, for a list of syntactic and semantic issues you must be aware of if you develop in C or a derivative language.

Tuesday, December 21, 2010

Children Programmers

No high-school graduate expects to get a job as a physicist, even after getting top grades in the advanced placement physics exams.  But I know of many high-school students who make money programming, and who think of themselves as professional programmers.  No large or knowledgeable company will use such child labor, but there are still many small businesses and other clients who don't understand what really makes a professional developer, and are glad to save some money this way.

If this is a small one-time job that doesn't require maintenance, it may even be successful.  But if it becomes necessary to make changes, especially after the programmer has graduated from high school and moved elsewhere, the client often needs to scrap the application and start from scratch, having learned something the hard way.

This is not the children's fault.  They really don't understand what being a professional programmer is about.  In fact, this is not always discussed explicitly in the university computer-science curriculum.  Yes, there are courses on software engineering as well as project-based courses, and conscientious professors require extensive tests and documentation even in other courses.  But there are many courses, with specific requirements in each, and it is very difficult to create a truly large project that really requires good software engineering practices in the undergraduate curriculum.

Without such projects, the exercise is self-defeating.  Students will do whatever is required to get a grade, which often means creating an impression of using the practices required of them.  But they will realize that this isn't really necessary for the project itself.  As a result, they will feel that software engineering is another of those things best forgotten after the exam.  Hopefully, they will recall what they learned when they start working for real, and encounter the problems that make disciplined development a necessity.  I still occasionally meet former undergraduate students who tell me that now that they work in industry they finally understand what I was talking about.

Being professional is actually much more than following some software engineering process.  It is taking responsibility for your work: pursuing quality, avoiding questionable practices, refusing to release dangerous products.  It is taking responsibility for yourself: keeping yourself educated, continuously learning, being honest with your co-workers, employer, and the public in general.  All these things can't be taught in one course, in the same way that children can't be educated to be responsible citizens in one year.  Both require lifelong learning, and teaching is best done by example.  And with so many adults behaving like children (see "Programmers as Children"), how can we expect the children to behave like adults?

Thursday, December 16, 2010

Programmers as Children

Freud is quoted as saying that the postponement of gratification is the hallmark of maturity.  Developers are put the the test very often, since the computer can supply instant feedback on our programs - just run it and see what happens.  And, unfortunately, developers often fail this test.  It is much easier to try it out than to think it out!

When I started programming, up to the beginning of my student days, we could only use punched cards for our programs.  (Professors and other privileged people could use "terminals"!)  You had to punch your cards, then put the shiny deck on the dispatcher's table.  This high priest of the computer would be reading a newspaper, and although he would notice you staring at him, would ignore you completely.  Trying to hint to him that he should put your deck in the card reader would only make him slower.  Eventually, he would perform this operation, and then you had to wait for him to fetch the fan-folded paper from the big noisy line printer, separate it into individual printouts, then put it in your box.  And then you would find out that you forgot a comma.

The whole process would take about half an hour of turnaround time (for a few seconds of actual computation).  This made us very careful when submitting a job.  And, since there is only so many times you can go for coffee in one day, we would spend the turnaround time running the program in our heads: trying to figure out what can go wrong with it and fixing it on paper.

I will be the first to admit that checking for syntax errors is best left to the computer.  It is a waste of a developer's time to have to scan his code for them.  But time devoted to thinking about a program is time well spent.  As an undergraduate student, I wasn't thinking about formally proving my programs correct, and I didn't have the formal training for it, but in essence that is what I was trying to do.  And the result was that I found many logic problems before ever submitting the deck.

In my last year of undergraduate studies at Tel Aviv University, I took a compilation course and had to write a small compiler as a course project.  I decided to write it using a language made for the purpose: SNOBOL4.  Unfortunately, Tel Aviv U didn't have a SNOBOL4 compiler (probably because there wasn't one for the CDC 6600).  But I was taking a graduate course at the Weizmann Institute in Rehovot, and had an account on their IBM machine, which had the SPITBOL compiler for SNOBOL4.  Being reduced to the use of public transportation, I only visited Rehovot once a week, to attend my course.  So even though I had access to a 3270 terminal there and had almost instant turnaround time, I had to spend a week just staring at my listings between sessions.  I used that time to think about my compiler, and as a result finished debugging the whole thing in just two sessions at the terminal.  All told, I spent less time on the project that I would have if I had daily access to a terminal.  (Having a computer all my own to do my debugging on was totally unthinkable at the time.)

I'm ashamed to admit that I now find myself hitting "run" to see what happens, without thinking things through.  If it fails, I have learned something with little investment.  But what if it "works"?  Can I be convinced that my last change really eliminated the last bug, and I can go on to work on other things?  I still need to think carefully and convince myself that it will work on all other cases.

Testing is a useful way of increasing your confidence in your code.  But testing is not a replacement for thinking about the logic of your program!  What's more, writing good tests is a process that requires time and thought on its own.  And, like mental reasoning about programs, it is a process that many developers neglect.  But, more on that some other time.  I've got some code I need to think about.

Wednesday, December 15, 2010

My Dusty Deck

I started programming when I was about twelve, and I have enjoyed it ever since.  I was lucky to have spent many years in academia, so I was my own master.  Even when I consulted for a data-security company, I was responsible for security- and performance-critical code, and was never pressured to compromise on quality.

Sadly, most developers aren't that fortunate.  The results are all around us, from the daily annoyances on personal computers to lost lives and property.  (For some non-bedtime reading about the latter, see Computer-Related Risks by Peter Neumann.)

Can we do something about this?  There are many factors involved in producing high-quality maintainable software.  There are technical, psychological, and educational issues, and no one can claim to cover them all.  The Dusty Deck is my attempt to present my own point of view, shaped by my own experience in developing software and teaching computer science.

I am fascinated by good tools.  I want to have the best tools possible to do my job, and I get very frustrated when a tool doesn't understand what I want to do and gets in the way instead of helping me do it faster and better.  Because I know what tools I want to have for my own use, most of my research is focused on the creation of tools I can believe in.  In pursuing this goal, I spent four years at the Programmer's Apprentice project in MIT's AI lab, where I learned many things that I used in later research, most notably the plan calculus representation of programs.  (For more information, see The Programmer's Apprentice by Rich and Waters.)  Work with students and colleagues over the years led me into the field of static analysis of programs, which I am continuing now at IBM Research.

Over the years, I have been exposed to many languages, development environments, platforms, operating systems, and frameworks.  I started with Fortran on punched cards on a CDC 6600, went on to Pascal in the Introduction to Computer Science course at Tel Aviv University, played with many other languages (Prolog, APL, various assembly languages), and used Lisp on the Lisp Machine at MIT (happy days!).  C became my language for low-level work, and Java for higher-level programming, and various scripting languages for small tasks.  During all this work I developed some strong views, likes and dislikes, which I will share here.

I have taught core courses based on the books Structure and Interpretation of Computer Programs by Abelson and Sussman and Object-Oriented Software Construction by Meyer, and using Scheme, Eiffel, and Java (although the specific language was never a goal in itself).  For several years I have also been serving on the Israel Ministry of Education's Computer Science Curriculum Committee.  These experiences shaped my thinking about education.

To start with something controversial, in the next post I'll discuss the relationships between programmers and children.  Let me know what you think!

P.S. If you aren't familiar with the term "dusty deck," see its definition in the Jargon File.