Author : John Morris
Page : 1 Next >>
Robust programs check themselves: the programmer builds into them test code that verifies that conditions which should exist at various points in the program do actually exist. Programming errors that cause a program to enter erroneous states trigger this test code and emit warning messages. The programmer can then correct the error and re-run the program. If - as often happens - the 'fix' is not the correct one, the message will appear again. The programmer continues correcting the code until the error disappears. One advantage of this test code is that it can detect erroneous states of the program well before any output is generated: thus the path backwards from the error to its source is shorter and easier and quicker to trace.
A Very Common Problem
Failure to initialise pointers is an extremely common error in C programs. In the OO design strategy we've been using here, failure to call an object's constructor before trying to operate on ('use') the object will result in passing an uninitialised pointer to a class method. Many operating operating systems will clear unused memory to nulls before running programs. Thus an uninitialised pointer will be a NULL (zero address). So we know that when any method (other than the constructor) of a class is called it should be passed a pointer to an object which is not NULL. A robust program will have code inserted to detect this:
int some_method( some_class c, int p1, int p2 ) {
if ( c == NULL ) {
printf("some_method: error null object\n");
return .. ; /* an appropriate error value */
}
... /* perform some_method on c */
}
While this achieves the desired aim - the program itself detects some errors - it is not very efficient. The program is loaded down with this self-checking code which increases its size and slows it down. A slightly better alternative is:
int some_method( some_class c, int p1, int p2 ) {
#ifdef TEST_MODE
if ( c == NULL ) {
printf("some_method: error null object\n");
return .. ; /* an appropriate error value */
}
#endif
... /* perform some_method on c */
}
With this approach, when the program is being tested, the programmer inserts
#define TEST_MODE
somewhere in the code (or uses the compiler options to 'pre-define' it) when testing it and removes the #define when testing is complete and the checking code is believed to be redundant. This results in considerable self-checking when the program is being tested and a lean, efficient, fast production program.
Assertions
While this strategy will work well, C has an in-built mechanism which is neater. Because it is part of the C standard, all compilers provide the mechanism and any software engineer using C will be familiar with it.
At various points in the program, we can make assertions about the state of the program. For example, in the example above, we assert that on entry to the function, the pointer should not be NULL. We then add to the program a call to the assert routine:
#include <assert.h>
int some_method( some_class c, int p1, int p2 ) {
assert( c != NULL );
... /* perform some_method on c */
}
If the assertion is not true, then the program will print an error message and exit. Implementations of assert vary in the details of the information that they print when an assertion fails, but a typical message will look like this:
Assertion failure at line 34 in file some_class.c: c != NULL
As you can see, this results in much simpler robust code. We simply add the assert calls to our programs at points at which we can make assertions about the state of the program. assert takes as its argument any boolean expression. Like any other C expression, this may be arbitrarily complex. If the argument is true, then assert simply returns. However, if it is false, it prints a message specifying exactly where in your program the error occurred and the condition causing the error. To use assert, you must include its specification, <assert.h> at the head of the program.
Multiple assertions
One of the most common places that asserts should be inserted is at the beginning of functions. At this point, you usually know quite a bit about the values of the various parameters to the function. These will be set out clearly in the pre-conditions for the method. For example, in our example above, not only do we know that the object c should have been constructed properly, but we will generally know something about the legal or sensible values for the parameters, p1 and p2. This state information should be added to your program as an assert statement, eg
#include <assert.h>
int some_method( some_class c, int p1, int p2 ) {
assert( (c != NULL) && (p1 > 0) && ( (p2 >= 0) && (p2 <= MAX_P) ) );
... /* perform some_method on c */
}
While this makes a robust program, there is a slight problem: if any one of the conditions fails, the message printed will be:
Assertion failure at line 34 in file some_class.c:
(c != NULL) && (p1 > 0) && ( (p2 >= 0) && (p2 &;t= MAX_P) )
While this makes a robust program (in the sense that it self-checks itself), debugging is difficult: we don't know which condition caused the assertion failure to be flagged. Thus a better strategy is to insert multiple asserts:
#include <assert.h>
int some_method( some_class c, int p1, int p2 ) {
assert( c != NULL );
assert( p1 > 0 );
assert( (p2 >= 0) && (p2 <= MAX_P) );
... /* perform some_method on c */
}
Now when an assertion is raised, the error message will clearly indicate which parameter has the illegal value.
Of course, we could take this philosophy to its extreme and split the last assert into two, so that we know whether p2 exceeded its lower or upper bound. Because I'm lazy, I'm generally content to get a flag that p2 was wrong: usually that is enough to rapidly identify the source of the problem, so that I can keep my code a little more compact. However, the thorough approach that has only one condition to each assert should probably be preferred: I occasionally find that it's quicker to split the compound condition into two, re-compile and re-run the program, which means that I would have been more efficient to type two asserts in the first place!
One of the benefits of the standard assert mechanism is that it's also efficient for production programs. To see why, let's look at the implementation of assert.
Implementation of assert
It is instructive to have a look in the file assert.h on your system to see how assert is implemented. You'll probably find something like this:
#ifndef NDEBUG
#define assert(x) {if(!(x)) \
printf("Assertion failure at line %d in file %s: %s\n", \
__LINE__,__FILE__,#x); exit(1); }
#else
#define assert(x)
#endif
This can be read as:
"If the symbol NDEBUG is not defined, then expand the text
assert(x)
into
{if(!(x))
printf("Assertion failure at line %d in file %s: %s\n", \
__LINE__,__FILE__,#x); exit(1); }
substituting the actual expression for x wherever it occurs.
However, if NDEBUG has already been defined, substitute nothing for assert(x) wherever it occurs, ie remove all asserts from the program."
The default situation is for the symbol NDEBUG to be undefined, which means that all asserts in your program are expanded into
{if(!( .. condition .. )) printf( ... ); exit(1); }
- exactly what you want when debugging and verifying the program. However, when you are satisfied that the program is running correctly, you arrange for NDEBUG to be defined, re-compile and re-link the program. On Unix and most other command-line driven compilers, you can define a symbol externally when the compiler is invoked:
cc -DNDEBUG -DX=3 prog.c
has the effect of pre-defining the symbols NDEBUG and X, giving X a value of 3. This is exactly the same as inserting a prologue containing:
#define NDEBUG
#define X 3
at the head of every program source file. In fact one GUI-based compiler (Metrowerks C/C++) simply allows you to specify a file which is compiled at the head of every file. Other GUI systems have compiler option windows which allow you to set the pre-defined symbols.
Thus by using the assert mechanism, you can build programs which
Page : 1 Next >>