Rice PLT TeachScheme Project: Slides: TCEA State 1998: AP Without C++

TCEA State 1998 Slides

Problems with Traditional Languages

Note that the problems described earlier are not exclusive to C: most of them exist in all traditional languages (such as Pascal, C, and C++). However, they are by no means exhaustive. Here are some other problems with such languages.

Here's a simple Pascal program:

var x : int;

x := 65530;
x := x + 5;
writeln (x);

What output does this program produce?

Depending on the specific hardware architecture being used, this program can have one of many outputs:

A positive number (the correct answer)
A negative number
A positive number (the wrong answer!), due to double-overflow

DrScheme allows numbers to be as large as necessary; the only restriction is the amount of memory in the computer. In contrast, a language like Pascal restricts the user to about 32 bits, even if there are 4 megabytes of memory going unused!

Here's a C program with confusing behavior:

typedef struct PairCell * Pair;

Pair readPair () {
  struct PairCell p;

  scanf ("%d %d", &(p.x), &(p.y));
  return (&p);
}

The function actually follows a principle dictated by top-down programming style: it isolates the program's i/o from other processing.

Unfortunately, the pair that is allocated for p is placed on the stack, so that once the function returns and another function is called, the contents of that location will inevitably be overwritten by some other numbers, which may be meaningless in this context (or, more insidiously, may be close to the actual values).

In fact, this represents a fairly simple kind of memory error. In general, memory leaks and dangling pointers are some of the worst errors that arise from using these languages.

In reality, students shouldn't have to bother with these details. Clearly, they are details that never arise on paper. In addition, programs called garbage collectors usually do much better than humans at dealing with these problems. Systems like DrScheme come with built-in garbage collectors. (In the above example, the pair is actually allocated on the heap, not the stack, and the garbage collector makes sure to return memory to the system when the pair is no longer necessary. All this is completely transparent to the programmer.)

All of these problems can and do occur in C++. The ETS has muted criticism by instead shifting focus to its much-vaunted AP classes, which offer beneficial features like bounds-checking on vector accesses. Unfortunately, the advantages of these classes are easily compromised by the rest of the language, which is not under the ETS's control. Here's an example involving an AP class.

vector Bar(20);
Bar = 100;
Bar[20] = 2;

In the first statement, the programmer allocates a vector, Bar, of twenty elements. (Recall that the legal indices for this vector are in the range [0,19].) The third statement is clearly erroneous, since it accesses an element beyond the end of the vector. Since this is an AP vector, the error should be caught. But what's the second statement? Perhaps the student left out a subscript (he may have meant to assign the value to Bar[0]). Or maybe he thought this statement initializes all the elements of the vector. Hopefully it'll flag a syntax error. Either way, the error in the third statement will be caught. Right?

Wrong. The second statement redefines Bar to be a 100-element vector! The compiler converts the integer to an invocation of the vector constructor (whatever that means to the beginning student). The bounds-checking of the AP classes has been rendered useless.

PLT / scheme@cs.rice.edu

Last modified at 15:07:16 CST on Sunday, February 08, 1998