Simply Scheme:To the Instructor Simply Scheme: Introducing Computer Science 2/e Copyright (C) 1999 MIT

To the Instructor

cover photo
Brian Harvey
University of California, Berkeley
Matthew Wright
University of California, Santa Barbara

Download PDF version
Back to Table of Contents
BACK chapter thread NEXT
MIT Press web page for Simply Scheme

The language that we use in this book isn't exactly standard Scheme. We've provided several extensions that may seem unusual to an experienced Scheme programmer. This may make the book feel weird at first, but there's a pedagogic reason for each extension.

Along with our slightly strange version of Scheme, our book has a slightly unusual order of topics. Several ideas that are introduced very early in the typical Scheme-based text are delayed in ours, most notably recursion. Quite a few people have looked at our table of contents, noted some particular big idea of computer science, and remarked, "I can't believe you wait so long before getting to such and such!"

In this preface for instructors, we describe and explain the unusual elements of our approach. Other teaching issues, including the timing and ordering of topics, are discussed in the Instructor's Manual.

Lists and Sentences

The chapter named "Lists" in this book is Chapter 17, about halfway through the book. But really we use lists much earlier than that, almost from the beginning.

Teachers of Lisp have always had trouble deciding when and how to introduce lists. The advantage of an early introduction is that students can then write interesting symbolic programs instead of boring numeric ones. The disadvantage is that students must struggle with the complexity of the implementation, such as the asymmetry between the two ends of a list, while still also struggling with the idea of composition of functions and Lisp's prefix notation.

We prefer to have it both ways. We want to spare beginning students the risk of accidentally constructing ill-formed lists such as

((((() . D) . C) . B) . A)

but we also want to write natural-language programs from the beginning of the book. Our solution is to borrow from Logo the idea of a sentence abstract data type.[1] Sentences are guaranteed to be flat, proper lists, and they appear to be symmetrical to the user of the abstraction. (That is, it's as easy to ask for the last word of a sentence as to ask for the first word.) The sentence constructor accepts either a word or a sentence in any argument position.

We defer structured lists until we have higher-order functions and recursion, the tools we need to be able to use the structure effectively.[2] A structured list can be understood as a tree, and Lisp programmers generally use that understanding implicitly. After introducing car-cdr recursion, we present an explicit abstract data type for trees, without reference to its implementation. Then we make the connection between these formal trees and the name "tree recursion" used for structured lists generally. But Chapter 18 can be omitted, if the instructor finds the tree ADT unnecessary, and the reader of Chapter 17 will still be able to use structured lists.

Sentences and Words

We haven't said what a word is. Scheme includes separate data types for characters, symbols, strings, and numbers. We want to be able to dissect words into letters, just as we can dissect sentences into words, so that we can write programs like plural and pig-latin. Orthodox Scheme style would use strings for such purposes, but we want a sentence to look (like this) and not ("like" "this"). We've arranged that in most contexts symbols, strings, and numbers can be used interchangeably; our readers never see Scheme characters at all.[3] Although a word made of letters is represented internally as a symbol, while a word made of digits is represented as a number, above the abstraction line they're both words. (A word that standard Scheme won't accept as a symbol nor as a number is represented as a string.)

There is an efficiency cost to treating both words and sentences as abstract aggregates, since it's slow to disassemble a sentence from right to left and slow to disassemble a word in either direction. Many simple procedures that seem linear actually behave quadratically. Luckily, words aren't usually very long, and the applications we undertake in the early chapters don't use large amounts of data in any form. We write our large projects as efficiently as we can without making the programs unreadable, but we generally don't make a fuss about it. Near the end of the book we discuss explicitly the efficient use of data structures.

Overloading in the Text Abstraction

Even though computers represent numbers internally in many different ways (fixed point, bignum, floating point, exact rational, complex), when people visit mathland, they expect to meet numbers there, and they expect that all the numbers will understand how to add, subtract, multiply, and divide with each other. (The exception is dividing by zero, but that's because of the inherent rules of mathematics, not because of the separation of numbers into categories by representation format.)

We feel the same way about visiting textland. We expect to meet English text there. It takes the form of words and sentences. The operations that text understands include first, last, butfirst, and butlast to divide the text into its component parts. You can't divide an empty word or sentence into parts, but it's just as natural to divide a word into letters as to divide a sentence into words. (The ideas of mathland and textland, as well as the details of the word and sentence procedures, come from Logo.)

Some people who are accustomed to Scheme's view of data types consider first to be badly "overloaded"; they feel that a procedure that selects an element from a list shouldn't also extract a letter from a symbol. Some of them would prefer that we use car for lists, use substring for strings, and not disassemble symbols at all. Others want us to define word-first and sentence-first.

To us, word-first and sentence-first sound no less awkward than fixnum-+ and bignum-+. Everyone agrees that it's reasonable to overload the name + because the purposes are so similar. Our students find it just as reasonable that first works for words as well as for sentences; they don't get confused by this.

As for the inviolability of symbols—the wall between names and data—we are following an older Lisp tradition, in which it was commonplace to explode symbols and to construct new names within a program. Practically speaking, all that prevents us from representing words as strings is that Scheme requires quotation marks around them. But in any case, the abstraction we're presenting is that the data we're dissecting are neither strings nor symbols, but words.

Higher-Order Procedures, Lambda, and Recursion

Scheme relies on procedure invocation as virtually its only control mechanism. In order to write interesting programs, a Scheme user must understand at least one of two hard ideas: recursion or procedure as object (in order to use higher-order procedures). We believe that higher-order procedures are easier to learn, especially because we begin in Chapter 8 by applying them only to named procedures. Using a named procedure as an argument to another procedure is the way to use procedures as objects that's least upsetting to a beginner. After the reader is comfortable with higher-order procedures, we introduce lambda; after that we introduce recursion. We do the tic-tac-toe example with higher-order procedures and lambda, but not recursion.

In this edition, however, we have made the necessary minor revisions so that an instructor who prefers to begin with recursion can assign Part IV before Part III.

When we get to recursion, we begin with an example of embedded recursion. Many books begin with the simplest possible recursive procedure, which turns out to be a simple sequential recursion, or even a tail recursion. We feel that starting with such examples allows students to invent the "go back" model of recursion as looping.

Mutators and Environments

One of the most unusual characteristics of this book is that there is no assignment to variables in it. The reason we avoid set! is that the environment model of evaluation is very hard for most students. We use a pure substitution model throughout most of the book. (With the background they get from this book, students should be ready for the environment model when they see a rigorous presentation, as they will, for example, in Chapter 3 of SICP.)

As the last topic in the book, we do introduce a form of mutation, namely vector-set!. Mutation of vectors is less problematic than mutation of lists, because lists naturally share storage. You really have to go out of your way to get two pointers to the same vector.[4] Mutation of data structures is less problematic than assignment to variables because it separates the issue of mutation from the issues of binding and scope. Using vectors raises no new questions about the evaluation process, so we present mutation without reference to any formal model of evaluation. We acknowledge that we're on thin ice here, but it seems to work for our students.

In effect, our model of mutation is the "shoebox" model that you'd find in a mainstream programming language text. Before we get to mutation, we use input/output programming to introduce the ideas of effect and sequence; assigning a value to a vector element introduces the important idea of state. We use the sequential model to write two more or less practical programs, a spreadsheet and a database system. A more traditional approach to assignment in Scheme would be to build an object-oriented language extension, but the use of local state variables would definitely force us to pay attention to environments.

[1] Speaking of abstraction, even though that's the name of Part V, we do make an occasion in each of the earlier parts to talk about abstraction as examples come up.

[2] Even then, we take lists as a primitive data type. We don't teach about pairs or improper lists, except as a potential pitfall.

[3] Scheme's primitive I/O facility gives you the choice of expressions or characters. Instead of using read-char, we invent read-line, which reads a line as a sentence, and read-string, which returns the line as one long word.

[4] We don't talk about eq? at all. We're careful to write our programs in such a way that the issue of identity doesn't arise for the reader.

(back to Table of Contents)

BACK chapter thread NEXT

Brian Harvey, bh@cs.berkeley.edu