Data Definitions for Lists of Arbitrary Length

#drnseclistslong#9592> Suppose we wish to represent the inventory of a toy store that sells such things as dolls, make-up sets, clowns, bows, arrows, and soccer balls. To make an inventory, a store owner would start with an empty sheet of paper and slowly write down the names of the toys on the various shelves. Representing a list of toys in Scheme is straightforward. We can simply use Scheme's symbols for toys and then <#62101#><#9593#>cons<#9593#><#62101#>truct lists from them. Here are a few short samples:
<#9598#>empty<#9598#>
<#9599#>(cons<#9599#> <#9600#>'<#9600#><#9601#>ball<#9601#> <#9602#>empty)<#9602#> 
<#9603#>(cons<#9603#> <#9604#>'<#9604#><#9605#>arrow<#9605#> <#9606#>(cons<#9606#> <#9607#>'<#9607#><#9608#>ball<#9608#> <#9609#>empty))<#9609#> 
<#9610#>(cons<#9610#> <#9611#>'<#9611#><#9612#>clown<#9612#> <#9613#>empty)<#9613#> 
<#9614#>(cons<#9614#> <#9615#>'<#9615#><#9616#>bow<#9616#> <#9617#>(cons<#9617#> <#9618#>'<#9618#><#9619#>arrow<#9619#> <#9620#>(cons<#9620#> <#9621#>'<#9621#><#9622#>ball<#9622#> <#9623#>empty)))<#9623#> 
<#9624#>(cons<#9624#> <#9625#>'<#9625#><#9626#>clown<#9626#> <#9627#>(cons<#9627#> <#9628#>'<#9628#><#9629#>bow<#9629#> <#9630#>(cons<#9630#> <#9631#>'<#9631#><#9632#>arrow<#9632#> <#9633#>(cons<#9633#> <#9634#>'<#9634#><#9635#>ball<#9635#> <#9636#>empty))))<#9636#> 
For a real store, the list will contain many more items, and the list will grow and shrink over time. In any case, we cannot say in advance how many items these inventory lists will contain. Hence, if we wish to develop a function that consumes such lists, we cannot simply say that the input is a list with either one, two, three, or four items. We must be prepared to think about lists of arbitrary length. In other words, we need a data definition that precisely describes the class of lists that contain an arbitrary number of symbols. Unfortunately, the data definitions we have seen so far can only describe classes of data where each item is of a fixed size, such as a structure with a specific number of components or a list with a specific number of items. So how can we describe a class of lists of arbitrary size? Looking back we see that all our examples fall into one of two categories. The store owner starts with an empty list and <#62102#><#9640#>cons<#9640#><#62102#>tructs longer and longer lists. The construction proceeds by <#62103#><#9641#>cons<#9641#><#62103#>ing together a toy and another list of toys. Here is a data definition that reflects this process:
A <#62104#><#9643#>list of symbols<#9643#><#62104#> (<#62105#><#9644#>list-of-symbols<#9644#><#62105#>) is either
  1. the empty list, <#62106#><#9646#>empty<#9646#><#62106#>, or
  2. <#62107#><#9647#>(cons<#9647#>\ <#9648#>s<#9648#>\ <#9649#>los)<#9649#><#62107#> where <#62108#><#9650#>s<#9650#><#62108#> is a symbol and <#62109#><#9651#>los<#9651#><#62109#> is a list of symbols.
This definition is unlike any of the definitions we have seen so far or that we encounter in high school English or mathematics. Those definitions explain a new idea in terms of old, well-understood concepts. In contrast, this definition refers to <#9654#>itself<#9654#> in the item labeled~2, which implies that it explains what a list of symbols is in terms of lists of symbols. We call such definitions <#9655#>self-referential<#9655#> or <#9656#>recursive<#9656#>. At first glance, a definition that explains or specifies something in terms of itself does not seem to make much sense. This first impression, however, is wrong. A recursive definition, like the one above, make sense as long as we can construct some elements from it; the definition is correct if we can construct all intended elements. Let's check whether our specific data definition makes sense and contains all the elements we are interested in. From the first clause we immediately know that <#62110#><#9658#>empty<#9658#><#62110#> is a list of symbols. From the second clause we know that we can create larger lists with <#62111#><#9659#>cons<#9659#><#62111#> from a symbol and a list of symbols. Thus <#62112#><#9660#>(cons<#9660#>\ <#9661#>'<#9661#><#9662#>ball<#9662#>\ <#9663#>empty)<#9663#><#62112#> is a list of symbols because we just determined that <#62113#><#9664#>empty<#9664#><#62113#> is one and we know that <#62114#><#9665#>'<#9665#><#9666#>doll<#9666#><#62114#> is a symbol. There is nothing special about <#62115#><#9667#>'<#9667#><#9668#>doll<#9668#><#62115#>. Any other symbol could serve equally well to form a number of one-item lists of symbols:
<#9673#>(cons<#9673#> <#9674#>'<#9674#><#9675#>make-up-set<#9675#> <#9676#>empty)<#9676#>
<#9677#>(cons<#9677#> <#9678#>'<#9678#><#9679#>water-gun<#9679#> <#9680#>empty)<#9680#> 
<#9681#>...<#9681#> 
Once we have lists that contain one symbol, we can use the same method to build lists with two items:
<#9689#>(cons<#9689#> <#9690#>'<#9690#><#9691#>Barbie<#9691#> <#9692#>(cons<#9692#> <#9693#>'<#9693#><#9694#>robot<#9694#> <#9695#>empty))<#9695#>
<#9696#>(cons<#9696#> <#9697#>'<#9697#><#9698#>make-up-set<#9698#> <#9699#>(cons<#9699#> <#9700#>'<#9700#><#9701#>water-gun<#9701#> <#9702#>empty))<#9702#> 
<#9703#>(cons<#9703#> <#9704#>'<#9704#><#9705#>ball<#9705#> <#9706#>(cons<#9706#> <#9707#>'<#9707#><#9708#>arrow<#9708#> <#9709#>empty))<#9709#> 
<#9710#>...<#9710#> 
From here, it is easy to see how we can form lists that contain an arbitrary number of symbols. More importantly still for our problem, all possible inventories are adequately described by our data definition.
<#9716#>Exercise 9.2.1<#9716#> Show that all the inventory lists discussed at the beginning of this section belong to the class <#62116#><#9718#>list-of-symbols<#9718#><#62116#>.~ external Solution<#62117#><#62117#> <#9724#>Exercise 9.2.2<#9724#> Do all lists of two symbols also belong to the class <#62118#><#9726#>list-of-symbols<#9726#><#62118#>? Provide a concise argument.~ external Solution<#62119#><#62119#> <#9732#>Exercise 9.2.3<#9732#> Provide a data definition for the class of <#9734#>list of booleans<#9734#>. The class contains all arbitrarily large lists of booleans.~ external Solution<#62120#><#62120#>