Designing Functions for Self-Referential Data Definitions

At first glance, self-referential data definitions seem to be far more complex than those for compound or mixed data. But, as the example in the preceding subsection shows, our design recipes still work. Nevertheless, in this section we discuss a new design recipe that works better for self-referential data definitions. As implied by the preceding section, the new recipe generalizes those for compound and mixed data. The new parts concern the process of discovering when a self-referential data definition is needed, deriving a template, and defining the function body:
Data Analysis & Design:
If a problem statement discusses compound information of arbitrary size, we need a recursive or self-referential data definition. At this point, we have only seen one such class, <#62198#><#10180#>list-of-symbols<#10180#><#62198#>, but it is easy to imagine other, yet similar classes of lists. We will get to know many other examples in this and the following part. For a recursive data definition to be valid, it must satisfy two conditions. First, it must contain at least two clauses. Second, at least one of the clauses must not refer back to the definition. It is good practice to identify the self-references explicitly with arrows from the references in the data definition back to its beginning. Our running example for this section are functions that consume lists of symbols:

#picture10182#

Template:
A self-referential data definition specifies a mixed class of data, and one of the clauses should specify a subclass of compound data. Hence the design of the template can proceed according to the recipes in sections~#secdesign3#10201> and~#secmixeddatadesign#10202>. Specifically, we formulate a <#62206#><#10203#>cond<#10203#>-expression<#62206#> with as many <#62207#><#10204#>cond<#10204#><#62207#>-clauses as there are clauses in the data definition, match each recognizing condition to the corresponding clause in the data definition, and write down appropriate selector expressions in all <#62208#><#10205#>cond<#10205#><#62208#>-lines that process compound values. In addition, we inspect each selector expression. For each that extracts a value of the same class of data as the input, we draw an arrow back to the function parameter. At the end, we must have as many arrows as we have in the data definition. Let's return to the running example. The template for a list-processing function contains a <#62209#><#10206#>cond<#10206#>-expression<#62209#> with two clauses and one arrow:

#picture10207#

For simplicity, this book will use a textual alternative to arrows. Instead of drawing an arrow, the templates contain self-applications of the function to the selector expression(s):

<#10244#>(d<#10244#><#10245#>efine<#10245#> <#10246#>(fun-for-los<#10246#> <#10247#>a-list-of-symbols)<#10247#>
  <#10248#>(c<#10248#><#10249#>ond<#10249#> 
    <#10250#>[<#10250#><#10251#>(empty?<#10251#> <#10252#>a-list-of-symbols)<#10252#> <#10253#>...]<#10253#> 
    <#10254#>[<#10254#><#10255#>e<#10255#><#10256#>lse<#10256#> <#10257#>...<#10257#> <#10258#>(first<#10258#> <#10259#>a-list-of-symbols)<#10259#> <#10260#>...<#10260#> 
      <#10261#>...<#10261#> <#10262#>(fun-for-los<#10262#> <#10263#>(rest<#10263#> <#10264#>a-list-of-symbols))<#10264#> <#10265#>...]<#10265#><#10266#>))<#10266#> 
We refer to these self-applications as <#62212#><#10270#>NATURAL RECURSIONS<#10270#><#62212#>.
Body:
For the design of the body we start with those <#62213#><#10271#>cond<#10271#><#62213#>-lines that do not contain natural recursions. They are called <#62214#><#10272#>BASE CASES<#10272#><#62214#>. The corresponding answers are typically easy to formulate or are already given by the examples. Then we deal with the self-referential cases. We start by reminding ourselves what each of the expressions in the template line computes. For the recursive application we assume that the function already works as specified in our purpose statement. <#10273#>The rest is then a matter of combining the various values.<#10273#> Suppose we wish to define the function <#62215#><#10274#>how-many<#10274#><#62215#>, which determines how many symbols are on a list of symbols. Assuming we have followed the design recipe, we have the following:
<#70966#>;; <#62216#><#10278#>how-many<#10278#> <#10279#>:<#10279#> <#10280#>list-of-symbols<#10280#> <#10281#><#10281#><#10282#>-;SPMgt;<#10282#><#10283#><#10283#> <#10284#>number<#10284#><#62216#><#70966#>
<#70967#>;; to determine how many symbols are on <#62217#><#10285#>a-list-of-symbols<#10285#><#62217#><#70967#> 
<#10286#>(d<#10286#><#10287#>efine<#10287#> <#10288#>(how-many<#10288#> <#10289#>a-list-of-symbols)<#10289#> 
  <#10290#>(c<#10290#><#10291#>ond<#10291#> 
    <#10292#>[<#10292#><#10293#>(empty?<#10293#> <#10294#>a-list-of-symbols)<#10294#> <#10295#>...]<#10295#> 
    <#10296#>[<#10296#><#10297#>e<#10297#><#10298#>lse<#10298#> <#10299#>...<#10299#> <#10300#>(first<#10300#> <#10301#>a-list-of-symbols)<#10301#> <#10302#>...<#10302#> 
      <#10303#>...<#10303#> <#10304#>(how-many<#10304#> <#10305#>(rest<#10305#> <#10306#>a-list-of-symbols))<#10306#> <#10307#>...]<#10307#><#10308#>))<#10308#> 
The answer for the base case is <#62218#><#10311#>0<#10311#><#62218#> because the empty list contains nothing. The two expressions in the second clause compute the <#62219#><#10312#>first<#10312#><#62219#> item and the number of symbols on the <#62220#><#10313#>(rest<#10313#><#10314#> <#10314#><#10315#>a-list-of-symbols)<#10315#><#62220#>. To compute how many symbols there are on all of <#62221#><#10316#>a-list-of-symbols<#10316#><#62221#>, we just need to add <#62222#><#10317#>1<#10317#><#62222#> to the value of the latter expression:
<#10321#>(d<#10321#><#10322#>efine<#10322#> <#10323#>(how-many<#10323#> <#10324#>a-list-of-symbols)<#10324#>
  <#10325#>(c<#10325#><#10326#>ond<#10326#> 
    <#10327#>[<#10327#><#10328#>(empty?<#10328#> <#10329#>a-list-of-symbols)<#10329#> <#10330#>0]<#10330#> 
    <#10331#>[<#10331#><#10332#>else<#10332#> <#10333#>(+<#10333#> <#10334#>(how-many<#10334#> <#10335#>(rest<#10335#> <#10336#>a-list-of-symbols))<#10336#> <#10337#>1)]<#10337#><#10338#>))<#10338#> 
Combining Values:
In many cases, the combination step can be expressed with Scheme's primitives, for example, <#62223#><#10341#>+<#10341#><#62223#>, <#62224#><#10342#>and<#10342#><#62224#>, or <#62225#><#10343#>cons<#10343#><#62225#>. If the problem statement suggests that we ask questions about the first item, we may need a nested <#62226#><#10344#>cond<#10344#><#62226#>-statement. Finally, in some cases, we may have to define auxiliary functions.
Figure~#figdesign5#10346> summarizes this discussion in the usual format; those design steps that we didn't discuss are performed as before. The following section discusses several examples in detail.

#tabular10348#

<#10436#>Figure: The revised design recipe for recursive data<#10436#>

<#62230#>(Refines the recipes in figures~#figdesign1#10438> (pg.~#figdesign1#10439>), #figdesign3#10440> (pg.~#figdesign3#10441>) #figdesign4#10442> (pg.~#figdesign4#10443>))<#62230#>