Lists in Lists

The World Wide Web, or just ``the Web,'' has become the most interesting and most rapidly expanding part of the Internet, a global network of computers. Roughly speaking, the Web is a collection of Web pages. Each Web page is a sequence of words, pictures, movies, audio messages, and many more things. Most importantly, Web pages also contain links to other Web pages. A Web browser enables people to view Web pages. It presents a Web page as a sequence of words, images, and so on. Some of the words on a page may be underlined. Clicking on underlined words leads to a new Web page. Most modern browsers also provide a Web page composer. These are tools that helps people create collections of Web pages. A composer can, among other things, search for words or replace one word by another. In short, Web pages are things that we should be able to represent on computers, and there are many functions that process Web pages. To simplify our problem, we consider only Web pages of words and nested Web pages. One way of understanding such a page is as a sequence of words and Web pages. This informal description suggests a natural representation of Web pages as lists of symbols, which represent words, and Web pages, which represent nested Web pages. After all, we have emphasized before that a list may contain different kinds of things. Still, when we spell this idea out as data definition, we get something rather unusual:
A <#63554#><#17388#>Web page<#17388#><#63554#> (<#63555#><#17389#>WP<#17389#><#63555#>) is either
  1. <#63556#><#17391#>empty<#17391#><#63556#>;
  2. <#63557#><#17392#>(cons<#17392#>\ <#17393#>s<#17393#>\ <#17394#>wp)<#17394#><#63557#>
    where <#63558#><#17395#>s<#17395#><#63558#> is a symbol and <#63559#><#17396#>wp<#17396#><#63559#> is a Web page; or
  3. <#63560#><#17397#>(cons<#17397#>\ <#17398#>ewp<#17398#>\ <#17399#>wp)<#17399#><#63560#>
    where both <#63561#><#17400#>ewp<#17400#><#63561#> and <#63562#><#17401#>wp<#17401#><#63562#> are Web pages.
This data definition differs from that of a list of symbols in that it has three clauses instead of two and that it has three self-references instead of one. Of these self-references, the one at the beginning of a <#63563#><#17404#>cons<#17404#><#63563#>tructed list is the most unusual one. We refer to such Web pages as <#17405#>immediately embedded<#17405#> Web pages. Because the data definition is unusual, we construct some examples of Web pages before we continue. Here is a plain page:
<#17410#>'<#17410#><#17411#>(<#17411#><#17412#>The<#17412#> <#17413#>TeachScheme!<#17413#> <#17414#>Project<#17414#> <#17415#>aims<#17415#> 
  <#17416#>to<#17416#> <#17417#>improve<#17417#> <#17418#>students'<#17418#> <#17419#>problem-solving<#17419#> 
  <#17420#>and<#17420#> <#17421#>organization<#17421#> <#17422#>skills.<#17422#> <#17423#>It<#17423#> <#17424#>provides<#17424#> 
  <#17425#>software<#17425#> <#17426#>and<#17426#> <#17427#>lecture<#17427#> <#17428#>notes<#17428#> <#17429#>as<#17429#> <#17430#>well<#17430#> <#17431#>as<#17431#> 
  <#17432#>exercises<#17432#> <#17433#>and<#17433#> <#17434#>solutions<#17434#> <#17435#>for<#17435#> <#17436#>teachers.)<#17436#> 
It contains nothing but words. Here is a complex page:
<#17444#>'<#17444#><#17445#>(<#17445#><#17446#>The<#17446#> <#17447#>TeachScheme<#17447#> <#17448#>Web<#17448#> <#17449#>Page<#17449#>
  <#17450#>Here<#17450#> <#17451#>you<#17451#> <#17452#>can<#17452#> <#17453#>find:<#17453#> 
  <#17454#>(LectureNotes<#17454#> <#17455#>for<#17455#> <#17456#>Teachers)<#17456#> 
  <#17457#>(Guidance<#17457#> <#17458#>for<#17458#> <#17459#>(DrScheme:<#17459#> <#17460#>a<#17460#> <#17461#>Scheme<#17461#> <#17462#>programming<#17462#> <#17463#>environment))<#17463#> 
  <#17464#>(Exercise<#17464#> <#17465#>Sets)<#17465#> 
  <#17466#>(Solutions<#17466#> <#17467#>for<#17467#> <#17468#>Exercises)<#17468#> 
  <#17469#>For<#17469#> <#17470#>further<#17470#> <#17471#>information,<#17471#> <#17472#>write<#17472#> <#17473#>to<#17473#> <#17474#>scheme@<#17474#><#17475#>cs)<#17475#> 
The immediately embedded pages start with parentheses and the symbols <#63564#><#17479#>'<#17479#><#17480#>LectureNotes<#17480#><#63564#>, <#63565#><#17481#>'<#17481#><#17482#>Guidance<#17482#><#63565#>, <#63566#><#17483#>'<#17483#><#17484#>Exercises<#17484#><#63566#>, and <#63567#><#17485#>'<#17485#><#17486#>Solutions<#17486#><#63567#>. The second embedded Web page contains another embedded page, which starts with the word <#63568#><#17487#>'<#17487#><#17488#>DrScheme<#17488#><#63568#>. We say this page is <#17489#>embedded<#17489#> with respect to the entire page. Let's develop the function <#63569#><#17490#>size<#17490#><#63569#>, which consumes a Web page and produces the number of words that it and all of its embedded pages contain:
<#71075#>;; <#63570#><#17495#>size<#17495#> <#17496#>:<#17496#> <#17497#>WP<#17497#> <#17498#><#17498#><#17499#>-;SPMgt;<#17499#><#17500#><#17500#> <#17501#>number<#17501#><#63570#><#71075#>
<#71076#>;; to count the number of symbols that occur in <#63571#><#17502#>a-wp<#17502#><#63571#><#71076#> 
<#17503#>(define<#17503#> <#17504#>(size<#17504#> <#17505#>a-wp)<#17505#> <#17506#>...)<#17506#> 
The two Web pages above suggest two good examples, but they are too complex. Here are three simpler examples, one per clause in the data definition:
  <#17514#>(size<#17514#> <#17515#>empty)<#17515#>
<#17516#>=<#17516#> <#17517#>0<#17517#> 
  <#17525#>(size<#17525#> <#17526#>(cons<#17526#> <#17527#>'<#17527#><#17528#>One<#17528#> <#17529#>empty))<#17529#>
<#17530#>=<#17530#> <#17531#>1<#17531#> 
  <#17539#>(size<#17539#> <#17540#>(cons<#17540#> <#17541#>(cons<#17541#> <#17542#>'<#17542#><#17543#>One<#17543#> <#17544#>empty)<#17544#> <#17545#>empty))<#17545#>
<#17546#>=<#17546#> <#17547#>1<#17547#>  
The first two examples are obvious. The third one deserves a short explanation. It is a Web page that contains one immediately embedded Web page, and nothing else. The embedded Web page is the one of the second example, and it contains the one and only symbol of the third example. To develop the template for <#63572#><#17551#>size<#17551#><#63572#>, let's carefully step through the design recipe. The shape of the data definition suggests that we need three <#63573#><#17552#>cond<#17552#><#63573#>-clauses: one for the <#63574#><#17553#>empty<#17553#><#63574#> page, one for a page that starts with a symbol, and one for a page that starts with an embedded Web page. While the first condition is the familiar test for <#63575#><#17554#>empty<#17554#><#63575#>, the second and third need closer inspection because both clauses in the data definition use <#63576#><#17555#>cons<#17555#><#63576#> and a simple <#63577#><#17556#>cons?<#17556#><#63577#> won't distinguish between the two forms of data. If the page is not <#63578#><#17557#>empty<#17557#><#63578#>, it is certainly <#63579#><#17558#>cons<#17558#><#63579#>tructed, and the distinguishing feature is the first item on the list. In other words, the second condition must use a predicate that tests the first item on <#63580#><#17559#>a-wp<#17559#><#63580#>:
<#71077#>;; <#63581#><#17564#>size<#17564#> <#17565#>:<#17565#> <#17566#>WP<#17566#> <#17567#><#17567#><#17568#>-;SPMgt;<#17568#><#17569#><#17569#> <#17570#>number<#17570#><#63581#><#71077#>
<#17571#>;; to count the number of symbols that occur in a-wp<#17571#> 
<#17572#>(d<#17572#><#17573#>efine<#17573#> <#17574#>(size<#17574#> <#17575#>a-wp)<#17575#> 
  <#17576#>(c<#17576#><#17577#>ond<#17577#> 
    <#17578#>[<#17578#><#17579#>(empty?<#17579#> <#17580#>a-wp)<#17580#> <#17581#>...]<#17581#> 
    <#17582#>[<#17582#><#17583#>(symbol?<#17583#> <#17584#>(first<#17584#> <#17585#>a-wp))<#17585#> <#17586#>...<#17586#> <#17587#>(first<#17587#> <#17588#>a-wp)<#17588#> <#17589#>...<#17589#> <#17590#>(size<#17590#> <#17591#>(rest<#17591#> <#17592#>a-wp))<#17592#> <#17593#>...]<#17593#> 
    <#17594#>[<#17594#><#17595#>else<#17595#> <#17596#>...<#17596#> <#17597#>(size<#17597#> <#17598#>(first<#17598#> <#17599#>a-wp))<#17599#> <#17600#>...<#17600#> <#17601#>(size<#17601#> <#17602#>(rest<#17602#> <#17603#>a-wp))<#17603#> <#17604#>...]<#17604#><#17605#>))<#17605#> 
The rest of the template is as usual. The second and third <#63582#><#17609#>cond<#17609#><#63582#> clause contain selector expressions for the first item and the rest of the list. Because <#63583#><#17610#>(rest<#17610#>\ <#17611#>a-wp)<#17611#><#63583#> is always a Web page and because <#63584#><#17612#>(first<#17612#>\ <#17613#>a-wp)<#17613#><#63584#> is one in the third case, we also add a recursive call to size for these selector expressions. Using the examples and the template, we are ready to design <#63585#><#17614#>size<#17614#><#63585#>: see figure~#figsize#17615>. The differences between the definition and the template are minimal, which shows again how much of a function we can design by merely thinking systematically about the data definition for its inputs.
<#71078#>;; <#63586#><#17620#>size<#17620#> <#17621#>:<#17621#> <#17622#>WP<#17622#> <#17623#><#17623#><#17624#>-;SPMgt;<#17624#><#17625#><#17625#> <#17626#>number<#17626#><#63586#><#71078#>
<#17627#>;; to count the number of symbols that occur in a-wp<#17627#> 
<#17628#>(d<#17628#><#17629#>efine<#17629#> <#17630#>(size<#17630#> <#17631#>a-wp)<#17631#> 
  <#17632#>(c<#17632#><#17633#>ond<#17633#> 
    <#17634#>[<#17634#><#17635#>(empty?<#17635#> <#17636#>a-wp)<#17636#> <#17637#>0]<#17637#> 
    <#17638#>[<#17638#><#17639#>(symbol?<#17639#> <#17640#>(first<#17640#> <#17641#>a-wp))<#17641#> <#17642#>(+<#17642#> <#17643#>1<#17643#> <#17644#>(size<#17644#> <#17645#>(rest<#17645#> <#17646#>a-wp)))]<#17646#> 
    <#17647#>[<#17647#><#17648#>else<#17648#> <#17649#>(+<#17649#> <#17650#>(size<#17650#> <#17651#>(first<#17651#> <#17652#>a-wp))<#17652#> <#17653#>(size<#17653#> <#17654#>(rest<#17654#> <#17655#>a-wp)))]<#17655#><#17656#>))<#17656#> 
<#63587#>Figure: The definition of <#17660#>size<#17660#> for Web pages<#63587#>

<#17664#>Exercise 14.3.1<#17664#> Briefly explain how to define <#63588#><#17666#>size<#17666#><#63588#> using its template and the examples. Test <#63589#><#17667#>size<#17667#><#63589#> using the examples from above. <#17668#>Exercise 14.3.2<#17668#> Develop the function <#63590#><#17670#>occurs1<#17670#><#63590#>. The function consumes a Web page and a symbol. It produces the number of times the symbol occurs in the Web page, ignoring the nested Web pages. Develop the function <#63591#><#17671#>occurs2<#17671#><#63591#>. It is like <#63592#><#17672#>occurs1<#17672#><#63592#>, but counts <#17673#>all<#17673#> occurrences of the symbol, including in embedded Web pages.~ external Solution<#63593#><#63593#> <#17679#>Exercise 14.3.3<#17679#> Develop the function <#63594#><#17681#>replace<#17681#><#63594#>. The function consumes two symbols, <#63595#><#17682#>new<#17682#><#63595#> and <#63596#><#17683#>old<#17683#><#63596#>, and a Web page, <#63597#><#17684#>a-wp<#17684#><#63597#>. It produces a page that is structurally identical to <#63598#><#17685#>a-wp<#17685#><#63598#> but with all occurrences of <#63599#><#17686#>old<#17686#><#63599#> replaced by <#63600#><#17687#>new<#17687#><#63600#>.~ external Solution<#63601#><#63601#> <#17693#>Exercise 14.3.4<#17693#> People do not like deep Web trees because they require too many page switches to reach useful information. For that reason a Web page designer may also want to measure the depth of a page. A page containing only symbols has depth <#63602#><#17695#>0<#17695#><#63602#>. A page with an immediately embedded page has the depth of the embedded page plus <#63603#><#17696#>1<#17696#><#63603#>. If a page has several immediately embedded Web pages, its depth is the maximum of the depths of embedded Web pages plus <#63604#><#17697#>1<#17697#><#63604#>. Develop <#63605#><#17698#>depth<#17698#><#63605#>, which consumes a Web page and computes its depth.~ external Solution<#63606#><#63606#>