Sorting Quickly

Hoare's quicksort algorithm is the classic example of generative recursion in computing. Like <#66167#><#33089#>sort<#33089#><#66167#> in section~#secsortI#33090>, <#66168#><#33091#>quicksort<#33091#><#66168#> is a function that consumes a list of numbers and produces a version that contains the same numbers in ascending order. The difference between the two functions is that <#66169#><#33092#>sort<#33092#><#66169#> is based on structural recursion and <#66170#><#33093#>quicksort<#33093#><#66170#> is based on generative recursion. The underlying idea of the generative step is a time-honored strategy: divide and conquer. That is, we divide the non-trivial instances of the problem into two smaller, related problems, solve those smaller problems, and combine their solutions into a solution for the original problem. In the case of <#66171#><#33094#>quicksort<#33094#><#66171#>, the intermediate goal is to divide the list of numbers into two lists: one that contains all the items that are strictly smaller than the first item, and another one with all those items that are strictly larger than the first item. Then the two smaller lists are sorted using the same procedure. Once the two lists are sorted, we simply juxtapose the pieces. Due to its special role, the first item on the list is often called the <#33095#>pivot item<#33095#>.
rawhtml44

<#33096#><#33096#> <#66172#>Figure: A tabular illustration of <#33097#>quick-sort<#33097#><#66172#>


To develop a better understanding of the process, let's walk through one step of the evaluation by hand. Suppose the input is
<#33103#>(list<#33103#> <#33104#>11<#33104#> <#33105#>8<#33105#> <#33106#>14<#33106#> <#33107#>7)<#33107#>
The pivot item is <#66173#><#33111#>11<#33111#><#66173#>. Partioning the list into items larger and smaller than <#66174#><#33112#>11<#33112#><#66174#> produces two lists:
<#33117#>(list<#33117#> <#33118#>8<#33118#> <#33119#>7)<#33119#>
and
<#33127#>(list<#33127#> <#33128#>14)<#33128#>
The second one is already sorted in ascending order; sorting the first one produces <#66175#><#33132#>(list<#33132#>\ <#33133#>7<#33133#>\ <#33134#>8)<#33134#><#66175#>. This leaves us with three pieces from the original list:
  1. <#66176#><#33136#>(list<#33136#>\ <#33137#>7<#33137#>\ <#33138#>8)<#33138#><#66176#>, the sorted version of the list with the smaller numbers;
  2. <#66177#><#33139#>11<#33139#><#66177#>; and
  3. <#66178#><#33140#>(list<#33140#>\ <#33141#>14)<#33141#><#66178#>, the sorted version of the list with the larger numbers.
To produce a sorted version of the original list, we concatenate the three pieces, which yields the desired result: <#66179#><#33143#>(list<#33143#>\ <#33144#>7<#33144#>\ <#33145#>8<#33145#>\ <#33146#>11<#33146#>\ <#33147#>14)<#33147#><#66179#>. Our illustration leaves open how <#66180#><#33148#>quicksort<#33148#><#66180#> knows when to stop. Since it is a function based on generative recursion, the general answer is when the sorting problem has become trivial. Clearly, <#66181#><#33149#>empty<#33149#><#66181#> is one trivial input for <#66182#><#33150#>quicksort<#33150#><#66182#>, because the only sorted version of it is <#66183#><#33151#>empty<#33151#><#66183#>. For now, this answer suffices; we will return to this question in the next section. Figure~#figsortill#33152> provides tabular overview of the entire sorting process for <#66184#><#33153#>(list<#33153#>\ <#33154#>11<#33154#>\ <#33155#>8<#33155#>\ <#33156#>14<#33156#>\ <#33157#>7)<#33157#><#66184#>. Each box has three compartments: rawhtml45 The top compartment shows the list that we wish to sort, the bottommost contains the result. The three columns in the middle display the sorting process for the two partitions and the pivot item.
<#33162#>Exercise 25.2.1<#33162#> Simulate all <#66185#><#33164#>quicksort<#33164#><#66185#> steps for <#66186#><#33165#>(list<#33165#>\ <#33166#>11<#33166#>\ <#33167#>9<#33167#>\ <#33168#>2<#33168#>\ <#33169#>18<#33169#>\ <#33170#>12<#33170#>\ <#33171#>14<#33171#>\ <#33172#>4<#33172#>\ <#33173#>1)<#33173#><#66186#>. external Solution<#66187#><#66187#>
Now that we have a good understanding of the generative step, we can translate the process description into Scheme. The description suggests that <#66188#><#33181#>quicksort<#33181#><#66188#> distinguishes two cases. If the input is <#66189#><#33182#>empty<#33182#><#66189#>, it produces <#66190#><#33183#>empty<#33183#><#66190#>. Otherwise, it performs a generative recursion. This case-split suggests a <#66191#><#33184#>cond<#33184#>-expression<#66191#>:
<#71458#>;; <#66192#><#33189#>quick-sort<#33189#> <#33190#>:<#33190#> <#33191#>(listof<#33191#> <#33192#>number)<#33192#> <#33193#><#33193#><#33194#>-;SPMgt;<#33194#><#33195#><#33195#> <#33196#>(listof<#33196#> <#33197#>number)<#33197#><#66192#><#71458#>
<#33198#>;; to create a list of numbers with the same numbers as<#33198#> 
<#71459#>;; <#66193#><#33199#>alon<#33199#><#66193#> sorted in ascending order<#71459#> 
<#33200#>(d<#33200#><#33201#>efine<#33201#> <#33202#>(quick-sort<#33202#> <#33203#>alon)<#33203#> 
  <#33204#>(c<#33204#><#33205#>ond<#33205#> 
    <#33206#>[<#33206#><#33207#>(empty?<#33207#> <#33208#>alon)<#33208#> <#33209#>empty]<#33209#> 
    <#33210#>[<#33210#><#33211#>else<#33211#> <#33212#>...]<#33212#><#33213#>))<#33213#> 
The answer for the first case is given. For the second case, when <#66194#><#33217#>quicksort<#33217#><#66194#>'s input is non-<#66195#><#33218#>empty<#33218#><#66195#>, the algorithm uses the first item to partition the rest of the list into two sublists: a list with all items smaller than the pivot item and another one with those larger than the pivot item. Since the rest of the list is of unknown size, we leave the task of partitioning the list to two auxiliary functions: <#66196#><#33219#>smaller-items<#33219#><#66196#> and <#66197#><#33220#>larger-items<#33220#><#66197#>. They process the list and filter out those items that are smaller and larger, respectively, than the first one. Hence, each auxiliary function accepts two arguments, namely, a list of numbers and a number. Developing these functions is, of course, an exercise in structural recursion; their definitions are shown in figure~#figqsort#33221>.
<#71460#>;; <#66198#><#33226#>quick-sort<#33226#> <#33227#>:<#33227#> <#33228#>(listof<#33228#> <#33229#>number)<#33229#> <#33230#><#33230#><#33231#>-;SPMgt;<#33231#><#33232#><#33232#> <#33233#>(listof<#33233#> <#33234#>number)<#33234#><#66198#><#71460#>
<#33235#>;; to create a list of numbers with the same numbers as<#33235#> 
<#71461#>;; <#66199#><#33236#>alon<#33236#><#66199#> sorted in ascending order<#71461#> 
<#33237#>(d<#33237#><#33238#>efine<#33238#> <#33239#>(quick-sort<#33239#> <#33240#>alon)<#33240#> 
  <#33241#>(c<#33241#><#33242#>ond<#33242#> 
    <#33243#>[<#33243#><#33244#>(empty?<#33244#> <#33245#>alon)<#33245#> <#33246#>empty]<#33246#> 
    <#33247#>[<#33247#><#33248#>else<#33248#> <#33249#>(a<#33249#><#33250#>ppend<#33250#> 
            <#33251#>(quick-sort<#33251#> <#33252#>(smaller-items<#33252#> <#33253#>alon<#33253#> <#33254#>(first<#33254#> <#33255#>alon)))<#33255#> 
            <#33256#>(list<#33256#> <#33257#>(first<#33257#> <#33258#>alon))<#33258#> 
            <#33259#>(quick-sort<#33259#> <#33260#>(larger-items<#33260#> <#33261#>alon<#33261#> <#33262#>(first<#33262#> <#33263#>alon))))]<#33263#><#33264#>))<#33264#> 
<#71462#>;; <#66200#><#33265#>larger-items<#33265#> <#33266#>:<#33266#> <#33267#>(listof<#33267#> <#33268#>number)<#33268#> <#33269#>number<#33269#> <#33270#><#33270#><#33271#>-;SPMgt;<#33271#><#33272#><#33272#> <#33273#>(listof<#33273#> <#33274#>number)<#33274#><#66200#><#71462#> 
<#71463#>;; to create a list with all those numbers on <#66201#><#33275#>alon<#33275#><#66201#> <#71463#> 
<#71464#>;; that are larger than <#66202#><#33276#>threshold<#33276#><#66202#><#71464#> 
<#33277#>(d<#33277#><#33278#>efine<#33278#> <#33279#>(larger-items<#33279#> <#33280#>alon<#33280#> <#33281#>threshold)<#33281#> 
  <#33282#>(c<#33282#><#33283#>ond<#33283#> 
    <#33284#>[<#33284#><#33285#>(empty?<#33285#> <#33286#>alon)<#33286#> <#33287#>empty]<#33287#> 
    <#33288#>[<#33288#><#33289#>else<#33289#> <#33290#>(if<#33290#> <#33291#>(;SPMgt;<#33291#> <#33292#>(first<#33292#> <#33293#>alon)<#33293#> <#33294#>threshold)<#33294#> 
              <#33295#>(cons<#33295#> <#33296#>(first<#33296#> <#33297#>alon)<#33297#> <#33298#>(larger-items<#33298#> <#33299#>(rest<#33299#> <#33300#>alon)<#33300#> <#33301#>threshold))<#33301#> 
              <#33302#>(larger-items<#33302#> <#33303#>(rest<#33303#> <#33304#>alon)<#33304#> <#33305#>threshold))]<#33305#><#33306#>))<#33306#> 
<#71465#>;; smaller-items : <#66203#><#33307#>(listof<#33307#> <#33308#>number)<#33308#> <#33309#>number<#33309#> <#33310#><#33310#><#33311#>-;SPMgt;<#33311#><#33312#><#33312#> <#33313#>(listof<#33313#> <#33314#>number)<#33314#><#66203#><#71465#> 
<#71466#>;; to create a list with all those numbers on <#66204#><#33315#>alon<#33315#><#66204#> <#71466#> 
<#71467#>;; that are smaller than <#66205#><#33316#>threshold<#33316#><#66205#><#71467#> 
<#33317#>(d<#33317#><#33318#>efine<#33318#> <#33319#>(smaller-items<#33319#> <#33320#>alon<#33320#> <#33321#>threshold)<#33321#> 
  <#33322#>(c<#33322#><#33323#>ond<#33323#> 
    <#33324#>[<#33324#><#33325#>(empty?<#33325#> <#33326#>alon)<#33326#> <#33327#>empty]<#33327#> 
    <#33328#>[<#33328#><#33329#>else<#33329#> <#33330#>(if<#33330#> <#33331#>(;SPMlt;<#33331#> <#33332#>(first<#33332#> <#33333#>alon)<#33333#> <#33334#>threshold)<#33334#> 
              <#33335#>(cons<#33335#> <#33336#>(first<#33336#> <#33337#>alon)<#33337#> <#33338#>(smaller-items<#33338#> <#33339#>(rest<#33339#> <#33340#>alon)<#33340#> <#33341#>threshold))<#33341#> 
              <#33342#>(smaller-items<#33342#> <#33343#>(rest<#33343#> <#33344#>alon)<#33344#> <#33345#>threshold))]<#33345#><#33346#>))<#33346#> 
<#33350#>Figure: The quick-sort algorithm<#33350#>
Each sublist is sorted separately, using <#66206#><#33352#>quick-sort<#33352#><#66206#>. This implies the use of recursion and, more specifically, the following two expressions:
  1. <#66207#><#33354#>(quick-sort<#33354#>\ <#33355#>(smaller-items<#33355#>\ <#33356#>alon<#33356#>\ <#33357#>(first<#33357#>\ <#33358#>alon)))<#33358#><#66207#>, which sorts the list of items smaller than the pivot; and
  2. <#66208#><#33359#>(quick-sort<#33359#>\ <#33360#>(larger-items<#33360#>\ <#33361#>alon<#33361#>\ <#33362#>(first<#33362#>\ <#33363#>alon)))<#33363#><#66208#>, which sorts the list of items larger than the pivot; and
Once we get the sorted versions of the two lists, we need a function that combines the two lists and the pivot item. Scheme's <#66209#><#33365#>append<#33365#><#66209#> function accomplishes this:
<#33370#>(append<#33370#> <#33371#>(quick-sort<#33371#> <#33372#>(smaller-items<#33372#> <#33373#>alon<#33373#> <#33374#>(first<#33374#> <#33375#>alon)))<#33375#>
        <#33376#>(list<#33376#> <#33377#>(first<#33377#> <#33378#>alon))<#33378#> 
        <#33379#>(quick-sort<#33379#> <#33380#>(larger-items<#33380#> <#33381#>alon<#33381#> <#33382#>(first<#33382#> <#33383#>alon))))<#33383#> 
Clearly, all items in list~1 are smaller than the pivot and the pivot is smaller than all items in list~2, so the result is a sorted list. Figure~#figqsort#33387> contains the full function. It includes the definition of <#66210#><#33388#>quick-sort<#33388#><#66210#>, <#66211#><#33389#>smaller-items<#33389#><#66211#>, and <#66212#><#33390#>larger-items<#33390#><#66212#>. Let's take a look at the beginning of a sample hand evaluation:
  <#33395#>(quick-sort<#33395#> <#33396#>(list<#33396#> <#33397#>11<#33397#> <#33398#>8<#33398#> <#33399#>14<#33399#> <#33400#>7))<#33400#>
<#33401#>=<#33401#> <#33402#>(append<#33402#> <#33403#>(quick-sort<#33403#> <#33404#>(list<#33404#> <#33405#>8<#33405#> <#33406#>7))<#33406#> 
          <#33407#>(list<#33407#> <#33408#>11)<#33408#> 
          <#33409#>(quick-sort<#33409#> <#33410#>(list<#33410#> <#33411#>14)))<#33411#> 
<#33419#>=<#33419#> <#33420#>(append<#33420#> <#33421#>(append<#33421#> <#33422#>(quick-sort<#33422#> <#33423#>(list<#33423#> <#33424#>7))<#33424#>
                  <#33425#>(list<#33425#> <#33426#>8)<#33426#> 
                  <#33427#>(quick-sort<#33427#> <#33428#>empty))<#33428#> 
          <#33429#>(list<#33429#> <#33430#>11)<#33430#> 
          <#33431#>(quick-sort<#33431#> <#33432#>(list<#33432#> <#33433#>14)))<#33433#> 
<#33441#>=<#33441#> <#33442#>(append<#33442#> <#33443#>(append<#33443#> <#33444#>(append<#33444#> <#33445#>(quick-sort<#33445#> <#33446#>empty)<#33446#>
                          <#33447#>(list<#33447#> <#33448#>7)<#33448#> 
                          <#33449#>(quick-sort<#33449#> <#33450#>empty))<#33450#> 
                  <#33451#>(list<#33451#> <#33452#>8)<#33452#> 
                  <#33453#>(quick-sort<#33453#> <#33454#>empty))<#33454#> 
          <#33455#>(list<#33455#> <#33456#>11)<#33456#> 
          <#33457#>(quick-sort<#33457#> <#33458#>(list<#33458#> <#33459#>14)))<#33459#> 
<#33467#>=<#33467#> <#33468#>(append<#33468#> <#33469#>(append<#33469#> <#33470#>(append<#33470#> <#33471#>empty<#33471#>
                          <#33472#>(list<#33472#> <#33473#>7)<#33473#> 
                          <#33474#>empty)<#33474#> 
                  <#33475#>(list<#33475#> <#33476#>8)<#33476#> 
                  <#33477#>empty)<#33477#> 
          <#33478#>(list<#33478#> <#33479#>11)<#33479#> 
          <#33480#>(quick-sort<#33480#> <#33481#>(list<#33481#> <#33482#>14)))<#33482#> 
<#33490#>=<#33490#> <#33491#>(append<#33491#> <#33492#>(append<#33492#> <#33493#>(list<#33493#> <#33494#>7)<#33494#>
                  <#33495#>(list<#33495#> <#33496#>8)<#33496#> 
                  <#33497#>empty)<#33497#> 
          <#33498#>(list<#33498#> <#33499#>11)<#33499#> 
          <#33500#>(quick-sort<#33500#> <#33501#>(list<#33501#> <#33502#>14)))<#33502#> 
<#33503#>=<#33503#> <#33504#>...<#33504#> 
The calculation shows the essential steps of the sorting process, that is, the partitioning steps, the recursive sorting steps, and the concatenation of the three parts. From this calculation, we can see that <#66213#><#33508#>quick-sort<#33508#><#66213#> implements the process illustrated in figure~#figsortill#33509>
<#33512#>Exercise 25.2.2<#33512#> Complete the above hand-evaluation. The hand-evaluation of <#66214#><#33514#>(quick-sort<#33514#>\ <#33515#>(list<#33515#>\ <#33516#>11<#33516#>\ <#33517#>8<#33517#>\ <#33518#>14<#33518#>\ <#33519#>7))<#33519#><#66214#> suggests an additional trivial case for <#66215#><#33520#>quick-sort<#33520#><#66215#>. Every time <#66216#><#33521#>quick-sort<#33521#><#66216#> consumes a list of one item, it produces the very same list. After all, the sorted version of a list of one item is the list itself. Modify the definition of <#66217#><#33522#>quick-sort<#33522#><#66217#> to take advantage of this observation. Hand-evaluate the same example again. How many steps does the extended algorithm save? external Solution<#66218#><#66218#> <#33528#>Exercise 25.2.3<#33528#> While <#66219#><#33530#>quick-sort<#33530#><#66219#> quickly reduces the size of the problem in many cases, it is inappropriately slow for small problems. Hence, people often use <#66220#><#33531#>quick-sort<#33531#><#66220#> to reduce the size of the problem and switch to a different sort function when the list is small enough. Develop a version of <#66221#><#33532#>quick-sort<#33532#><#66221#> that uses <#66222#><#33533#>sort<#33533#><#66222#> from section~#secsortI#33534> if the length of the input is below some threshold. external Solution<#66223#><#66223#> <#33540#>Exercise 25.2.4<#33540#> If the input to <#66224#><#33542#>quick-sort<#33542#><#66224#> contains the same number several times, the algorithm returns a list that is strictly shorter than the input. Why? Fix the problem so that the output is as long as the input. external Solution<#66225#><#66225#> <#33548#>Exercise 25.2.5<#33548#> Use the <#66226#><#33550#>filter<#33550#><#66226#> function to define <#66227#><#33551#>smaller-items<#33551#><#66227#> and <#66228#><#33552#>larger-items<#33552#><#66228#> as one-liners. external Solution<#66229#><#66229#> <#33558#>Exercise 25.2.6<#33558#> Develop a variant of <#66230#><#33560#>quick-sort<#33560#><#66230#> that uses only one comparison function, say, <#66231#><#33561#>;SPMlt;<#33561#><#66231#>. Its partitioning step divides the given list <#66232#><#33562#>alon<#33562#><#66232#> into a list that contains the items of <#66233#><#33563#>alon<#33563#><#66233#> smaller than <#66234#><#33564#>(first<#33564#>\ <#33565#>alon)<#33565#><#66234#> and another one with those that are not smaller. Use <#66235#><#33566#>local<#33566#><#66235#> to combine the functions into a single function. Then abstract the new version to consume a list and a comparison function:
<#71468#>;; <#66236#><#33571#>general-quick-sort<#33571#> <#33572#>:<#33572#> <#33573#>(X<#33573#> <#33574#>X<#33574#> <#33575#><#33575#><#33576#>-;SPMgt;<#33576#><#33577#><#33577#> <#33578#>bool)<#33578#> <#33579#>(list<#33579#> <#33580#>X)<#33580#> <#33581#><#33581#><#33582#>-;SPMgt;<#33582#><#33583#><#33583#> <#33584#>(list<#33584#> <#33585#>X)<#33585#><#66236#><#71468#>
<#33586#>(define<#33586#> <#33587#>(general-quick-sort<#33587#> <#33588#>a-predicate<#33588#> <#33589#>a-list)<#33589#> <#33590#>...)<#33590#> 
external Solution<#66237#><#66237#>