We have seen how situations involving deceptive idleness can arise for various reasons. However, the question remains as to how it actually affects the behaviour of different disk schedulers. In this chapter, we examine how various manifestations of deceptive idleness cause fundamentally different kinds of problems with disk schedulers. In all forthcoming examples, the scheduler is forced to multiplex between request streams from different processes, at every decision point.
Consider an operating system equipped with any seek reducing scheduler,
like Shortest positioning-time first (SPTF) or
C-LOOK [WGP94]. These generally improve
throughput by reordering available requests to minimize expensive head
repositioning operations. Imagine an application base consisting of two
processes
and
, each issuing sequentially positioned 64 KB
requests in different parts of the disk (as depicted in
figure 4.1).
![]() |
In one case, assume that both processes have initially issued all their
requests. At almost every decision point, there are suitable requests
pending for both processes. In the interest of seek reduction, the
scheduler then services several requests for
, moves the head across
the disk, and services several requests for
. This yields a
throughput of 21 MB/s on our disk.
Instead, suppose both process issue a single synchronous stream of disk
requests each, i.e. they issue a request only after the previous one has
completed. The first request from
is serviced, after which the
work-conserving scheduler does not wait for
to generate its
subsequent request. It assumes
to have become idle, seeks across the
disk to the pending request issued by
, and services that instead. A
few hundred microseconds after this happens, the next request from
arrives. It is now too late to service
's nearby followup request,
since disk scheduling is non-preemptible. The scheduler is thus forced
to alternate between servicing requests issued by the two processes,
thereby degenerating to FCFS and achieving a throughput of just 5 MB/s
(due to 9ms of average positioning time for every 3ms of data transfer
time for 64 KB).
If we had more than two processes issuing requests, then C-LOOK would behave like a round-robin scheduler, servicing just the one available request from each process before moving on to the next. On the other hand, SPTF would converge to the two closest requests on disk and alternate between them, starving the third process. Deceptive idleness does not always cause the scheduler to degenerate to a FCFS scheduler; the effect is often more subtle.
If the disk scheduler receives just one stream of sequentially positioned requests, then despite deceptive idleness, there would be no reason to reposition the head between requests. Any scheduler (even FCFS) would yield high throughput. However, with concurrent, synchronous request streams, we would experience large head seeks for every request serviced - regardless of the seek reduction policy. This is the essence of how deceptive idleness causes seek-reducing schedulers to yield bad performance.
Deceptive idleness can affect schedulers in ways other than just
degrading performance. Consider a proportional-share scheduler like
Stride [WW95] (with or without simultaneous seek
reduction), assigned two resource containers
and
and assigned
shares of 2:1. Assume disk-intensive applications are bound to these
containers. These applications would thus expect the service that a busy
request issuer deserves of the proportional-share scheduler. Assume for
purposes of discussion that all requests target random blocks and are
equally large, thus taking equal amounts of time to service.
![]() |
A typical implementation of this scheduler may service a few requests
for container
, and correspondingly, a larger number of requests for
container
. However, if processes in these containers collectively
maintain only one outstanding request at any time, then a
work-conserving scheduler becomes incapable of meeting the desired
proportions. Like before, it becomes forced to alternately service
requests from the two containers, achieving a service ratio closer to
1:1 (see figure 4.2).
We now examine a set of experiments that illustrate the deviation incurred on the stride scheduler (with two resource containers), corresponding to different sources of deceptive idleness.
Figure 4.3 depicts the desired
behaviour of the scheduler, without the presence of deceptive idleness.
Proportions of 2:1 are assigned to the two resource containers
and
, and each container is associated with
two processes. These two processes produce enough requests between them
to avoid any delusions of idleness. Consequently, we observe container
always receiving 66% of disk service. For example, after
60 seconds into the experiment, the two containers have received 40 and
20 seconds of service time on the disk respectively.
With only one process bound to each resource container, we experience explicit request dependencies as shown in figure 4.4. The number of active resource containers is almost always seen to be just 1, and we achieve close to 1:1 service allocated to the two containers. Application-level requirements of proportional allocation are not satisfied.
As we have seen earlier, sharing metadata gives rise to deceptive idleness. Interspersed with file I/O, this leads to an average of about 1.5 resource containers that seem busy to the scheduler at decision points. We thus experience similar proportions being achieved by the stride scheduler (figure 4.5).
As we have seen earlier, a heavily thrashing system can contract deceptive idleness due to one process needing memory that another process is trying to write out to swap. Figure 4.6 demonstrates the behaviour of a proportional-share scheduler under such circumstances - proportions are slightly violated on different occasions, and the stride scheduler tries to eventually compensate for such skew. However, the stride scheduler is expected to deliver accurate proportions on a fine timescale, so such temporary violation is not allowed. Many runs of such experiments produce markedly different graphs, due to unpredictable pageout daemon behaviour.
This class of problems, though not very pronounced in its effect, actually illustrates the much bigger problem of global resource management. How much CPU or memory resources should the system allocate to a particular process, in order to enable it to issue its intended disk requests? How much CPU or memory blockage should a process tolerate before being declared really idle? These and other questions demand a deeper analysis and greater solutions that we don't attempt in this thesis.
One tentative approach towards solution for memory contention causing deceptive idleness, is that of self-paging as implemented in the Nemesis [Han99] operating system. When each process needs memory, the system writes out pages to swap from its own address space. Thus, blockage of a process for pageout I/O would affect the issue of its own requests only, and the problem of deceptive idleness should at least partially get solved. We have not explored this approach any further in this thesis.
Some of the above situations may look as though the scheduler has degenerated into a FCFS scheduler, due to complete lack of choice at every decision point. However, this is not always true; when scheduling from more than two resource containers, the effect of deceptive idleness can depend on specific idiosyncrasies of the scheduling algorithm. We examine this behaviour for two interesting and fundamentally different kinds of proportional-share schedulers.
Assume all requests take approximately the same amount of time to
service. Consider three active resource containers
and
,
assigned proportions of 1:2:3 respectively. Deceptive idleness
prevents two requests from any container being serviced
consecutively. However, this restriction still allows the scheduler
to meet the desired proportions. The schedule
, etc.
(with any permutations of
and
's) suffices for this purpose.
Moreover, this schedule will actually be achieved by Stride, since
virtual clocks ``remember'' deserved service and compensate if
possible.
In contrast, consider proportions of 1:1:3 assigned to the
containers. The constraint imposed by deceptive idleness now becomes
too restrictive to meet these proportions. Requests are scheduled
from
etc. and the skewed proportions of 1:1:2 are
achieved instead.
Generalizing from these two examples, the Stride scheduler is capable of meeting proportions for a set of resource containers, if and only if the share on any one container does not exceed the sum of shares on the remaining containers. This condition is similar to the triangle inequality extended to polygons, where the length of a side never exceeds the sum of lengths of the remaining sides.
A lottery scheduler [WW94] is a randomized proportional-share scheduling algorithm that operates by selecting a request from all runnable resource containers, with probabilities proportional to their assigned shares. On a somewhat coarse timescale, it intends to regulate the relative resource consumption rates of disk-intensive processes according to the assigned proportions. Though more meaningful for CPU scheduling, we analyze the effect of deceptive idleness on a lottery disk scheduler.
Let proportions of
be assigned to resource containers
, with sum of shares
. Let the scheduler actually achieve proportions
of
, averaged over reasonable periods of time. Ideally we
would expect
, thus justifying correct
behaviour. This does happen when all applications provide enough
requests to the disk scheduler at the critical decision points.
Figure 4.7 depicts an
experiment wherein three resource containers are assigned
proportions of
1:2:3 (i.e.
), and
each has two processes bound to it, where each process issues
randomly positioned back-to-back requests. Allowing for vagaries
imposed by the inherent randomness, we observe
.
However, this stops being true if deceptive idleness is caused by
having only one process bound to each resource container. After each
request completes, explicit request dependencies in request
generation result in the scheduler falsely assuming that the
container that generated the request has momentarily become idle.
Consequently the scheduler achieves the distorted proportions of
1:1.6:1.8 (i.e.
) instead of
1:2:3, as shown in figure 4.8.
This strange behaviour deserves a mathematical explanation in the
general case. With probability
, a request from
gets
serviced at every decision point. At the next decision point, this
has no pending requests due to deceptive idleness, and is
therefore not chosen. So for
, the conditional probability
that a request from
gets chosen now is
. Summing up these conditional probabilities, we get
| (1) |
Thus, for assigned proportions of
1:2:3, we get
, which gives us
5:8:9 =
1:1.6:1.8, thus tallying quite well with the above experiment. To
show that achieved proportions can be very different from assigned
ones,
1:10:100 gives us
1:9.1:10. When we introduce a very large number of resource
containers,
grows large, so achieved proportions tend
to approach the corresponding assigned proportions. These have been
verified through direct experiment.
Thus the impact of deceptive idleness on the lottery scheduler diminishes when we have many resource containers at any time. But even when this is the case, it is common in practice to observe only a small number of resource containers in a system that are actually busy at any time. Secondly, it may seem possible to synthesize a set of assigned proportions for which the achieved proportions are just what one wants: however, this would typically not be possible, since the scheduler becomes sensitive to the number of busy resource containers at any time. Workarounds of this kind are thus difficult, if at all possible.
Practical deployment of a proportional-share disk scheduler would typically incorporate seek reduction as described in section 2.1.3. Deceptive idleness may allow such a scheduler to deliver the desired proportions, but may prevent the seek reducing aspect from working as desired. This is because requests for sequential blocks are generated often not at the level of resource containers, but by some entity more fine-grained, like a process or a thread. When this happens, binding multiple processes to each container may prevent deceptive idleness at the level of a resource container, but may not be enough to prevent deceptive idleness for each process. Proportional-share resource allocation typically cares about the former, and seek reduction, if any, is dependent on the latter.