Distributed Program Construction

Fall 1999

Lecture 8: Date Replication and Coherence

Reading:
 
 
 


COMP 413
        Sharing Replicated Data

  Problem: If objects (or data) are shared, we need to do something about concurrent accesses to guarantee state consistency.
 
 



COMP 413


Basic issue: If the shared object resides in one address space, controlling concurrency is easy:


 
 


COMP 413 Problem: Having copies of shared objects (data), and concurrent updates, may get into serious consistency problems


 


COMP 413

Performance and Scalability

Main issue: To keep replicas consistent, we need to ensure that all conflicting operations are done in the the same order everywhere

Conflicting operations: From the world of transactions:

Guaranteeing global ordering on conflicting operations is a costly operation, downgrading scalability

Solution: weaken consistency requirements so that, hopefully, global synchronization can be avoided
 
 


COMP 413 Strong consistency models: Operations on shared data are synchronized: Weak consistency models: Synchronization occurs only when shared data is locked and unlocked: Observation: The weaker the consistency model, the easier it is to build a scalable solution.



COMP 413 Any read to a shared data item X returns the value stored by the most recent write operation on X.

Observation: It doesn’t make sense to talk about “the most recent” in a distributed environment.


 

Note: Strict consistency is what you get in the normal sequential case, where your program does not interfere with any other program.



COMP 413 The result of any execution is the same as if the operations of all processes were executed in some sequential order, and the operations of each individual process appear in this sequence in the order specified by its program.

Note: We’ re talking about interleaved executions: there is some total ordering for all operations taken together .


 

r + w >= t, where


COMP 413 Writes that are potentially causally related must be seen by all processes in the same order. Concurrent writes may be seen in a different order by different processes.


 


COMP 413 Writes done by a single process are received by all other processes in the order in which they were issued, but writes from different processes may be seen in a different order by different processes .


COMP 413

Weak Consistency

Basic idea: You don’t care that reads and writes of a series of operations are immediately known to other processes. You just want the effect of the series as a whole to be known.

Observation: Weak consistency implies that we need to lock and unlock data (implicitly or not).
 
 


COMP 413

Release Consistency

Idea: Divide access to a synchronization variable into two parts: an acquire and a release phase . Acquire forces a requester to wait until the shared data can
be accessed; release sends requester’ s local value to shared memory.


COMP 413
  Note: Where release consistency affects all shared variables, entry consistency affects only those shared variables associated with a synchronization variable.

Question: What would be a convenient way of making entry consistency more or less transparent to programmers?
 
 


COMP 413

Client-Centric Coherence Models


COMP 413

    Client-Centric Coherence Models

Goal: Show how we can perhaps avoid system-wide consistency, by concentrating on what specific clients want, instead of what should be maintained by servers.

Background: Most large-scale distributed systems (i.e ., databases) apply replication for scalability, but can support only weak consistency:



COMP 413

Consistency for Mobile Users

Example: Consider a distributed database to which you have access through your notebook. Assume your notebook acts as a front end to the database.
 

Note: The only thing you really want is that the entries you updated and/or read at A, are in B the way you left them in A. In that case, the database will appear to be consistent to you.


COMP 413

Basic Architecture


 
 
 


COMP 413

System Model



  • COMP 413

    Read Your Writes

    Definition: if read R follows write W in a session, and R is performed on DB(S,t), then W should have been in DB(S,t):

    Note: There is no guarantee that R returns W: there may have been writes from other clients at S between W and R.

    Example: Updating your Web page and guaranteeing that your Web browser shows the newest version instead of its cached copy.


    COMP 413

    Monotonic Reads

    Intuitively: Ensure (within a sessions) that each read operation is made only at a server containing all the writes that were seen by previous reads in s.

    Notation: A set WS of writes is complete for read R and DB(S,t) (Complete(WS,S,t,R)) if and only if:

    Notation: WS=RelevantWrites(S,t,R) iff:


    Example: Automatically reading your personal calendar updates from different servers. Monotonic Reads guarantees that the user sees all updates, no matter from which server the automatic reading takes place.

    Example: Reading (not modifying) incoming mail while you are on the move. Each time you connect to a different e-mail server, that server fetches (at least) all the updates from the server you previously visited.


    COMP 413
    Writes Follows Reads

    Intuitively: If a read precedes a write (in a session), then that write is performed after all writes that preceeded the read.

    Definition: if a read R precedes a write W, and R is performed at server S, then if W is performed at server S*, all relevant writes for R are also performed at S*, and before W:

    Note: We are imposing two conditions which need not always be relevant:


    COMP 413
    Monotonic Writes


    COMP 413
    Replica Coherence Protocols




    COMP 413

    Algorithms for Sequential Consistency

    Observation: In the end we always want sequential consistency, whether or not it is implemented using weak consistency in combination with synchronization
    variables (locks).

    Observation: If we discard whether all data is globally consistent (as is the case with release consistency), or specific data is consistent (entry consistency), we need to distinguish three access methods.


     


    COMP 413

    Read Remote, Write Remote


    COMP 413
    Read Migrate, Write Migrate

    COMP 413
    Read Local, Write Remote

    COMP 413

    Read Local, Write Migrate


    COMP 413

    Read Replicate, Write Replicate


    COMP 413

    Replication Strategies (1/3)

    Observation: Choosing one of the basic algorithms for (sequential) consistency is only the first step. There are many more refinements to make!

    Model: We consider objects (and don’t worry whether they contain just data or code, or both)

    Distinguish different stores: A store is capable of hosting a replica of an object:


    COMP 413

    Replication Strategies (2/3)


    COMP 413

    Replication Strategies (3/3)

    Change distribution: What is distributed between the replicas:

    Store responsiveness: When does a replica take action when it notices inconsistency: Store reaction: What does a replica actually do:
    COMP 413
    Example Consistency Protocol


     

    Observation: Basically, all stores in the Web apply the same replication strategy regardless the content of the Web document (there are a few exceptions)

    Observation: Caching is becoming more and more problematic in the Web, as the number of documents grows exponentially.

    Again: It is the consistency requirements that determines the applicability of scaling techniques.


    COMP 413