Distributed Program Construction
Fall 1999
Lecture 7: Naming
Reading: Cheriton,
Ch7 (take this chapter with a grain of salt)
or, Chapter 9 in Coulouris et al.
COMP 413
Naming
Model (1/2)
Basics: We consider a name space consisting of
nodes that contain information on entities . Nodes can be named.
Context: A function that maps names to nodes
-
Bind: create a name for a node
in a context:
bind(context,name,node) ->
void
-
Resolve: determine the node
to which a name in a given context is mapped:
resolve(context,name) -> node
-
Contex node (directory): a node
containing a context
context(node) -> context
Each edge from node M to node N is
labeled with the name N has in the context stored in M.
COMP 413
Naming
Model (2/2)
Compound name : a nonempty sequence of names [name1,...,nameK]
Compound names are resolved recursively:
-
resolve(context,[name]) = resolve(context,name)
-
resolve(context,[name1,...,nameK]) = resolve(context(resolve(context,name1)),[name2,...,nameK])

COMP 413
Meaning of a Name
Pure name: A name that has no
meaning at all; it is just a "randomly" chosen string. Pure names can be
used for comparison only.
Opposite of pure name is a description:
List of properties that an object satisfies (most names are somewhere in
between)
Identifier: A name having
the following properties:
P1 Each identifier is bound
to at most one entity
P2 Each entity is named
by at most one identifier
P3 An identifier remains
bound to the same entity (prohibits reusing an identifier)
P4 An entity is never assigned
a new identifier (prohibits renaming)
Note: P3 <-> P4
An identifier need not necessarly
be a pure name, i.e., it may have content. However, that content is never
allowed to change, as this would violate
P4.
COMP 413
Observation: We can easily store all kinds of attributes
in a node, describing aspects of the entity the node represents:
-
Type of the entity
-
An identifier for that entity
-
Address of the entity?s location
-
Nicknames
-
...
Observation: Context nodes can also have attributes,
besides just storing a context
COMP 413
Problem: To resolve a name we
need a context. How do we actually find that (initial) context?
Closure mechanism: The mechanism
to select the implicit context from which to start name resolution:
-
cs.rice.edu: contact local DNS name
server
-
/home/druschel/myFile: start at local
filesystem's root directory
-
+1 713 527 4664: dial the number at
a telephone
Question: Why are closure mechanisms
always implicit ?
Observation: A closure mechanism
may also determine how name resolution should proceed
COMP 413
Name Linking (1/2)
Hard link: What we have described
so far as a name-to-entity binding
Soft link: Allow a node O
to
contain a name of another node:
-
First resolve O?s name (leading
to O)
-
Read the content of O, yielding
name
-
Name resolution continues with
name
Observations:
-
The name resolution process determines
that we read the content of a node, in particular, the name in the node
-
One way or the other, we know where
and how to start name resolution given name
COMP 413
COMP 413
Name Space Implementation
Basic issue: Distribute the name resolution process
as well as name space management across multiple machines, by distributing
nodes of the naming graph.
Consider a hierarchical naming graph and distinguish three
levels:
Global level: Consists of the high-level context
objects. Main aspect is that these context objects have to be jointly managed
by different administrations
Administrational level: Contains mid-level context
objects which can be grouped in such a way that each group can be assigned
to a separate administration.
Local level: Consists of low-level context objects
within a single administration. Main issue is effectively mapping context
objects to local name servers.
COMP 413
Iterative Name Resolution
-
resolve(context,[name2,...,nameK]) is sent to Server0
responsible for context
-
Server0 resolves resolve(context,name1) -> context1,
returning the identification (address) of Server1, which stores
context1
-
Client sends resolve(context1,[name2,...,nameK]) to
Server1
-
etc.
COMP 413
Recursive Name Resolution
-
resolve(context,[name2,...,nameK]) is sent to Server0
responsible for context
-
Server0 resolves resolve(context,name1) -> context1,
and sends resolve(context1,[name2,...,nameK]) to Server1,
which stores context1
-
Server0 waits for the result from Server1,
and returns it to the client
COMP 413
Merging Name Spaces (1/3)
Problem: We have different name
spaces that we wish to access from any given name space.
Solution 1: Introduce a naming
scheme by which pathnames of different name spaces are simply concatenated
(URLs).

COMP 413
Merging Name Spaces (2/3)
Solution 2: Introduce nodes that contain the name
of a node in a ?foreign? name space , along with the information how to
select the initial context in that foreign name space (Jade).

COMP 413
Merging Name Spaces (3/3)
Solution 3: Use only full pathnames, in which the
starting context is explicitly identified, and merge by adding a new root
node (DEC? s Global Name Space).

Note: In principle, you always have to start in
the new root
COMP 413
Size scalability: We need to
ensure that servers can handle a large number of requests per time unit
->
high-level servers are in big trouble.
Solution: Assume (at least
at global and administrative level) that content of nodes hardly ever changes.
In that case , we can apply extensive replication by mapping nodes to multiple
servers, and start name resolution at the nearest server.
Observation: An important
attribute of many nodes is the address where the represented entity can
be contacted. Replicating nodes makes large-scale traditional name servers
unsuited for locating mobile entitities.
COMP 413
Scalability Issues (2/2)
Geographical scalability: We need to ensure that
the name resolution process scales across large geographical distances.
Problem: By mapping nodes to servers that may,
in principle, be located anywhere, we introduce an implicit location dependency
in our naming scheme.
Solution: No general one available yet.
COMP 413
Domain
Name System (DNS)
Internet name service that translates "domain names" to
IP addresses. See Notes.
COMP 413
X.500
World-wide, attribute based directory service, originally
designed to supersede telephone directories (yellow and white pages).
-
committee design, ignores implementation issues
-
scalability properties still unknown
-
hierarchical namespace (similar to DNS)
Directory Information Base (DIB) entry contains:
-
domain name
-
set of attributes
Two types of lookups:
-
read(domain-name,{set of attributes}) -> {set of attribute
values}
returns the requested set of attribute values, corresponding
to the requested domain-name
Example: read(us.rice.cs.druschel,{office-phone#,email-addr})
->
{+1 713 527 4664, druschel@cs.rice.edu}
-
search(name, {filter-expression}) -> {set of domain-names}
returns the set of domain names in the directory tree
rooted at domain-name that
satisfy the filter-expression. The
filter-expression is a boolean expression that
specifies a search criterion: a logical combination of test of any of the
attributes in the directory entry.
Example: search(us.rice.cs, {office=DH3013} ->
{us.rice.cs.gradstudent.ssiyer, us.rice.cs.gradstudent.sameh}
COMP 413
WWW URLs
Merges namespaces:
-
protocol (defined by WWW)
-
DNS domain name
-
server filesystem namespace
COMP 413
Traditional Name Service Summary
Hierarchical Namespaces
-
useful for namespace organization, easy to ensure name uniqueness
-
can be exploited to guide lookup in distributed nameservers
-
server hierarchy resembles namespace hierarchy
-
case 1: namespace hierarchy reflects object proximity
-
(e.g., www.ai.mit.edu/pub/GNU/emacs)
-
(e.g., www.cs.rice.edu/courses/comp413)
-
can exploit lookup locality
-
natural distributed administration of name bindings
-
case 2: weak correlation between namespace hierarchy and
object location
-
(e.g., computers/software/free/GNU/emacs/)
-
(e.g., /education/comp-sci/dist-systems/courses/Rice-comp413)
-
lookup locality less likely (at least with human users)
-
complex administration of name bindings
Caching
-
critical for scalability (reduce non-local lookup frequency)
-
weak consistency
COMP 413
Naming & Locating Objects (1/2)
Location service: Solely aimed at providing the
addresses of the current locations of entities.
Assumption: Entities are mobile, so that their
current address may change frequently .
Naming service: Aimed at providing the content
of nodes in a name space, given a (compound) name. Content comprises different
(attribute,value) pairs.
Assumption: Node contents at global and administrative
level is relatively stable for scalability reasons.
Observation: If a traditional naming service is
used to locate entities, we also have to assume that node contents at the
local level is stable, as we can use only names as identifiers (think of
Web pages).
COMP 413
Naming & Locating Objects (2/2)
Problem: It is not realistic to assume stable node contents
down to the local naming level
Solution: Decouple naming from locating entities

Name: Any name in a traditional naming space
Object handle: An object identifier (P1, P3)
Contact address: An address providing all information
necessary to contact an object
Observation: An object? s name is now completely
independent of its location.
COMP 413
Simple Solutions for Locating Objects
Broadcasting: Simply broadcast the OID, requesting
the object to return its current address.
-
Can never scale beyond local-area networks
-
Requires all processes to listen to incoming location requests
Forwarding pointers: Each time an object moves, it
leaves behind a pointer telling where it has gone to.
-
Dereferencing can be made entirely transparent to clients
by simply following the chain of pointers
-
Update a client? s reference as soon as present location
has been found
-
Geographical scalability problems:
-
Long chains are not fault tolerant
-
Increased network latency at derefencing
-
Essential to have separate chain reduction mechanisms
COMP 413
Home-Based Approaches (1/2)
Single-tiered scheme: Let a home keep track of
where the object is:
-
An object? s home address is registered at a naming service
-
The home registers the foreign address of the object
-
Clients always contact the home first, and then continue
with the foreign location
COMP 413
Home-Based Appr oac hes (2/2)
Two-tiered scheme: Keep track of visiting objects
-
Check local visitor register first
-
Fall back to home location if local look-up fails
Problems with home-based approaches:
-
The home address has to be suppoted as long as the object
lives.
-
The home address is fixed, which means an unnecessary burden
when the object permanently moves another location
-
Poor geographical scalability (the object may be next to
the client)
COMP 413
Hierarchical Location Services (HLS)
Basic idea: Build a large-scale search tree for which
the under lying network is divided into hierarchical domains. Each domain
is represented by a separate directory node.
COMP 413
HLS: Tree Organization
-
The address of an object is stored in a leaf node, or in
an intermediate node
-
Intermediate nodes contain a pointer to a child if and only
if the subtree rooted at the child stores an address of the object
-
The root knows about all objects
COMP 413
HLS: Look-up Operation
Basic principles:
-
Start look-up at local leaf node
-
If node knows about the object, follow downward pointer,
otherwise go one level up
-
Upward look-up always stops at root
COMP 413
HLS: Insert Operation

COMP 413
HLS: Record Placement
Observation: If an object O moves regularly between
leaf domains D1 and D2, it may be more efficient to
store O? s contact record at the least common
ancestor LCA of D1 and D2.
-
Look-up operations from either D1 or D2 are
on aver age cheaper
-
Update operations (i.e ., changing the current address) can
be done directly at LCA
-
Note: assuming that O generally stays below LCA,
it does make sense to cache a pointer to LCA
COMP 413
HLS: Scalability Issues
Size scalability: Again, we have a problem of overloading
higher-level nodes:
-
Only solution is to partition a node into a number of subnodes
and evenly assign objects to subnodes
-
Naive partitioning may introduce a node management problem,
as a subnode may have to know how its parent and children are partitioned.
Geographical scalability: We have to ensure that look-up
operations generally proceed monotonically in
the direction of where we?ll find an address:
-
If object O generally resides in California, we should not
let a root subnode located in France store O? s contact record.
-
Unfortunately, subnode placement is not that easy, and only
few tentative solutions are known
COMP 413