39
Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming is that a name can be resolved to the entity it refers to. To resolve names, it is necessary to implement a naming system. In distributed system, the implementation of a naming system is itself often distributed across multiple machines. Two things need to be considered for naming system are efficiency and scalability. Contents for this section: 1. Discussing some general issues with respect to naming 2. Organization & implementation of human friendly names, for example DNS 3. Four basic approaches to locating a mobile entity.

Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

  • View
    226

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Naming• Names in computer systems are used to share resources, to

uniquely identify entities, to refer to locations and so on. An important issue with naming is that a name can be resolved to the entity it refers to. To resolve names, it is necessary to implement a naming system.

• In distributed system, the implementation of a naming system is itself often distributed across multiple machines. Two things need to be considered for naming system are efficiency and scalability.

• Contents for this section:1. Discussing some general issues with respect to naming2. Organization & implementation of human friendly names, for example

DNS3. Four basic approaches to locating a mobile entity.

Page 2: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Naming Entities

• Names, Identifiers, and Addresses– A name in a distributed system is a string of bits or

characters that is used to refer to an entity.– An entity here can be anything practical: process,

printer, mailbox, webpage, hosts, disk….. It can be operated on.

– The name of an access point is called an address – An identifier for entities is a name that has the

following properties:1. An identifier refers to at most one entity2. Each entity is referred to by at most one identifier3. An identifier always refers to the same entity (i.e. it is never

reused).

Page 3: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Name space

• Names in distributed system are organized into name space. A name space can be represented as a labelled, directed graph with two types of nodes:– A leaf node represents a named entity and has the property that

it has no outgoing edges.– A directory node has a number of outgoing edges, each labelled

with a name. A directory node stores a directory table in which an outgoing edges is represented as a pair (edge label, node identifier)

• Each path in a naming graph can be referred to by the sequence of labels corresponding to the edges in that path such as:

N:<label-1, label-2, …, label-n>– If N is the root of the naming graph, it is called an absolute path

name. Otherwise, it is called a relative path name.• global name and local name.

Page 4: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Name Spaces (1)

A general naming graph with a single root node.

Page 5: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Name Spaces (2)

The general organization of the UNIX file system implementation on a logical disk of contiguous disk blocks.

Page 6: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Name Resolution

• The process of looking up a name is called name resolution

• To explain how name resolution works, consider a path name such as N:<label-1, label-2, …, label-n>. Resolution of this name starts at node N of the naming graph, where the name label-1 is looked up in the directory table, and which returns the identifier of the node to which label-1 refers. Resolution continues to label-n by returning the content of that node.

• Name Resolution includes topics:– Closure Mechanism– Linking and Mounting

Page 7: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Closure Mechanism

• Knowing how and where to start name resolution is generally referred to as a closure mechanism. Essentially, a closure mechanism deals with selecting the initial node in a name space from which name resolution is to start

• 00442078156340

• HOME in UNIX

Page 8: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Linking and Mounting• Strongly related to name resolution is the use of aliases. An alias is

another name for the same entity.• Two approaches to implement alias:

– The first approach is to simply allow multiple absolute paths names to refer to the same node in a naming graph. (Fig 4.1) (hard links).

– The second approach is to represent an entity by a leaf node, say N, but instead of storing the address or state of that entity, the node stores an absolute path name. (Fig 4.3) (path name /home/steen/keys, which refers to a node containing the absolute path name /keys, is a symbolic link to node n5.

• Mounting is one way to merge different name spaces• Mount point and mounting point

– The directory node storing the node identifier is called a mount point.– The directory node in the foreign name space is called a mounting point.

• To mount a foreign name space in distributed system requires at least the following information:– The name of an access protocol– The name of the server.– The name of the mounting point in the foreign name space.

Page 9: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Linking and Mounting (1)

• The concept of a symbolic link explained in a naming graph.

Page 10: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Linking and Mounting (2)

• Mounting remote name spaces through a specific process protocol.

Page 11: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

The implementation of a Name Space

• A name space forms the heart of a naming service, that is, a service that allows users and processes to add, remove, and look up names. A naming service is implemented by name server.

• The contents of this part includes:– Name Space Distribution– Implementation of Name Resolution

Page 12: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Name Space Distribution

• why name spaces should be arranged hierarchically?– Decrease possibility of name conflicts,

reduce the size of naming contexts, make name bindings more meaningful, make lookups more efficient and enable federation of name servers.

Page 13: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Name Space Distribution• Name spaces for a large-scale, possibly worldwide distributed system, are

usually organized hierarchically. The name space is partitioned into three logical layers:

– The name space is partitioned into three logical layers:• The global layer is formed by highest-level. This layer is often

characterized by its stability; the directory tables in this layer are rarely changed (19)

• The administrational layer is formed by directory nodes that together are managed within a single organization. A characteristic feature of the directory nodes in the administrational layer is that they represent groups of entities that belong to the same organization or administrational unit.

• The managerial layer consists of nodes that may typically change regularly. The nodes in this layer are maintained not only by system administrators, but also by individual end users of a distributed system.

Page 14: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Name Space Distribution

• The name space is divided into nonoverlapping parts, called zones in DNS. A zone is a part of the name space that is implemented by a separate name server.

• Name servers in each layer have to meet different requirements

Page 15: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Name Space Distribution (1)

• An example partitioning of the DNS name space, including Internet-accessible files, into three layers.

Page 16: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Name Space Distribution (2)

• A comparison between name servers for implementing nodes from a large-scale name space partitioned into a global layer, as an administrational layer, and a managerial layer.

Item Global Administrational Managerial

Geographical scale of network Worldwide Organization Department

Total number of nodes Few Many Vast numbers

Responsiveness to lookups Seconds Milliseconds Immediate

Update propagation Lazy Immediate Immediate

Number of replicas Many None or few None

Is client-side caching applied? Yes Yes Sometimes

Page 17: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Implementation of Name Resolution

• Each client has access to a local name resolver, which is responsible for ensuring that the name resolution process is carried out.

• Assume the (absolute) path name root:<nl,vu,cs,ftp,pub,globe,index.txt> is to be resolved. Using a URL notation, this path name would correspond to ftp://ftp.cs.vu.nl/pub/globe/index.txt , there is two ways to implement name resolution:– In iterative name resolution, a name resolver hands over the complete

name to the root name server.– With recursive name resolution, a name server passes the result to the

next name server it finds.• The drawback of recursive name resolution is that it puts a higher

performance demand on each name server. Its two important advantages are: – caching result is more effective compared to iterative name resolution; – the communication costs may be reduced.

Page 18: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Implementation of Name Resolution (1)

• The principle of iterative name resolution.

Page 19: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Implementation of Name Resolution (2)

• The principle of recursive name resolution.

Page 20: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Implementation of Name Resolution (3)

• Recursive name resolution of <nl, vu, cs, ftp>. Name servers cache intermediate results for subsequent lookups.

Server for node

Should resolve

Looks upPasses to child

Receives and caches

Returns to requester

cs <ftp> #<ftp> -- -- #<ftp>

vu <cs,ftp> #<cs> <ftp> #<ftp> #<cs>#<cs, ftp>

ni <vu,cs,ftp> #<vu> <cs,ftp> #<cs>#<cs,ftp>

#<vu>#<vu,cs>#<vu,cs,ftp>

root <ni,vu,cs,ftp> #<nl> <vu,cs,ftp> #<vu>#<vu,cs>#<vu,cs,ftp>

#<nl>#<nl,vu>#<nl,vu,cs>#<nl,vu,cs,ftp>

Page 21: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Example: The Domain Name System

• The DNS Name Space– The DNS name space is hierarchically organized as a

rooted tree. A label is a case-insensitive string made up of alphanumeric characters. A label has a maximum length of 63 characters; the length of a complete path name is restricted to 255 characters.

– The label attached to a node’s incoming edge is also used as the name for that node. A subtree is called a domain; a path name to its root node is called a domain name.

– The contents of a node is formed by a collection of resource records.

Page 22: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

The DNS Name Space

The most important types of resource records forming the contents of nodes in the DNS name space.

Type of record

Associated entity

Description

SOA Zone Holds information on the represented zone

A Host Contains an IP address of the host this node represents

MX Domain Refers to a mail server to handle mail addressed to this node

SRV Domain Refers to a server handling a specific service

NS Zone Refers to a name server that implements the represented zone

CNAME Node Symbolic link with the primary name of the represented node

PTR Host Contains the canonical name of a host

HINFO Host Holds information on the host this node represents

TXT Any kind Contains any entity-specific information considered useful

Page 23: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

DNS Implementation (1)

An excerpt from the

DNS database for the zone cs.vu.nl.

Page 24: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

LOCATING MOBILE ENTITIES• Naming versus Locating Entities

– Example: ftp.cs.vu.nl• Assuming global and administrational layers are stable, local cache

can find cs.vu.nl domain address. If the ftp machine changed locally, only the name server for cs.vu.nl has to change. Lookups will be efficient.

• If the ftp machine were moved to a machine named ftp.cs.unisa.edu.au which lies in a completely different domain. There are basically two solutions.

– One solution is to record the address of the new machine in the DNS database for cs.vu.nl.

– An alternative solution is to record the name of the new machine, instead of its address, effectively turning ftp.cs.vu.nl into a symbolic link.

• The drawback for first approach is that it violates the assumption that operations on nodes in the managerial layer are efficient.

• The drawback of using a symbolic link is that lookup operations become less efficient. In effect, each lookup is split into two steps:

– Find the name of the new machine.– Look up the address associated with that name.

Page 25: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Naming versus Locating Entities

a) Direct, single level mapping between names and addresses.b) T-level mapping using identities.

Page 26: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Simple Solutions

• Broadcasting and Multicasting (For LAN)– Broadcasting: Every host receives the

request.– Multicasting: Only a restricted group of

hosts receives the request

Page 27: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Forwarding Pointers

• Principle: When an entity moves from A to B, it leaves behind a reference to its new location at B.

• Drawbacks of this approach:– If no special measures are taken, a chain can become so

long that locating an entity is prohibitively expensive.– All intermediate locations in a chain will have to maintain

their part of chain of forwarding pointers as long as needed.– As soon as a forwarding pointer is lost for whatever reason,

the entity can no longer be reached.

• Whenever an object moves from address space A to B, it leaves behind a proxy in its place in A and installs a skeleton that refers to it in B.

Page 28: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Forwarding Pointers (1)

• The principle of forwarding pointers using (proxy, skeleton) pairs.

Page 29: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Forwarding Pointers (2)

• Redirecting a forwarding pointer, by storing a shortcut in a proxy.

Page 30: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Home-Based Approaches

• A popular approach to supporting mobile entities in large-scale networks, is to introduce a home location, which keeps track of the current location of an entity. In practice, the home location is often chosen to be the place where an entity was created.

Page 31: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Home-Based Approaches

• The principle of Mobile IP.

Page 32: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Hierarchical Approaches

• In a hierarchical scheme, a network is divided into a collection of domains. There is a single top-level domain that spans the entire network. Each domain can be subdivided into multiple, smaller subdomains. A lowest-level domain, called a leaf domain, typically corresponds to a local-area network in a computer network or a cell in a mobile telephone network.

• Multi-tiered lookup operations• Multi-tiered update operations• Multi-tiered delete operations

Page 33: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Hierarchical Approaches (1)

Hierarchical organization of a location service into domains, each having an associated directory node.

Page 34: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Hierarchical Approaches (2)

An example of storing information of an entity having two addresses in different leaf domains.

Page 35: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Hierarchical Approaches (3)

Looking up a location in a hierarchically organized location service.

Page 36: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Hierarchical Approaches (4)

An insert request is forwarded to the first node that knows about entity E.

A chain of forwarding pointers to the leaf node is created.

Page 37: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Pointer Caches (1)

Caching a reference to a directory node of the lowest-level domain in which an entity will reside most of the time.

Page 38: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Pointer Caches (2)

A cache entry that needs to be invalidated because it returns a nonlocal address, while such an address is available.

Page 39: Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming

Scalability Issues

The scalability issues related to uniformly placing subnodes of a partitioned root node across the network covered by a location service.