Transparency in Distributed Operating Systems Vijay Akkineni

Transparency in Distributed Transparency in Distributed Operating SystemsOperating Systems

Vijay AkkineniVijay Akkineni

Operating Systems GenerationsOperating Systems Generations

Centralized Operating systems Network Operating Systems Distributed Operating Systems Cooperative Autonomous Systems Cloud Computing

Partitioning of COSPartitioning of COS

Applications Finance Word processing Web Application

Subsytems Programming Envrionment Database Systems

Utilities Compiler Command Interpreter

library

System Services File System Memory Manager Scehduler

Kernel CPU Multiplexing, Interrupt Handling, Device Drivers, Synchronization primitives, Interprocess communication

Network Operating SystemNetwork Operating System

Peer to Peer communications. Seven layer OSI architecture. Examples Remote Login, File Transfer,

Messaging, Network Browsing, Remote Execution.

Distributed Operating SystemDistributed Operating System

Loosely coupled systems. Sharing or resources and coordination of

resources. Transparency – Key difference between

NOS and DOS. Distributed resources and activities are

to be managed and controlled.

Distributed Operating SystemsDistributed Operating Systems

Cooperative Autonomous Cooperative Autonomous SystemsSystems

Characterized by Service Integration. Middleware – Cobra, JMS, RMI.

TransparencyTransparency

Hide irrelevant system dependent details from the users

Higher Implementation Complexities Single System Image Minimal Knowledge

Location TransparencyLocation Transparency

User has no awareness of object locations,objects are mapped and referred to by logical names.

WebServices UDDI, Federated Services - SOA

Access TransparencyAccess Transparency

Ability to access local and remote system objects in uniform way.

The physical separation of system objects is concealed from the user.

Examples – Accessing a file from the local file system and from a cloud drive.

Migration TransparencyMigration Transparency Logical Resources and Physical

processes migrated by the system, from one location to another in an attempt to maximize efficiency, reliability, availability or security should be automatically controlled by the system

Example – Application Servers using JNDI

Replication TransparencyReplication Transparency

Exhibit consistency of multiple instances of files and data.

System elements are copied to remote points in the system in an effort to possibly increase efficiencies through better proximity or provide increased reliability through duplication.

Examples – Google's Big Table, HDFS.

Concurrency TransparencyConcurrency Transparency

Sharing of Objects without interference. Similar to Time sharing. An important challenge when designing

distributed systems is how to deal with concurrent accesses.

Example – An important design goal for distributed database. Transactional Integrity and ACID properties during multiple transactions happening concurrently.

Failure TransparencyFailure Transparency Failure Transparency tries to mask

failures so that they are not seen or noticed by the users.

It is difficult to identify between a resource that has failed and a resource which is performing badly (slowly).

Consider opening a webpage - is it dead or painfully slow, how long should the browser wait?

Examples - Map Reduce Frameworks, DFS Replication on Data Nodes.

Performance TransparencyPerformance Transparency Attempt to achieve a consistent and

predictable performance level even with changes to system structure or load distribution.

When parts of the system experience significant delay or load imbalance, the system is responsible for the automatic, rapid, and accurate detection and orchestration of a remedy.

Examples - Load balancing, Speculative execution in Map Reduce.

Size/Scale TransparencySize/Scale Transparency A system's geographic reach, number of

nodes, level of node capability, or any changes therein should exists without any required user knowledge or interaction.

Research Area - Currently there is lot of ongoing research on running Map Reduce job across the data centers. Partition compute jobs based on geographical locality.

Revision transparencyRevision transparency

System occasionally have need for system-software version changes and changes to internal implementation of system infrastructure.

Examples - Linux Kernel Upgrades and how it should not effect the existing software applications on the OS.

Revision transparencyRevision transparency

System occasionally have need for system-software version changes and changes to internal implementation of system infrastructure.

Examples - Linux Kernel Upgrades and how it should not effect the existing software applications on the OS.

Parallelism TransparencyParallelism Transparency The most difficult aspect of transparency,”Holy

Grail” of distributed system designers. Systems parallel execution of processes throughout

the system should occur without any required user knowledge.

Examples – Parallel Algorithms on multicore processors and Map Reduce tasks on multiple systems.

Transparency SummarizedTransparency SummarizedAccess Hide differences in data representation and how a

resource is accessed

Location Hide where a resource is located

Migration Hide that a resource may move to another location

Relocation Hide that a resource may be moved to another location while in use

Replication Hide that a resource may be shared by several competitive users

Concurrency Hide that a resource may be shared by several competitive users

Failure Hide the failure and recovery of a resource

Persistence Hide whether a (software) resource is in memory or on disk

Major Research AreasMajor Research AreasAreas Transparencies

Communication Networks SynchronizationDistributed Algorithms

Interaction and Control Transparency

Process SchedulingDeadlock HandlingLoad balancing

Performance Transparency

Resource SchedulingFile Sharing Concurrency Control

Resource Transparency

Failure Handling Configuration Redundancy

Failure Transparency

Towards Cloud OSTowards Cloud OS

• Permit autonomous management of its resources on behalf of its users and applications.

• Cloud OS operations must continue despite loss of nodes, entire clusters, and network partitioning.

• The Cloud OS must be operating system and architecture agnostic.

• The Cloud must support multiple types of applications, including legacy.

• Cloud OS management system must be decentralized, scalable, have little overhead per user and per machine and be cost effective

Logical Model of Cloud OS Logical Model of Cloud OS

Cloud MiddlewareCloud Middleware Heterogeneous Nature of Cloud hindering adoption of

Cloud Technologies. IBM is researching into Altocumulus middleware,

which offers a uniform API for using Amazon EC2, Eucalyptus, Google AppEngine, and IBM HiPODS Cloud, aiming to provide an API which is Cloud agnostic.

http://www.almaden.ibm.com/asr/projects/cloud/

ReferencesReferences“HP performance-optimized datacenter

(POD).” Data Sheet, 2008.“Amazon EC2.” [Online]

http://aws.amazon.com/ec2.Apache hadoop Map Reduce

Documents

Transparency in Distributed Operating Systems Vijay Akkineni