Dependable Distributed Applications - uni-potsdam.de · •Relational DBMS are hard to make...

Dependable Distributed Applications

Dependable Systems 2014

Lena Herscheid, Dr. Peter Tröger

Dependable Distributed Applications | Dependable Systems 2014 1

Frameworks +Programming ModelsHanmer, Robert. Patterns for fault tolerant software. John Wiley & Sons, 2013.

Introduction to Fault Tolerant CORBA. http://cnb.ociweb.com/cnb/CORBANewsBrief-200301.html

Erlang/OTP http://www.erlang.org/doc/

FT-CORBA

• Extension of CORBA standard by commonly used fault tolerance patterns

• Fault model: node crash faults

• Replication• Object level, ReplicationManager + ReplicaFactory• Logical singletons: group of replica object group, appear as a single object• warm / cold passive high recovery time• active / active_with_votinghigh multicast time

• Fault detection• FaultDetector + FaultNotifier• Are assumed inherently fault tolerant

• Failure recovery• Apply log of updated to replica, depending on replica type

• Implementations• Replication in the ORB: Electra, TAO, Orbix+Isis, …• Replication through CORBA objects: DOORS, AQuA, OGS, …

Erlang/OTP

• Erlang programming language: fault tolerance as design principle• Isolated lightweight processes (managed by the VM)

• Programming model: asynchronous message passing

• “Let it crash” policy• Processes terminate with error codes

• Monitoring processes are expected to do recovery

• Transparent distribution of processes (by VM)

• Open Telecom Platform framework• Common patterns in concurrent distributed Erlang programs

• Modules can instantiate behaviours (server, fsm, supervisor…)

Erlang/OTP Example supervision tree:

one_for_one restart:

Fault TolerantCoordination ServicesBurrows, Mike. "The Chubby lock service for loosely-coupled distributed systems." Proceedings of the 7th symposium on Operating systems design and implementation. USENIX Association, 2006.

Hunt, Patrick, et al. "ZooKeeper: Wait-free Coordination for Internet-scale Systems." USENIX Annual Technical Conference. Vol. 8. 2010.

Motivation

• Distributed algorithms are notoriously hard to implement correctly

• Leader election / consensus need to be inherently fault tolerant

• Decoupling algorithmic and data redundancy• Storage nodes usually need a higher degree of replication

• Consistency constraints

• High recovery costs

• Decision making should be lightweight• Fast recovery

• Low latency requirement

In Search of an Understandable Consensus Algorithm

Chubby

• Google’s distributed lock service

• Goal: easily add consensus / leader election to existing application

• Lock service: simple interface for distributed decision making• “a generic electorate that allows a client system to make decisions correctly

when less than a majority of its own members are up”

• Serves small files so elected primaries can easily distribute parameters

• Client notification on events (such as lock expiry new leader election)

• Chubby servers contain 5 replicas, implementing Paxos• Automatic failover within a configured machine pool

Zookeeper

• “Because Coordinating Distributed Systems is a Zoo”

• Distributed configuration + coordination service• Used for leader election, message queuing, synchronization

• Provides a file system like namespace for coordination data (<= 1MB per node)• Kept in memory• State based service: no change history

• Guaranteed absolute order of updates• Client watch events are triggered in the same order as Zookeeper sees the updates

• Throughput of read requests scales with #servers

• Throughput of write requests decreases with #servers• Consensus on all updates• ~50k updates per second

Distributed StorageChang, Fay, et al. "Bigtable: A distributed storage system for structured data.“ ACM Transactions on Computer Systems (TOCS) 26.2 (2008): 4.

Corbett, James C., et al. "Spanner: Google’s globally distributed database.“ ACM Transactions on Computer Systems (TOCS) 31.3 (2013): 8.

HDFS architecture guide. http://hadoop. apache. org/common/docs/current/hdfs design. pdf (2008).

DeCandia, Giuseppe, et al. "Dynamo: amazon's highly available key-value store." ACM SIGOPS Operating Systems Review. Vol. 41. No. 6. ACM, 2007.

Design Choices

• When to resolve conflicts?• On read

• On write

• Who resolves conflicts?• Application: data model aware resolution policies possible

• Storage system: application transparency, but less powerful

• ACID vs BASE

• PCAELC trade-offs

• Data partitioning algorithm

ACID vs BASE (Brewer. PODC keynote. 2000)

Atomic, Consistent, Isolated, Durable

• Transactions

• Strong consistency

• Pessimistic/conservative replication

Basically Available, Soft-state, Eventual consistency

• Best Effort

• Weak consistency

• Optimistic replication

Modern distributed storage systems

• Geo-replication• Latency issues• Consistency models need to take locality into account

• Shift towards tuneable, relaxed consistency models• Application-specific configuration• Fault tolerance increasingly also a DevOps problem

• Always available, low latency, partition tolerance, scalability (ALPS)• Availability before consistency• Most ALPS systems offer eventual consistency

• NoSQL movement• Relational DBMS are hard to make consistent and available• Denormalized data is easier to replicate

Self-Healing

• How (and when) to handle diverging replicas with eventual consistency?

• Read repair• Quorum met, but not all replicas agreed inconsistency detected!

• Force the minority to update their copy

• Active Anti-Entropy (AAE)• Continuously running background process

• Difference detection using hash trees

BigTable

• Google’s distributed database

• Designed to handle petabytes of distributed data

• Non-relational data model: “multi-dimensional sparse maps”

• GQL: subset of SQL

• Building Blocks• Google File System (GFS) for raw storage

• Chubby for master election

• Custom MapReduce implementation for writing data

Google Spanner

• Spanservers consist of different data centres

• Data model: semi-relational

• “Externally consistent” transactions(linearizable consistency for R/W transactions)

• Timestamped transactions, using Paxos

Effect of killing the Paxos leader

Google Spanner / TrueTime

• Instead of relying on NTP, data centres have own atomic clocks

• GPS-based time negotiation• Periodical consensus on time reliable, uncertain global clock

• Interval-based time (uncertainty representation)• The longer past the last synchronization point, the higher the uncertainty

• Standard storage system behind Hadoop

• Replication of equal size file blocks on DataNodes

• Central coordinating NameNode• Maintains metadata: namespace tree, mapping of blocks to DataNodes• Metadata kept in memory• Monitors DataNodes by receiving heartbeats

• DataNode failure NameNode detects it, replicates on another node

• NameNode single point of failure (before 2.0.0)

High Availability HDFS

• HBase runs on top of HDFS: open source BigTable implementation

Dynamo

• Amazon’s distributed key-value store

• Designed for scalability and high availability

• Assumptions• Most operations do not span multiple data items

No need for fully relational DBMS

• Poor write availability is worse than inconsistency

• Always writeable• Conflict resolution upon reads

• Distributed key-value store programmed in Erlang

• Designed based on Dynamo paper

Replication configuration Ring-based consistent hashing Erlang supervision tree

Cassandra

• Distributed NoSQL DBMS

• Designed for performance and scalability

• Eventual consistency configurable• Hinted handoff for availability

• Gossip protocol for failure detection

• Configurable replication + partitioning• NetworkTopologyStrategy:

data-centre aware

http://www.ecyrd.com/cassandracalculator/

The Reality of Distributed Failures…

human operation mistakes data corruption is rarely part of the failure model

unforeseen (hence unmodelled) error propagation chains

dynamically changing failure probabilities

nested failures during recovery routines

Dependable Distributed Applications - uni-potsdam.de · •Relational DBMS are hard to make...

Documents

The German Honeynet Project - FIRST · Thorsten Holz • Pi1 - Laboratory for Dependable Distributed Systems UNIVERSITÄT MANNHEIM GenIII honeynet • Honeywall CD-ROM “roo” •

1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU

Making Web Services Dependable · 2007. 2. 24. · distributed systems. The participants in a distributed trans-action incur a risk that their data will be locked, and will be inaccessible,

Dependable communication synthesis for distributed ......an important class of embedded systems-automobiles. Rather than generate arbitrary networks, we restrict the topology space

Dependable Technologies for Critical Systems Copyright Critical Software S.A. 1998-2003 All Rights Reserved. Handling big dimensions in distributed data

Dependable communication synthesis for distributed embedded systems Nagarajan Kandasamy, John P. Hayes, Brian T. Murray Presented by John David Eriksen

The Art of (Application) Fingerprinting · 2016. 11. 23. · Maximillian Dornseif • Laboratory for Dependable Distributed Systems Who we are • Laboratory for Dependable Distributed

Towards Dependable, Scalable, and Pervasive Distributed ......distributed ledgers more dependable, scalable, and pervasive. In this paper, we present the research landscape in distributed

Dependable dad

A Distributed Object-Oriented Framework for Dependable Multiparty Interactions A.F.Zorzo, R.J.Stroud Leonardo Viccari

Building Dependable Distributed Applications … Dependable Distributed Applications Using AQUA1 Jennifer Ren, Michel Cukier, Paul Rubel, and William H. Sanders Center for Reliable

DEPENDABLE PRODUCTIVITY

Middleware for Dependable Network Services in ...jgroup.sourceforge.net/papers/1999-19.pdf · Middleware for Dependable Network Services in Partitionable Distributed Systems Alberto

Achieving High Survivability in Distributed Systems …...Slide 1/50 Achieving High Survivability in Distributed Systems through Automated Response! Yu-Sung Wu Dependable Computing

Dependable communication synthesis for distributed embedded systems

Dependable Computing Systems Lab: Research Overview · The broad goal of our research is to design practical dependable distributed systems. This means distributed systems, from large-scale

Architecture-based approach to build adaptive software Presenter Kashif Dar kashifd@ifi.uio.no INF5360: Seminar on Dependable and Adaptive Distributed

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED … Dependable... · microservices consist of in-memory or persistent databases and relational database that underpin the information

Dependable Solutions

Towards Dependable, Scalable, and Pervasive Distributed ...msrg.org/.../pdf_files/...Dependable,_Scalable,_an.pdf · Towards Dependable, Scalable, and Pervasive Distributed Ledgers