Upload
dangthuy
View
218
Download
3
Embed Size (px)
Citation preview
www.opendaylight.org
▪ Abhishek Kumar ▪ Basheeruddin Ahmed ▪ Colin Dixon ▪ Harman Singh ▪ Kamal Rameshan ▪ Robert Varga ▪ Tony Tkacik
My Collaborators
2
Tom Pantelis
▪ Luis Gomez ▪ Phillip Shea ▪ Radhika Hirannaiah ▪ and many more…
www.opendaylight.org
Subsystems
5
member-‐1
member-‐2
member-‐3
Distributed Data Store
member-‐1 member-‐2
Remote RPC Connector
www.opendaylight.org
High Level Architecture
Distributed Data Store Remote RPC Connector
Persistence
Remoting
Clustering
www.opendaylight.org
Actor Systems
7
Distributed Data Store Remote RPC Connector
Actor Hierarchy
Configuration
Dispatchers
www.opendaylight.org
Data Synchroniza:on
8
Data store Synchronized Data Tree Raft for Distributed Consensus
Remote RPC Synchronized RPC Registry Gossip for data distribution
www.opendaylight.org
Data Distribu:on
14
Client
member-1
Dis
tribu
tedD
ataS
tore
member-2
member-3
topology
inventory
www.opendaylight.org
HA
16
member-2
member-3
inventory – follower -‐1
inventory – follower -‐ 2
Client
member-1
DistributedDataStore
inventory – leader
www.opendaylight.org
RaB Distributed Consensus
17
discovers n
ode with
highe
r term Follower
Candidate
Leader
starts up/ recovers
Hmes out, starts elecHons
receives votes from majority of nodes
Hmes out, restarts elecHons
follower-‐2 follower-‐1
leader
Election Replication/Consensus
www.opendaylight.org
Journal replica:on
18
leader
follower-1
follower-2
transaction-1
transaction-2
transaction-3
transaction-4
transaction-1
transaction-2
transaction-3
transaction-4
transaction-1
transaction-2
transaction-3
transaction-4
www.opendaylight.org
Loca:on Transparency
23
Consumer
member-1 member-2
Provider R
pcP
rovi
derP
roxy
Rem
oteR
pcB
roke
r
www.opendaylight.org
RPC Registry Replica:on -‐ Gossip
25
version=1
version=2
modify
change version
Local bucket updates change version
m1,v1
m2,v5
m3,v7
All buckets and their versions known to all members
Every 1 second members send all known bucket versions to any one peer
m1
m2 m3 status
status
m2 m3
m1
update
local versions higher – send update local versions lower – send status to sender
www.opendaylight.org
Modules
27
sal-clustering-commons
sal-akka-raft sal-remoterpc-connector
sal-distributed-datastore
sal-clustering-config
sal-akka-raft-example
sal-dummy-distributed-datastore
clustering-test-app
www.opendaylight.org
▪ Some common messages ▪ Actor base classes ▪ The Protobuf messages used in Helium ▪ The Protobuf NormalizedNode serializaHon code ▪ The NormalizedNode streaming code ▪ Other miscellaneous uHlity classes
sal-‐clustering-‐commons
28
www.opendaylight.org
▪ ImplementaHon of the Ra[ Algorithm on top of akka ▪ Uses akka-‐persistence for durability ▪ Provides a base class called Ra:Actor which when can be extended by anyone who wants to replicate state ▪ See sal-‐akka-‐ra[-‐example which provides a simple implementaHon of a replicated HashMap
sal-‐akka-‐raB
29
www.opendaylight.org
▪ ConcurrentDOMDataBroker ▪ DistributedDataStore ▪ ImplementaHon of the DOMStore SPI ▪ Shard built on top of Ra[Actor ▪ Creates Shards based on Sharding strategy ▪ Code for a client to interact with the Shard Leader
sal-‐distributed-‐datastore
30
www.opendaylight.org
▪ RemoteRpcProvider ▪ Default RPC Provider. Invoked when an RPC is not found in the local MD-‐SAL registry. ▪ Code for BucketStore which provides a mechanism to replicate state based on Gossip ▪ Code for RpcBroker which allows invoking a remote rpc
sal-‐remoterpc-‐connector
31
www.opendaylight.org
Startup
33
DistributedConfigDataStoreProviderModule
DistributedDataStore
ShardManager
Shard1 Shard Shard3 Shard4
createInstance
ActorContext waitTillReadyLatch
create & waitTillReady
www.opendaylight.org
Recovery
34
Shard1 Shard Shard3 Shard4
ShardManager
read last known state from disk
ready
waitTillReadyLatch
countDown
www.opendaylight.org
▪ Recovery must be complete ▪ All Shard Leaders must be known ▪ Three messages are monitored by ShardManager
▪ Cluster.MemberStatusUp ▪ Used to figure out the address of a cluster member
▪ LeaderStateChanged ▪ Used to figure out if a Follower has a different Leader
▪ ShardRoleChanged ▪ Use to figured out any changes in a Shard’s Role
▪ WaiHng is not infinite, by default it lasts only 90 seconds but is configurable ▪ Will block config sub-‐system
Wai:ng for Ready
35
www.opendaylight.org
Crea:ng a Transac:on
36
DistributedDataStore newReadWriteTransaction
TransactionProxy
create
www.opendaylight.org
First Opera:on
37
ActorContext.findPrimary
PrimaryCache.lookup/ShardManager.findPrimary
Found?
LocalTransactionContext RemoteTransactionContext
NoOpTransactionContext
TransactionProxy write(“inventory”, node)
Local?
N
Y N
www.opendaylight.org
Transac:ons
38
Client
DistributedDataStore
inventory – leader
Client
DistributedDataStore
inventory – leader
Local Transaction Remote Transaction
mem
ber-
1 mem
ber-
1 m
embe
r-2
www.opendaylight.org
Local Transac:on Op:miza:on
39
LocalTransactionContext Shard - Leader
write
merge
delete
ready
member-1
www.opendaylight.org
Remote Transac:on Op:miza:on
40
RemoteTransactionContext Shard Leader
write
merge
delete
ready
write mod
merge mod
delete mod
member-1 member-2
www.opendaylight.org
Transac:on Rate Limi:ng
41
rate-limit = 100 Tx/Sec
Tx Cohort
Shard Leader
member-2
20ms
Tx Cohort
50ms
Tx Cohort
15ms
after rate-limit/2 transactions done…. new-rate-limit = 25 Tx/Sec
www.opendaylight.org
Opera:on Limi:ng
42
RemoteTransactionContext Shard Leader
write
merge
delete
write mod
merge mod
delete mod
member-1 member-2
… …
block
www.opendaylight.org
Commit Coordina:on
43
Shard Leader
member-2
Shard CommitCoordinator
Tx1 -‐ ready
Tx2 -‐ ready
Tx3 -‐ ready
Tx1 -‐ commit
Tx3 -‐ commit
Tx3 -‐ abort
Tx2 -‐ commit
Tx1
Tx2
Tx3
www.opendaylight.org
Managing the in-‐memory journal Replicated To All
44
Client leader
follower-1 follower-2
commit transaction
txn
txn txn
www.opendaylight.org
Managing the in-‐memory journal Cluster member unavailable
45
Client leader
follower-1 follower-2
commit transaction
txn
txn
txn txn txn
txn txn txn
www.opendaylight.org
Data Change No:fica:ons
46
Client leader
follower-1 follower-2
commit transaction
txn
txn txn
notify
www.opendaylight.org
Startup
48
RemoteRpcBrokerModule createInstance
RpcManager
RemoteRpcProvider
RpcBroker RpcRegistry RemoteRpcImpl RpcListener
www.opendaylight.org
Default RPC Delegate
49
RpcManager SchemaContext
DOMRpcProviderService
read all rpc definitions
registerImplementation(remoteRpcImpl)
www.opendaylight.org
RPC Registered
50
RpcProviderRegistry addRoutedRpcImpl
RoutedRpcRegistration registerPath
RpcListener
RpcRegistry
www.opendaylight.org
Invoking a Remote RPC
51
RemoteRpcImpl invokeRpc
RpcRegistry
Route found?
RpcBroker
ExecuteRpc
FooService
throw Exception
www.opendaylight.org
Invoking a Remote RPC
52
RemoteRpcImpl
Consumer
Provider
member-1 member-2
RpcBroker
RpcRegistry
invokeRpc
invokeRpc findRoute
ExecuteRpc
www.opendaylight.org
Transac:on Tracing
54
Created txn member-‐2-‐txn-‐9400 of type READ_WRITE on chain member-‐2-‐txn-‐chain-‐13
Client
Server
Tx member-‐2-‐txn-‐9400 read /(urn:opendaylight:inventory?...
member-‐3-‐shard-‐inventory-‐operaHonal: CreaHng transacHon : shard-‐member-‐2-‐txn-‐9400
Tx member-‐2-‐txn-‐9400 Readying 1 transacHons for commit
Tx member-‐2-‐txn-‐9400 commit
member-‐3-‐shard-‐inventory-‐operaHonal: Readying transacHon member-‐2-‐txn-‐9400
member-‐3-‐shard-‐inventory-‐operaHonal: Commigng transacHon member-‐2-‐txn-‐9400
Tx member-‐2-‐txn-‐9400: commit succeeded
Cluster Member IniHator
Counter
TransacHon Type
Module
Data store type
www.opendaylight.org
Replica:on Tracing
55
Leader
Sending AppendEntries to follower member-‐2-‐shard-‐topology-‐operaHonal: AppendEntries [term=2, leaderId=member-‐1-‐shard-‐topology-‐operaHonal, prevLogIndex=520, prevLogTerm=2, entries=[Entry{index=521, term=2}], leaderCommit=520, replicatedToAllIndex=-‐1]
Follower handleAppendEntries: AppendEntries [term=2, leaderId=member-‐2-‐shard-‐topology-‐operaHonal, prevLogIndex=520, prevLogTerm=2, entries=[Entry{index=521, term=2}], leaderCommit=520, replicatedToAllIndex=-‐1]
handleAppendEntries returning : AppendEntriesReply [term=2, success=true, logLastIndex=521, logLastTerm=2, followerId=member-‐1-‐shard-‐topology-‐operaHonal]
handleAppendEntriesReply from member-‐2-‐shard-‐topology-‐operaHonal: applying to log – commitIndex: 521, lastAppliedIndex: 520
handleAppendEntriesReply -‐ FollowerLogInformaHon for member-‐2-‐shard-‐topology-‐operaHonal updated: matchIndex: 521, nextIndex: 522
www.opendaylight.org
Shard MBean
56
org.opendaylight.controller:type=DistributedOperaHonalDataStore,Category=Shards,name=member-‐1-‐shard-‐inventory-‐operaHonal
OperaHonal
Config
member-‐1
member-‐2
member-‐3
default
inventory
topology
operaHonal
config
Attributes AbortTransacHonsCount CommitIndex CommiledTransacHon
sCount CurrentTerm FailedTransacHonsCount
FollowerInfo FollowerIniHalSync Status
InMemoryJournalData Size
InMemoryJournalLogSize
LastApplied
LastCommiledTransacHonTime
LastIndex LastTerm Leader Ra[State
ReadOnlyTransacHon Count
ReadWriteTransacHonCount
WriteOnlyTransacHon Count
VotedFor and more….
www.opendaylight.org
ShardManager MBean
57
org.opendaylight.controller:type=DistributedOperaHonalDataStore,Category=ShardManager,name=shard-‐manager-‐operaHonal
OperaHonal
Config
operaHonal
config
Attributes
• LocalShards • SyncStatus
www.opendaylight.org
Data store GeneralRun:meInfo MBean
58
org.opendaylight.controller:type=DistributedConfigDatastore,name=GeneralRunHmeInfo
OperaHonal
Config
Attributes
• TransacHonCreaHonRateLimit
www.opendaylight.org
Transac:on Commit Rate MBean
59
org.opendaylight.controller.cluster.datastore:name=distributed-‐data-‐store.config.commit.rate
Attributes
• 50thPercentile • 75thPercenHle • 90thPercenHle • and so on…
operaHonal
config
• Count • Min • Max • StdDev
www.opendaylight.org
Data store GeneralRun:meInfo MBean
60
org.opendaylight.controller:type=DistributedConfigDatastore,name=GeneralRunHmeInfo
OperaHonal
Config
Attributes
• TransacHonCreaHonRateLimit
www.opendaylight.org
Message Sta:s:cs MBean
61
org.opendaylight.controller.actor.metric:name=/user/shardmanager-‐config.msg-‐rate.ActorIniHalized
Attributes
• 50thPercentile • 75thPercenHle • 90thPercenHle • and so on…
operaHonal
config
• Count • Min • Max • StdDev
Message Name
www.opendaylight.org
RemoteRpcBroker MBean
63
org.opendaylight.controller:type=RemoteRpcBroker,name=RemoteRpcRegistry
Attributes
• BucketVersions • GlobalRpc • LocalRegisteredRoutedRpc
Operations
• findRpcByName • findRpcByRoute
www.opendaylight.org
Message Sta:s:cs MBean
64
org.opendaylight.controller.actor.metric:name=/user/rpc/registry.msg-‐rate.AddOrUpdateRoutes
Attributes
• 50thPercentile • 75thPercenHle • 90thPercenHle • and so on…
• Count • Min • Max • StdDev
Message Name
www.opendaylight.org
▪ Deploy a cluster ▪ Run clustering integraHon tests ▪ Write an applicaHon that works in the cluster ▪ Write bugs to report features which you find missing ▪ Try running dsBenchMark on a cluster ▪ Test out replicaHon using the dummy data store ▪ Check out the code ▪ Send email to controller-‐[email protected] with quesHons
Suggested Next Steps…
66