Upload
irati-project
View
968
Download
1
Embed Size (px)
Citation preview
RINA Workshop
The Pouzin Society
RINA Detailed Overview and Implementation Discussions
RINA Workshop. Barcelona, January 22-24 2013
The Pouzin SocietyOverview
• Distributed Applications– Naming, Flows, Application API– Common Application Connection Establishment and CDAP– SDU Protection
• The DIF and the IPC Process– Block Diagram (Reference vs. Implementation Architecture)– RIB and RIB operations– Enrollment– Flow Allocation– Transport protocol: EFCP, DTP, DTCP– Relaying and Multiplexing: RMT– Routing and Forwarding– Resource Allocation
• Shim DIF– Internet/IP/TCP/UDP use with RINA
• IDD• Misc. topics: Network Mgt., Security, …• Demo storyboard
• RINAband2
The Pouzin Society
DISTRIBUTED APPLICATIONSNaming, Flows and Application API
3
The Pouzin SocietyDistributed applications
• For A and B to communicate, they need:– A means to identify each other -> Application process naming– A medium that provides a communication service -> Flows– A way to indicate the communication medium that they want resources to be
allocated for a particular communication to take place, with certain quality requirements -> Communication medium API
– A shared domain of discourse -> Objects– Optionally verify who are they talking to (authenticate), negotiate what
protocol is going to be used to carry the data they will exchange, and what concrete encoding is to be used -> Application connection
– A method to carry their discourse (objects)-> Application protocol4
Medium that enables applications to communicate
Appl. Process
B
Local handle to a particular instance of a
communication
Appl. Process
A1 8
Local handle to a particular instance of a
communication
flow
Application connection
Application protocol
The Pouzin Society
Distributed applications in the Internet
• For A and B to communicate, they need:– Application process naming: No names for applications, IP addresses
and ports is all what we have (URLs are pathnames to the applications)– Flows: Only 2 types; TCP (some variants) or UDP, each of them with
fixed characteristics– Communication medium API: Need to know to what PoA and port an
application is attached to in order to allocate a flow, no means to express desired properties of the flow
– Objects: Vary depending on the application protocol used– Application connection: Applications have to know in advance which
application protocol is going to be used; authentication is done through separate protocols.
– Application protocol: Many protocols and encodings, tailored to different purposes 5
Medium that enables applications to communicate
App name not existing, use IP
address or Domain name
Well known port (the handle is no longer
local)
App name not existing, use IP
address or Domain name
1 8
Well-known port, the handle is no longer local
TCP connection or UDP flow
Application connection (not a generic mechanism, partially provided through
different protocols)
Many: HTTP, SMTP, FTP, Telnet, RTP, SNMP, SSH, XMPP, …
The Pouzin SocietyDistributed applications in RINA
• For A and B to communicate, they need:– Application process naming: Complete application naming theory, no
communication medium internal addresses are exposed to apps.– Flows: Flows can have a myriad of characteristics, tailored to different
application requirements.– Communication medium API: Request allocation of flows to other
applications by name; request desired properties for each flow– Objects: Each application decides on their contents and encoding– Application connection: Generic application connection establishment
procedure, where different authentication policies can be plugged in – Application protocol: A single application protocol that can have multiple
encodings: CDAP 6
Medium that enables applications to communicate
Application names
Local handle (portId)
Application names
80
8Flows can have different QoS characteristics
Application connection (generic mechanism, different authentication policies)
CDAP
Local handle (portId)
The Pouzin SocietyRINA API
• The RINA API needs to be different in many ways from conventional Internet operations (usually “sockets”)– Application flows connect named Applications, not addresses/ports– RINA takes responsibility for locating the destination of a flow request,
whereas current practice is to use macro-for-the-address mapping (e.g., DNS) and then use the absolute address returned plus a “known port”• RINA Applications are “registered” in order to be found by their name
– Applications can be reached at multiple points (AE’s, more later)– Applications can reject a request for a flow before it is created and then
authenticate the requestor before establishing a connection– RINA allows a flexible specification of the requested quality/properties of
a flow– RINA transport occurs in application-defined units (“Service Data Units”
or SDUs) vs. a stream of unstructured bytes that force applications to do their own delimiting into meaningful units
• Current API’s don’t provide access to the full set of RINA benefits– Though it is not strictly necessary to have a common cross-system API
for RINA, it would still be a Good Thing to have, as sockets was for IP
7
The Pouzin SocietyNaming – Points to Remember
• The Internet does not name applications– The Internet doesn’t really name nodes/applications –
everything addressable is accessed using “absolute addresses” (IP addresses, ports) to reach it
– DNS is a name-to-number mapping, but applications contact other applications using the absolute addresses returned• There is no virtual addressing (NAT is arguably a step toward it, but
interacts in complex ways with DNS)
• RINA is different– Applications are named
• There can be multiple simultaneous executing instances of an application, so they must be distinguished. The “application instance” is an integral part of the application name.
• An application may contain multiple “Application Entities” (next slide)
– Applications do not know “addresses” of each other, only names
8
The Pouzin Society
• The Application Process Model– Application Process Name: the name of the app– Application Process Instance: to differentiate
specific instances of the same app– Application Entity Name: part of the application
concerned with communication. Associated to a subset of all the existing application objects
– Application Entity Instance: refers to a particular instantiation of an application entity (that is a particular instance of a communication associated to a concrete instantiation of an application protocol and a set of objects)
Application naming
Application
Application Entity
Application Entity
9
Public Internet
HTTP AE Instance 1
HTTP AE Instance 2
Gmail Server ApplicationInstance 1HTTP AE
Instance 9
Browser ApplicationInstance 1
Gmail appInstance 1
HTTP AE Instance 4
TCP connection
TCP connection
Private Network DRDA AE Instance 1
DB Server application instance 1
DRDA AE Instance 4
TCP connection
Gmail Server application instance 2
DRDA AE Instance 1
DRDA AE Instance 2
TCP connection
The Pouzin SocietyFlows
• Instantiation of a communication service between applications– A flow is locally identified by an app through the use of a port-id– Flows transport well defined units of application data (SDUs, Service Data
Units)
• A flow has some externally visible properties:– Bandwidth related
• Average bandwidth• Average SDU bandwidth• Peak BW duration• Peak SDU BW duration
– Undetected Bit Error Rate– Partial Delivery of SDUs allowed?– In order delivery of SDUs required?– Maximum allowable gap between SDUs?– Maximum delay– Maximum jitter
10
The Pouzin SocietyThe IPC API
11
• Presents the service provided by a DIF: a communication flow between applications, with certain quality attributes.
• 6 operations:• portId _allocateFlow(destAppName, List<QoSParams>)• void _write(portId, sdu)• sdu _read(portId)• void _deallocate(portId)• void _registerApp(appName, List<difName>)• void _unregisterApp(appName, List<difName>)
• QoSParams are defined in a technology-agnostic way• Bandwidth-related, delay, jitter, in-order-delivery, loss rates, …
• Aid to adoption: faux sockets API.• Presents the sockets API to the applications, but internally maps the calls to the IPC
API• Current applications can be deployed in RINA networks untouched, but won’t enjoy
all RINA features
The Pouzin Society
IPC API Implementationi2CAT/TSSG Prototype: General design
• Design goal: Portability to multiple Operating Systems (take advantage of Java)
• The RINA Library is part of the application, and provides both a Sockets and a Native RINA API (can be part of the same library or create two packages).
• The IPC Manager is the point of entry to the “RINA stack” running on the computer. It hosts the IPC processes, manages its lifecycle (creation, deletion) and acts as a broker between the RINA library and the IPC Processes.
• Local TCP connections are the means of communication between Apps (running the RINA Library) and the IPC Manager.
• Use of blocking I/O: one thread per each TCP connection 12
IPC ManagerLocal TCP connections
Java Application
RINA App Library
Sockets API
Native RINA API
IPC Process
1IPC
Process 2
The Pouzin Society
IPC API Implementationi2CAT/TSSG Prototype: behavior of a “client” RINA application
13
RINA App Library
IPC Manager
Open a new socket
Send CDAP M_CREATE Message with an FlowService object
Map App Name to DIF, find IPC Process that is member of the DIF, invoke allocateFlowSend CDAP M_CREATE_R Message with the FlowService object
Send delimited SDU (byte[]) to deliver data
Cause the IPC process to transfer the data over the flowSend delimited SDU[] to deliver dataKeep data in buffer
until read by the app, or notify app
Close socket Cause the IPC Process to unallocate the flow
Close socket
Close socket (on response message or timer)
Data transfer
Flow Allocation
Flow Deallocation
Received delete flow request
The Pouzin Society
IPC API Implementationi2CAT/TSSG Prototype: behavior of a “server” RINA application (I)
14
RINA App Library
IPC Manager
Open a new socket
Send CDAP M_START Message with a AppRegistration object
App registrationStart a new Server Socket at port X.Listen for incoming requests
Send CDAP M_START_R Message (success or not, reason)
Flow allocation(for each new incoming flow request)
Send CDAP M_CREATE Message with a FlowService objectStart a new thread for the flow. Decide if accept connection
Incoming Flow allocation request, if destination app is registered, open a new socket
Open a new socket to port X
Send CDAP M_CREATE_R Message with the FlowService objectCause the IPC process to send the allocate response back
Update IDD table and related flow allocator directorie(s). RegisterApp contains Source App naming info, optional list of DIF names and socket number
The Pouzin Society
IPC API Implementationi2CAT/TSSG Prototype: behavior of a “server” RINA application (II)
15
RINA App Library
IPC Manager
Flow Deallocation
socket closedIncoming Flow deallocation request,
Socket closeCause the IPC process to send the deallocate response backClose socket, stop
thread
App unregistration
Close socketUpdate IDD table and related IPC Process directorie(s)
Close socketOn timer or directly: close socket, close serversocket On timer, if not already closed, close
socket
Delimited SDU to write data (byte[])
Cause the IPC process to transfer the data over the flowDelimited SDU to read data (byte[])
Keep data in buffer until read by the app, or notify app
Data transfer
The Pouzin Society
DISTRIBUTED APPLICATIONSCommon Application Establishment Phase and CDAP
16
The Pouzin Society
Once application processes have a communication flow between them, they have to set up an application connection before being able to exchange any further information.
The application connection allows the two communicating apps to: Exchange naming information with its apposite, optionally authenticating it Agree on an application protocol and/or syntax version for the application data exchange
phase
CACEP Common Application Connection Establishment Phase
Appl. Process
A
DIF
Appl. Process
B
2 2flow
1) 2)
3) 4)
1
M_CONNECT (srcName, destName, credentials, proto, syntax version) Appl.
ProcessA
DIF
Appl. Process
B
2 2flow
2
N
Optional messages exchanging authentication information
Appl. Process
A
DIF
Appl. Process
B
2 2flow
N+1
M_CONNECT_R (result, reason, options)Appl.
ProcessA
DIF
Appl. Process
B
2 2flow
Application data transfer phase, processes exchange data using an application protocol
The Pouzin SocietyCDAP
• The Common Distributed Application Protocol (CDAP) is the application protocol used by IPC Processes to exchange shared state (IPC Processes are Application Processes)
• It is also recommended for all RINA applications to use for exchanging shared state (when anything but an amorphous flow of bytes is needed), legacy aps can use whatever they want to use
• The CDAP Specification defines the complete set of operations and messages, as well as their fields– Connection establishment (Connect, Disconnect, authentication)– Object operations: create, delete, read, write, start, stop
• The set of objects and meaning of operations is not dictated by CDAP proper – that is an application concern– IPC Processes are applications, and manipulate a set of objects, but none of
them are dictated by CDAP
• Messages can be encoded in any agreed-upon way– As long as the applications agree, e.g., via CACEP exchange– We currently use GPB, have experimented with JSON
18
The Pouzin Society
19
CDAP operates on objects
• All objects CDAP operates on have the following attributes:– ObjectClass
• Class (data type and representation) of an object
– ObjectName• A identifier of an object, unique within the objects of the same class
– ObjectInstance• An alias of objectClass + objectName, uniquely identifies an object
– ObjectValue• The actual value of the object, encoded as desired by the application
• All CDAP operations can be applied two modifiers: scope/filter; which enables CDAP operations to affect multiple objects that form a hierarchy with a single message:– Scope: An integer indicating how many levels below the selected object
the operation has to be applied.
– Filter: A predicate function that evaluates if the operation should be applied to each individual object within the scope.
The Pouzin SocietyCDAP, AEs and the OIB/RIB
• All the objects an Application Process knows about are locally “stored” in the Object Information Base/Resource Information Base.– The RIB may be an actual database, or just a logical representation of all
the information known by an application process.
• In RINA there’s only a single application protocol: CDAP. Then why are there different AEs?– Each AE is able to operate on a subset of the RIB
20
AE type 1Instance 1
Application Process 2
AE type 2Instance 1
OIB/RIB
AE type 1Instance 1
Application Process 1 Application Process 3
AE type 2Instance 1
OIB/RIBOIB/RIB
CDAPCDAP
The Pouzin SocietyCDAP Implementation
• CDAP messages comprise a sequence of 1 or more fields– The one always-present field is the message type
• Each field has an identifying name or numeric tag (depending on encoding), and a value
• The field names or tag values (for GPB encoding), value types, and presence/absence of particular fields in CDAP messages of each type is defined in the CDAP Specification
• One supported type for an object value is an embedded message, not understood by CDAP itself, but transported unchanged to the apposite– The message declarations for IPC Process object values are not
part of CDAP, but part of the IPC Process Object Dictionary definition. Other applications define their own object types
21
The Pouzin SocietyCDAP Implementation (cont.)
• For Google Protocol Buffers (GPB) syntax, a freely-available compiler can produce code in several languages to construct a valid CDAP message and to access the fields of one: https://developers.google.com/protocol-buffers/– .proto files describe the field values and types– GPB is being used because of its simplicity, compact representation,
support (Google uses it heavily), a freely-available high-quality tool, a simple definition language, and general acceptance by the developer community
– XML, ASN.1, JSON, or other representations would also work as a concrete syntax to encode CDAP messages
• i2CAT’s implementation uses Java code produced by the Google protoc GPB compiler
• TRIA’s implementation uses a table-driven parser/generator that also accepts/generates JSON
22
The Pouzin SocietyExample Message Definition
message qosCube_t{ //a QoS cube specificationrequired uint32 qosId = 1; //Identifies the QoS cubeoptional string name = 2; // A human-readable name for the QoS
cubeoptional uint64 averageBandwidth = 3; //in bytes/s, a value of 0 indicates 'don't care'optional uint64 averageSDUBandwidth = 4; //in bytes/s, a value of 0 indicates 'don't
care'optional uint32 peakBandwidthDuration = 5; //in ms, a value of 0 indicates 'don't care'optional uint32 peakSDUBandwidthDuration = 6; //in ms, a value of 0 indicates 'don't
care'optional double undetectedBitErrorRate = 7; //a value of 0 indicates 'don`t care'optional bool partialDelivery = 8; //indicates if partial delivery of SDUs is allowed or
notoptional bool order = 9; //indicates if SDUs have to be delivered in orderoptional int32 maxAllowableGapSdu = 10; //indicates the maximum gap allowed in
SDUs, a gap of N SDUs is considered the same as all SDUs delivered. A value of -1 indicates 'Any'optional uint32 delay = 11; //in milliseconds, indicates the maximum delay
allowed in this flow. A value of 0 indicates don't careoptional uint32 jitter = 12; //in milliseconds, indicates indicates the maximum
jitter allowed in this flow. A value of 0 indicates don't care}
23
The Pouzin Society
DISTRIBUTED APPLICATIONSSDU Protection
24
The Pouzin SocietySDU Protection
• Applications may have different levels of trust in the communication mediums they use– Need for a way to protect the SDUs they send through the flows
25
Medium that enables applications to communicate
App BApp A
1 2flow
SDUs
App A
SDU Protection module
flow1
Unprotected SDUs
Protected SDUs
• SDU Protection module protects outgoing SDUs and unprotects incoming SDUs
• Can perform the following functions (configurable through policies)• Encryption (Integrity and confidentiality)• Compression• Error detection (CRCs, FECs)• Time To Live
outbound SDU
s
inbound SDU
s
The Pouzin Society
IPC PROCESSBlock diagram, architecture reference vs. implementation
26
The Pouzin Society
Levels of Abstraction (Abstraction is Invariance)
Reference Model
Service Definitions
Protocols
Procedures
Policies
Implementation
Dec
reas
ing
Lev
els
of A
bstr
acti
on
Inva
rian
tV
aria
ble
The Pouzin SocietyIPC Process
• The IPC Process is an entity that provides IPC services for applications running on the same system– It is an application, and uses RINA application operations to do
everything it does– It may or may not be an “OS Process”– There is no set model for how to implement it, and there can be
very different implementations – based on OS, scale, and many other concerns
– In some implementations, it will become part of the OS, just as IP networking is now
– In some implementations, it will operate as “middleware”, atop the OS and its normal networking layer
• All IPC Processes do similar things – WHAT they do is described in the Reference Architecture, but there are many feasible Implementation Architectures for HOW those functions get done. We’ll examine a few today
28
The Pouzin SocietyBlock Diagram (Reference Arch.)
• You’ve seen the RINA Reference Architecture (RA) partitioning of the IPC Process– This describes the basic mechanisms and what they communicate
among themselves to perform the total functionality ascribed to an IPC Process
• Implementers use the RA as a guide to create an Implementation Architecture– Driven by their particular requirements, implementation target,
and end-use• Language, use of OS features, flow of control approach, etc., can all be
different – but they all need to implement the RA
– Modules may be different, but ALL RA functions will be present in a complete implementation, and will communicate with the same functions as they do in the RA
– There can also be multiple different implementations of the same Implementation Architecture (e.g., ports to different OS’s)
29
The Pouzin SocietyBlock Diagram (Reference Arch.)
DIF
System (Host)
IPC Process
IPC Process
MgmtAgemt
System(Router)
IPC Process
IPC Process
IPC Process
MgmtAgemt
System(Host)
IPC Process
IPC Process
MgmtAgemt
Appl. Process
DIF DIF
Appl. Process
IPC API
Data Transfer Data Transfer Control Layer Management
SDU Delimiting
Data Transfer
Relaying and Multiplexing
SDU Protection
Transmission Control
Retransmission Control
Flow Control
RIB Daemon
RIBCDAP
Parser/Generator
CACEP Enrollment
Flow Allocation
Resource Allocation
Forwarding Table Generator (Routing)
Authentication
State VectorState VectorState Vector
Data Transfer Data Transfer
Transmission ControlTransmission Control
Retransmission Control
Retransmission Control
Flow ControlFlow Control
IPC Resource
Mgt.
Inter DIF Directory
SDU Protecti
on
Multiplexing
IPC Mgt. Tasks
Other Mgt. Tasks
Application Specific Tasks
Increasing timescale (functions performed less often) and complexity30
The Pouzin Society
TRIA IMPLEMENTATION
31
The Pouzin SocietyOverall Goals and Approach
• Provide a framework to test and debug the new protocols– Use a single-threaded state machine model to simplify locking
and increase repeatability– Operate entirely at user (application) level for easier debugging
• Anticipate the desire to move some portions (which ones were as yet unknown) into the OS kernel eventually– Coded in C– Memory/buffering/time-management operations similar to those
available inside the UNIX/Linux OS
• Anticipate future porting to multiple targets– Use standard POSIX/UNIX capabilities common on all or most
platforms, avoid extensions that impair portability– Test on MacOS (Mach-based UNIX) and Linux– Test on large and small systems (Intel and ARM-based)
32
The Pouzin SocietyMajor Parts of the Implementation
• Infrastructure– Main program, select (event) loop, state machine framework, file
management, non-blocking I/O, delimiting, pseudo-files (internal IPC, Shim DIF), memory and message pools, timers, startup/shutdown, configuration parsing, logging and debug utilities, GPB and JSON utilities
• CDAP– Table-driven CDAP msg. parse/build, connection state machine
• RIB– Node allocation, lookup, RIB Daemon operations on nodes– Object Manager mechanism for operations on objects
• IPC Process– Per-DIF management (RIB Daemon, enrollment, startup), FA, FAI, DTP/DTCP,
Network Management client interface, Shim DIF, routing, IPC Process-specific Object Managers
• RINA native API Library
• Tests, including RINABAND33
The Pouzin SocietyHigh-Level Block Diagram
34
User Application
Flow Al. InstancesFlow Allocator
EFCP Instances
RMT
RINA API
Device Driver file (N-1) FAI socket
I/O Device RINA DIF
RINA API
(N-1)DIFFlows
Per-DIF Manager
Routing Computation
NetMgr App.
RINA API
IDD Application
RINA API
RIBDataBase
NetMgr Agent/Directory Server
IPCMGR Process
Logger
AuthenticationDatabase
SHIM DIF
UNIX/LinuxProcess
UNIX/LinuxProcesses
The Pouzin Society
I2CAT/TSSG IMPLEMENTATION
35
The Pouzin SocietyOverall Goals and Approach
• Provide an open source initial RINA implementation that can be used for education and quick prototyping – as well as to exercise and improve the RINA specs.– Easy to develop, OS-independent language: Java
– Code structured to be modular and extendable: Use OSGi as a component framework (Eclipse Virgo Implementation)
– Portable to different operating systems: only use Java OS-dependent features available in most OSs (sockets)
• Enable to setup relatively complex scenarios with few hardware resources – Use the TINOS protocol experimentation framework –developed by TSSG- in
order to be able to emulate multiple “hardware” within the same Java process.
• i2CAT/TSSG’s RINA implementation is part of the TINOS project, as one of the “protocol stacks” available. – Reuse of TINOS compile/build infrastructure– Maximize synergies between both projects: single development community
(hosted at github)– Integration with TINOS will be easier (not done yet) 36
The Pouzin SocietyMajor Parts of the Implementation• Infrastructure– VIRGO OSGi core (handles the lifecycle of the different components –bundles in
OSGi parlance), single thread pool, blocking I/O, configuration parsing (JSON library), sockets, Google Guava library (Java has no unsigned types, thanks!), Google Java GPB implementation, Java timers, delimiting, object encoding/decoding
• IPC Manager– RINA-side of the IPC API, IPC Process Lifecycle Management, will host
management agent and IDD (not implemented yet), console service (local administration)
• “Normal IPC Process”– RIB, RIB Daemon, CDAP Parser/generator, Enrollment task, Flow Allocator,
Resource Allocator, EFCP, RMT, SDU Protection
• Shim IPC Process for IP Layers– Setup and management of TCP and UDP flows as per the shim DIF spec
• RINA Application Library– Native RINA API and faux sockets API
• Test applications– RINABand, Echo server & client, simple chat application
37
The Pouzin SocietyGeneral design (I)
38
Flows to/from other shim IPC Processes
Listen for local TCP connections at port 32771
Virgo OSGi Kernel
IPC ManagerConsoleService
Listen for local TCP connections at port 32766
Application Service IPC Process Lifecycle
Management (“IRM”)
Client Application 1
RINA Lib
For each flow, local TCP connection to port 32771
Server Application 1
RINA LibFor the registration, local TCP connection to port 32771 Listen for local TCP connections at
port X (dynamically assigned)
For each flow to service application 1, local TCP connection to port X
Normal IPC ProcessComponents
IPC Service
Delimiter
RIB Daemon
RMT
Encoder CDAP Session Manager
Flow Allocator
Enrollment Task
Local administration
IDD
GPB parser Resource Allocator
EFCP
Shim IPC Process for IP
IPC Service
Flow Allocator
OS Process (Java VM
instantiation)NOTE: Could be multiple
“systems” within the same Java VM once fully
integrated with TINOS
OS Process (Java VM
instantiation)
Listen for TCP connections and UDP datagrams at IPa:portb
SDU Protection
The Pouzin SocietyGeneral design (II)
39
The Pouzin Society
Why TINOS?Larger experimentation scenarios with less infrastructure
40
Java Virtual Machine
IP (Jnode)
Data Link Data Link Data Link
Shim DIF
Data Link Data Link
IP (Jnode)
Shim DIF
DIF
Java Virtual Machine
IP (Jnode)
Data Link Data Link Data Link
Shim DIF
Data Link Data Link
IP (Jnode)
Shim DIF
DIF
Java Virtual Machine
Data LinkIP (OS stack)
Shim DIF
Java Virtual Machine
Data Link
IP (JNode)
Shim DIF
Public
In
tern
et
Data L
ink
IP (O
S sta
ck)
Shim
DIF
DIF
XMPP network
LAN
• With TINOS multiple nodes can be created within the same Java JVM, with different network connectivity with each other and other JVMs (TINOS uses adapted IP stack from JNode and XMPP for this)
The Pouzin Society
ALTERNATIVE IMPLEMENTATIONSSome Implementation Architectures with Interesting Properties
41
The Pouzin SocietyRINA in the OS Kernel
• Make RINA a “native” networking API– New/Extended OS system calls provide full RINA capability– Move (at least) DTP/DTCP into the OS kernel for speed
42
AppApp IPC Process-
RMT
Network Device 1
Forwarding Table
New/Extended OS API Calls
DTP/DTCP Flow State
Application Space
OS Kernel
Network Device 2
“Network Device”Might be a Shim DIF
or a RINA DIF
The Pouzin SocietyRINA Split Between H/W and S/W
• RINA RMT/DTP performed in hardware– Software still does DTCP and remainder of IPC Process fn’s– Transiting PDUs need not be processed by software
43
AppApp IPC Process-
RMT
Network Interface 1
Forwarding Table
New/Extended OS API Calls
DTP Flow State
Application Space
OS Kernel
Network Interface 2
DTCP
Hardware/Firmware
The Pouzin Society
IPC PROCESS (CONTINUED)RIB and RIB operations
44
The Pouzin SocietyRIB and RIB Operations
• The Resource Information Base (RIB) is a virtual object database– Each AE projects a view over the underlying objects– The RIB holds the shared state of the communication
instances between applications
• The IPC Processes communicate by exchanging operations on RIB objects– The only operations are:
create, delete, read, write, start, and stop– These operations cause objects to perform appropriate
actions (defined in an object dictionary)– There is a particular tree of RIB objects defined for IPC
Process use (any other application can define its own tree)
45
The Pouzin SocietyA Few Thoughts on the RIB Daemon• A generalization of Event Management and Routing Update
– Elsewhere (circa 1988) I said Event Management is the hypothalamus of network management and looks like this:
RcvEvents
Logging
SubscriptionService Subscript
DefFile
Add/DeleteSubscriptionFilter control
To Other Management Applications
The Pouzin SocietyA Few Thoughts on the RIB Daemon
• Generalizing routing update adds a capability for managing periodic and/or event driven a data distribution and replication strategy.
RcvEvents
Logging
SubscriptionService Subscript
DefFile
Add/DeleteSubscription
Filter control
To Other Management Applications
ReplicationOptimizer
Does this imply an opportunity for a journaling RIB for some data?
WriteSubscriptions
The Pouzin SocietyA Few Thoughts on the RIB Daemon
• So re-arranging and re-labeling for our current problem.
CDAPProcessing
Logging
SubscriptionService
SubscriptDefFile
Add/DeleteSubscriptionsFrom Tasks
To Requesting Tasks
ReplicationOptimizer
An opportunity for a journaling RIB for some data?
WriteSubscriptions
Reads and Writes to an actual store or to other tasks or task data structures, e.g. DT-state vector.
EventSubscriptions
Incoming CDAP PDUSs
The Pouzin SocietyRIB Implementation
• Our protocol exchanges refer to objects by name and/or object-id (a number)– We haven’t started using object-id’s yet, but the intent was to
make the protocol exchanges more compact– We will standardize the object names/id’s that need to be the
same for consistent RINA implementations through PSOC
• The RIB appears as a tree-structured database with objects at its leaf nodes. Leaves are named with the full absolute pathname from the root to the leaf.
• We operate on an object by sending the operation and the operand object’s name/id (and a value, if appropriate)– The reference model has a “RIB Daemon” that performs the
operation; in practice, this may be subsumed into other entities 49
The Pouzin Society
Naming conventions for IPC Processes
• Application names:– Can be whatever, probably would be useful to give some
kind of indication of its physical location (to facilitate management, for no other purposes).
• Application instances:– Not used in principle, since in normal operation there
should be no need to connect to a concrete instance of an application process (default to 1).
• Two Application Entities:– Management AE: Flows established to/from here are
used to establish application connections to neighboring IPC Processes and exchange layer management information using CDAP.
– Data Transfer AE: Flows established to/from here are used by the RMT to transport “data transfer SDUs”.
50
The Pouzin Society
/daf/management/naming/applicationprocessname
Current tree of objects
51
/
/daf
/dif
/daf/management
/daf/management/operationalstatus
/daf/management/naming
/daf/management/naming/address
/daf/management/naming/whatevercastnames
/daf/management/neighbors
/dif/ipc/dif/ipc/datatransfer /dif/ipc/datatransfer/constants
/dif/management
/dif/management/flowallocator /dif/management/flowallocator/qoscubes
/dif/resourceallocation/flowallocator/flows
/dif/management/flowallocator/directoryforwardingtableentries
/dif/resourceallocation /dif/resourceallocation/flowallocator
/dif/resourceallocation/nminus1flowmanager /dif/resourceallocation/nminus1flowmanager/nminus1flows
/dif/resourceallocation/pduforwardingtable
The Pouzin Society
IPC PROCESSEnrollment
52
The Pouzin SocietyEnrollment
• Enrollment is the process by which an IPC Process communicates with another IPC Process to join a DIF– And acquires enough information to start operating as a member
of the DIF– After enrollment, the newly-enrolled IPC Process is able to create
and accept flows between it and other IPC Processes in the DIF
• Enrollment on the Internet– For TCP/IP mostly inexistent or by ad-hoc/manual means (DHCP
provides a bit of the required functionality)– In IEEE 802.11 the procedure for joining a network is almost
identical to what RINA predicts. The BSSID is a DIF-name.– Similarly, there is enrollment in 802.1q (VLANs).– Done independently, confirmation of the theory.
53
The Pouzin Society
Start at the BeginningJoining a DIF
• A Wants to join DIF beta of which B is a member. First it needs to establish communication with beta. So A’s DIF Management task using DIF A’s IPC Manager (not shown) does an allocate(beta, as good QoS as it can get).The name beta is a whatevercast name for the set containing the addresses of all members of beta that the rule returns the address of an IPC Process with a common (N-1)-DIF. The whatevernme is resolved by the (N-1)-DIF.
• The Allocate creates a flow between A and B. They exchange CDAP connect requests, followed by whatever authentication is required to establish an application connection between A and B. Actually between A and beta. B is acting as an agent or representative for beta.
• Then A and B exchange initialization information. Primarily B is telling A what its DIF internal name (address) is and populating A’s RIB with the current information on the DIF. We will come back to this.
(N-1)-DIF
IPC Process A wants to join DIF of which B is a member.
IPC Process B is a member of a DIF beta
DIF
Man
agem
ent D
IF Managem
ent
Establish connauthenticate
Initialization information
54
The Pouzin Society
A is now a member of beta
• There is now an application connection between the IPC management components of A and B.– All connections between members of a DAF are managed by their IPC management component.– Any management component can send on the flows managed by IPC.– All incoming PDUs are delivered to the RIB Daemon.– The RIB Daemon is a subscription service, essentially a generalization of both routing update and event
management. When any CDAP PDU arrives, it is logged and distributed to the tasks that have subscribed to be notified.
– The Flow Allocator subscribes to Create/Delete Flow Req. (The Flow Allocator will update the RIB after processing the request.)
(N-1)-DIF
A
IPC
Man
agem
ent
RIB
Dae
mon
B
IPC Managem
ent
RIB Daem
onApplicationConnection
55
The Pouzin SocietyEnrollment Exchange
• There are several enrollment situations that IPC Processes encounter when connecting, for example:– An IPC Process that is not enrolled connects to an IPC
Process that is not enrolled in a DIF – the two form a DIF– An IPC Process that is not enrolled connects to an IPC
Process that is already enrolled in a DIF – it joins the DIF– An IPC Process that is enrolled makes a connection to a
neighbor that is enrolled – they now have a new route for flows
• An IPC Process can be in either role, as initiator or target
• The information exchanged in some cases can be reduced to minimize enrollment time
56
The Pouzin SocietyEnrollment Procedure I
• When the New Member receives the M_Connect Response, the New Member copies Current_Address to Saved_Address, it sends – M_Start Enrollment(address, Address_expiration_time, other data about
New Member)
• /* The New Member is telling the Existing Member what it knows. Primarily this is derived from the address (NULL or not), and the expiration life-time of the address if non-NULL. Since addresses are generally assigned for hours or minutes, tight time synchronization is not required. (Even for DIFs with fast turnover, fairly long assignment times are still prudent.)*/
• The Member sends – M_Start_R Enrollment(address (potentially different), Application Process
Name, Current_Address, Address_Expiration).
The Pouzin SocietyEnrollment Procedure II
• Using the information, provided by the New Member, the Existing Member sends – M_Create (zero or more) to initialize the Static and Near Static information
required. When finished and the New Member has sent all necessary – M_Create_Rs
• The Existing Member sends a
– M_Stop Enrollment (Immediate:Boolean)
• The New Member may Read any additional information not provided by the Existing Member.– M_Read (zero or more)– M_Stop_R Enrollment
• If the Immediate Boolean is True, the New Member is free to transition to the Operational state.
• If the Boolean Immediate is False, then the New Member can not transition to the Operational state until an M_Start Operation is received.
The Pouzin SocietyEnrollment Procedure III
• The New Member is free to Read any information not provided by the Existing Member. Once these are completed, the Existing Member sends:– M_Start Operation
• The New Member sends– M_Start_R Operation
• Invoke RIB Update of dynamic information which will cause others to send data to the New Member.
The Pouzin Society
Ignore if started earlier, or start now (consider enrolled now)
Check if I got enough data to start. If more info is required send M_READ requests on specific objects (not the case). I’m Enrolled!Now, if I have a DIF in common with one or more of the neighbors (I’m multihomed) I could enroll with them as well (next slide)
Example Message SequenceSkipping Application connection setup (CACEP)
• One IPC process is a member of a DIF, another one is not
60
Joining IPC Process
MemberIPC Process
1
M_START (Enrollment_Info_object{address=null})
2The joining IPC Process has no address, not a member of the DIF. Assign a valid address and reply
3
M_START_R (ok, Enrollment_Info_object{address=25})
4
Got a positive response and an address. Wait for STOP Enrollment response, RIB Daemon processes the M_CREATE messages
4Send DIF static info (whatevercast names, data transfer constants, qos cubes, supported policy sets) and dynamic info (neighbours, directory forwarding table entries) through a series of M_CREATE messages
5
M_CREATE (DIF_info1)
…
5
M_CREATE (DIF_infoN)
6Once all the information is sent, send stop enrollment request (informing the enrollee has to wait for START operation request) and wait for response
7
M_STOP (Enrollment{allowed_to_start_early=true})
8
9
M_STOP_R (ok)
10 Got STOP response. He’s enrolled! Send M_START message (no answer required)11
M_START (operationalStatus)12
The Pouzin Society
Ignore if started earlier, or start now (consider enrolled now)
Check if I got enough data to start. If more info is required send M_READ requests on specific objects (not the case). The member I’ve talked to is now my neighbor!
Example Message SequenceSkipping Application connection setup (CACEP)
• Both IPC Processes are members of the same DIF
61
Joining IPC Processalso a member
MemberIPC Process
1
M_START (Enrollment_Info_object{address=25})
2The joining IPC Process has a valid address, he is a member of the DIF. Reply
3
M_START_R (ok, Enrollment_Info_object{address=25})
4
Got a positive response and my address is still valid. Wait for M_STOP enrollment request, RIB Daemon processes the M_CREATE messages
4Send DIFs dynamic info only (neighbours, directory forwarding table entries) through a series of M_CREATE messages5
M_CREATE (DIF_info1)
…
5
M_CREATE (DIF_infoN)
6 Once all the information is sent, send stop enrollment request
7
M_STOP (Enrollment{allowed_to_start_early=true})
8
9
M_STOP_R (ok)
10 Got STOP response. He’s my neighbor! Send start message, no response required
11
M_START (operationalStatus)12
The Pouzin Society
IPC PROCESSFlow Allocation
62
The Pouzin Society
63
Flow Allocator
• When Application Process generates an Allocate request, the Flow Allocator creates a flow allocator instance to manage each new flow.
• The Instance is responsible for managing the flow and deallocating the ports– DTP/DTCP instances are deleted automatically after 2MPL with no
traffic,
• When it is given an Allocate Request it does the following:
Allocate(Dest-Appl-Name, QoS parameters)
FlowAllocator
LocalDir Cache
DirForwarding
Table
The Pouzin Society
Details of the Allocation Data Flow: I
• Upon initialization, the FA subscribes to create/delete flow objects.
• The FAI is handed an allocate request. After determining that it is well formed it must find the destination application.
• It consults the Directory Forwarding Table (dotted arrow). The table maps the dest-appl in the request to a “Next Place” to look for it (IPC Process @)
• That points to either a nearest neighbor management flow (if it is multihomed there will be more than one) or a connection allocated that does not go to a nearest neighbor, but uses the data transfer AE. This connection was created by the management task and is available to all tasks within the IPC Process.
Allocate(dest-appl, desired_flow_properties)
Create Flow(dest-appl, stuff)
IPC/RMT
FAI
EFCP
DirectoryForwarding
Table
Appl-names Next Place
RIBDaemon
SubscribeCreate/deleteFlow objects
64
The Pouzin Society
Details of the Allocation Data Flow: II
• When a Create Flow Request arrives, the RIB Daemon forwards it to the FAI for inspection.
• If the FA determines that dest-appl is not here, then it consults the Directory forwarding table as before to determine where to send it next.
• If dest-appl is here, then . . .
Create Flow(dest-appl, stuff)
IPC/RMT
FA
EFCP
DirectoryForwarding
Table
Appl-names Next Place
RIBDaemon
Create Flow(dest-appl, stuff)
65
The Pouzin Society
Details of the Allocation Data Flow: III
• When the Create Flow Req arrives it is passed to the Flow Allocator.• The Flow Allocator looks it up in the table and determines that dest-appl is here. It
determines whether or not the requestor has access to dest-appl.If it does, • then dest-appl is instantiated if necessary, and given an allocate indicate.• It responds with an allocate confirm, if positive then data transfer can commence.• In either case or earlier of access was denied, a Create Flow Resp is sent back with the
appropriate response.
IPC/RMT
FAI
EFCP
DirectoryForwarding
Table
Appl-names Next Place
RIBDaemon
DestAppl
AllocateIndicate Allocate
Confirm
Create Flow Req(dest-appl, stuff)
Create Flow Resp
Read/Write
66
The Pouzin Society
Implementation of the Flow AllocatorApplication registration and DirectoryForwardingTable• DirectoryForwardingTable maps ApNames to the @ of IPC
processes where they are currently registered.– Updated by local application registration events (through IPC API)– Updated by remote application registration events (through remote CDAP
messages processed by the RIB Daemon)– Updated by timers (to discard stale entries)
• Distributed database, several strategies for implementation (the larger the DIF, the more complex it becomes)– Compromise between load of messages to update the database vs. the
timeliness of the data in each DB– Fully replicated vs. partially replicated
• Current implementation: simple, only for small DIFs.– Fully replicated Database (all the IPC Processes know about all the
registered applications in the DIF)– Each time a local application registers/unregister, the FA sends CDAP
M_CREATE message to all its nearest neighbors – Each time a new mapping is learned (from a remote update), if the value of
that mapping changed, the FA sends CDAP M_CREATE message to all its nearest neighbors – except for the one that notified the update -
67
The Pouzin Society
Implementation of the Flow Allocatori2CAT: Management of flows and Interaction with EFCP
68
DIF
IPC Process
Flow Allocator
Appl. Process
1Allocate Request (destAPName, QoS Params)
2 Map request into policies, see if is feasible.Search dest app. at the directory.
IPC Process
Flow Allocator
Appl. Process
5
3M_CREATE(Flow object)
allocation_requested(srcApName)
4Check access control and policies to see if flow is feasible
6allocation_response(result)
7 Create DTP/DTCP instanceDTP/DTCP
8
M_CREATE_R(Flow object)2 Create DTP/DTCP instance
DTP/DTCP
2 Create FAI
FAI FAI
4 Create FAI
9
Allocate Response(result)
• When the flow has been established, 1 incoming and 1 outgoing queue are allocated at the layer boundary by the FAI
• Also, a new EFCP StateVector for the connection (1 per flow right now) is instantiated at the DataTransferAE; as well as 2 queues for queuing PDUs to/from the RMT
The Pouzin Society
IPC PROCESSTransport protocol: EFCP
69
The Pouzin Society
70
EFCP: Error and Flow Control Protocol
• Based on delta-t with mechanism and policy separated.– Naturally cleaves into Data Transfer and Data Transfer Control
• Data Transfer consists of tightly bound mechanisms– Roughly similar to IP+UDP
• Data Transfer Control, if present, consists of loosely bound mechanisms.– Flow control and retransmission (ack) control
• One or more instances per flow; policies driven by the QoS parameters.– The Flow Allocator translates the QoS parameters into suitable policies.– In parallel, might be used for things like p2p [sic] do.– Used serially, avoids the need for a separate security connection as in IPsec.
• Comes in several syntactic flavors based on the length of (address, connection-endpoint-id and sequence number)
• Addresses: 8, 16, 32, 64, 128, variable.• CEP-id: 8, 16, 32, 64• Sequence: 4, 8, 16, 32, 64
Data TransferProtocol
Data TransferControl
State Vector
The Pouzin Society
EFCP: separation of port allocation from synchronization
71
Synchronization (EFCP state machines, data transfer)
ConnectionEndpoint
Port Allocation (FA dialogue, IPC Process management)
Port-id
Connection
• Separating port allocation from synchronization – unlike TCP- has interesting security implications – more on this later.
• Port Allocation state is created/deleted based on explicit requests• Local applications through the IPC API (allocate/deallocate flows)• Remote CREATE/DELETE Flow requests from other IPC Processes
• Synchronization state is refreshed every time a DTP/DTCP packet is sent/received• If no packet is received after a certain amount of time state is discarded
The Pouzin Society
Intro to Delta-TTimer-based connection management
• All connections exist all the time, the protocol just needs to keep caches of the state for those that have carried traffic recently– When a PDU is received for a certain connectionId, the state of
the connection is refreshed– After a certain amount of time with no traffic, the state is
discarded
• What amount of time with no traffic is necessary to be able to safely discard the send/receive state and ensure that:– No packets from a previous connection are accepted in a new
connection– The receiving side doesn’t close until it has received all the
retransmissions of the sending site and can unambiguously respond to them
– A sending side must not close until it has received an Ack for all its transmitted data or allowed time for an Ack of its final retransmission to return before reporting a giveup failure. 72
The Pouzin Society
73
Intro to Delta-T (II)Timer-based connection management
• MPL: Max time to traverse a network
• A: Max time the receiver will wait before sending an acknowledge
• R: Max time a sender will keep retransmitting a packet
• deltaT = MPL + A + R
• Watson showed that send state can be safely discarded after a period of 3deltaT with no traffic, and receive state can be discarded after a period of 2deltaT with no traffic
Sender Receiver
MPL
A
R
PDU 1
PDU 1 ACK
PDU 2
PDU 2
PDU 2
• No SYNs are FYNs are necessary (compared to TCP) -> simpler, more robust
• Implication of Watson’s results:• If MPL cannot be bound, then there is no way to have a reliable data
transport, therefore it cannot be IPC
The Pouzin SocietyData Transfer Protocol (DTP)
• Notice that the flow is a straight shot, very little processing and if there is anything to do, it is moved to the side. The most complex thing DTP does is reassembly and ordering.
• If there is a DTCP instance for this flow: – If the flow control window closes,
PDUs are shunted to the flow controlQ.– If the flow does retransmission, a copy
of the PDU is put on the rexmsnQ.
• These PDUs are now DTCP’s responsibility to send when appropriate.
RMT
CRC
Sequencing/Strip
Delimiting
Reassembly/Separation
Reassmb/SeqQ
InboundQ
CRC
Delimit SDU
Fragment/ Concatenate
Sequence/Address
RexmsnQ
ClsdWinQ
DTCP PDUs
74
The Pouzin Society
75
Data Transfer PDU Contents• Version: 8 Bit (optionally used, absent in current prototypes)
• Destination-Address: Addr-Length
• Source-Address: Addr-Length
• Flow-id: Struct
– QoS-id: 8 Bit
– Destination-CEP-id: Port-id-Length
– Source-CEP-id: Port-id-length• PDUType: 8 bits
• Flags: 8 bits
• PDU-Length: LengthLength
• SequenceNumber: SequenceNumberlength
• Sequence User-Data{DelimitedSDU* | SDUFrag}
The Pouzin SocietyDTP PDU Parsing Example (DEMO)
int policy = dtc.EFCPEncodingPolicyType;
switch ( policy ) { case PDUVERSION_DEMOPROFILE:
NEXT16(destAddr);NEXT16(srcAddr);NEXT16(destCEPID);NEXT16(srcCEPID);NEXT8(qosID);NEXT8(pduType);NEXT8(flags);NEXT32(pduSeqNumber);break;
76
The Pouzin SocietyDTP Policies
• UnknownFlowPolicy – When a PDU arrives for a Data Transfer Flow terminating in this IPC-Process and there is no active DTSV, this policy consults the ResourceAllocator to determine what to do.
• SDUReassemblyTimer Policy – this policy is used when fragments of an SDU are being reassembled and all of the fragments to complete the SDU have not arrived. Typical behavior would be to discard all PDUs associated with the SDU being reassembled.
• SDUGapTimer Policy – this policy is used when the SDUGapTimer expires and PDUs have not been received to a sequence of SDUs with no gaps greater thanMaxGapAllowed. Typically, the action would be to signal an error or abort the flow.
• ClsdWindPolicy - This policy determines what to do if the PDU should not be passed to the RMT.
• MaxPDUSize – The maximum size in bytes of a PDU in this DIF.
• MaxFlowPDUSize – The maximum size in bytes of a PDU on this Flow.
• SeqRollOverThres – The value at which a new flow is created and assigned to this Port-id to support data integrity.
• MaxGapAllowed – The maximum gap in SDUs that can be delivered to the (N)-DIF port without compromising the requested QoS.
77
The Pouzin SocietyData Transfer Control Protocol
• For flows with retransmission (acks) and/or flow control, a DTP flow requires a DTCP companion.
• DTCP controls flow volume, the RMT controls combined flow rate of (N-1)-flows.– Congestion Control is provided by (N-1)-flows
• Notice no explicit synchronization. This is enforced by the bounds on the 3 timers Watson found are necessary: Retransmission Control bounds two of them: RTT and retries. Max PDU Lifetime is bounded by PDUProtection (TTL) or the propagation time on a (N-1)-DIF that does not relay, e.g. a wire.
DTP
RMT
DT-SV
Re-xmsn Q
Flow Control Q
DTCP
RexmsnCtl
Flow Ctl
Data Flow
Control Flow
78
The Pouzin Society
DTCP PoliciesGeneral policies & parameters
• TA – Maximum time an ack is delayed before sending
• TG – Maximum time to exhaust retries.
• TimeUnit – for rate based flow control, i.e. # of PDUs sent per TimeUnit
• FlowInitPolicy – Data Transfer Control initialization policy
• SVUpdatePolicy – Updates the State Vector on arrival of a TransferPDU
• LostControlPDUPolicy – What to do if a Control PDU is lost?
79
The Pouzin Society
DTCP PoliciesRetransmission control
• RTTEstimator Policy – the algorithm for estimating RTT
• RetransmissionTimerExpiryPolicy - what to do when a Retransmission Timer Expires, if the action is not retransmit all PDUs with sequence numbers less than this.
• ReceiverRetransmission Policy - This policy is executed by the receiver to determine when to positively or negatively ack PDUs.
• SenderAck Policy - provides some discretion on when PDUs may be deleted from the ReTransmissionQ. This is useful for multicast and similar situations where one might want to delay discarding PDUs from the retransmission queue.
• SenderAckList Policy - similar to the previous one for selective ack
80
The Pouzin Society
DTCP PoliciesFlow control
• InitialCredit Policy - sets the initial amount of credit on the flow.
• InitialRate Policy - sets the initial sending rate to be allowed on the flow.
• ReceivingFlowControlPolicy - on receipt of a Transfer PDU can update the flow control allocations.
• UpdateCredit Policy – determines how to update the Credit field, i.e. whether the value is absolute or relative to the sequence number.
• FlowControlOverrun Policy - what action to take if the credit or rate has been exceeded.
• ReconcileFlowConflict Policy - when both Credit and Rate based flow control are in use and they disagree on whether the PM can send or receive data.
81
The Pouzin Society
IPC PROCESSRelaying and Multiplexing Task
82
The Pouzin SocietyRelaying and Multiplexing Task
• Outbound this is the first queuing we must hit and even here it may not be necessary (see below).
• DTP flows are classed by the QoS-id part of the connection-id, RMT policy determines the servicing of the queues, for each PDU consulting the forwarding table and posting it to the proper (N-1)-port.
– Because PDUs are complete formed, RMT need not distinguish locally generated PDUs from those that arrived on an (N-1)-port.
• The natural structure of the 3 kinds of “boxes” is such to limit the number of (N-1)-ports.
(N-1)-DIF A (N-1)-DIF B
Queues
Ports
PDUs fromEFCP & (N-1)-DIF flows
Forwarding Table
83
The Pouzin SocietyRMT Implementation Issues
• RMT is the main place where QoS policies operate
• Multiplexing requires flow control, buffering, and policies for how to manage queue space and I/O bandwidth– RMT may discard inbound PDUs it has no place for– RMT uses a policy to decide which outbound flow’s data will be sent
when it can next send a PDU to an (N-1)Flow• A primary input to this decision is the QoS cube of the flow
– Various flow control methods to push back to the application may be used to prevent having to discard outbound data• For example, don’t take an outbound PDU from a flow until the destination
(N-1)Flow is known to be able to accept it
• RMT accesses the Forwarding Table to chose the (N-1)Flow to send a PDU over
• The RMT also may receive a PDU from an (N-1)Flow that needs forwarding, refer to the Forwarding Table, and place it on an outbound (N-1)Flow– This also has flow control/resource management implications 84
The Pouzin Society
IPC PROCESSRouting and forwarding
85
The Pouzin SocietyRouting and Forwarding
• When a local application generates an outbound PDU for a remote application, RMT locates the appropriate outbound (N-1)DIF flow by using the last-known address for the destination application– This uses the “forwarding table”, a mapping of
address+QoS to a specific (N-1)DIF flow. This is in general a many-to-many mapping.
• The forwarding table is also used to determine which outbound flow to use to forward a PDU going to a destination other than the current IPC Process
• The forwarding table can be computed in the same way it’s usually done – periodic recomputation, based on neighbor and link performance updates
86
The Pouzin SocietyRouting and Forwarding
• DTP PDUs with non-local destination transit thru RMT• Route update messages maintain forwarding table
87
(N-1)-DIF A (N-1)-DIF B
Queues
Ports
PDUs fromEFCP & (N-1)-DIF flows
Forwarding Table
Compute Forwarding
Table
Route Update Messages
DTP PDU
The Pouzin SocietyComplications in Implementing
• Protection on PDU’s must be checked and potentially removed before RMT can examine the PCI– E.g., the PDU could be encrypted or could be coded with
redundant coding to reduce error rates
• RMT uses decoded PCI to determine if the PDU is for a local destination, and if so for which flow
• If the PDU is determined to be transiting the IPC Process, the exiting (N-1)Flow must be identified and appropriate protection re-computed if anything has changed– Protection needs to be recomputed only if the PDU
changes• E.g., hop count, if present, would be decremented by RMT 88
The Pouzin Society
IPC PROCESSResource Allocator
89
The Pouzin SocietyResource Allocator
• The resource allocator is the core of management in the IPC Process. The degree of decentralization depends on the policies and how it is used.
• The RA has a set of meters and dials that it can manipulate. The meter fall in 3 categories:– Traffic characteristics from the user of the DIF– Traffic characteristics of incoming and outgoing flows– Information from other members of the DIF
• The Dials– Creation/Deletion of QoS Classes– Data Transfer QoS Sets– Modifying Data Transfer Policy Parameters– Creation/Deletion of RMT Queues– Modify RMT Queue Servicing– Creation/Deletion of (N-1)-flows– Assignment of RMT Queues to (N-1)-flows– Forwarding Table Generator Output
90
The Pouzin Society
IPC Process
Resource Allocatori2CAT Implementation
• Just implemented a small subset of the RA functionality: management of N-1 flows.– Request flows to one or more N-1 DIFs– Register the IPC Process in unerlying N-1 DIFs. Process flow requests
that have the IPC Process as a target (accept/deny them)
• Current policy.– Before initiating CACEP and enrollment to a neighbor, allocate a management
N-1 flow his Management AE. If enrollment is successful, allocate a data transfer flow to the data Transfer AE of this neighbor.
91
N-1 Flow Manager
N-1 DIF A
N-1 DIF B
The Pouzin Society
SHIM DIF
92
The Pouzin SocietyThe Shim DIF
• Sits above a non-RINA transport (e.g., wire, Internet, LAN) and presents enough of the RINA API to allow an application to treat the transport as a RINA DIF
• Some transports present a poor match to RINA
• Luckily, the IPC Process is an undemanding RINA application; it needs only unreliable flows to neighbors to operate
• The non-RINA transport configuration information needed may be configured statically, or by using some non-RINA method (e.g., DNS)
• To date, we have created Shim DIFs over IP– There are many practical issues with this mapping
93
The Pouzin SocietyThe Shim DIF
94
Public Internet Private IP layer
“Shim IPC Process”
“Shim IPC Process”
IPC Process
“Shim IPC Process”
IPC Process IPC Process
“Shim IPC Process”
Shim DIFShim DIF
DIF
Appl. Process
Appl. Process
UDP flow UDP flow
TCP flow(s) TCP flow(s)
• The “shim IPC Process” for IP layers is not a “real IPC Process”. It just presents an IP layer as if it is a regular DIF– Wraps the IP layer with the DIF interface.– Maps the names of the IPC Processes of the layer above to IP addresses in the IP layer.– Creates TCP and/or UDP flows based on the QoS requested by an “allocate request”.
The Pouzin SocietyIP/TCP/UCP Practical Issues
• DNS does not provide “application names”– An IP address (or FQDN) plus port is closer, but not exact
• NAT blocks incoming traffic unless a port is opened– Outgoing traffic generally opens a (high) port for incoming– Manually opening ports for incoming flow requests takes
administration/configuration effort, so needs to be minimized
• Sharing a TCP flow among multiple RINA flows creates flow control and starvation potential, so only UDP flows are usable for DTP traffic– This forces one unique port number per application
• Incoming TCP connections from an IP address are not self-identifying w.r.t. the originating application (in general, outgoing ports are not well-known ports)– We introduced a Shim-DIF-specific PDU type to handle this
95
The Pouzin Society
MISC. TOPICS
96
The Pouzin Society
INTER-DIF DIRECTORY (IDD)
97
The Pouzin SocietyIDD
• IDD instances will communicate with IPC Processes– Local communication (same node) may be OS’s IPC or
RINA flows, depending on implementation architecture• The Reference Architecture does not mandate a method
• IDD instances will communicate with one another– Since instances are (generally) on different nodes,
standard RINA application flows will be used– This is an area where standard approaches (protocols,
AE’s, …) can be adopted, but there is no requirement to adopt a single model for all DIFs and sets of DIFs
98
The Pouzin SocietyNetwork Management
• Network management operations work the same way the IPC Process works in general: operations on objects
• There is additional security (mostly OS provided) on access to the IPC Process by a local Network Management client application
• There will be RIB objects that Network Management has visibility to and rights to operate on that remote IPC Processes do not have
• Network Management can cross DIF boundaries, since multiple DIFs may belong in the same management domain
99
The Pouzin SocietySecurity
• All applications have the option to identify and accept/reject incoming flow requests from other applications
• An OS may choose to limit what applications have the right to access a particular DIF/IPC Process
• Encryption is per-DIF; if an application wants its own SDUs hidden from the IPC Process it’s using to communicate via a DIF, it can encrypt them – the IPC Process never looks inside of them– Since the IPC Process is an application, this goes for its
PDUs as well.
100
The Pouzin Society
CONCLUSIONS
101
The Pouzin SocietyConclusions
• The RINA Architecture is implementable– Three implementations are in various stages of completion, pushing
one another– The size and complexity of implementation is modest (we are
currently using simple policies)
• There are many reasonable implementation approaches– Different requirements and OS’s may lead to different partitioning,
language, and overall approach– What would we do differently in our next implementation? Discussion
to follow!
• With working implementations in place, bringing up a new one is much less difficult than the first ones– Most problems will be with the new implementation– It is also beneficial to the existing ones – new implementations can
cause an existing one to follow new paths and uncover latent defects
• We welcome new partners and new implementations!102
The Pouzin Society
DISCUSSION
103
The Pouzin Society
DEMO STORYBOARD
104
The Pouzin SocietyRINABand Test Application
• Client specifies test parameters– Num_flows, SDU size, SDUs per flow, who sends data, reliable/unreliable
flows
• Client sets up a number of flows and, when setup, the test starts– Client, server or both send the agreed number of SDUs over the flows
• Test ends when all the SDUs have been received at the receiving side(s) or a timer fires (counting time without receiving SDUs)
• Client displays stats of the test– SDUs sent/received (number, Mbps), % of lost SDUs
105
DIF
RINAband serverInstance 1
1 flow for test controlRINABand Client ControlAE
Data AEInst 7N flows for test data
The Pouzin SocietyDemo scenario
• The public Internet shim DIF provides direct connectivity to all the IPC Processes in the RINA-Demo.DIF– Doesn’t necessarily need to be the case, it depends how the public Internet shim
DIF “directory” is populated
106
“Public Internet layer”
Public Internet shim DIF
RINA-Demo.DIF
T
TI I TFlorida
Florida
Castelldefels
Barcelona.i2CAT
Barcelona
64
84.88.40.71 84.88.40.70
Castefa.i2CAT
65
Castelldefels
?
bigslug.TRIA-Fl
radio.TRIA-Fl
1720
RINABand, 1
RINABand, 4
32769
32770
32769
32770Tria-fl.dyndns.org
Tria-fl.dyndns.org
32792
32793
32769
32770
azathoth.TRIA-Fl
147.83.207.208
32769
32770
RINABand, 6
The Pouzin SocietyDemo scenario(near future)
• Missing a bit of functionality to reach this– Routing computation– Flow Allocator should do relaying of M_CREATE, M_CREATE_R and M_DELETE Flow
requests
107
“Public Internet layer”
Public Internet shim DIF
RINA-Demo.DIF
LAN shim DIFWiFi LAN
The Pouzin SocietyDemo storyboard
• IPC Process Creation– As a member of a DIF (show RIB)– Not a member of any DIF (show RIB)
• Enrollment (show RIB after joining)– Unenrolled member contacts enrolled member – Enrolled member contacts enrolled member– Member goes away and joins again
• Application registration– RINABand application(s) registering at DIF (show FA directory update)– RINABand applications unregistering (show FA directory update)
• Flow allocation– Establish flows and send data with the RINABand client. Show
throughtput, stats…
108