View
218
Download
1
Tags:
Embed Size (px)
Citation preview
Self-healing networksSelf-healing networksWhen the going gets tough, the tough get going
L.Spaanenburg. Groningen University, Department of Computing Science. P.O. Box 800, 9700 AV, Groningen.Mail: [email protected], http://www.cs.rug.nl/~ben
2001 IPA Spring Days2001 IPA Spring Days
onon
Security
April 2001 IPA Spring Days - Security 2
MotivationMotivation
Security involves the guaranteed accessSecurity involves the guaranteed access
to all resources at all times with top qualityto all resources at all times with top quality
Threats:Threats: - from outside- from outside
- from inside- from inside
Here:Here: internal diseases onlyinternal diseases only
What is security?
April 2001 IPA Spring Days - Security 3
AgendaAgenda
• The nature of the netThe nature of the net
• Disasters with central controlDisasters with central control
• The nature of self-healingThe nature of self-healing
• In-line monitoringIn-line monitoring
• A hardware / software perspectiveA hardware / software perspective
• Research viewResearch view
What we need and what we can’t
April 2001 IPA Spring Days - Security 4
The weak spotThe weak spot
• A network is billions of tightly connected A network is billions of tightly connected distributed heterogeneous componentsdistributed heterogeneous components
• Things happen on a wide time/spatial scale with Things happen on a wide time/spatial scale with massive interactionmassive interaction
• A local disturbance can spread widely in zero A local disturbance can spread widely in zero timetime
• Relationships and interdependencies are too Relationships and interdependencies are too complex for mathematical theoriescomplex for mathematical theories
It is the small dog that bites!
April 2001 IPA Spring Days - Security 5
User’s perspective on networksUser’s perspective on networksAn integrated Power Information
Communication technology
April 2001 IPA Spring Days - Security 6
Telephone networkTelephone networkA network can be a tree with central control
connectionlocalexchange
2nd-orderexchange
1st-orderexchange
short distance
medium distance
long distance
April 2001 IPA Spring Days - Security 7
Data NetworkData NetworkConnectionless communication by broadcast
Subnet
LAN
Host Router
April 2001 IPA Spring Days - Security 8
Means of CommunicationMeans of Communication
• SynchronousSynchronousPDH: Plesiochronous Digital HierarchySDH: Synchronous Digital HierarchyISDN: Integrated Services Digital Network
• AsynchronousAsynchronousFDDI: Fiber Distributed Data InterfaceFR: Frame RelayATM: Asynchronous Transfer Mode
Sigh, there are some many ways to communicate
April 2001 IPA Spring Days - Security 9
Sources of AbnormalitySources of Abnormality
• Attacks from the outside worldAttacks from the outside world(service attack)(service attack)
• Hick-ups in the network communicationHick-ups in the network communication
• Failures on the network nodesFailures on the network nodes
It’s a detection problem!It’s a detection problem!
What goes wrong, will go wrong
April 2001 IPA Spring Days - Security 10
The Keeler-Allston disasterThe Keeler-Allston disaster
• On 10 August 1996, the Keeler-Allston 500 kV On 10 August 1996, the Keeler-Allston 500 kV power line tripped creating voltage depression power line tripped creating voltage depression and the McNary Dam went to maximumand the McNary Dam went to maximum
• The Ross-Lexington 230 kV line also tripped and The Ross-Lexington 230 kV line also tripped and pushed the McNary Dam over the edgepushed the McNary Dam over the edge
• The McNary Dam sets off oscillations that went The McNary Dam sets off oscillations that went to 500 MW within 1.5 minutesto 500 MW within 1.5 minutes
• The North-South Pacific INTER-tie isolated 11 The North-South Pacific INTER-tie isolated 11 US states and 2 Canadian provincesUS states and 2 Canadian provinces
The network is vulnerable for local abnormalities
April 2001 IPA Spring Days - Security 11
The 1998 Galactic page outThe 1998 Galactic page out
• In May 1998, the Galaxy-IV satellite was In May 1998, the Galaxy-IV satellite was disabled by unknown causesdisabled by unknown causes
• US National Public Radio and 40M pagers US National Public Radio and 40M pagers went out, airline flights delayed and data went out, airline flights delayed and data networks had to be manually reconfigurednetworks had to be manually reconfigured
• Many geo-stationary satellites are 800 – 1400 Many geo-stationary satellites are 800 – 1400 km; 13 (60-), 35 (70-), 69 (80-) and 250 (90-)km; 13 (60-), 35 (70-), 69 (80-) and 250 (90-)
• 10 million pieces of debris > 1 mm10 million pieces of debris > 1 mm
The weak belly of the Earth
April 2001 IPA Spring Days - Security 12
Other fault cascadesOther fault cascades
Finagle’s LawFinagle’s Law
““Anything that can go wrong, will”Anything that can go wrong, will”
Antibiotics cause resistance (DDT)Antibiotics cause resistance (DDT)
Code replication also works for errorsCode replication also works for errors
Cause/effect relations occur frequently
April 2001 IPA Spring Days - Security 13
Self-healing in historySelf-healing in history
• 19931993 AT&T announced the self-healingAT&T announced the self-healingwireless networkwireless network
• 19981998 SUN bought the RedCape PolicySUN bought the RedCape PolicyFramework for self-healing softwareFramework for self-healing software
• 19981998 HP released the sefl-healing versionHP released the sefl-healing versionof OpenView Network Node Managerof OpenView Network Node Manager
• 20012001 Concord Com. AnnouncedConcord Com. Announcedself-healing for the homeself-healing for the home
The name has been used before
April 2001 IPA Spring Days - Security 14
Self-Healing ingredientsSelf-Healing ingredients
• ApplicationApplication handling the communicationhandling the communication
• PresentationPresentation message formattingmessage formatting
• SessionSession controls traffic between partiescontrols traffic between parties
• TransportTransport converts packets into framesconverts packets into frames v.v.v.v.
• NetworkNetwork controls frame routingcontrols frame routing
• Data LinkData Link frames of bit sequencesframes of bit sequences
• PhysicalPhysical relays physical quantitiesrelays physical quantities
Self-healing = Detection + Diagnosis + Self-Repair
NetworkTest
NodeTest
Recon-figure
April 2001 IPA Spring Days - Security 15
An Initiative in Self-HealingAn Initiative in Self-Healing
• The CIN/SI is funded by the Electronic Power The CIN/SI is funded by the Electronic Power Research Institute and the US Dept. of Defense Research Institute and the US Dept. of Defense as part of the Government-Industry as part of the Government-Industry Collaborative University Research programCollaborative University Research program
• 28 universities in 6 consortia started Spring 28 universities in 6 consortia started Spring 1999 to spent $30 M in 5 years1999 to spent $30 M in 5 years
• The approach is multi-agent technologyThe approach is multi-agent technology
The Complex Interactive Networks/Systems Initiative
April 2001 IPA Spring Days - Security 16
CIN/SI consortiaCIN/SI consortia
• [CalTech][CalTech] CIN Mathematical FoundationCIN Mathematical Foundation• [CMU][CMU] Context-dependent AgentsContext-dependent Agents• [Cornell][Cornell] Failure MinimizationFailure Minimization• [Harvard][Harvard] Modeling and DiagnosisModeling and Diagnosis• [Purdue][Purdue] Intelligent ManagementIntelligent Management• [Washington][Washington] Defense to AttacksDefense to Attacks
The different aspects of self-healing
April 2001 IPA Spring Days - Security 17
Key issuesKey issues
• Pre-programming misses the target by lack of Pre-programming misses the target by lack of context dependencecontext dependence
• No damage would have occurred if the load on No damage would have occurred if the load on the McNary Dam would have decreased by the McNary Dam would have decreased by 0.4% during the next 30 minutes 0.4% during the next 30 minutes
• Local agents making real-time decision would Local agents making real-time decision would have eliminated the Keeler-Allson disaster.have eliminated the Keeler-Allson disaster.
Central control comes too late by definition
April 2001 IPA Spring Days - Security 18
Basic agent typesBasic agent types
• Agents are called cognitive or rational when Agents are called cognitive or rational when equipped with clear rules and algorithmsequipped with clear rules and algorithms
• Agents are called reactive when their Agents are called reactive when their functioning depends on the interrogation of the functioning depends on the interrogation of the environmentenvironment
Both type of agents are required on the decision-Both type of agents are required on the decision-making layers handling respectively reaction,making layers handling respectively reaction, coordination and deliberationcoordination and deliberation
What are agents?
April 2001 IPA Spring Days - Security 19
CIN/SI architecture (1)CIN/SI architecture (1)Operational control of the power plant
Power System
ProtectionAgents
GenerationAgents
Controls
Faults IsolationAgents
Frequency StabilityAgents
Events/alarmFiltering Agents
Model updateAgents
CommandAgents
Events/alarms
Triggering events Plans/Decisions
April 2001 IPA Spring Days - Security 20
CIN/SI architecture (2)CIN/SI architecture (2)Strategic management of the power grid
Events/alarmFiltering Agents
Model updateAgents
CommandAgents
Triggering events Plans/Decisions
Events IdentificationAgents
PlanningAgents
RestorationAgents
Vulnerability AssessmentAgents
Hidden FailureHidden FailureMonitoring AgentsMonitoring Agents
ReconfigurationReconfigurationAgentsAgents
April 2001 IPA Spring Days - Security 21
Monitoring the processMonitoring the processStrategic decisions on tactic control
MonitorMonitor
ProcessProcessControlControlSensorSensor ActuatorActuator
April 2001 IPA Spring Days - Security 22
The network emphasisThe network emphasisThe network glues the agents together
NetworkNetworkAgent
Agent Agent
Agent Agent
Agent
April 2001 IPA Spring Days - Security 23
Defect looses allDefect looses all
But what we need is:But what we need is:• Mutual observation between nodesMutual observation between nodes• Group decision of testing agentsGroup decision of testing agents• Implied reconfiguration of the networkImplied reconfiguration of the network
How can we facilitateHow can we facilitate
testing with agent properties?testing with agent properties?
Majority voting is a centralized consensus scheme
April 2001 IPA Spring Days - Security 24
Agent characteristicsAgent characteristicsWhat is security?
sensors
effectors
Behaviour
mousemessages...other agents
messagesmovechange appearancespeak
Independent, Reactive,Proactive, Social
April 2001 IPA Spring Days - Security 25
Built-in Block ObservationBuilt-in Block ObservationTesting complex systems requires autonomy
generator
process
verifier
April 2001 IPA Spring Days - Security 26
Linear Feedback Shift-registerLinear Feedback Shift-register
When data flows over identical nodes,When data flows over identical nodes,the typical function can be characterizedthe typical function can be characterized
by the feedback polynomialby the feedback polynomial
Generation of ordered bit strings by EXORs
016 xxx
April 2001 IPA Spring Days - Security 27
Friedmann modelFriedmann modelThe aim is for a locally compacted set of patterns
ProcessProcess
II OO
April 2001 IPA Spring Days - Security 28
A basic functionA basic function
• A simple low-pass filterA simple low-pass filter
• Takes a data sampling routine,Takes a data sampling routine,multiplying adder and final function 1/N.multiplying adder and final function 1/N.
Proto-typical software on a small PIC controller
1
0)(
1 N
iiti xc
Nz
April 2001 IPA Spring Days - Security 29
A neuronA neuron
• A simple neuronA simple neuron
• Is similar to the low-pass filter except for Is similar to the low-pass filter except for the incoming data. Operates from the the incoming data. Operates from the same input data ring-buffer.same input data ring-buffer.
Intelligence can be built from filtering
1
0)(
N
iiji xwfz
April 2001 IPA Spring Days - Security 30
A neural networkA neural network
• A feed-forward networkA feed-forward network
• Differs only in the layer-by-layer Differs only in the layer-by-layer switching of the I/O-blocksswitching of the I/O-blocks
Where there is one neuron, there can be more
1
0
1
0)(
M
j
N
iijij xwfwfz
April 2001 IPA Spring Days - Security 31
Non-Linear Feedback SRNon-Linear Feedback SR
When data flows over identical nodes,When data flows over identical nodes,the typical function can be characterizedthe typical function can be characterizedby the globally recurrent neural networkby the globally recurrent neural network
Generation of ordered patterns by Correlators
txw
April 2001 IPA Spring Days - Security 32
Neural ObservationNeural Observation
• Analog correlation is about finding the Analog correlation is about finding the functional similarityfunctional similarity
• Digital correlation is the same except for the Digital correlation is the same except for the effect of crispingeffect of crisping
• Random access storage is always larger than Random access storage is always larger than storage of an ordered functionstorage of an ordered function
• The neurally approximated function allowes The neurally approximated function allowes for a dense salvage of ordered I/O-pairsfor a dense salvage of ordered I/O-pairs
Analog correlation looks like digital EXOR
April 2001 IPA Spring Days - Security 33
Data-Flow ArchitectureData-Flow Architecture
• When data flows over identical nodes,When data flows over identical nodes,the typical function can be characterizedthe typical function can be characterized
• Built-In Logic Block ObservationBuilt-In Logic Block Observation• The BIFBO can also be shared with The BIFBO can also be shared with
neighboring nodesneighboring nodes• Built-In Function Block ObservationBuilt-In Function Block Observation• The local test does not differentiate between The local test does not differentiate between
hardware and softwarehardware and software
Data discrepancy is low-level abnormal behavior
April 2001 IPA Spring Days - Security 34
Question 1Question 1
• If you can not test it, then it’s not worth If you can not test it, then it’s not worth to design it.to design it.
• Hierarchical design needs a hierarchical Hierarchical design needs a hierarchical test.test.
• Abstraction gives a condensed view on Abstraction gives a condensed view on reality.reality.
• Abstraction provides for scalability.Abstraction provides for scalability.
Is there an abstractional test?
April 2001 IPA Spring Days - Security 35
Question 2Question 2
• Interaction is good, conflicts are lessInteraction is good, conflicts are less• If resources have a state, access should be If resources have a state, access should be
bounded by statebounded by state• Conflicting services pose basically a Conflicting services pose basically a
scheduling problemscheduling problem• It’s hard to schedule over an arbitrary It’s hard to schedule over an arbitrary
networknetwork
Is feature interaction really a static problem?
April 2001 IPA Spring Days - Security 36
Question 3Question 3
• Design should be scalable; test is no exception.Design should be scalable; test is no exception.• Detection can do without diagnosis;Detection can do without diagnosis;
Diagnosis can not go without detection.Diagnosis can not go without detection.• Testing can be based on area (coverage) or on Testing can be based on area (coverage) or on
frontier (sensitivity)frontier (sensitivity)• The boundary between software and hardware The boundary between software and hardware
is still moving is still moving
Do neural networks provide for a built-in test?