31
1 Qualification and Reliability of Complex Electronic Rotorcraft Systems by Alex Boydston & Dr. William Lewis, AMRDEC for AFRL Safe and Secure Symposium 15-17 June 2010 UH-60M Blackhawk Upgrade Picture by U.S. Army CH47F Chinook CAAS Glass Cockpit Picture by U.S. Army DISCLAIMER: Presented at theSafe and Secure Systems and Software Symposium (S5) on 15-17 June 2010 . This material is declared a work of the U.S. Government and is not subject of copyright materials. Approved for public release; distribution unlimited. Review completed by the AMRDEC Public Affairs Office 21 Sep 2009; FN4208. Reference herein to any specific commercial, private or public products, process, or service by trade name, trademark, manufacturer, or otherwise, does not constitute or imply its endorsement, recommendation, or favoring by the United States Government. The views and opinions expressed herein are strictly those of the authors and do not represent or reflect those of the United States Government.

Qualification of Complex Avionics Systems - mys5.org 3/Qualification and Reliability...Federated systems were not tightly coupled and seemed easier to test to some degree ... Complexity

Embed Size (px)

Citation preview

1

Qualification and Reliability of Complex Electronic Rotorcraft Systems

by Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010

UH-60M Blackhawk Upgrade Picture by US Army

CH47F Chinook CAAS Glass CockpitPicture by US Army

DISCLAIMER Presented at theSafe and Secure Systems and Software Symposium (S5) on 15-17 June 2010 This material is declared a work of theUS Government and is not subject of copyright materials Approved for public release distribution unlimited Review completed by the AMRDEC PublicAffairs Office 21 Sep 2009 FN4208 Reference herein to any specific commercial private or public products process or service by trade name trademarkmanufacturer or otherwise does not constitute or imply its endorsement recommendation or favoring by the United States Government The views andopinions expressed herein are strictly those of the authors and do not represent or reflect those of the United States Government

Presenter
Presentation Notes
Executive Summary13There is a need to develop an industry standard method for qualifying complex integrated systems to a specified reliability The goal of this paper is not to resolve the problems or evaluate past or existing designs but it is to catch the attention of government and commercial aviation industry and solicit input into new processes and standards to correct the longstanding issues in the development of complex avionics systems It will encourage committee formation to address these issues The United States (US) Army Aviation Engineering Directorate (AED) requests the input of all technical stakeholders and desires to not dictate these requirements but to establish requirements tools and guidelines for reliability and qualification as a unified community13Abstract13Military civil and commercial rotary wing and fixed wind aircraft rely on complex and highly integrated hardware and software systems for safe operation and successful execution of missions These complex systems are now classified as ldquocyber-physical systemsrdquo per the National Science Foundation (NSF) [38] Whatever the architecture chosen be it federated integrated or some hybrid complex avionics systems must be robust reliable and safe The architecture should meet the functional requirements for timeliness predictability and controllability in order to satisfy the needs to safely and effectively aviate navigate communicate and execute a mission 13Traditionally avionics systems were federated but have evolved into highly integrated systems of computer hardware and software In an integrated system if the architecture has flaws faults can couple between systems and propagate leading to unpredictable and unreliable behavior unless proper partitioning is accomplished Qualification and reliability assessment of such systems is challenging within schedule and budget constraints despite using accepted engineering practices Given this there is a need to develop an industry standard method for qualifying complex integrated systems to a specified reliability13Avionics systems require deterministic performance for critical functions Integrated Modular Avionic (IMA) systems introduce new issues with data processing error checking and error handling beyond their federated counterparts Federated systems were not tightly coupled and seemed easier to test to some degree since the functions were encapsulated in dedicated and separate units Now avionics systems rely on computer hardware running multiple processes on an integrated system which hopefully is implemented with a partitioned operating system Whether federated integrated or some hybrid cyber-physical system it is crucial for reliability safety verification and validation to be embedded in the life-cycle of a system Dealing with this complexity is challenging because of the intricacies present in the design of such systems Stand-alone federated systems have reliability targets based upon their functions Complex cyber-physical systems may have both distributed and integrated functions and multithreaded processes 13Program management systems engineers developers systems integration engineers human factors engineers test engineers and other disciplines must be cognizant of the need for design for testability and high reliability of such systems early in the life-cycle Waiting until Preliminary Design Review (PDR) is too late to start addressing these considerations and may require redesign later or onerous testing during qualification that is costly and time consuming If complete and correct requirements are not given emphasis at the high level then problems with poor reliability and inadequate qualification are found after the systems are acquired by the user This may result in low performance and possibly risk of injury or loss of life 13Processes for project management and engineering are fairly well understood and promoted through the defense acquisition guidelines and by various industry standards such as the Capability Maturity Model Index (CMMI) MIL-STD-882 SAE ARP4754 SAE ARP4761 SAE ARP4754 DO-178B DO-254 ARINC 653 and others Taken together these standards provide guidance that if followed likely will result in safe highly reliable and cost-effective systems over the life-cycle of the system 13Typically as complexity and redundancy of systems increase the reliability also increases to a break-even point and then begins to decrease at some level of complexity Redundant systems by their nature increase complexity of the system They are designed with the goal to provide fault tolerance which ultimately protects the crew passengers test personnel ground personnel innocent bystanders and expensive equipment Correct decisions must be made on good data with proper timing data type and data exchange in mind As the complexity and integration of the system increases costs in design and qualification and schedule sometimes increases 13The current guidelines indicate how bad a design is but do not define how to measure the expected reliability of such complex systems Additionally these guidelines fall short in defining how reliability is measured throughout the life-cycle of a program The current system design guidance often does not yield desired reliability of complex electronic aircraft goals of meeting performance specifications This must be introduced at higher levels like the Program Element Office (PEO) because they control overall cost and schedule Presently ease of test reliability and safety are by-products of the designersrsquo and implementersrsquo diligence in meeting purposeful requirements There must be a concerted effort from government agencies such as the Federal Aviation Administration (FAA) Department of Defense (DOD) and commercial aviation industry to define better design guidance These proposed guidelines should not constrain the creativity in design but must set the boundaries necessary for achieving advances in design with performance reliability and safety in mind 13

2

Agenda

bull Objectivebull Defense Acquisition Approach to Systems Development and Testbull US Army Airworthinessbull AED and Qualificationbull Evolution of Helicopter Systemsbull Present Approach to Testingbull Development Challengesbull Complexity Issuesbull Complexity vs Reliability Cost and Schedulebull Complex System Examples and Failuresbull Lessons Learned from Failuresbull Current Guidelines and Certification Assessment Considerationsbull Definition of Complexity and Reliability Neededbull Analytical Modeling and Analysisbull Call for System Reliability Standard Establishment

Presenter
Presentation Notes
In summary this paper will13 Identify development challenges with complex systems13 Present the current US Army approach to airworthiness and testing13 Discuss the challenges facing reliability and safety13 Survey complexity issues13 Cover the historical aspect of the reliability problem and identify that this not a new problem13 Itemize some current design guidelines for modeling systems and identify deficiencies13 Address certification assessment considerations13 Encourage development and refinement of standard modeling and analysis tools to place power in the hands of the designer to mitigate issues with system reliability early and throughout the project life-cycle and13 Request input from the development and test community to establish standard processes and requirements for qualifying complex systems1313The ultimate goal of this paper is not to resolve the problems or evaluate past or existing designs but it is to catch the attention of government and commercial aviation industry and solicit input into new processes and standards to correct the longstanding issues in the development of complex avionics systems It will encourage committee formation to address these issues The United States Army Aviation Engineering Directorate (AED) requests the input of all technical stakeholders and desires to not dictate these requirements but to establish requirements tools and guidelines for reliability and qualification as a unified community

3

Objective

ldquoDevelop an industry standard method for qualifying complex integrated systems to a specified reliabilityrdquo

RA-66 Comanche Picture by US Army

Presenter
Presentation Notes
Currently there is no method to address qualifying complex integrated systems to a specified reliability Later discussions in this presentation will point to the fact that hardware reliability has been established and recognized however software reliability hasnrsquot reached a consensus on approach Both hardware and software are major contributions to the overall reliability of a system

4

Defense Acquisition Approach to Systems Development and Test

Requirements Establishment

Analysis

High Level Design

Detailed Specifications

Implementation Coding

Operational Testing amp Validation

Verification

Development Testing

Deployed System

Presenter
Presentation Notes
The general recommended system development V-curve as shown in is not always followed in a strict sense although it should be the goal Such negligence to the proper process makes the establishment of certification very difficult For new system development and existing system upgrades requirements must be clear complete and testable The certification requirements must be made obvious in the development of the requirements establishment phase with the goal of being fully identified during the requirements development

5

US Army Airworthiness

Airworthiness Qualification meansThe system is safe and reliable to operate and willperform the mission when deliveredIt will continue to safely perform the mission ifmaintainedoperated per the manualParts and overhaul work must be high quality to maintain airworthinessFlight control systems have high reliability requirements - 10-9 for civil airspace critical IFR functions [35]- 10-6 for tactical airspace [36]

Presenter
Presentation Notes
The US Army has been involved in flight since the early Wright-B flight of 1909 Initially fixed wing and rotary wing aircraft in early aviation remained simple and federated The basics to aviate navigate and communicate were handled by dedicated gauges compasses gyroscopes and mechanical linkages to fly the aircraft With the dawn of the Space Age in the 1960s the rise of more complex electronic control in aviation appeared with the ApolloSaturn projects NASA Dryden F-8 project and others As time progressed more integrated modular avionics (IMA) and digital fly by wire (DFBW) came into military and commercial aircraft to reduce weight increase functionality and provide redundancy With this advancement of complex flight control technology testing verification and validation became problematic To this day complex systems have been challenging to qualify for the Federal Aviation Administration (FAA) and military airworthiness commands13The US Army AED holds the responsibility for airworthiness approval This includes the design approval production approval and continued airworthiness In the design approval AED must determine if the system design or modification meets airworthiness standards In the milestone decisions it must determine whether the system is produced or modified in accordance with the approved design For continued airworthiness a judgment must be made for the system operated and maintained to keep it compliant with the approved design The bottom line is that the qualified system meets its performance requirements and is reliable when delivered1313Establishing approval for AWR has become increasingly difficult with the evolution of complex avionic control systems from federated architectures to IMA architectures that rely heavily on complex software systems (ie now identified as cyber-physical systems) As stated earlier AED has the mission to ensure airworthiness qualification for aircraft and subsystems used in the US Army aviation fleet including helicopters airplanes and Unmanned Aircraft Systems (UAS) 13

6

AED and Qualification

bull AEDrsquos mission is to ensure airworthiness qualification for aircraft and subsystems used in the US Army fleet

bull Airworthiness Qualification isndash Demonstration of an aircraft or aircraft subsystem or

component including modifications to function safely meeting performance specifications when used and maintained within prescribed limits (AR 70-62)

bull Traditionally qualified systems by ndash Similarity ndash Analysisndash Testndash Demonstrationndash Examination

Presenter
Presentation Notes
Airworthiness qualification is the demonstration of an aircraft or aircraft subsystem or component including modifications to function satisfactorily when used and maintained within prescribed limits [25] The US Army has traditionally qualified systems and components by physical testing analysis demonstration or by similarity

7

Evolution of Helicopter Systems

bull Past systems historically federated ndash Distributed Functionality ndash Hardware basedndash Simple easier to test

bull Now systems are becoming integratedndash Combined Functionality ndash More Software Intensivendash Complex more difficult to test

Chief Warrant Office Jim Beaty (back row far left) and crew of the of the Vietnam UH-1 Flying Bikinis (friend of Alex Boydston)

UH-1 Cockpit (US Army)

Chinook CAAS Cockpit (US Army)

CH-47 Chinook (US Army)

Presenter
Presentation Notes
The US Army has been involved in flight since the early Wright-B flight of 1909 Initially fixed wing and rotary wing aircraft in early aviation remained simple and federated The basics to aviate navigate and communicate were handled by dedicated gauges compasses gyroscopes and mechanical linkages to fly the aircraft With the dawn of the Space Age in the 1960s the rise of more complex electronic control in aviation appeared with the ApolloSaturn projects NASA Dryden F-8 project and others As time progressed more integrated modular avionics (IMA) and digital fly by wire (DFBW) came into military and commercial aircraft to reduce weight increase functionality and provide redundancy With this advancement of complex flight control technology testing verification and validation became problematic To this day complex systems have been challenging to qualify for the Federal Aviation Administration (FAA) and military airworthiness commands13

8

Present Approach to Testing

bull Several disciplines weigh in such as software avionics GNampCenvironmental E3 electrical human factors structures aerodynamics etc

bull Current test methodology per older federated systemsndash Hardware Mil-Std 810 Mil-Std 461ndash Requirements Analysis (Traceability)

bull Test at different levelsndash Individual software module testingndash Black box testingndash System Integration testing

bull Hot benchbull System Integration Lab (SIL)

ndash Aircraft level Testing bull Groundbull Flight Aviation Flight Test Directorate (AFTD) Testing (US Army Photo)

Aviation System Integration Facility (ASIF) (US Army Photo)

Presenter
Presentation Notes
Airworthiness qualification is the demonstration of an aircraft or aircraft subsystem or component including modifications to function satisfactorily when used and maintained within prescribed limits [25] The US Army has traditionally qualified systems and components by physical testing analysis demonstration or by similarity Historically most systems were federated They were hardware based simple and distributed Now they have become more integrated They are more software intensive complex and have combined functionality contained in one or more computers With this evolution from simple to more complex the Army is finding it more difficult to execute an AWR As systems evolve to more complex systems of systems this problem is only growing worse13The current test approach to achieving confidence in systems for an AWR for the US Army is based more on traditional federated avionics systems Experienced personnel in Software Vehicle Management Systems Avionic Communications Navigation amp Guidance Control Electrical Environmental Human Factors Electrical and Electro-magnetic Effects (E3) Structures Thermal Safety Integration Testing and Flight Test personnel and test pilots all play important roles in accomplishing test and review of new and existing systems While some may not consider areas such as thermal or EEE important to software development they are crucial since the software is running on physical systems that are affected by heat and susceptibility to electromagnetic radiation which can cause abnormal operation or bit errors Current test methodology for hardware relies on MIL-STD-810 MIL-STD-461 and requirements analysis such as traceability MIL-STD-810 is the Department of Defense Test Method Standard for Environmental Engineering Considerations and Laboratory Tests1313It focuses on equipment environmental design and test limits to the conditions that it will experience throughout its service life and establishing chamber test methods that replicate the effects of environments on the equipment rather than imitating the environments themselves MIL-STD-461 includes the ldquoRequirements for the Control of Electromagnetic Interference Characteristics of Subsystems and Equipmentrdquo These test standards are geared for hardware and physical effects and do not address software operation Current methods for software testing include testing the individual software modules black box (or interface) level system integration level and aircraft level testing System integration testing can include hot bench system integration labs (SILs) (such as the shown in Figure 4) and aircraft ground level testing as conducted by the Aviation Flight Test Directorate (AFTD) of Redstone Test Center (RTC) (see Figure 5) 13Neglecting complete and good requirements promotes risk It is common practice by Program Managers (PMs) to accept risks As issues are found in systems it is on AED to issue an Army Safety Action Memorandum (ASAM) that identifies a deficiency to the field pilots Concurrently an Airworthiness Impact Statement (AWIS) is issued to the PMs which contain a probabilistic analysis of how the identified shortcoming will affect risk The AMCOM Safety Office will produce its own calculations The PMs can either accept or reject this assessment Regardless cautions and warnings are placed in the AWR to keep the entire program and flight crew aware of issues13

9

Development Challenges

bull Legacy Aircraft often upgraded in a piecemeal fashionndash Makes certification difficultndash Desire to increase to modern requirements based on size of upgrade and

what it includes ndash hard to scope

bull New system requirements must be clear complete and testable ndash Certification requirements must be obvious

bull Orchestrating agreement between stakeholders is necessary to mitigatendash Juggling of multiple software buildsndash End system that is difficult to test certify and deployndash Escalating Costsndash System Safety from being poorly understoodndash Design iterations

Presenter
Presentation Notes
It would be wonderful if all systems were straightforward in design easily testable and simple to write an Airworthiness Release (AWR) for however that is not the case Legacy aircraft such as the Chinook have been upgraded in a piecemeal fashion acquiring much needed improvements in aviation navigation and communication The general recommended system development V-curve as shown earlier is not always followed in a strict sense although it should be the goal process Negligence to the proper process makes the establishment of certification very difficult For new system development and existing system upgrades requirements must be clear complete and testable The certification requirements must be made obvious in the development of the requirements establishment phase with the goal of being fully identified during the requirements development Orchestrating agreement among all stakeholders (eg the program manager systems engineers human factors engineers integrators test engineers manufacturers users and certifiers) is necessary to mitigate problems such as13juggling multiple software builds13producing a difficult-to-test difficult-to-certify and difficult-to-deploy systems13misunderstanding system safety and13requiring design iterations that impact schedules and costs

10

Complexity Issues

bull System Development costs and schedule increase with complexityndash Existing lack of schedule and funding resources

bull Keeps systems from achieving full compliance with specifications and requirements

bull Garbage in -gt Garbage OuthellipPoor requirements -gt Poor Systemndash Finding problems in new designs at PDR is too latendash Difficult to correct existing poorly designed fielded complex systems

bull Complexity amp reliability of complex systems is not fully understoodndash How do we accurately assess operating risk performance reliability of

complex systems based on limited testing and analysisndash How do we know when system design is good enoughndash Latent defects occur in supposedly well-tested mature systems

bull Avionics parts and software change constantlyndash Spiral development -gt new softwarehardware qualification required frequently ndash How do we streamline the process (partition the system) so the need for

complete re-qualification after changes is lessened

Presenter
Presentation Notes
In an old computer development lesson it is well known that if you put ldquogarbage in you will get garbage outrdquo Likewise in a system if you have poor requirements then you will end up with a poor system Finding problems in new designs at Preliminary Design Review (PDR) is too late It has been shown that discovering issues at this stage in the game will cause a reiteration on the design which costs time and money Furthermore dealing with existing fielded complex systems that have not gone through the rigors of proper systems engineering results quite often with an overwhelming mess of functionality to test verify validate qualify and certifydesigns at PDR is too late It has been shown that discovering issues at this stage in the life-cycle will cause a reiteration on the design which costs time and money Furthermore dealing with existing fielded complex systems that have not gone through the rigors of proper systems engineering results quite often with a complicated conglomeration of functionality to test verify validate qualify certify and maintain This is indicative where complexity has exceeded our understanding in how to certify a system We still do not fully understand complexity and how to address reliability of complex systems How do we accurately assess operating risks performance and reliability of complex systems based on limited testing and analysis How do we know when a system design is good enough How do we modularize spirally developed systems to minimize the need for re-qualification of unchanging portions of the system We are 30 plus years into this technology and we still deal with systems with latent defects that are occurring in supposedly well-tested and mature systems To further exacerbate the problem we are now dealing with complex system of systems (ie cyber-physical systems)13It is a given that you can keep on adding redundancy and complexity to a problem to attain a desired level of reliability but at some point in time the reliability will taper off At best we sometimes must satisfy for an optimum point before digressing in reliability In the same vein system development costs and schedule increase with complexity too (see Figure 7 and 8)13Avionics parts and software constantly change over the life of a program Typically a spiral development program occurs with complex software development which means that qualification is required frequently This begs the question of how to streamline the process so that the need to conduct a complete requalification is avoided13With these complex systems there are other hurdles to cross such as fully characterizing and conducting the Functional Hazard Assessments (FHAs) and Failure Mode Effects and Criticality Analyses (FMECAs) It is crucial for the safety assessment that these are conducted correctly to fully understand the risks and later perform the correct tests at the right level Additionally once the complex component hardware and software are integrated then yet other problems appear It is at that time that the disparity of teams crossing multiple contractors and development groups becomes obvious and if not properly coordinated could cause impacts to the schedule Other programmatic problems affect complex system development and qualification For instance lack of schedule and funding resources causes a shortcoming to adequately provide for the proper compliance with specification and requirements short-circuiting the systems engineering process An ever decreasing availability of trained engineers to support the development and test of such systems exists Non-technical political influences sometimes affect the reliability of complex systems and are difficult to avoid Lastly there is a lack of a centralized database that captures the various families of systems that have been built along with their characterization of success and failures Such a database from all past and present government complex systems could be valuable in establishing reliability basis for future models

11

Complexity Issues (continued)

bull Functional Hazard Assessments and related documentation are crucial

ndash Understanding risks ndash Performing the correct tests at the right level Lab test vs Flight Test

bull Saves flight time and money

bull Systems Integration for complex systems is a schedule driver

bull Need experienced personnel to work complex systems

bull Need a centralized database - just doesnrsquot existndash Determine data needed for quantifying reliability of complex systemsndash Capture the pertinent data on previous or existing complex systemsndash Understand successes and failures of previous and present complex systemsndash Establish baseline reliability figures for known architectures

bull Complex System of Systems exacerbates problem

12

Reliability vs Complexity ampCost vs Complexity

Notional Graphs

bull Reliability vs Complexity bull Cost amp Schedule vs Complexity

Rel

iabi

lity

Complexity Complexity

Cos

t amp S

ched

uleOptimum

Aggregation of part reliability

feeds into overall system reliability

Desired

Presenter
Presentation Notes
Avionics parts and software constantly change over the life of a program Typically a spiral development program occurs with complex software development which means that qualification is required frequently This begs the question of how to streamline the process so that the need to conduct a complete re-qualification is avoided13With these complex systems there are other hurdles to cross such as fully characterizing and conducting the Functional Hazard Assessments (FHAs) and Failure Mode Effects and Criticality Analyses (FMECAs) It is crucial for the safety assessment that these are conducted correctly to fully understand the risks and later perform the correct tests at the right level13Additionally once the complex component hardware and software are integrated then yet other problems crop up It is at that time that the disparity of teams crossing multiple contractors and development groups becomes obvious and if not properly coordinated could cause impacts to the schedule 13Other programmatic problems affect complex system development and qualification For instance lack of schedule and funding resources causes a shortcoming to adequately provide for the proper compliance with specification and requirements short-circuiting the systems engineering process An ever decreasing availability of trained engineers to support the development and test of such systems exists Non-technical political influences sometimes affect the reliability of complex systems and are difficult to avoid Lastly there is a lack of a centralized database that captures the various families of systems that have been built along with their characterization of success and failures Such a database from all past and present government complex systems could be valuable in establishing reliability basis for future models

13

A Few Examples of Complex Systems

bull This is not a new problem Other have struggled with the challenges of establishing confidence in complex systems

ndash NASAbull Apollo Guidance Computerbull Dryden F8 Crusaderbull Space Shuttlebull International Space Station

ndash Commercial Airlinersbull Airbus A320 and higherbull Boeing B777 B787

ndash Militarybull Ships and Submarinesbull Jets (F14F15 F16 F18 F22 F35 etc)bull Cargo Planes (C130J Hercules C17 Globemaster etc)bull Helicopters (Chinook Blackhawk Sea Stallion etc)bull Rocketsbull Unmanned Aerial Systemsbull Unmanned Ground Systemsbull Unmanned Submarine Systems Photos by US Army NASA US Navy and US Air Force

Presenter
Presentation Notes
As mentioned earlier complex avionics systems are not a new idea Since the early 1960rsquos complex avionic architectures have existed beginning with the ApolloSaturn program Massachusetts Institute of Technology (MIT) Instrumentation Lab (IL) which is now Draper Laboratory and International Business Machines (IBM) led the way with the MITIL Apollo Guidance Computer (AGC) and the Saturn V IBM triple modular redundant (TMR) voting guidance computer system The word software was not even coined at the time but engineers such as Margaret Hamilton MITIL Director of Apollo On-board Software can attest to the fact that some the same issues with creating reliable software then still exists today [5] A large majority of the issues then dealt with the communication between systems engineers and the programmers Requirements were thrown over the wall without the confirmation that the requirements were complete and a lot of the issues cropped up as interface problems Identifying these issues prompted Hamilton to create her own company and create a modeling language called Universal Systems Language (USL) to head off the problems experienced with Apollo [11] Some 200 plus modeling programs have been developed since Apollo and used to mitigate issues and increase confidence in systems of varying complexity13As time progressed other systems came along The NASA Dryden F8 Crusader was the first digital fly by wire (DFBW) jet aircraft that relied heavily on complex IMA and software for flight control The Space Transportation System (STS) shuttle includes a Quad Modular Redundant (QMR) system with a fifth backup flight computer containing uncommon code US Air Force and Naval airplanes that have possessed complex or redundant IMA configurations include the F14 Tomcat F15 Eagle F16 Falcon F18 Hornet F22 Raptor F35 Joint Strike Fighter F117 Nighthawk V22 Osprey C17 Globemaster and many more along with recent Unmanned Air Vehicle Systems (UAVS) The US Army complex systems on helicopters include the13RAH-66 Comanche DFBW Triple Modular Redundant (TMR) architecture13glass cockpit avionics on the UH-60M Blackhawk baseline 13Common Avionics Architecture System (CAAS) glass cockpit on the UH-60M Blackhawk modernization and CH-47F Chinooks13and other aircraft13Additionally there are many self-checking pair engine controller systems along with system of system Future Combat Systems (FCS) and Unmanned Air Vehicle Systems (UAVS) This has also permeated the commercial airliner market with the Airbus 320 and higher Airbus models Boeing 777 and Boeing 787 aircraft With this ever increasing technology something must be done about the reliability issue With such a wealth of data on aviation and non-aviation cyber-physical systems such as submarine ship nuclear medical locomotive and automotive systems there should be adequate information to get a start on modeling systems correctly for reliability Therefore this is not an isolated problem to avionics and other disciplines should aide in resolving this problem13

14

Some Complex System Failures

bull V-22 Osprey crashesbull Mars Climate Orbiter crashbull Mars Pathfinder software resetbull USS Vincennes downing an Airbus 320bull Therac-25 software radiation treatment

failurebull 1989 Airbus A320 air show crashbull China Airlines Airbus Industries A300

crashbull Ariane 5 satellite launcher malfunctionbull Failure of the primary flight system to

sync with the backup during prelaunch of STS-1

bull Mexicana Airlines Boeing 727 airliner crashed into a mountain due to the software not correctly determining the mountain position

bull Loss of the first American probe to Venus

bull Korean Airlines KAL 901 accidentbull Soviet Phobos I Mars probe lost

bull Three Mile Islandbull F-18 fighter plane crash due to bad

exceptionbull F-14 fighter plane lost to

uncontrollable spinbull Swedish Gripen prototype crashedbull Swedish Gripen air-show crashbull F-22 failure crossing the IDLbull 2006 German-Spanish Barracuda UAVbull 2004 FA-22 Raptor stealth fighter jet

crash bull FA-22 Raptor navigation system

software error at Nellis AFBbull 50 cockpit blackouts on A320bull A320 multiple avionics and electrical

failures at Newark NJbull Boeing 777 Malaysian Airlines jetlinerrsquos

nightmarish autopilot rollercoaster ridebull 3000 feet US Army and Air Force UAV

Crashes

bull hellip And Many Morehellip

Presenter
Presentation Notes
Multiple crashes have occurred with the V-22 Osprey [41]13In 1999 the Mars Climate Orbiter crashed because of incorrect units in a program caused by poor systems engineering practices [42 44]13In 1988 an Airbus 320 was shot down by the USS Vincennes because of cryptic and misleading output displayed by the tracking software [3]13In 1989 an Airbus A320 crashed at an air show due to altitude indication and software handling [3]13In 1994 a China Airlines Airbus Industries A300 crash on killing 264 from faulty software [3]13In 1996 the first Ariane 5 satellite launcher destruction mishap was caused by a faulty software design error with a few lines of ADA code containing unprotected variables Horizontal velocity of the Ariane 5 exceeded the Arian 4 resulting in the guidance system veering the rocket off course Insufficient testing did not catch this error which was a carry-over from Ariane 4[3 39]13In 1986 a Mexicana Airlines Boeing 727 airliner crashed into a mountain due to the software not correctly determining the mountain position [39]13In 1986 the Therac-25 radiation therapy machines overdosed cancer patients due to flaw in the computer program controlling the highly automated devices [3 39 45]13During the maiden launch in 1981 of the Discovery space shuttle a failure of the primary flight control computer system to establish sync with the backup during prelaunch [43]13On December 10 1990 the Space Shuttle Columbia had to land early due to computer software problems [39]13In 1997 The Mars Pathfinder software reset problem due to latent task execution caused by priority inversion with a mutex [3 44]13An error in a single FORTRAN statement resulted in the loss of the first American probe to Venus From G J Myers Software Reliability Principles amp Practice p 25 [3]13In September 1999 the Korean Airlines KAL 901 accident in Guam killed 225 out of 254 aboard A worldwide bug was discovered in barometric altimetry in Ground Proximity Warning System (GPWS) From ACM SIGSOFT Software Engineering Notes vol 23 no 1 [3]13The Soviet Phobos I Mars probe was lost due to a faulty software update at a cost of 300 million rubles Its disorientation broke the radio link and the solar batteries discharged before reacquisition From Aviation Week 13 Feb 1989 [3]13An F-18 fighter plane crash due to a missing exception condition From ACM SIGSOFT Software Engineering Notes vol 6 no 2 [3]13An F-14 fighter plane was lost to uncontrollable spin traced to tactical software From ACM SIGSOFT Software Engineering vol9 no5 [3]13In 1989 Swedish Gripen prototype crashed due to software in their digital fly-by-wire system [3 46]13In 1995 another Gripen fighter plane crashed during air-show caused by a software issue [3 46]13On February 11 2007 twelve FA-22 Raptors were forced to head back to the Hawaii when a software bug caused a computer crash as they were crossing International Date Line In the middle of the ocean all their systems comprising navigation fuel and part of the communications systems dumped All the attempts to reboot failed[47]13February 2006 German-Spanish Unmanned Combat Air Vehicle Barracuda crash due to software failure [4]13December 2004 a glitch in the software for controlling flight probably caused an FA-22 Raptor stealth fighter jet to crash on takeoff at Nellis Air Force [4]13In 2008 a United Airbus A320 registration N462UA experienced multiple avionics and electrical failures including loss of all communications shortly after rotation while departing Newark Liberty International Airport in Newark New Jersey [NTSB Report Identification DCA08IA033] 13In 2006 a Boeing 777 Malaysian Airlines jetlinerrsquos autopilot caused a stall to occur by climbing 3000 feet Pilots struggled to nose down the plane but plunged into a steep dive After pulling back up the pilots regained control Cause was defective flight software providing incorrect data for airspeed and acceleration confusing the flight computers and initially ignoring the pilotrsquos commands[49]13US Army and Air Force UAV crashes from control system or human error13

15

Lessons Learned from Failures

bull From Nancy Levesonrsquos paper ldquoThe Role of Software in Spacecraft Accidentsrdquondash ldquoFlaws in the safety culture diffusion of responsibility and

authorityndash Limited communication channels and poor information flowndash Inadequate system and software engineering ndash Poor or missing specifications ndash Unnecessary complexity and software functionality ndash Software reuse or changes without appropriate safety analysisndash [Shortcomings] in safety engineering practices ndash Flaws in test and simulation environments ndash Inadequate human factors design for softwarerdquo

Presenter
Presentation Notes
In Dr Nancy Levesonrsquos paper [36] ldquoThe Role of Software in Spacecraft Accidentsrdquo she cited problems with software development issues within NASA on various projects According to Dr Leveson there were ldquoflaws in the safety culture diffusion of responsibility and authority limited communication channels and poor information flow inadequate system and software engineering poor or missing specifications unnecessary complexity and software functionality software reuse or changes without appropriate safety analysis violation of basic safety engineering practices inadequate system safety engineering flaws in test and simulation environments and inadequate human factors design for softwarerdquo While these problems were identified for spacecraft development within NASA and corrected aviation in general could learn from these lessons to mitigate issues with complex systems development

16

Some Current Guidelines

bull DO-178B - Software Considerations in Airborne Systems and Equipment Certification bull DO-248B ndash Final Report for the Clarification of DO-178Bbull DO-278 - Guidelines for Communications Navigation Surveillance and Air Traffic Management

(CNSATM) Systems Software Integrity Assurancebull DO-254 - Design Assurance Guidance for Airborne Electronic Hardware bull DO-297 ndash Integrated Modular Avionics (IMA) Development Guidance and Certification

Considerationsbull SAE-ARP4754 ndash Certification Consideration for Highly Integrated or Complex Aircraft Systemsbull SAE-ARP4671- Guidelines and Methods for Conducting the Safety Assessment Process on

Airborne Systems and Equipmentbull FAA Advisory Circular AC27-1B - Certification of Normal Category Rotorcraftbull FAA Advisory Circular AC29-2C - Certification of Transport Category Rotorcraftbull ISOIEC 12207 - Software Life Cycle Processesbull ARINC 653 - Specification Standard for Time and System Partitionbull MIL-STD-882D - DoD System Safetybull ADS-51-HDBK - Rotorcraft and Aircraft Qualification Handbookbull AR-70-62 - Airworthiness Release Standardbull SED-SES-PMHFSA 001 - Software Engineering Directorate (SED) Software Engineering

Evaluation System (SEES) Program Manager Handbook for Flight Software Airworthinessbull SED-SES-PMHSS 001 - SED SEES Program Manager Handbook for Software Safety

WHATrsquoS MISSING - Reliability Standard for Complex Systems

Presenter
Presentation Notes
These problems previously stated drove the development of these guidelines however there is no standard for system reliability that includes software There are other standards and circulars that pertain to complex systems but a reliability standard for complex systems but a reliability standard is missing for complex systems which would outline the process for establishing cyber-physical systems reliability This standard should indicate how to model and analyze and ascertain the projected level of reliability

17

Certification Assessment Considerations

bull Sufficient data and time must be available for air worthiness evaluation

bull Certification processndash Currently lengthy ndash Depends much on human interpretation trade offs and risk mitigation ndash Overwhelming for complex integrated systems (FHAs FTAs FMECAs

risk mitigation etc)

bull Consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives

bull Certification Tasks outlined in DO-297 should be consideredndash Task 1 Module Acceptancendash Task 2 Application softwarehardware acceptancendash Task 3 IMA system acceptancendash Task 4 Aircraft integration of IMA system ndash including VampVndash Task 5 Change of modules or applicationsndash Task 6 Reused of modules or applications

Presenter
Presentation Notes
In order to execute an AWR sufficient data must be provided to the appropriate airworthiness authorities to demonstrate that safety requirements are met to the levels per the System Safety Assessment (SSA) or equivalent The certification process is currently lengthy and depends on much human interpretation of the myriad of complex architecture functions 13The current guidelines such as DO-178B DO-254 DO-297 SAE-ARP-4754 and SAE-ARP-4671 along with many other guidelines outline the proper steps that should be taken System safety managementrsquos military standard is MIL-STD-882 and has been in use for decades Civilian safety standards for the aviation industry include SAE ARP4754 which shows the incorporation of system safety activities into the design process and provides guidance on techniques to use to ensure a safe design SAE ARP4761 contains significant guidance on how to perform the system safety activities spoken about in SAE ARP4754 DO-178B outlines development assurance guidance for aviation software based on the failure condition categorization of the functionality provided by the software DO-254 embodies similar guidance for aviation hardware ARINC 653 is a widely accepted standard to ensure time and space partitioning for software DO-297 does an excellent job of describing the certification tasks to take for an IMA system which include13Task 1 Module acceptance13Task 2 Application softwarehardware acceptance13Task 3 IMA system acceptance13Task4 Aircraft integration of IMA systems including verification and validation13Task 5 Change of modules or applications13Task 6 Reuse of modules or applications13Taken together these standards provide guidance that if followed likely will result in safe highly reliable and cost-effective systems over the life-cycle of the system Yet while these guidelines exist there is not a consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives Also there is not a standard outlining overall reliability for a system to include hardware and software reliability In order to achieve this level of reliability a standard should be developed to define the process and method to achieve a quantifiable reliability number that would in turn lead to acceptance13

18

TRL 3 or 4 TRL 6 or 7 TRL 8 or 9

Component 4

ComplexityFundamentals

Reliability Parametrics

Component 3

ComplexityFundamentals

Reliability Parametrics

Definition of Complexity and Reliability is Needed

Subsystem 1

SystemIntegration ofComponents

ReliabilityDependencies

Component 2

ComplexityFundamentals

Reliability Parametrics

Component 1

ComplexityFundamentals

Reliability Parametrics

Subsystem 2

SystemIntegration ofComponents

ReliabilityDependencies

Integration System

Realized System

ReliabilitySensitivities

High Reliable Complex System

Certificate(eg AWR)

Integration

Integration

Integration

Integration

Integration

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) andor AWR

19

Analytical Models and Reliability

bull Analytical models of hardware reliability are well understood

bull Architecture modeling and software reliability modeling is not a novel idea but is highly debated

ndash There are many approaches and little consensus as to best wayndash Many models (Jelinski-Moranda Littlewood-Verrall Musa-Okumoto etc) [1]ndash Many tools (over 200+ tools since 1970s have been built) [2]

bull Predictability of software reliability is of great concern because it is a major contributor to unreliability[2]

bull Software Reliability is the probability of error-free operation of a computer program in a specified environment for a specified time [1]

bull Need basis for setting reliability figures based on previous systems and iteratively refine those figures in the future

bull NOT A REPLACEMENT FOR TESTING AND VERIFICATION

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability While not suggesting as solution in total better modeling practices should be considered toward a solution to bridge the gap between design test and implementation A method to model the architecture early in the requirements establishment phase follow the detailed design and coding and then be able to verify the system along with the model may be a path to greater confidence in the system and reduce the risks warnings and cautions that must be issued13In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result (see Figure 9) The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) or AWR 13To achieve this goal modeling and analysis tools that follow a standard for modeling reliability should exist As previously stated over 200 tools have been created since 1970s Here is a list of a few of the current tools13Universal Systems Language (USL)13Unified Modeling Language (UML)13Systems Modeling Language (SysML)13MATLABSimulink13Telelogic Rhapsody13MathCAD13Colored Petri Nets13Rate Monotonic Analysis (RMA)13STATEMATE (Used by Airbus)13Standard for the Development of Safety-Critical Embedded Software (SCADE)13OPNET13Embedded System Modeling Language (ESML)13Component Synthesis using Model-Integrated Computing (CoSMIC)13Architectural Analysis and Design Language (AADL)13By no means is this list complete Typically different companies and projects address this challenge and choose unit tools to perform the upfront analysis and modeling not following a standard approach Multiple tools need to converge or be compatible with the framework of a common tool such as AADL The resultant tool(s) could be used to verify the requirements up front to mitigate reliability issues down the life-cycle 13

20

Tools for Modeling and Analysis

bull Universal Systems Language (USL)bull Unified Modeling Language (UML)bull Systems Modeling Language (SysML)bull MATLABSimulinkbull Telelogic Rhapsodybull MathCadbull Colored Petri Netsbull Rate Monotonic Analysis (RMA)bull STATEMATE (Used by Airbus)bull SCADEbull OPNETbull Embedded System Modeling Language (ESML)bull Component Synthesis using Model-Integrated Computing (CoSMIC)bull Architectural Analysis and Design Language (AADL)bull At least 200+ more packages since the 70rsquosbull Certified tools needs to converge to an accepted standard

modelinganalysis method for complex system reliability

Presenter
Presentation Notes
Typically different companies and projects address the challenge and choose unit tools to perform the upfront analysis and modeling not following a guideline approach Multiple tools need to converge or be compatible with a set modeling standard for complex systems The compliant tools could be used to verify the requirements up front to mitigate reliability issues down the life-cycle A notional approach would follow that shown in Figure 8 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model The down side to modeling from certain circles is getting people to believe those models How do you certify a modeling tool and the actual models within the tools Those issues should be addressed going forward

21

Modification to Acquisition Model

Requirements Establishment

High Level Design

Detailed Specifications

Implementation Coding

Verification

Development TestingA

rchi

tect

ural

Mod

el amp

A

naly

sis

Propose standard modeling methodology to be applied at different phases of development to enhance requirements development reliability allocation reliability

measurement and testing (DISCLAIMER DOES NOT REPLACE TESTING)

Reliabilityallocated Reliability

measured

Operational Testing amp Validation

Deployed System

Presenter
Presentation Notes
A notional approach would follow that shown in Figure 10 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model This approach could bridge the design and test phases together It is emphasized here that the architectural model would not replace critical testing but augments the process to allow for better requirement identification and verification Thorough ground and flight tests should never be replaced by modeling Modeling would only allow for more robust and a higher level of confidence in the requirements and design The model could be used in conjunction with the testing to confirm the design Proper modeling and analysis would reduce total program costs by enforcing more complete and correct initial requirements which reduce issues discovered down the road in testing that are expensive to fix or impossible to fix and having to accept high risks Additionally if the model is maintained and optimized then it could possibly be used after system deployment to analyze impacts of upgrades or changes to the system allowing for more complete analysis and reduce overall system redesign costs1313A hurdle to cross with modeling and analysis is convincing people to believe those models Some method to certify these models and modeling tools should be addressed in the future Standards should be set in place for correct modeling techniques for complex systems Lastly consideration of standard verification checking tools should be made such as with the use of the Motor Industry Software Reliability Association (MISRA) compliance verification tool for the use of C in safety critical systems

22

Systems Reliability Standard Establishment

bull Establish a working group to define this standardndash Need a technical society to lead the charge on this

bull Collaborate with industry academia military and societiesndash Focus on development of a reliability standard with AWR safety in mindndash Draw upon the experiences to feed into this standard

bull Study existing and previous complex systemsndash Shuttle Space Station missile systems nuclear submarine and ship

systems nuclear control systems military and commercial jet systems ndash Obtain software reliability information from given existing and previous

systemsndash Build database which would serve as basis for future reliability

bull Research prior efforts in complex systems analysis

bull Establish consensus based modeling and analysis method

Presenter
Presentation Notes
In conclusion methods for achieving a design for complex systems do exist however achieving reliability and attaining a level of qualification that would permit better AWRs does not There needs to be a standard developed to tackle this issue rather than relying on current methods to ascertain airworthiness of complex systems This will not occur overnight An orchestrated collaboration among industry academia military labs and technically professional societies to focus on development of this standard should allow us to draw upon the experiences to feed into this reliability standard with AWR safety in mind We have a long living experiment with complex software systems on the Space Transportation System (STS) International Space Station (ISS) missile systems nuclear submarine and ship systems nuclear control systems military and commercial jet systems from which we should be able to obtain at least inferred software reliability information from the architecture and run time that the systems use We should look at the lessons learned from these systems to see what could have been done to improve and what was done right that should be carried forward The challenge is collecting this information into a central database and arriving at some figure of reliability from previous systems where data exists This would at least provide a starting point to allow initial assessments and could be optimized in the future Also this is not the only study for establishing reliability metrics to complex software systems There have been research projects of similarity to this effort that have risen and fallen The data from those projects should not be wasted but studied to feed into whatever standard that is developed While historical information would be useful each design is unique and requires tools to accomplish the design Investigation of architectural modeling constructs should be further investigated as a possible augmentation to the design and test process We need to determine which forum is best to conduct this effort (eg SAE IEEE AIAA ACM AHS INCOSE or other) As stated in the paper ldquoSpace Shuttle Avionicsrdquo [31] ldquoThe designers the flight crew and other operational users of the system often have a mindset established in a previous program or experience which results in a bias against new or different lsquounconventionalrsquo approachesrdquo If nothing is done to address this problem it will only get worse over time It is past time to address the issue of reliability of complex systems and software

23

BACKUP SLIDES

BACKUP SLIDES

24

PEO Aviation System Safety Management Decision Authority Matrix

Severity(Most Credible)

FrequentP gt 1E-3

A

Probable 1D-4 lt P lt= 1E-3

B

Occasional1E-5 lt P lt= 1E-4

C

Remote1E-6 lt P lt= 1E-5

D

Improbable1E-7 lt P lt= 1E-6

E

Catastrophic1

Critical 2

Marginal 3

Negligible4

Army Acquisition

PEO Aviation

ProgramManagement

HazardCategory

Description

1 Catastrophic Death or permanent total disability system loss

2 Critical Severe injury or minor occupational illness (no permanent effect) minor system or environmental damage

3 Marginal Minor injury or minor occupational illness (no permanent effect) minor system or environmental damage

4 Negligible Less than minor injury or occupational illness (no lost workdays) or less than minor environmental damage

RiskLevel

Description Probability (Frequency) (per 100000 flight hours)

A Frequent gt 100 (P gt 1E-3)

B Probable lt=100 and gt10 (1E-4 lt P lt= 1E-3)

C Occasional lt= 10 and gt1 (1E-5 lt P lt= 1E-4)

D Remote lt=1 and gt01 (1E-6 lt P lt= 1E-5)

E Improbable lt=01 and gt001 (1E-7 lt P lt= 1E-6)

Presenter
Presentation Notes
As already mentioned it is the goal of the US Army Aviation Engineering Directorate that the system is safe and reliable to operate and will perform the mission when delivered Another goal includes the released system continues to safely perform the mission if maintained and operated per the operatorrsquos manual Replacement parts and overhaul work must be high quality to support continued airworthiness Per the Program Element Office Memorandum 08-03 Risk Matrix US Army flight control systems are to achieve 1E-9 reliability for flight critical functions per civil airspace regulations [35] and 1E-6 reliability for tactical use [36] Quantifying these numbers is established for component hardware but not for software Just as hardware should have quantifiable reliability so should software13

25

Reliability Defined

bull Software Reliability - the probability that a given piece of software will execute without failure in a given environment for a given time [40] ndash Often debated as to how to measure

bull Hardware Reliability - the probability that a hardware component fails over time ndash Well defined and established

bull System Reliability - the probability of success or the probability that the system will perform its intended function under specified design limits [over a given amount of time] [39] ndash A combination of software and hardware reliability

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

26

Hardware vs Software Reliability

Hardware Reliability Software ReliabilityFailure rate has a bathtub curve The burn-in state is similar to the software debugging state

Without considering program evolution failure rate is statistically non-increasing

Material deterioration can cause failures even though the system is not used

Failures never occur if the software is not used

Failure data are fitted to some distributions The selection of the underlying distribution is based on the analysis of failure data and experiences Emphasis is placed on analyzing failure data

Most models are analytically derived from assumptions Emphasis is on developing the model the interpretation of the model assumptions and the physical meaning of the parameters

Failures are caused by material deterioration design errors misuse and environment

Failures are caused by incorrect logic incorrect statements or incorrect input data

Can be improved by better design better material applying redundancy and accelerated life cycle testing

Can be improved by increasing testing effort and correcting discovered faults Reliability tends to change continuously during testing due to the addition of problems in new code or the removal of problems by debugging errors

Hardware repairs restore the original condition Software repairs establish a new piece of software

Hardware failures are usually preceded by warnings Software failures are rarely preceded by warnings

Hardware components can be standardized Software components have rarely been standardized

Hardware can usually be tested exhaustively Software essentially requires infinite testing for completeness

Reference [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Hong Pham compares software versus hardware reliability with the information as shown in Table 2 [39] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

27

Acronym ListACRONYM DEFINITIONAADL Architectural Analysis and Design LanguageAC Advisory Circular (FAA)ACM Association of Computing MachineryAED Aviation Engineering Directorate (AMRDEC)AFTD Aviation Flight Test Directorate (US Army)AGC Apollo Guidance ComputerAHS American Helicopter SocietyAIAA American Institute of Aeronautics and Astronautics (Inc)AMCOM Aviation and Missile Command (US Army)AMRDEC Aviation and Missiles Research Development and Engineering Center (US Army)AR Army RegulationARINC Aeronautical Radio Inc ARP Aerospace Recommended PracticeASIF Avionics Software Integration FacilityATAM Architecture Tradeoff Analysis MethodATM Air Traffic ManagementAWR Airworthiness ReleaseCAAS Common Avionics Architecture SystemCH-47 Cargo Helicopter ChinookCMM Capability Maturity ModelCMMI Capability Maturity Model IndexCMU Carnegie Mellon UniversityCNS Communications Navigation SurveillanceCoSMIC Component Synthesis using Model-Integrated ComputingCPS Cyber-Physical SystemCRC Chemical Rubber Company (ie CRC Press)DFBW Digital Fly-By-WireDoD Department of DefenseE3 Electrical and Electromagnetic EffectsESML Embedded System Modeling LanguageFAA Federal Aviation AdministrationFCS Future Combat SystemsFHA Functional Hazard AssessmentFMEA Failure Modes Effects Analysis

28

Acronym List (concluded)

ACRONYM DEFINITIONGPWS Ground Proximity Warning SystemIBM International Business MachinesIEC International Engineering ConsortiumIL Instrumentation Lab (now Draper Laboratory)IMA Integrated Modular AvionicsINCOSE International Council On Systems EngineeringISO International Organization for StandardizationISS International Space StationKAL Korean AirlinesMISRA Motor Industry Standard Software Reliability AssociationMIT Massachusetts Institute of TechnologyNASA National Aeronautics and Space Administration (USA)PDR Preliminary Design ReviewPEO Program Element OfficePNAS Proceedings of the National Academy of SciencesRAQ Rotorcraft and Aircraft QualificationRMA Rate Monotonic AnalysisRTC Redstone Test Center (US Army) RTTC Redstone Technical Test Center (US Army)RTCA Radio Technical Commission for AeronauticsSAE Society of Automotive EngineersSED Software Engineering Directorate (AMRDEC)SEES Software Engineering Evaluation SystemSEI Software Engineering Institute (CMU)SIL System Integration LaboratorySSA System Safety AssessmentSTS Space Transportation SystemSysML Systems Modeling LanguageTMR Triple Modular RedundantTRL Technical Readiness LevelUAS Unmanned Aircraft SystemUH-60 Utility Helicopter BlackhawkUML Unified Modeling LanguageUS United StatesUSL Universal Systems Language

29

References

bull [1] Israel Koren and Mani Krishna ldquoFault-Tolerant Systemsrdquo Morgan Kaufmann 2007bull [2] Jianto Pan ldquoSoftware Reliabilityrdquo Carnegie Mellon University Spring 1999bull [3] Nachum Dershowitz httpwwwcstauacil~nachumdhorrorhtmlbull [4] httpwwwair-attackcombull [5] David A Mindell ldquoDigital Apollo Human and Machine in Spaceflightrdquo The MIT Press 2008bull [6] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Flight Software Airworthiness SED-SES-PMHFSA 001 December 2003bull [7] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Software Safety SED-SES-PMHSSA 001 February 2006bull [8] AMCOM Regulation 385-17 Software System Safety Policy 15 March 2008bull [9] NASA Software Safety Guidebook NASA-GB-871913 31 March 2004bull [10] Margaret Hamilton amp William Hackler ldquoUniversal Systems Language Lessons Learned from Apollordquo IEEE Computer

Society 2008bull [11] Margaret Hamilton ldquoFull Life Cycle Systems Engineering and Software Development Environment Development Before

The Fact In Actionrdquo httpwwwhtiuscomArticlesFull_Life_Cyclehtmbull [12] Peter Feiler David Gluch John Hudak ldquo The Architecture Analysis amp Design Language (AADL) An Introductionrdquo

CMUSEI-2006-TN-011 February 2006bull [13] Peter Feiler John Hudak ldquoDeveloping AADL Models for Control Systems A Practitionerrsquos Guiderdquo CMUSEI-2007-TR-

014 July 2007bull [14] Bruce Lewis ldquoUsing the Architecture Analysis and Design Language for System Verification and Validationrdquo SEI

Presentation 2006bull [15] Feiler Gluch Hudak Lewis ldquoEmbedded System Architecture Analysis Using SAE AADLrdquo CMUSEI-2004-TN-004 June

2004bull [16] Charles Pecheur Stacy Nelson ldquoVampV of Advanced Systems at NASArdquo NASACR-2002-211402 April 2002bull [17] Systems Integration Requirements Task Group ldquoARP 4754 Certification Considerations for Highly-Integrated or

Complex Aircraft Systemsrdquo SAE Aerospace 10 April 1996bull [18] SAE ldquoARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne System and

Equipmentrdquo December 1996bull [19] Aeronautical Radio Inc (ARINC) ldquoARINC Specification 653P1-2 Avionics Application Software Standard Interface Part 1

ndash Required Servicesrdquo 7 March 2006

30

References (Continued)

bull [20] Department of Defense ldquoMIL-STD-882D Standard Practice for System Safetyrdquo 19 January 1993bull [21] RTCA Incorporated ldquoDO-178 Software Considerations in Airborne Systems and Equipment Certificationrdquo 1

December 1992bull [22] RTCA Incorporated ldquoDO-254 Design Assurance Guidance for Airborne Electronic Hardwarerdquo 19 April 2000bull [23] US Army ldquoAeronautical Design Standard Handbook Rotorcraft and Aircraft Qualification (RAQ) Handbookrdquo 21

October 1996bull [24] Cary R Spitzer (Editor) ldquoAvionics Elements Software and Functionsrdquo CRC Press 2007bull [25] US Army ldquoArmy Regulation 70-62 Airworthiness Qualification of Aircraft Systemsrdquo 21 May 2007bull [26] US Army ldquoArmy Regulation 95-1 Aviation Flight Regulationsrdquo 3 February 2006bull [27] ldquoUsing the Architecture Tradeoff Analysis Method (ATAM) to Evaluate the Software Architecture for a Product Line

of Avionics Systems A Case Studyrdquo Barbacci Clements Lattanze Northrop Wood July 2003 CMUSEI-2003-TN-012bull [28] ldquoAll in the Family CAAS amp AADLrdquo Peter Feiler August 2008 CMUSEI-2008-SR-021bull [29] ldquoCMMI Guidelines for Process Integration and Product Improvementrdquo Chrissis Konrad Shrum Pearson Education

2007bull [30] ldquoModel Driven Performance Analysis for Avionics Systemsrdquo Brendan OrsquoConnell Draper Laboratory January 2006bull [31] John F Hanaway Robert W Moorehead ldquoSpace Shuttle Avionics Systemsrdquo NASA SP-504 1989bull [32] Lui Sha ldquoThe Complexity Challenge in Modern Avionics Softwarerdquo August 14 2006bull [33] ldquoIncidents Prompt New Scrutiny of Airplane Software Glitchesrdquo 30 May 2006 Wall Street Journalbull [34] Eyal Ophir Clifford Nass and Anthony Wagner ldquo Cognitive Control in Media Multitaskersrdquo PNAS 20 July 2009bull [35] ldquoAdvisory Circular AC 251309-1A System Design and Analysisrdquo Federal Aviation Administration 21 June 1988bull [36] Program Element Office Policy Memorandum 08-03bull [38] httpwwwnsfgovpubs2008nsf08611nsf08611htm National Science Foundation webpage on Cyber-Physical

Systemsbull [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000bull [40] Paul Rook editor ldquoSoftware Reliability Handbookrdquo Elsevier Science Publishers LTD 1990

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)

2

Agenda

bull Objectivebull Defense Acquisition Approach to Systems Development and Testbull US Army Airworthinessbull AED and Qualificationbull Evolution of Helicopter Systemsbull Present Approach to Testingbull Development Challengesbull Complexity Issuesbull Complexity vs Reliability Cost and Schedulebull Complex System Examples and Failuresbull Lessons Learned from Failuresbull Current Guidelines and Certification Assessment Considerationsbull Definition of Complexity and Reliability Neededbull Analytical Modeling and Analysisbull Call for System Reliability Standard Establishment

Presenter
Presentation Notes
In summary this paper will13 Identify development challenges with complex systems13 Present the current US Army approach to airworthiness and testing13 Discuss the challenges facing reliability and safety13 Survey complexity issues13 Cover the historical aspect of the reliability problem and identify that this not a new problem13 Itemize some current design guidelines for modeling systems and identify deficiencies13 Address certification assessment considerations13 Encourage development and refinement of standard modeling and analysis tools to place power in the hands of the designer to mitigate issues with system reliability early and throughout the project life-cycle and13 Request input from the development and test community to establish standard processes and requirements for qualifying complex systems1313The ultimate goal of this paper is not to resolve the problems or evaluate past or existing designs but it is to catch the attention of government and commercial aviation industry and solicit input into new processes and standards to correct the longstanding issues in the development of complex avionics systems It will encourage committee formation to address these issues The United States Army Aviation Engineering Directorate (AED) requests the input of all technical stakeholders and desires to not dictate these requirements but to establish requirements tools and guidelines for reliability and qualification as a unified community

3

Objective

ldquoDevelop an industry standard method for qualifying complex integrated systems to a specified reliabilityrdquo

RA-66 Comanche Picture by US Army

Presenter
Presentation Notes
Currently there is no method to address qualifying complex integrated systems to a specified reliability Later discussions in this presentation will point to the fact that hardware reliability has been established and recognized however software reliability hasnrsquot reached a consensus on approach Both hardware and software are major contributions to the overall reliability of a system

4

Defense Acquisition Approach to Systems Development and Test

Requirements Establishment

Analysis

High Level Design

Detailed Specifications

Implementation Coding

Operational Testing amp Validation

Verification

Development Testing

Deployed System

Presenter
Presentation Notes
The general recommended system development V-curve as shown in is not always followed in a strict sense although it should be the goal Such negligence to the proper process makes the establishment of certification very difficult For new system development and existing system upgrades requirements must be clear complete and testable The certification requirements must be made obvious in the development of the requirements establishment phase with the goal of being fully identified during the requirements development

5

US Army Airworthiness

Airworthiness Qualification meansThe system is safe and reliable to operate and willperform the mission when deliveredIt will continue to safely perform the mission ifmaintainedoperated per the manualParts and overhaul work must be high quality to maintain airworthinessFlight control systems have high reliability requirements - 10-9 for civil airspace critical IFR functions [35]- 10-6 for tactical airspace [36]

Presenter
Presentation Notes
The US Army has been involved in flight since the early Wright-B flight of 1909 Initially fixed wing and rotary wing aircraft in early aviation remained simple and federated The basics to aviate navigate and communicate were handled by dedicated gauges compasses gyroscopes and mechanical linkages to fly the aircraft With the dawn of the Space Age in the 1960s the rise of more complex electronic control in aviation appeared with the ApolloSaturn projects NASA Dryden F-8 project and others As time progressed more integrated modular avionics (IMA) and digital fly by wire (DFBW) came into military and commercial aircraft to reduce weight increase functionality and provide redundancy With this advancement of complex flight control technology testing verification and validation became problematic To this day complex systems have been challenging to qualify for the Federal Aviation Administration (FAA) and military airworthiness commands13The US Army AED holds the responsibility for airworthiness approval This includes the design approval production approval and continued airworthiness In the design approval AED must determine if the system design or modification meets airworthiness standards In the milestone decisions it must determine whether the system is produced or modified in accordance with the approved design For continued airworthiness a judgment must be made for the system operated and maintained to keep it compliant with the approved design The bottom line is that the qualified system meets its performance requirements and is reliable when delivered1313Establishing approval for AWR has become increasingly difficult with the evolution of complex avionic control systems from federated architectures to IMA architectures that rely heavily on complex software systems (ie now identified as cyber-physical systems) As stated earlier AED has the mission to ensure airworthiness qualification for aircraft and subsystems used in the US Army aviation fleet including helicopters airplanes and Unmanned Aircraft Systems (UAS) 13

6

AED and Qualification

bull AEDrsquos mission is to ensure airworthiness qualification for aircraft and subsystems used in the US Army fleet

bull Airworthiness Qualification isndash Demonstration of an aircraft or aircraft subsystem or

component including modifications to function safely meeting performance specifications when used and maintained within prescribed limits (AR 70-62)

bull Traditionally qualified systems by ndash Similarity ndash Analysisndash Testndash Demonstrationndash Examination

Presenter
Presentation Notes
Airworthiness qualification is the demonstration of an aircraft or aircraft subsystem or component including modifications to function satisfactorily when used and maintained within prescribed limits [25] The US Army has traditionally qualified systems and components by physical testing analysis demonstration or by similarity

7

Evolution of Helicopter Systems

bull Past systems historically federated ndash Distributed Functionality ndash Hardware basedndash Simple easier to test

bull Now systems are becoming integratedndash Combined Functionality ndash More Software Intensivendash Complex more difficult to test

Chief Warrant Office Jim Beaty (back row far left) and crew of the of the Vietnam UH-1 Flying Bikinis (friend of Alex Boydston)

UH-1 Cockpit (US Army)

Chinook CAAS Cockpit (US Army)

CH-47 Chinook (US Army)

Presenter
Presentation Notes
The US Army has been involved in flight since the early Wright-B flight of 1909 Initially fixed wing and rotary wing aircraft in early aviation remained simple and federated The basics to aviate navigate and communicate were handled by dedicated gauges compasses gyroscopes and mechanical linkages to fly the aircraft With the dawn of the Space Age in the 1960s the rise of more complex electronic control in aviation appeared with the ApolloSaturn projects NASA Dryden F-8 project and others As time progressed more integrated modular avionics (IMA) and digital fly by wire (DFBW) came into military and commercial aircraft to reduce weight increase functionality and provide redundancy With this advancement of complex flight control technology testing verification and validation became problematic To this day complex systems have been challenging to qualify for the Federal Aviation Administration (FAA) and military airworthiness commands13

8

Present Approach to Testing

bull Several disciplines weigh in such as software avionics GNampCenvironmental E3 electrical human factors structures aerodynamics etc

bull Current test methodology per older federated systemsndash Hardware Mil-Std 810 Mil-Std 461ndash Requirements Analysis (Traceability)

bull Test at different levelsndash Individual software module testingndash Black box testingndash System Integration testing

bull Hot benchbull System Integration Lab (SIL)

ndash Aircraft level Testing bull Groundbull Flight Aviation Flight Test Directorate (AFTD) Testing (US Army Photo)

Aviation System Integration Facility (ASIF) (US Army Photo)

Presenter
Presentation Notes
Airworthiness qualification is the demonstration of an aircraft or aircraft subsystem or component including modifications to function satisfactorily when used and maintained within prescribed limits [25] The US Army has traditionally qualified systems and components by physical testing analysis demonstration or by similarity Historically most systems were federated They were hardware based simple and distributed Now they have become more integrated They are more software intensive complex and have combined functionality contained in one or more computers With this evolution from simple to more complex the Army is finding it more difficult to execute an AWR As systems evolve to more complex systems of systems this problem is only growing worse13The current test approach to achieving confidence in systems for an AWR for the US Army is based more on traditional federated avionics systems Experienced personnel in Software Vehicle Management Systems Avionic Communications Navigation amp Guidance Control Electrical Environmental Human Factors Electrical and Electro-magnetic Effects (E3) Structures Thermal Safety Integration Testing and Flight Test personnel and test pilots all play important roles in accomplishing test and review of new and existing systems While some may not consider areas such as thermal or EEE important to software development they are crucial since the software is running on physical systems that are affected by heat and susceptibility to electromagnetic radiation which can cause abnormal operation or bit errors Current test methodology for hardware relies on MIL-STD-810 MIL-STD-461 and requirements analysis such as traceability MIL-STD-810 is the Department of Defense Test Method Standard for Environmental Engineering Considerations and Laboratory Tests1313It focuses on equipment environmental design and test limits to the conditions that it will experience throughout its service life and establishing chamber test methods that replicate the effects of environments on the equipment rather than imitating the environments themselves MIL-STD-461 includes the ldquoRequirements for the Control of Electromagnetic Interference Characteristics of Subsystems and Equipmentrdquo These test standards are geared for hardware and physical effects and do not address software operation Current methods for software testing include testing the individual software modules black box (or interface) level system integration level and aircraft level testing System integration testing can include hot bench system integration labs (SILs) (such as the shown in Figure 4) and aircraft ground level testing as conducted by the Aviation Flight Test Directorate (AFTD) of Redstone Test Center (RTC) (see Figure 5) 13Neglecting complete and good requirements promotes risk It is common practice by Program Managers (PMs) to accept risks As issues are found in systems it is on AED to issue an Army Safety Action Memorandum (ASAM) that identifies a deficiency to the field pilots Concurrently an Airworthiness Impact Statement (AWIS) is issued to the PMs which contain a probabilistic analysis of how the identified shortcoming will affect risk The AMCOM Safety Office will produce its own calculations The PMs can either accept or reject this assessment Regardless cautions and warnings are placed in the AWR to keep the entire program and flight crew aware of issues13

9

Development Challenges

bull Legacy Aircraft often upgraded in a piecemeal fashionndash Makes certification difficultndash Desire to increase to modern requirements based on size of upgrade and

what it includes ndash hard to scope

bull New system requirements must be clear complete and testable ndash Certification requirements must be obvious

bull Orchestrating agreement between stakeholders is necessary to mitigatendash Juggling of multiple software buildsndash End system that is difficult to test certify and deployndash Escalating Costsndash System Safety from being poorly understoodndash Design iterations

Presenter
Presentation Notes
It would be wonderful if all systems were straightforward in design easily testable and simple to write an Airworthiness Release (AWR) for however that is not the case Legacy aircraft such as the Chinook have been upgraded in a piecemeal fashion acquiring much needed improvements in aviation navigation and communication The general recommended system development V-curve as shown earlier is not always followed in a strict sense although it should be the goal process Negligence to the proper process makes the establishment of certification very difficult For new system development and existing system upgrades requirements must be clear complete and testable The certification requirements must be made obvious in the development of the requirements establishment phase with the goal of being fully identified during the requirements development Orchestrating agreement among all stakeholders (eg the program manager systems engineers human factors engineers integrators test engineers manufacturers users and certifiers) is necessary to mitigate problems such as13juggling multiple software builds13producing a difficult-to-test difficult-to-certify and difficult-to-deploy systems13misunderstanding system safety and13requiring design iterations that impact schedules and costs

10

Complexity Issues

bull System Development costs and schedule increase with complexityndash Existing lack of schedule and funding resources

bull Keeps systems from achieving full compliance with specifications and requirements

bull Garbage in -gt Garbage OuthellipPoor requirements -gt Poor Systemndash Finding problems in new designs at PDR is too latendash Difficult to correct existing poorly designed fielded complex systems

bull Complexity amp reliability of complex systems is not fully understoodndash How do we accurately assess operating risk performance reliability of

complex systems based on limited testing and analysisndash How do we know when system design is good enoughndash Latent defects occur in supposedly well-tested mature systems

bull Avionics parts and software change constantlyndash Spiral development -gt new softwarehardware qualification required frequently ndash How do we streamline the process (partition the system) so the need for

complete re-qualification after changes is lessened

Presenter
Presentation Notes
In an old computer development lesson it is well known that if you put ldquogarbage in you will get garbage outrdquo Likewise in a system if you have poor requirements then you will end up with a poor system Finding problems in new designs at Preliminary Design Review (PDR) is too late It has been shown that discovering issues at this stage in the game will cause a reiteration on the design which costs time and money Furthermore dealing with existing fielded complex systems that have not gone through the rigors of proper systems engineering results quite often with an overwhelming mess of functionality to test verify validate qualify and certifydesigns at PDR is too late It has been shown that discovering issues at this stage in the life-cycle will cause a reiteration on the design which costs time and money Furthermore dealing with existing fielded complex systems that have not gone through the rigors of proper systems engineering results quite often with a complicated conglomeration of functionality to test verify validate qualify certify and maintain This is indicative where complexity has exceeded our understanding in how to certify a system We still do not fully understand complexity and how to address reliability of complex systems How do we accurately assess operating risks performance and reliability of complex systems based on limited testing and analysis How do we know when a system design is good enough How do we modularize spirally developed systems to minimize the need for re-qualification of unchanging portions of the system We are 30 plus years into this technology and we still deal with systems with latent defects that are occurring in supposedly well-tested and mature systems To further exacerbate the problem we are now dealing with complex system of systems (ie cyber-physical systems)13It is a given that you can keep on adding redundancy and complexity to a problem to attain a desired level of reliability but at some point in time the reliability will taper off At best we sometimes must satisfy for an optimum point before digressing in reliability In the same vein system development costs and schedule increase with complexity too (see Figure 7 and 8)13Avionics parts and software constantly change over the life of a program Typically a spiral development program occurs with complex software development which means that qualification is required frequently This begs the question of how to streamline the process so that the need to conduct a complete requalification is avoided13With these complex systems there are other hurdles to cross such as fully characterizing and conducting the Functional Hazard Assessments (FHAs) and Failure Mode Effects and Criticality Analyses (FMECAs) It is crucial for the safety assessment that these are conducted correctly to fully understand the risks and later perform the correct tests at the right level Additionally once the complex component hardware and software are integrated then yet other problems appear It is at that time that the disparity of teams crossing multiple contractors and development groups becomes obvious and if not properly coordinated could cause impacts to the schedule Other programmatic problems affect complex system development and qualification For instance lack of schedule and funding resources causes a shortcoming to adequately provide for the proper compliance with specification and requirements short-circuiting the systems engineering process An ever decreasing availability of trained engineers to support the development and test of such systems exists Non-technical political influences sometimes affect the reliability of complex systems and are difficult to avoid Lastly there is a lack of a centralized database that captures the various families of systems that have been built along with their characterization of success and failures Such a database from all past and present government complex systems could be valuable in establishing reliability basis for future models

11

Complexity Issues (continued)

bull Functional Hazard Assessments and related documentation are crucial

ndash Understanding risks ndash Performing the correct tests at the right level Lab test vs Flight Test

bull Saves flight time and money

bull Systems Integration for complex systems is a schedule driver

bull Need experienced personnel to work complex systems

bull Need a centralized database - just doesnrsquot existndash Determine data needed for quantifying reliability of complex systemsndash Capture the pertinent data on previous or existing complex systemsndash Understand successes and failures of previous and present complex systemsndash Establish baseline reliability figures for known architectures

bull Complex System of Systems exacerbates problem

12

Reliability vs Complexity ampCost vs Complexity

Notional Graphs

bull Reliability vs Complexity bull Cost amp Schedule vs Complexity

Rel

iabi

lity

Complexity Complexity

Cos

t amp S

ched

uleOptimum

Aggregation of part reliability

feeds into overall system reliability

Desired

Presenter
Presentation Notes
Avionics parts and software constantly change over the life of a program Typically a spiral development program occurs with complex software development which means that qualification is required frequently This begs the question of how to streamline the process so that the need to conduct a complete re-qualification is avoided13With these complex systems there are other hurdles to cross such as fully characterizing and conducting the Functional Hazard Assessments (FHAs) and Failure Mode Effects and Criticality Analyses (FMECAs) It is crucial for the safety assessment that these are conducted correctly to fully understand the risks and later perform the correct tests at the right level13Additionally once the complex component hardware and software are integrated then yet other problems crop up It is at that time that the disparity of teams crossing multiple contractors and development groups becomes obvious and if not properly coordinated could cause impacts to the schedule 13Other programmatic problems affect complex system development and qualification For instance lack of schedule and funding resources causes a shortcoming to adequately provide for the proper compliance with specification and requirements short-circuiting the systems engineering process An ever decreasing availability of trained engineers to support the development and test of such systems exists Non-technical political influences sometimes affect the reliability of complex systems and are difficult to avoid Lastly there is a lack of a centralized database that captures the various families of systems that have been built along with their characterization of success and failures Such a database from all past and present government complex systems could be valuable in establishing reliability basis for future models

13

A Few Examples of Complex Systems

bull This is not a new problem Other have struggled with the challenges of establishing confidence in complex systems

ndash NASAbull Apollo Guidance Computerbull Dryden F8 Crusaderbull Space Shuttlebull International Space Station

ndash Commercial Airlinersbull Airbus A320 and higherbull Boeing B777 B787

ndash Militarybull Ships and Submarinesbull Jets (F14F15 F16 F18 F22 F35 etc)bull Cargo Planes (C130J Hercules C17 Globemaster etc)bull Helicopters (Chinook Blackhawk Sea Stallion etc)bull Rocketsbull Unmanned Aerial Systemsbull Unmanned Ground Systemsbull Unmanned Submarine Systems Photos by US Army NASA US Navy and US Air Force

Presenter
Presentation Notes
As mentioned earlier complex avionics systems are not a new idea Since the early 1960rsquos complex avionic architectures have existed beginning with the ApolloSaturn program Massachusetts Institute of Technology (MIT) Instrumentation Lab (IL) which is now Draper Laboratory and International Business Machines (IBM) led the way with the MITIL Apollo Guidance Computer (AGC) and the Saturn V IBM triple modular redundant (TMR) voting guidance computer system The word software was not even coined at the time but engineers such as Margaret Hamilton MITIL Director of Apollo On-board Software can attest to the fact that some the same issues with creating reliable software then still exists today [5] A large majority of the issues then dealt with the communication between systems engineers and the programmers Requirements were thrown over the wall without the confirmation that the requirements were complete and a lot of the issues cropped up as interface problems Identifying these issues prompted Hamilton to create her own company and create a modeling language called Universal Systems Language (USL) to head off the problems experienced with Apollo [11] Some 200 plus modeling programs have been developed since Apollo and used to mitigate issues and increase confidence in systems of varying complexity13As time progressed other systems came along The NASA Dryden F8 Crusader was the first digital fly by wire (DFBW) jet aircraft that relied heavily on complex IMA and software for flight control The Space Transportation System (STS) shuttle includes a Quad Modular Redundant (QMR) system with a fifth backup flight computer containing uncommon code US Air Force and Naval airplanes that have possessed complex or redundant IMA configurations include the F14 Tomcat F15 Eagle F16 Falcon F18 Hornet F22 Raptor F35 Joint Strike Fighter F117 Nighthawk V22 Osprey C17 Globemaster and many more along with recent Unmanned Air Vehicle Systems (UAVS) The US Army complex systems on helicopters include the13RAH-66 Comanche DFBW Triple Modular Redundant (TMR) architecture13glass cockpit avionics on the UH-60M Blackhawk baseline 13Common Avionics Architecture System (CAAS) glass cockpit on the UH-60M Blackhawk modernization and CH-47F Chinooks13and other aircraft13Additionally there are many self-checking pair engine controller systems along with system of system Future Combat Systems (FCS) and Unmanned Air Vehicle Systems (UAVS) This has also permeated the commercial airliner market with the Airbus 320 and higher Airbus models Boeing 777 and Boeing 787 aircraft With this ever increasing technology something must be done about the reliability issue With such a wealth of data on aviation and non-aviation cyber-physical systems such as submarine ship nuclear medical locomotive and automotive systems there should be adequate information to get a start on modeling systems correctly for reliability Therefore this is not an isolated problem to avionics and other disciplines should aide in resolving this problem13

14

Some Complex System Failures

bull V-22 Osprey crashesbull Mars Climate Orbiter crashbull Mars Pathfinder software resetbull USS Vincennes downing an Airbus 320bull Therac-25 software radiation treatment

failurebull 1989 Airbus A320 air show crashbull China Airlines Airbus Industries A300

crashbull Ariane 5 satellite launcher malfunctionbull Failure of the primary flight system to

sync with the backup during prelaunch of STS-1

bull Mexicana Airlines Boeing 727 airliner crashed into a mountain due to the software not correctly determining the mountain position

bull Loss of the first American probe to Venus

bull Korean Airlines KAL 901 accidentbull Soviet Phobos I Mars probe lost

bull Three Mile Islandbull F-18 fighter plane crash due to bad

exceptionbull F-14 fighter plane lost to

uncontrollable spinbull Swedish Gripen prototype crashedbull Swedish Gripen air-show crashbull F-22 failure crossing the IDLbull 2006 German-Spanish Barracuda UAVbull 2004 FA-22 Raptor stealth fighter jet

crash bull FA-22 Raptor navigation system

software error at Nellis AFBbull 50 cockpit blackouts on A320bull A320 multiple avionics and electrical

failures at Newark NJbull Boeing 777 Malaysian Airlines jetlinerrsquos

nightmarish autopilot rollercoaster ridebull 3000 feet US Army and Air Force UAV

Crashes

bull hellip And Many Morehellip

Presenter
Presentation Notes
Multiple crashes have occurred with the V-22 Osprey [41]13In 1999 the Mars Climate Orbiter crashed because of incorrect units in a program caused by poor systems engineering practices [42 44]13In 1988 an Airbus 320 was shot down by the USS Vincennes because of cryptic and misleading output displayed by the tracking software [3]13In 1989 an Airbus A320 crashed at an air show due to altitude indication and software handling [3]13In 1994 a China Airlines Airbus Industries A300 crash on killing 264 from faulty software [3]13In 1996 the first Ariane 5 satellite launcher destruction mishap was caused by a faulty software design error with a few lines of ADA code containing unprotected variables Horizontal velocity of the Ariane 5 exceeded the Arian 4 resulting in the guidance system veering the rocket off course Insufficient testing did not catch this error which was a carry-over from Ariane 4[3 39]13In 1986 a Mexicana Airlines Boeing 727 airliner crashed into a mountain due to the software not correctly determining the mountain position [39]13In 1986 the Therac-25 radiation therapy machines overdosed cancer patients due to flaw in the computer program controlling the highly automated devices [3 39 45]13During the maiden launch in 1981 of the Discovery space shuttle a failure of the primary flight control computer system to establish sync with the backup during prelaunch [43]13On December 10 1990 the Space Shuttle Columbia had to land early due to computer software problems [39]13In 1997 The Mars Pathfinder software reset problem due to latent task execution caused by priority inversion with a mutex [3 44]13An error in a single FORTRAN statement resulted in the loss of the first American probe to Venus From G J Myers Software Reliability Principles amp Practice p 25 [3]13In September 1999 the Korean Airlines KAL 901 accident in Guam killed 225 out of 254 aboard A worldwide bug was discovered in barometric altimetry in Ground Proximity Warning System (GPWS) From ACM SIGSOFT Software Engineering Notes vol 23 no 1 [3]13The Soviet Phobos I Mars probe was lost due to a faulty software update at a cost of 300 million rubles Its disorientation broke the radio link and the solar batteries discharged before reacquisition From Aviation Week 13 Feb 1989 [3]13An F-18 fighter plane crash due to a missing exception condition From ACM SIGSOFT Software Engineering Notes vol 6 no 2 [3]13An F-14 fighter plane was lost to uncontrollable spin traced to tactical software From ACM SIGSOFT Software Engineering vol9 no5 [3]13In 1989 Swedish Gripen prototype crashed due to software in their digital fly-by-wire system [3 46]13In 1995 another Gripen fighter plane crashed during air-show caused by a software issue [3 46]13On February 11 2007 twelve FA-22 Raptors were forced to head back to the Hawaii when a software bug caused a computer crash as they were crossing International Date Line In the middle of the ocean all their systems comprising navigation fuel and part of the communications systems dumped All the attempts to reboot failed[47]13February 2006 German-Spanish Unmanned Combat Air Vehicle Barracuda crash due to software failure [4]13December 2004 a glitch in the software for controlling flight probably caused an FA-22 Raptor stealth fighter jet to crash on takeoff at Nellis Air Force [4]13In 2008 a United Airbus A320 registration N462UA experienced multiple avionics and electrical failures including loss of all communications shortly after rotation while departing Newark Liberty International Airport in Newark New Jersey [NTSB Report Identification DCA08IA033] 13In 2006 a Boeing 777 Malaysian Airlines jetlinerrsquos autopilot caused a stall to occur by climbing 3000 feet Pilots struggled to nose down the plane but plunged into a steep dive After pulling back up the pilots regained control Cause was defective flight software providing incorrect data for airspeed and acceleration confusing the flight computers and initially ignoring the pilotrsquos commands[49]13US Army and Air Force UAV crashes from control system or human error13

15

Lessons Learned from Failures

bull From Nancy Levesonrsquos paper ldquoThe Role of Software in Spacecraft Accidentsrdquondash ldquoFlaws in the safety culture diffusion of responsibility and

authorityndash Limited communication channels and poor information flowndash Inadequate system and software engineering ndash Poor or missing specifications ndash Unnecessary complexity and software functionality ndash Software reuse or changes without appropriate safety analysisndash [Shortcomings] in safety engineering practices ndash Flaws in test and simulation environments ndash Inadequate human factors design for softwarerdquo

Presenter
Presentation Notes
In Dr Nancy Levesonrsquos paper [36] ldquoThe Role of Software in Spacecraft Accidentsrdquo she cited problems with software development issues within NASA on various projects According to Dr Leveson there were ldquoflaws in the safety culture diffusion of responsibility and authority limited communication channels and poor information flow inadequate system and software engineering poor or missing specifications unnecessary complexity and software functionality software reuse or changes without appropriate safety analysis violation of basic safety engineering practices inadequate system safety engineering flaws in test and simulation environments and inadequate human factors design for softwarerdquo While these problems were identified for spacecraft development within NASA and corrected aviation in general could learn from these lessons to mitigate issues with complex systems development

16

Some Current Guidelines

bull DO-178B - Software Considerations in Airborne Systems and Equipment Certification bull DO-248B ndash Final Report for the Clarification of DO-178Bbull DO-278 - Guidelines for Communications Navigation Surveillance and Air Traffic Management

(CNSATM) Systems Software Integrity Assurancebull DO-254 - Design Assurance Guidance for Airborne Electronic Hardware bull DO-297 ndash Integrated Modular Avionics (IMA) Development Guidance and Certification

Considerationsbull SAE-ARP4754 ndash Certification Consideration for Highly Integrated or Complex Aircraft Systemsbull SAE-ARP4671- Guidelines and Methods for Conducting the Safety Assessment Process on

Airborne Systems and Equipmentbull FAA Advisory Circular AC27-1B - Certification of Normal Category Rotorcraftbull FAA Advisory Circular AC29-2C - Certification of Transport Category Rotorcraftbull ISOIEC 12207 - Software Life Cycle Processesbull ARINC 653 - Specification Standard for Time and System Partitionbull MIL-STD-882D - DoD System Safetybull ADS-51-HDBK - Rotorcraft and Aircraft Qualification Handbookbull AR-70-62 - Airworthiness Release Standardbull SED-SES-PMHFSA 001 - Software Engineering Directorate (SED) Software Engineering

Evaluation System (SEES) Program Manager Handbook for Flight Software Airworthinessbull SED-SES-PMHSS 001 - SED SEES Program Manager Handbook for Software Safety

WHATrsquoS MISSING - Reliability Standard for Complex Systems

Presenter
Presentation Notes
These problems previously stated drove the development of these guidelines however there is no standard for system reliability that includes software There are other standards and circulars that pertain to complex systems but a reliability standard for complex systems but a reliability standard is missing for complex systems which would outline the process for establishing cyber-physical systems reliability This standard should indicate how to model and analyze and ascertain the projected level of reliability

17

Certification Assessment Considerations

bull Sufficient data and time must be available for air worthiness evaluation

bull Certification processndash Currently lengthy ndash Depends much on human interpretation trade offs and risk mitigation ndash Overwhelming for complex integrated systems (FHAs FTAs FMECAs

risk mitigation etc)

bull Consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives

bull Certification Tasks outlined in DO-297 should be consideredndash Task 1 Module Acceptancendash Task 2 Application softwarehardware acceptancendash Task 3 IMA system acceptancendash Task 4 Aircraft integration of IMA system ndash including VampVndash Task 5 Change of modules or applicationsndash Task 6 Reused of modules or applications

Presenter
Presentation Notes
In order to execute an AWR sufficient data must be provided to the appropriate airworthiness authorities to demonstrate that safety requirements are met to the levels per the System Safety Assessment (SSA) or equivalent The certification process is currently lengthy and depends on much human interpretation of the myriad of complex architecture functions 13The current guidelines such as DO-178B DO-254 DO-297 SAE-ARP-4754 and SAE-ARP-4671 along with many other guidelines outline the proper steps that should be taken System safety managementrsquos military standard is MIL-STD-882 and has been in use for decades Civilian safety standards for the aviation industry include SAE ARP4754 which shows the incorporation of system safety activities into the design process and provides guidance on techniques to use to ensure a safe design SAE ARP4761 contains significant guidance on how to perform the system safety activities spoken about in SAE ARP4754 DO-178B outlines development assurance guidance for aviation software based on the failure condition categorization of the functionality provided by the software DO-254 embodies similar guidance for aviation hardware ARINC 653 is a widely accepted standard to ensure time and space partitioning for software DO-297 does an excellent job of describing the certification tasks to take for an IMA system which include13Task 1 Module acceptance13Task 2 Application softwarehardware acceptance13Task 3 IMA system acceptance13Task4 Aircraft integration of IMA systems including verification and validation13Task 5 Change of modules or applications13Task 6 Reuse of modules or applications13Taken together these standards provide guidance that if followed likely will result in safe highly reliable and cost-effective systems over the life-cycle of the system Yet while these guidelines exist there is not a consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives Also there is not a standard outlining overall reliability for a system to include hardware and software reliability In order to achieve this level of reliability a standard should be developed to define the process and method to achieve a quantifiable reliability number that would in turn lead to acceptance13

18

TRL 3 or 4 TRL 6 or 7 TRL 8 or 9

Component 4

ComplexityFundamentals

Reliability Parametrics

Component 3

ComplexityFundamentals

Reliability Parametrics

Definition of Complexity and Reliability is Needed

Subsystem 1

SystemIntegration ofComponents

ReliabilityDependencies

Component 2

ComplexityFundamentals

Reliability Parametrics

Component 1

ComplexityFundamentals

Reliability Parametrics

Subsystem 2

SystemIntegration ofComponents

ReliabilityDependencies

Integration System

Realized System

ReliabilitySensitivities

High Reliable Complex System

Certificate(eg AWR)

Integration

Integration

Integration

Integration

Integration

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) andor AWR

19

Analytical Models and Reliability

bull Analytical models of hardware reliability are well understood

bull Architecture modeling and software reliability modeling is not a novel idea but is highly debated

ndash There are many approaches and little consensus as to best wayndash Many models (Jelinski-Moranda Littlewood-Verrall Musa-Okumoto etc) [1]ndash Many tools (over 200+ tools since 1970s have been built) [2]

bull Predictability of software reliability is of great concern because it is a major contributor to unreliability[2]

bull Software Reliability is the probability of error-free operation of a computer program in a specified environment for a specified time [1]

bull Need basis for setting reliability figures based on previous systems and iteratively refine those figures in the future

bull NOT A REPLACEMENT FOR TESTING AND VERIFICATION

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability While not suggesting as solution in total better modeling practices should be considered toward a solution to bridge the gap between design test and implementation A method to model the architecture early in the requirements establishment phase follow the detailed design and coding and then be able to verify the system along with the model may be a path to greater confidence in the system and reduce the risks warnings and cautions that must be issued13In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result (see Figure 9) The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) or AWR 13To achieve this goal modeling and analysis tools that follow a standard for modeling reliability should exist As previously stated over 200 tools have been created since 1970s Here is a list of a few of the current tools13Universal Systems Language (USL)13Unified Modeling Language (UML)13Systems Modeling Language (SysML)13MATLABSimulink13Telelogic Rhapsody13MathCAD13Colored Petri Nets13Rate Monotonic Analysis (RMA)13STATEMATE (Used by Airbus)13Standard for the Development of Safety-Critical Embedded Software (SCADE)13OPNET13Embedded System Modeling Language (ESML)13Component Synthesis using Model-Integrated Computing (CoSMIC)13Architectural Analysis and Design Language (AADL)13By no means is this list complete Typically different companies and projects address this challenge and choose unit tools to perform the upfront analysis and modeling not following a standard approach Multiple tools need to converge or be compatible with the framework of a common tool such as AADL The resultant tool(s) could be used to verify the requirements up front to mitigate reliability issues down the life-cycle 13

20

Tools for Modeling and Analysis

bull Universal Systems Language (USL)bull Unified Modeling Language (UML)bull Systems Modeling Language (SysML)bull MATLABSimulinkbull Telelogic Rhapsodybull MathCadbull Colored Petri Netsbull Rate Monotonic Analysis (RMA)bull STATEMATE (Used by Airbus)bull SCADEbull OPNETbull Embedded System Modeling Language (ESML)bull Component Synthesis using Model-Integrated Computing (CoSMIC)bull Architectural Analysis and Design Language (AADL)bull At least 200+ more packages since the 70rsquosbull Certified tools needs to converge to an accepted standard

modelinganalysis method for complex system reliability

Presenter
Presentation Notes
Typically different companies and projects address the challenge and choose unit tools to perform the upfront analysis and modeling not following a guideline approach Multiple tools need to converge or be compatible with a set modeling standard for complex systems The compliant tools could be used to verify the requirements up front to mitigate reliability issues down the life-cycle A notional approach would follow that shown in Figure 8 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model The down side to modeling from certain circles is getting people to believe those models How do you certify a modeling tool and the actual models within the tools Those issues should be addressed going forward

21

Modification to Acquisition Model

Requirements Establishment

High Level Design

Detailed Specifications

Implementation Coding

Verification

Development TestingA

rchi

tect

ural

Mod

el amp

A

naly

sis

Propose standard modeling methodology to be applied at different phases of development to enhance requirements development reliability allocation reliability

measurement and testing (DISCLAIMER DOES NOT REPLACE TESTING)

Reliabilityallocated Reliability

measured

Operational Testing amp Validation

Deployed System

Presenter
Presentation Notes
A notional approach would follow that shown in Figure 10 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model This approach could bridge the design and test phases together It is emphasized here that the architectural model would not replace critical testing but augments the process to allow for better requirement identification and verification Thorough ground and flight tests should never be replaced by modeling Modeling would only allow for more robust and a higher level of confidence in the requirements and design The model could be used in conjunction with the testing to confirm the design Proper modeling and analysis would reduce total program costs by enforcing more complete and correct initial requirements which reduce issues discovered down the road in testing that are expensive to fix or impossible to fix and having to accept high risks Additionally if the model is maintained and optimized then it could possibly be used after system deployment to analyze impacts of upgrades or changes to the system allowing for more complete analysis and reduce overall system redesign costs1313A hurdle to cross with modeling and analysis is convincing people to believe those models Some method to certify these models and modeling tools should be addressed in the future Standards should be set in place for correct modeling techniques for complex systems Lastly consideration of standard verification checking tools should be made such as with the use of the Motor Industry Software Reliability Association (MISRA) compliance verification tool for the use of C in safety critical systems

22

Systems Reliability Standard Establishment

bull Establish a working group to define this standardndash Need a technical society to lead the charge on this

bull Collaborate with industry academia military and societiesndash Focus on development of a reliability standard with AWR safety in mindndash Draw upon the experiences to feed into this standard

bull Study existing and previous complex systemsndash Shuttle Space Station missile systems nuclear submarine and ship

systems nuclear control systems military and commercial jet systems ndash Obtain software reliability information from given existing and previous

systemsndash Build database which would serve as basis for future reliability

bull Research prior efforts in complex systems analysis

bull Establish consensus based modeling and analysis method

Presenter
Presentation Notes
In conclusion methods for achieving a design for complex systems do exist however achieving reliability and attaining a level of qualification that would permit better AWRs does not There needs to be a standard developed to tackle this issue rather than relying on current methods to ascertain airworthiness of complex systems This will not occur overnight An orchestrated collaboration among industry academia military labs and technically professional societies to focus on development of this standard should allow us to draw upon the experiences to feed into this reliability standard with AWR safety in mind We have a long living experiment with complex software systems on the Space Transportation System (STS) International Space Station (ISS) missile systems nuclear submarine and ship systems nuclear control systems military and commercial jet systems from which we should be able to obtain at least inferred software reliability information from the architecture and run time that the systems use We should look at the lessons learned from these systems to see what could have been done to improve and what was done right that should be carried forward The challenge is collecting this information into a central database and arriving at some figure of reliability from previous systems where data exists This would at least provide a starting point to allow initial assessments and could be optimized in the future Also this is not the only study for establishing reliability metrics to complex software systems There have been research projects of similarity to this effort that have risen and fallen The data from those projects should not be wasted but studied to feed into whatever standard that is developed While historical information would be useful each design is unique and requires tools to accomplish the design Investigation of architectural modeling constructs should be further investigated as a possible augmentation to the design and test process We need to determine which forum is best to conduct this effort (eg SAE IEEE AIAA ACM AHS INCOSE or other) As stated in the paper ldquoSpace Shuttle Avionicsrdquo [31] ldquoThe designers the flight crew and other operational users of the system often have a mindset established in a previous program or experience which results in a bias against new or different lsquounconventionalrsquo approachesrdquo If nothing is done to address this problem it will only get worse over time It is past time to address the issue of reliability of complex systems and software

23

BACKUP SLIDES

BACKUP SLIDES

24

PEO Aviation System Safety Management Decision Authority Matrix

Severity(Most Credible)

FrequentP gt 1E-3

A

Probable 1D-4 lt P lt= 1E-3

B

Occasional1E-5 lt P lt= 1E-4

C

Remote1E-6 lt P lt= 1E-5

D

Improbable1E-7 lt P lt= 1E-6

E

Catastrophic1

Critical 2

Marginal 3

Negligible4

Army Acquisition

PEO Aviation

ProgramManagement

HazardCategory

Description

1 Catastrophic Death or permanent total disability system loss

2 Critical Severe injury or minor occupational illness (no permanent effect) minor system or environmental damage

3 Marginal Minor injury or minor occupational illness (no permanent effect) minor system or environmental damage

4 Negligible Less than minor injury or occupational illness (no lost workdays) or less than minor environmental damage

RiskLevel

Description Probability (Frequency) (per 100000 flight hours)

A Frequent gt 100 (P gt 1E-3)

B Probable lt=100 and gt10 (1E-4 lt P lt= 1E-3)

C Occasional lt= 10 and gt1 (1E-5 lt P lt= 1E-4)

D Remote lt=1 and gt01 (1E-6 lt P lt= 1E-5)

E Improbable lt=01 and gt001 (1E-7 lt P lt= 1E-6)

Presenter
Presentation Notes
As already mentioned it is the goal of the US Army Aviation Engineering Directorate that the system is safe and reliable to operate and will perform the mission when delivered Another goal includes the released system continues to safely perform the mission if maintained and operated per the operatorrsquos manual Replacement parts and overhaul work must be high quality to support continued airworthiness Per the Program Element Office Memorandum 08-03 Risk Matrix US Army flight control systems are to achieve 1E-9 reliability for flight critical functions per civil airspace regulations [35] and 1E-6 reliability for tactical use [36] Quantifying these numbers is established for component hardware but not for software Just as hardware should have quantifiable reliability so should software13

25

Reliability Defined

bull Software Reliability - the probability that a given piece of software will execute without failure in a given environment for a given time [40] ndash Often debated as to how to measure

bull Hardware Reliability - the probability that a hardware component fails over time ndash Well defined and established

bull System Reliability - the probability of success or the probability that the system will perform its intended function under specified design limits [over a given amount of time] [39] ndash A combination of software and hardware reliability

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

26

Hardware vs Software Reliability

Hardware Reliability Software ReliabilityFailure rate has a bathtub curve The burn-in state is similar to the software debugging state

Without considering program evolution failure rate is statistically non-increasing

Material deterioration can cause failures even though the system is not used

Failures never occur if the software is not used

Failure data are fitted to some distributions The selection of the underlying distribution is based on the analysis of failure data and experiences Emphasis is placed on analyzing failure data

Most models are analytically derived from assumptions Emphasis is on developing the model the interpretation of the model assumptions and the physical meaning of the parameters

Failures are caused by material deterioration design errors misuse and environment

Failures are caused by incorrect logic incorrect statements or incorrect input data

Can be improved by better design better material applying redundancy and accelerated life cycle testing

Can be improved by increasing testing effort and correcting discovered faults Reliability tends to change continuously during testing due to the addition of problems in new code or the removal of problems by debugging errors

Hardware repairs restore the original condition Software repairs establish a new piece of software

Hardware failures are usually preceded by warnings Software failures are rarely preceded by warnings

Hardware components can be standardized Software components have rarely been standardized

Hardware can usually be tested exhaustively Software essentially requires infinite testing for completeness

Reference [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Hong Pham compares software versus hardware reliability with the information as shown in Table 2 [39] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

27

Acronym ListACRONYM DEFINITIONAADL Architectural Analysis and Design LanguageAC Advisory Circular (FAA)ACM Association of Computing MachineryAED Aviation Engineering Directorate (AMRDEC)AFTD Aviation Flight Test Directorate (US Army)AGC Apollo Guidance ComputerAHS American Helicopter SocietyAIAA American Institute of Aeronautics and Astronautics (Inc)AMCOM Aviation and Missile Command (US Army)AMRDEC Aviation and Missiles Research Development and Engineering Center (US Army)AR Army RegulationARINC Aeronautical Radio Inc ARP Aerospace Recommended PracticeASIF Avionics Software Integration FacilityATAM Architecture Tradeoff Analysis MethodATM Air Traffic ManagementAWR Airworthiness ReleaseCAAS Common Avionics Architecture SystemCH-47 Cargo Helicopter ChinookCMM Capability Maturity ModelCMMI Capability Maturity Model IndexCMU Carnegie Mellon UniversityCNS Communications Navigation SurveillanceCoSMIC Component Synthesis using Model-Integrated ComputingCPS Cyber-Physical SystemCRC Chemical Rubber Company (ie CRC Press)DFBW Digital Fly-By-WireDoD Department of DefenseE3 Electrical and Electromagnetic EffectsESML Embedded System Modeling LanguageFAA Federal Aviation AdministrationFCS Future Combat SystemsFHA Functional Hazard AssessmentFMEA Failure Modes Effects Analysis

28

Acronym List (concluded)

ACRONYM DEFINITIONGPWS Ground Proximity Warning SystemIBM International Business MachinesIEC International Engineering ConsortiumIL Instrumentation Lab (now Draper Laboratory)IMA Integrated Modular AvionicsINCOSE International Council On Systems EngineeringISO International Organization for StandardizationISS International Space StationKAL Korean AirlinesMISRA Motor Industry Standard Software Reliability AssociationMIT Massachusetts Institute of TechnologyNASA National Aeronautics and Space Administration (USA)PDR Preliminary Design ReviewPEO Program Element OfficePNAS Proceedings of the National Academy of SciencesRAQ Rotorcraft and Aircraft QualificationRMA Rate Monotonic AnalysisRTC Redstone Test Center (US Army) RTTC Redstone Technical Test Center (US Army)RTCA Radio Technical Commission for AeronauticsSAE Society of Automotive EngineersSED Software Engineering Directorate (AMRDEC)SEES Software Engineering Evaluation SystemSEI Software Engineering Institute (CMU)SIL System Integration LaboratorySSA System Safety AssessmentSTS Space Transportation SystemSysML Systems Modeling LanguageTMR Triple Modular RedundantTRL Technical Readiness LevelUAS Unmanned Aircraft SystemUH-60 Utility Helicopter BlackhawkUML Unified Modeling LanguageUS United StatesUSL Universal Systems Language

29

References

bull [1] Israel Koren and Mani Krishna ldquoFault-Tolerant Systemsrdquo Morgan Kaufmann 2007bull [2] Jianto Pan ldquoSoftware Reliabilityrdquo Carnegie Mellon University Spring 1999bull [3] Nachum Dershowitz httpwwwcstauacil~nachumdhorrorhtmlbull [4] httpwwwair-attackcombull [5] David A Mindell ldquoDigital Apollo Human and Machine in Spaceflightrdquo The MIT Press 2008bull [6] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Flight Software Airworthiness SED-SES-PMHFSA 001 December 2003bull [7] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Software Safety SED-SES-PMHSSA 001 February 2006bull [8] AMCOM Regulation 385-17 Software System Safety Policy 15 March 2008bull [9] NASA Software Safety Guidebook NASA-GB-871913 31 March 2004bull [10] Margaret Hamilton amp William Hackler ldquoUniversal Systems Language Lessons Learned from Apollordquo IEEE Computer

Society 2008bull [11] Margaret Hamilton ldquoFull Life Cycle Systems Engineering and Software Development Environment Development Before

The Fact In Actionrdquo httpwwwhtiuscomArticlesFull_Life_Cyclehtmbull [12] Peter Feiler David Gluch John Hudak ldquo The Architecture Analysis amp Design Language (AADL) An Introductionrdquo

CMUSEI-2006-TN-011 February 2006bull [13] Peter Feiler John Hudak ldquoDeveloping AADL Models for Control Systems A Practitionerrsquos Guiderdquo CMUSEI-2007-TR-

014 July 2007bull [14] Bruce Lewis ldquoUsing the Architecture Analysis and Design Language for System Verification and Validationrdquo SEI

Presentation 2006bull [15] Feiler Gluch Hudak Lewis ldquoEmbedded System Architecture Analysis Using SAE AADLrdquo CMUSEI-2004-TN-004 June

2004bull [16] Charles Pecheur Stacy Nelson ldquoVampV of Advanced Systems at NASArdquo NASACR-2002-211402 April 2002bull [17] Systems Integration Requirements Task Group ldquoARP 4754 Certification Considerations for Highly-Integrated or

Complex Aircraft Systemsrdquo SAE Aerospace 10 April 1996bull [18] SAE ldquoARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne System and

Equipmentrdquo December 1996bull [19] Aeronautical Radio Inc (ARINC) ldquoARINC Specification 653P1-2 Avionics Application Software Standard Interface Part 1

ndash Required Servicesrdquo 7 March 2006

30

References (Continued)

bull [20] Department of Defense ldquoMIL-STD-882D Standard Practice for System Safetyrdquo 19 January 1993bull [21] RTCA Incorporated ldquoDO-178 Software Considerations in Airborne Systems and Equipment Certificationrdquo 1

December 1992bull [22] RTCA Incorporated ldquoDO-254 Design Assurance Guidance for Airborne Electronic Hardwarerdquo 19 April 2000bull [23] US Army ldquoAeronautical Design Standard Handbook Rotorcraft and Aircraft Qualification (RAQ) Handbookrdquo 21

October 1996bull [24] Cary R Spitzer (Editor) ldquoAvionics Elements Software and Functionsrdquo CRC Press 2007bull [25] US Army ldquoArmy Regulation 70-62 Airworthiness Qualification of Aircraft Systemsrdquo 21 May 2007bull [26] US Army ldquoArmy Regulation 95-1 Aviation Flight Regulationsrdquo 3 February 2006bull [27] ldquoUsing the Architecture Tradeoff Analysis Method (ATAM) to Evaluate the Software Architecture for a Product Line

of Avionics Systems A Case Studyrdquo Barbacci Clements Lattanze Northrop Wood July 2003 CMUSEI-2003-TN-012bull [28] ldquoAll in the Family CAAS amp AADLrdquo Peter Feiler August 2008 CMUSEI-2008-SR-021bull [29] ldquoCMMI Guidelines for Process Integration and Product Improvementrdquo Chrissis Konrad Shrum Pearson Education

2007bull [30] ldquoModel Driven Performance Analysis for Avionics Systemsrdquo Brendan OrsquoConnell Draper Laboratory January 2006bull [31] John F Hanaway Robert W Moorehead ldquoSpace Shuttle Avionics Systemsrdquo NASA SP-504 1989bull [32] Lui Sha ldquoThe Complexity Challenge in Modern Avionics Softwarerdquo August 14 2006bull [33] ldquoIncidents Prompt New Scrutiny of Airplane Software Glitchesrdquo 30 May 2006 Wall Street Journalbull [34] Eyal Ophir Clifford Nass and Anthony Wagner ldquo Cognitive Control in Media Multitaskersrdquo PNAS 20 July 2009bull [35] ldquoAdvisory Circular AC 251309-1A System Design and Analysisrdquo Federal Aviation Administration 21 June 1988bull [36] Program Element Office Policy Memorandum 08-03bull [38] httpwwwnsfgovpubs2008nsf08611nsf08611htm National Science Foundation webpage on Cyber-Physical

Systemsbull [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000bull [40] Paul Rook editor ldquoSoftware Reliability Handbookrdquo Elsevier Science Publishers LTD 1990

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)

3

Objective

ldquoDevelop an industry standard method for qualifying complex integrated systems to a specified reliabilityrdquo

RA-66 Comanche Picture by US Army

Presenter
Presentation Notes
Currently there is no method to address qualifying complex integrated systems to a specified reliability Later discussions in this presentation will point to the fact that hardware reliability has been established and recognized however software reliability hasnrsquot reached a consensus on approach Both hardware and software are major contributions to the overall reliability of a system

4

Defense Acquisition Approach to Systems Development and Test

Requirements Establishment

Analysis

High Level Design

Detailed Specifications

Implementation Coding

Operational Testing amp Validation

Verification

Development Testing

Deployed System

Presenter
Presentation Notes
The general recommended system development V-curve as shown in is not always followed in a strict sense although it should be the goal Such negligence to the proper process makes the establishment of certification very difficult For new system development and existing system upgrades requirements must be clear complete and testable The certification requirements must be made obvious in the development of the requirements establishment phase with the goal of being fully identified during the requirements development

5

US Army Airworthiness

Airworthiness Qualification meansThe system is safe and reliable to operate and willperform the mission when deliveredIt will continue to safely perform the mission ifmaintainedoperated per the manualParts and overhaul work must be high quality to maintain airworthinessFlight control systems have high reliability requirements - 10-9 for civil airspace critical IFR functions [35]- 10-6 for tactical airspace [36]

Presenter
Presentation Notes
The US Army has been involved in flight since the early Wright-B flight of 1909 Initially fixed wing and rotary wing aircraft in early aviation remained simple and federated The basics to aviate navigate and communicate were handled by dedicated gauges compasses gyroscopes and mechanical linkages to fly the aircraft With the dawn of the Space Age in the 1960s the rise of more complex electronic control in aviation appeared with the ApolloSaturn projects NASA Dryden F-8 project and others As time progressed more integrated modular avionics (IMA) and digital fly by wire (DFBW) came into military and commercial aircraft to reduce weight increase functionality and provide redundancy With this advancement of complex flight control technology testing verification and validation became problematic To this day complex systems have been challenging to qualify for the Federal Aviation Administration (FAA) and military airworthiness commands13The US Army AED holds the responsibility for airworthiness approval This includes the design approval production approval and continued airworthiness In the design approval AED must determine if the system design or modification meets airworthiness standards In the milestone decisions it must determine whether the system is produced or modified in accordance with the approved design For continued airworthiness a judgment must be made for the system operated and maintained to keep it compliant with the approved design The bottom line is that the qualified system meets its performance requirements and is reliable when delivered1313Establishing approval for AWR has become increasingly difficult with the evolution of complex avionic control systems from federated architectures to IMA architectures that rely heavily on complex software systems (ie now identified as cyber-physical systems) As stated earlier AED has the mission to ensure airworthiness qualification for aircraft and subsystems used in the US Army aviation fleet including helicopters airplanes and Unmanned Aircraft Systems (UAS) 13

6

AED and Qualification

bull AEDrsquos mission is to ensure airworthiness qualification for aircraft and subsystems used in the US Army fleet

bull Airworthiness Qualification isndash Demonstration of an aircraft or aircraft subsystem or

component including modifications to function safely meeting performance specifications when used and maintained within prescribed limits (AR 70-62)

bull Traditionally qualified systems by ndash Similarity ndash Analysisndash Testndash Demonstrationndash Examination

Presenter
Presentation Notes
Airworthiness qualification is the demonstration of an aircraft or aircraft subsystem or component including modifications to function satisfactorily when used and maintained within prescribed limits [25] The US Army has traditionally qualified systems and components by physical testing analysis demonstration or by similarity

7

Evolution of Helicopter Systems

bull Past systems historically federated ndash Distributed Functionality ndash Hardware basedndash Simple easier to test

bull Now systems are becoming integratedndash Combined Functionality ndash More Software Intensivendash Complex more difficult to test

Chief Warrant Office Jim Beaty (back row far left) and crew of the of the Vietnam UH-1 Flying Bikinis (friend of Alex Boydston)

UH-1 Cockpit (US Army)

Chinook CAAS Cockpit (US Army)

CH-47 Chinook (US Army)

Presenter
Presentation Notes
The US Army has been involved in flight since the early Wright-B flight of 1909 Initially fixed wing and rotary wing aircraft in early aviation remained simple and federated The basics to aviate navigate and communicate were handled by dedicated gauges compasses gyroscopes and mechanical linkages to fly the aircraft With the dawn of the Space Age in the 1960s the rise of more complex electronic control in aviation appeared with the ApolloSaturn projects NASA Dryden F-8 project and others As time progressed more integrated modular avionics (IMA) and digital fly by wire (DFBW) came into military and commercial aircraft to reduce weight increase functionality and provide redundancy With this advancement of complex flight control technology testing verification and validation became problematic To this day complex systems have been challenging to qualify for the Federal Aviation Administration (FAA) and military airworthiness commands13

8

Present Approach to Testing

bull Several disciplines weigh in such as software avionics GNampCenvironmental E3 electrical human factors structures aerodynamics etc

bull Current test methodology per older federated systemsndash Hardware Mil-Std 810 Mil-Std 461ndash Requirements Analysis (Traceability)

bull Test at different levelsndash Individual software module testingndash Black box testingndash System Integration testing

bull Hot benchbull System Integration Lab (SIL)

ndash Aircraft level Testing bull Groundbull Flight Aviation Flight Test Directorate (AFTD) Testing (US Army Photo)

Aviation System Integration Facility (ASIF) (US Army Photo)

Presenter
Presentation Notes
Airworthiness qualification is the demonstration of an aircraft or aircraft subsystem or component including modifications to function satisfactorily when used and maintained within prescribed limits [25] The US Army has traditionally qualified systems and components by physical testing analysis demonstration or by similarity Historically most systems were federated They were hardware based simple and distributed Now they have become more integrated They are more software intensive complex and have combined functionality contained in one or more computers With this evolution from simple to more complex the Army is finding it more difficult to execute an AWR As systems evolve to more complex systems of systems this problem is only growing worse13The current test approach to achieving confidence in systems for an AWR for the US Army is based more on traditional federated avionics systems Experienced personnel in Software Vehicle Management Systems Avionic Communications Navigation amp Guidance Control Electrical Environmental Human Factors Electrical and Electro-magnetic Effects (E3) Structures Thermal Safety Integration Testing and Flight Test personnel and test pilots all play important roles in accomplishing test and review of new and existing systems While some may not consider areas such as thermal or EEE important to software development they are crucial since the software is running on physical systems that are affected by heat and susceptibility to electromagnetic radiation which can cause abnormal operation or bit errors Current test methodology for hardware relies on MIL-STD-810 MIL-STD-461 and requirements analysis such as traceability MIL-STD-810 is the Department of Defense Test Method Standard for Environmental Engineering Considerations and Laboratory Tests1313It focuses on equipment environmental design and test limits to the conditions that it will experience throughout its service life and establishing chamber test methods that replicate the effects of environments on the equipment rather than imitating the environments themselves MIL-STD-461 includes the ldquoRequirements for the Control of Electromagnetic Interference Characteristics of Subsystems and Equipmentrdquo These test standards are geared for hardware and physical effects and do not address software operation Current methods for software testing include testing the individual software modules black box (or interface) level system integration level and aircraft level testing System integration testing can include hot bench system integration labs (SILs) (such as the shown in Figure 4) and aircraft ground level testing as conducted by the Aviation Flight Test Directorate (AFTD) of Redstone Test Center (RTC) (see Figure 5) 13Neglecting complete and good requirements promotes risk It is common practice by Program Managers (PMs) to accept risks As issues are found in systems it is on AED to issue an Army Safety Action Memorandum (ASAM) that identifies a deficiency to the field pilots Concurrently an Airworthiness Impact Statement (AWIS) is issued to the PMs which contain a probabilistic analysis of how the identified shortcoming will affect risk The AMCOM Safety Office will produce its own calculations The PMs can either accept or reject this assessment Regardless cautions and warnings are placed in the AWR to keep the entire program and flight crew aware of issues13

9

Development Challenges

bull Legacy Aircraft often upgraded in a piecemeal fashionndash Makes certification difficultndash Desire to increase to modern requirements based on size of upgrade and

what it includes ndash hard to scope

bull New system requirements must be clear complete and testable ndash Certification requirements must be obvious

bull Orchestrating agreement between stakeholders is necessary to mitigatendash Juggling of multiple software buildsndash End system that is difficult to test certify and deployndash Escalating Costsndash System Safety from being poorly understoodndash Design iterations

Presenter
Presentation Notes
It would be wonderful if all systems were straightforward in design easily testable and simple to write an Airworthiness Release (AWR) for however that is not the case Legacy aircraft such as the Chinook have been upgraded in a piecemeal fashion acquiring much needed improvements in aviation navigation and communication The general recommended system development V-curve as shown earlier is not always followed in a strict sense although it should be the goal process Negligence to the proper process makes the establishment of certification very difficult For new system development and existing system upgrades requirements must be clear complete and testable The certification requirements must be made obvious in the development of the requirements establishment phase with the goal of being fully identified during the requirements development Orchestrating agreement among all stakeholders (eg the program manager systems engineers human factors engineers integrators test engineers manufacturers users and certifiers) is necessary to mitigate problems such as13juggling multiple software builds13producing a difficult-to-test difficult-to-certify and difficult-to-deploy systems13misunderstanding system safety and13requiring design iterations that impact schedules and costs

10

Complexity Issues

bull System Development costs and schedule increase with complexityndash Existing lack of schedule and funding resources

bull Keeps systems from achieving full compliance with specifications and requirements

bull Garbage in -gt Garbage OuthellipPoor requirements -gt Poor Systemndash Finding problems in new designs at PDR is too latendash Difficult to correct existing poorly designed fielded complex systems

bull Complexity amp reliability of complex systems is not fully understoodndash How do we accurately assess operating risk performance reliability of

complex systems based on limited testing and analysisndash How do we know when system design is good enoughndash Latent defects occur in supposedly well-tested mature systems

bull Avionics parts and software change constantlyndash Spiral development -gt new softwarehardware qualification required frequently ndash How do we streamline the process (partition the system) so the need for

complete re-qualification after changes is lessened

Presenter
Presentation Notes
In an old computer development lesson it is well known that if you put ldquogarbage in you will get garbage outrdquo Likewise in a system if you have poor requirements then you will end up with a poor system Finding problems in new designs at Preliminary Design Review (PDR) is too late It has been shown that discovering issues at this stage in the game will cause a reiteration on the design which costs time and money Furthermore dealing with existing fielded complex systems that have not gone through the rigors of proper systems engineering results quite often with an overwhelming mess of functionality to test verify validate qualify and certifydesigns at PDR is too late It has been shown that discovering issues at this stage in the life-cycle will cause a reiteration on the design which costs time and money Furthermore dealing with existing fielded complex systems that have not gone through the rigors of proper systems engineering results quite often with a complicated conglomeration of functionality to test verify validate qualify certify and maintain This is indicative where complexity has exceeded our understanding in how to certify a system We still do not fully understand complexity and how to address reliability of complex systems How do we accurately assess operating risks performance and reliability of complex systems based on limited testing and analysis How do we know when a system design is good enough How do we modularize spirally developed systems to minimize the need for re-qualification of unchanging portions of the system We are 30 plus years into this technology and we still deal with systems with latent defects that are occurring in supposedly well-tested and mature systems To further exacerbate the problem we are now dealing with complex system of systems (ie cyber-physical systems)13It is a given that you can keep on adding redundancy and complexity to a problem to attain a desired level of reliability but at some point in time the reliability will taper off At best we sometimes must satisfy for an optimum point before digressing in reliability In the same vein system development costs and schedule increase with complexity too (see Figure 7 and 8)13Avionics parts and software constantly change over the life of a program Typically a spiral development program occurs with complex software development which means that qualification is required frequently This begs the question of how to streamline the process so that the need to conduct a complete requalification is avoided13With these complex systems there are other hurdles to cross such as fully characterizing and conducting the Functional Hazard Assessments (FHAs) and Failure Mode Effects and Criticality Analyses (FMECAs) It is crucial for the safety assessment that these are conducted correctly to fully understand the risks and later perform the correct tests at the right level Additionally once the complex component hardware and software are integrated then yet other problems appear It is at that time that the disparity of teams crossing multiple contractors and development groups becomes obvious and if not properly coordinated could cause impacts to the schedule Other programmatic problems affect complex system development and qualification For instance lack of schedule and funding resources causes a shortcoming to adequately provide for the proper compliance with specification and requirements short-circuiting the systems engineering process An ever decreasing availability of trained engineers to support the development and test of such systems exists Non-technical political influences sometimes affect the reliability of complex systems and are difficult to avoid Lastly there is a lack of a centralized database that captures the various families of systems that have been built along with their characterization of success and failures Such a database from all past and present government complex systems could be valuable in establishing reliability basis for future models

11

Complexity Issues (continued)

bull Functional Hazard Assessments and related documentation are crucial

ndash Understanding risks ndash Performing the correct tests at the right level Lab test vs Flight Test

bull Saves flight time and money

bull Systems Integration for complex systems is a schedule driver

bull Need experienced personnel to work complex systems

bull Need a centralized database - just doesnrsquot existndash Determine data needed for quantifying reliability of complex systemsndash Capture the pertinent data on previous or existing complex systemsndash Understand successes and failures of previous and present complex systemsndash Establish baseline reliability figures for known architectures

bull Complex System of Systems exacerbates problem

12

Reliability vs Complexity ampCost vs Complexity

Notional Graphs

bull Reliability vs Complexity bull Cost amp Schedule vs Complexity

Rel

iabi

lity

Complexity Complexity

Cos

t amp S

ched

uleOptimum

Aggregation of part reliability

feeds into overall system reliability

Desired

Presenter
Presentation Notes
Avionics parts and software constantly change over the life of a program Typically a spiral development program occurs with complex software development which means that qualification is required frequently This begs the question of how to streamline the process so that the need to conduct a complete re-qualification is avoided13With these complex systems there are other hurdles to cross such as fully characterizing and conducting the Functional Hazard Assessments (FHAs) and Failure Mode Effects and Criticality Analyses (FMECAs) It is crucial for the safety assessment that these are conducted correctly to fully understand the risks and later perform the correct tests at the right level13Additionally once the complex component hardware and software are integrated then yet other problems crop up It is at that time that the disparity of teams crossing multiple contractors and development groups becomes obvious and if not properly coordinated could cause impacts to the schedule 13Other programmatic problems affect complex system development and qualification For instance lack of schedule and funding resources causes a shortcoming to adequately provide for the proper compliance with specification and requirements short-circuiting the systems engineering process An ever decreasing availability of trained engineers to support the development and test of such systems exists Non-technical political influences sometimes affect the reliability of complex systems and are difficult to avoid Lastly there is a lack of a centralized database that captures the various families of systems that have been built along with their characterization of success and failures Such a database from all past and present government complex systems could be valuable in establishing reliability basis for future models

13

A Few Examples of Complex Systems

bull This is not a new problem Other have struggled with the challenges of establishing confidence in complex systems

ndash NASAbull Apollo Guidance Computerbull Dryden F8 Crusaderbull Space Shuttlebull International Space Station

ndash Commercial Airlinersbull Airbus A320 and higherbull Boeing B777 B787

ndash Militarybull Ships and Submarinesbull Jets (F14F15 F16 F18 F22 F35 etc)bull Cargo Planes (C130J Hercules C17 Globemaster etc)bull Helicopters (Chinook Blackhawk Sea Stallion etc)bull Rocketsbull Unmanned Aerial Systemsbull Unmanned Ground Systemsbull Unmanned Submarine Systems Photos by US Army NASA US Navy and US Air Force

Presenter
Presentation Notes
As mentioned earlier complex avionics systems are not a new idea Since the early 1960rsquos complex avionic architectures have existed beginning with the ApolloSaturn program Massachusetts Institute of Technology (MIT) Instrumentation Lab (IL) which is now Draper Laboratory and International Business Machines (IBM) led the way with the MITIL Apollo Guidance Computer (AGC) and the Saturn V IBM triple modular redundant (TMR) voting guidance computer system The word software was not even coined at the time but engineers such as Margaret Hamilton MITIL Director of Apollo On-board Software can attest to the fact that some the same issues with creating reliable software then still exists today [5] A large majority of the issues then dealt with the communication between systems engineers and the programmers Requirements were thrown over the wall without the confirmation that the requirements were complete and a lot of the issues cropped up as interface problems Identifying these issues prompted Hamilton to create her own company and create a modeling language called Universal Systems Language (USL) to head off the problems experienced with Apollo [11] Some 200 plus modeling programs have been developed since Apollo and used to mitigate issues and increase confidence in systems of varying complexity13As time progressed other systems came along The NASA Dryden F8 Crusader was the first digital fly by wire (DFBW) jet aircraft that relied heavily on complex IMA and software for flight control The Space Transportation System (STS) shuttle includes a Quad Modular Redundant (QMR) system with a fifth backup flight computer containing uncommon code US Air Force and Naval airplanes that have possessed complex or redundant IMA configurations include the F14 Tomcat F15 Eagle F16 Falcon F18 Hornet F22 Raptor F35 Joint Strike Fighter F117 Nighthawk V22 Osprey C17 Globemaster and many more along with recent Unmanned Air Vehicle Systems (UAVS) The US Army complex systems on helicopters include the13RAH-66 Comanche DFBW Triple Modular Redundant (TMR) architecture13glass cockpit avionics on the UH-60M Blackhawk baseline 13Common Avionics Architecture System (CAAS) glass cockpit on the UH-60M Blackhawk modernization and CH-47F Chinooks13and other aircraft13Additionally there are many self-checking pair engine controller systems along with system of system Future Combat Systems (FCS) and Unmanned Air Vehicle Systems (UAVS) This has also permeated the commercial airliner market with the Airbus 320 and higher Airbus models Boeing 777 and Boeing 787 aircraft With this ever increasing technology something must be done about the reliability issue With such a wealth of data on aviation and non-aviation cyber-physical systems such as submarine ship nuclear medical locomotive and automotive systems there should be adequate information to get a start on modeling systems correctly for reliability Therefore this is not an isolated problem to avionics and other disciplines should aide in resolving this problem13

14

Some Complex System Failures

bull V-22 Osprey crashesbull Mars Climate Orbiter crashbull Mars Pathfinder software resetbull USS Vincennes downing an Airbus 320bull Therac-25 software radiation treatment

failurebull 1989 Airbus A320 air show crashbull China Airlines Airbus Industries A300

crashbull Ariane 5 satellite launcher malfunctionbull Failure of the primary flight system to

sync with the backup during prelaunch of STS-1

bull Mexicana Airlines Boeing 727 airliner crashed into a mountain due to the software not correctly determining the mountain position

bull Loss of the first American probe to Venus

bull Korean Airlines KAL 901 accidentbull Soviet Phobos I Mars probe lost

bull Three Mile Islandbull F-18 fighter plane crash due to bad

exceptionbull F-14 fighter plane lost to

uncontrollable spinbull Swedish Gripen prototype crashedbull Swedish Gripen air-show crashbull F-22 failure crossing the IDLbull 2006 German-Spanish Barracuda UAVbull 2004 FA-22 Raptor stealth fighter jet

crash bull FA-22 Raptor navigation system

software error at Nellis AFBbull 50 cockpit blackouts on A320bull A320 multiple avionics and electrical

failures at Newark NJbull Boeing 777 Malaysian Airlines jetlinerrsquos

nightmarish autopilot rollercoaster ridebull 3000 feet US Army and Air Force UAV

Crashes

bull hellip And Many Morehellip

Presenter
Presentation Notes
Multiple crashes have occurred with the V-22 Osprey [41]13In 1999 the Mars Climate Orbiter crashed because of incorrect units in a program caused by poor systems engineering practices [42 44]13In 1988 an Airbus 320 was shot down by the USS Vincennes because of cryptic and misleading output displayed by the tracking software [3]13In 1989 an Airbus A320 crashed at an air show due to altitude indication and software handling [3]13In 1994 a China Airlines Airbus Industries A300 crash on killing 264 from faulty software [3]13In 1996 the first Ariane 5 satellite launcher destruction mishap was caused by a faulty software design error with a few lines of ADA code containing unprotected variables Horizontal velocity of the Ariane 5 exceeded the Arian 4 resulting in the guidance system veering the rocket off course Insufficient testing did not catch this error which was a carry-over from Ariane 4[3 39]13In 1986 a Mexicana Airlines Boeing 727 airliner crashed into a mountain due to the software not correctly determining the mountain position [39]13In 1986 the Therac-25 radiation therapy machines overdosed cancer patients due to flaw in the computer program controlling the highly automated devices [3 39 45]13During the maiden launch in 1981 of the Discovery space shuttle a failure of the primary flight control computer system to establish sync with the backup during prelaunch [43]13On December 10 1990 the Space Shuttle Columbia had to land early due to computer software problems [39]13In 1997 The Mars Pathfinder software reset problem due to latent task execution caused by priority inversion with a mutex [3 44]13An error in a single FORTRAN statement resulted in the loss of the first American probe to Venus From G J Myers Software Reliability Principles amp Practice p 25 [3]13In September 1999 the Korean Airlines KAL 901 accident in Guam killed 225 out of 254 aboard A worldwide bug was discovered in barometric altimetry in Ground Proximity Warning System (GPWS) From ACM SIGSOFT Software Engineering Notes vol 23 no 1 [3]13The Soviet Phobos I Mars probe was lost due to a faulty software update at a cost of 300 million rubles Its disorientation broke the radio link and the solar batteries discharged before reacquisition From Aviation Week 13 Feb 1989 [3]13An F-18 fighter plane crash due to a missing exception condition From ACM SIGSOFT Software Engineering Notes vol 6 no 2 [3]13An F-14 fighter plane was lost to uncontrollable spin traced to tactical software From ACM SIGSOFT Software Engineering vol9 no5 [3]13In 1989 Swedish Gripen prototype crashed due to software in their digital fly-by-wire system [3 46]13In 1995 another Gripen fighter plane crashed during air-show caused by a software issue [3 46]13On February 11 2007 twelve FA-22 Raptors were forced to head back to the Hawaii when a software bug caused a computer crash as they were crossing International Date Line In the middle of the ocean all their systems comprising navigation fuel and part of the communications systems dumped All the attempts to reboot failed[47]13February 2006 German-Spanish Unmanned Combat Air Vehicle Barracuda crash due to software failure [4]13December 2004 a glitch in the software for controlling flight probably caused an FA-22 Raptor stealth fighter jet to crash on takeoff at Nellis Air Force [4]13In 2008 a United Airbus A320 registration N462UA experienced multiple avionics and electrical failures including loss of all communications shortly after rotation while departing Newark Liberty International Airport in Newark New Jersey [NTSB Report Identification DCA08IA033] 13In 2006 a Boeing 777 Malaysian Airlines jetlinerrsquos autopilot caused a stall to occur by climbing 3000 feet Pilots struggled to nose down the plane but plunged into a steep dive After pulling back up the pilots regained control Cause was defective flight software providing incorrect data for airspeed and acceleration confusing the flight computers and initially ignoring the pilotrsquos commands[49]13US Army and Air Force UAV crashes from control system or human error13

15

Lessons Learned from Failures

bull From Nancy Levesonrsquos paper ldquoThe Role of Software in Spacecraft Accidentsrdquondash ldquoFlaws in the safety culture diffusion of responsibility and

authorityndash Limited communication channels and poor information flowndash Inadequate system and software engineering ndash Poor or missing specifications ndash Unnecessary complexity and software functionality ndash Software reuse or changes without appropriate safety analysisndash [Shortcomings] in safety engineering practices ndash Flaws in test and simulation environments ndash Inadequate human factors design for softwarerdquo

Presenter
Presentation Notes
In Dr Nancy Levesonrsquos paper [36] ldquoThe Role of Software in Spacecraft Accidentsrdquo she cited problems with software development issues within NASA on various projects According to Dr Leveson there were ldquoflaws in the safety culture diffusion of responsibility and authority limited communication channels and poor information flow inadequate system and software engineering poor or missing specifications unnecessary complexity and software functionality software reuse or changes without appropriate safety analysis violation of basic safety engineering practices inadequate system safety engineering flaws in test and simulation environments and inadequate human factors design for softwarerdquo While these problems were identified for spacecraft development within NASA and corrected aviation in general could learn from these lessons to mitigate issues with complex systems development

16

Some Current Guidelines

bull DO-178B - Software Considerations in Airborne Systems and Equipment Certification bull DO-248B ndash Final Report for the Clarification of DO-178Bbull DO-278 - Guidelines for Communications Navigation Surveillance and Air Traffic Management

(CNSATM) Systems Software Integrity Assurancebull DO-254 - Design Assurance Guidance for Airborne Electronic Hardware bull DO-297 ndash Integrated Modular Avionics (IMA) Development Guidance and Certification

Considerationsbull SAE-ARP4754 ndash Certification Consideration for Highly Integrated or Complex Aircraft Systemsbull SAE-ARP4671- Guidelines and Methods for Conducting the Safety Assessment Process on

Airborne Systems and Equipmentbull FAA Advisory Circular AC27-1B - Certification of Normal Category Rotorcraftbull FAA Advisory Circular AC29-2C - Certification of Transport Category Rotorcraftbull ISOIEC 12207 - Software Life Cycle Processesbull ARINC 653 - Specification Standard for Time and System Partitionbull MIL-STD-882D - DoD System Safetybull ADS-51-HDBK - Rotorcraft and Aircraft Qualification Handbookbull AR-70-62 - Airworthiness Release Standardbull SED-SES-PMHFSA 001 - Software Engineering Directorate (SED) Software Engineering

Evaluation System (SEES) Program Manager Handbook for Flight Software Airworthinessbull SED-SES-PMHSS 001 - SED SEES Program Manager Handbook for Software Safety

WHATrsquoS MISSING - Reliability Standard for Complex Systems

Presenter
Presentation Notes
These problems previously stated drove the development of these guidelines however there is no standard for system reliability that includes software There are other standards and circulars that pertain to complex systems but a reliability standard for complex systems but a reliability standard is missing for complex systems which would outline the process for establishing cyber-physical systems reliability This standard should indicate how to model and analyze and ascertain the projected level of reliability

17

Certification Assessment Considerations

bull Sufficient data and time must be available for air worthiness evaluation

bull Certification processndash Currently lengthy ndash Depends much on human interpretation trade offs and risk mitigation ndash Overwhelming for complex integrated systems (FHAs FTAs FMECAs

risk mitigation etc)

bull Consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives

bull Certification Tasks outlined in DO-297 should be consideredndash Task 1 Module Acceptancendash Task 2 Application softwarehardware acceptancendash Task 3 IMA system acceptancendash Task 4 Aircraft integration of IMA system ndash including VampVndash Task 5 Change of modules or applicationsndash Task 6 Reused of modules or applications

Presenter
Presentation Notes
In order to execute an AWR sufficient data must be provided to the appropriate airworthiness authorities to demonstrate that safety requirements are met to the levels per the System Safety Assessment (SSA) or equivalent The certification process is currently lengthy and depends on much human interpretation of the myriad of complex architecture functions 13The current guidelines such as DO-178B DO-254 DO-297 SAE-ARP-4754 and SAE-ARP-4671 along with many other guidelines outline the proper steps that should be taken System safety managementrsquos military standard is MIL-STD-882 and has been in use for decades Civilian safety standards for the aviation industry include SAE ARP4754 which shows the incorporation of system safety activities into the design process and provides guidance on techniques to use to ensure a safe design SAE ARP4761 contains significant guidance on how to perform the system safety activities spoken about in SAE ARP4754 DO-178B outlines development assurance guidance for aviation software based on the failure condition categorization of the functionality provided by the software DO-254 embodies similar guidance for aviation hardware ARINC 653 is a widely accepted standard to ensure time and space partitioning for software DO-297 does an excellent job of describing the certification tasks to take for an IMA system which include13Task 1 Module acceptance13Task 2 Application softwarehardware acceptance13Task 3 IMA system acceptance13Task4 Aircraft integration of IMA systems including verification and validation13Task 5 Change of modules or applications13Task 6 Reuse of modules or applications13Taken together these standards provide guidance that if followed likely will result in safe highly reliable and cost-effective systems over the life-cycle of the system Yet while these guidelines exist there is not a consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives Also there is not a standard outlining overall reliability for a system to include hardware and software reliability In order to achieve this level of reliability a standard should be developed to define the process and method to achieve a quantifiable reliability number that would in turn lead to acceptance13

18

TRL 3 or 4 TRL 6 or 7 TRL 8 or 9

Component 4

ComplexityFundamentals

Reliability Parametrics

Component 3

ComplexityFundamentals

Reliability Parametrics

Definition of Complexity and Reliability is Needed

Subsystem 1

SystemIntegration ofComponents

ReliabilityDependencies

Component 2

ComplexityFundamentals

Reliability Parametrics

Component 1

ComplexityFundamentals

Reliability Parametrics

Subsystem 2

SystemIntegration ofComponents

ReliabilityDependencies

Integration System

Realized System

ReliabilitySensitivities

High Reliable Complex System

Certificate(eg AWR)

Integration

Integration

Integration

Integration

Integration

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) andor AWR

19

Analytical Models and Reliability

bull Analytical models of hardware reliability are well understood

bull Architecture modeling and software reliability modeling is not a novel idea but is highly debated

ndash There are many approaches and little consensus as to best wayndash Many models (Jelinski-Moranda Littlewood-Verrall Musa-Okumoto etc) [1]ndash Many tools (over 200+ tools since 1970s have been built) [2]

bull Predictability of software reliability is of great concern because it is a major contributor to unreliability[2]

bull Software Reliability is the probability of error-free operation of a computer program in a specified environment for a specified time [1]

bull Need basis for setting reliability figures based on previous systems and iteratively refine those figures in the future

bull NOT A REPLACEMENT FOR TESTING AND VERIFICATION

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability While not suggesting as solution in total better modeling practices should be considered toward a solution to bridge the gap between design test and implementation A method to model the architecture early in the requirements establishment phase follow the detailed design and coding and then be able to verify the system along with the model may be a path to greater confidence in the system and reduce the risks warnings and cautions that must be issued13In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result (see Figure 9) The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) or AWR 13To achieve this goal modeling and analysis tools that follow a standard for modeling reliability should exist As previously stated over 200 tools have been created since 1970s Here is a list of a few of the current tools13Universal Systems Language (USL)13Unified Modeling Language (UML)13Systems Modeling Language (SysML)13MATLABSimulink13Telelogic Rhapsody13MathCAD13Colored Petri Nets13Rate Monotonic Analysis (RMA)13STATEMATE (Used by Airbus)13Standard for the Development of Safety-Critical Embedded Software (SCADE)13OPNET13Embedded System Modeling Language (ESML)13Component Synthesis using Model-Integrated Computing (CoSMIC)13Architectural Analysis and Design Language (AADL)13By no means is this list complete Typically different companies and projects address this challenge and choose unit tools to perform the upfront analysis and modeling not following a standard approach Multiple tools need to converge or be compatible with the framework of a common tool such as AADL The resultant tool(s) could be used to verify the requirements up front to mitigate reliability issues down the life-cycle 13

20

Tools for Modeling and Analysis

bull Universal Systems Language (USL)bull Unified Modeling Language (UML)bull Systems Modeling Language (SysML)bull MATLABSimulinkbull Telelogic Rhapsodybull MathCadbull Colored Petri Netsbull Rate Monotonic Analysis (RMA)bull STATEMATE (Used by Airbus)bull SCADEbull OPNETbull Embedded System Modeling Language (ESML)bull Component Synthesis using Model-Integrated Computing (CoSMIC)bull Architectural Analysis and Design Language (AADL)bull At least 200+ more packages since the 70rsquosbull Certified tools needs to converge to an accepted standard

modelinganalysis method for complex system reliability

Presenter
Presentation Notes
Typically different companies and projects address the challenge and choose unit tools to perform the upfront analysis and modeling not following a guideline approach Multiple tools need to converge or be compatible with a set modeling standard for complex systems The compliant tools could be used to verify the requirements up front to mitigate reliability issues down the life-cycle A notional approach would follow that shown in Figure 8 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model The down side to modeling from certain circles is getting people to believe those models How do you certify a modeling tool and the actual models within the tools Those issues should be addressed going forward

21

Modification to Acquisition Model

Requirements Establishment

High Level Design

Detailed Specifications

Implementation Coding

Verification

Development TestingA

rchi

tect

ural

Mod

el amp

A

naly

sis

Propose standard modeling methodology to be applied at different phases of development to enhance requirements development reliability allocation reliability

measurement and testing (DISCLAIMER DOES NOT REPLACE TESTING)

Reliabilityallocated Reliability

measured

Operational Testing amp Validation

Deployed System

Presenter
Presentation Notes
A notional approach would follow that shown in Figure 10 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model This approach could bridge the design and test phases together It is emphasized here that the architectural model would not replace critical testing but augments the process to allow for better requirement identification and verification Thorough ground and flight tests should never be replaced by modeling Modeling would only allow for more robust and a higher level of confidence in the requirements and design The model could be used in conjunction with the testing to confirm the design Proper modeling and analysis would reduce total program costs by enforcing more complete and correct initial requirements which reduce issues discovered down the road in testing that are expensive to fix or impossible to fix and having to accept high risks Additionally if the model is maintained and optimized then it could possibly be used after system deployment to analyze impacts of upgrades or changes to the system allowing for more complete analysis and reduce overall system redesign costs1313A hurdle to cross with modeling and analysis is convincing people to believe those models Some method to certify these models and modeling tools should be addressed in the future Standards should be set in place for correct modeling techniques for complex systems Lastly consideration of standard verification checking tools should be made such as with the use of the Motor Industry Software Reliability Association (MISRA) compliance verification tool for the use of C in safety critical systems

22

Systems Reliability Standard Establishment

bull Establish a working group to define this standardndash Need a technical society to lead the charge on this

bull Collaborate with industry academia military and societiesndash Focus on development of a reliability standard with AWR safety in mindndash Draw upon the experiences to feed into this standard

bull Study existing and previous complex systemsndash Shuttle Space Station missile systems nuclear submarine and ship

systems nuclear control systems military and commercial jet systems ndash Obtain software reliability information from given existing and previous

systemsndash Build database which would serve as basis for future reliability

bull Research prior efforts in complex systems analysis

bull Establish consensus based modeling and analysis method

Presenter
Presentation Notes
In conclusion methods for achieving a design for complex systems do exist however achieving reliability and attaining a level of qualification that would permit better AWRs does not There needs to be a standard developed to tackle this issue rather than relying on current methods to ascertain airworthiness of complex systems This will not occur overnight An orchestrated collaboration among industry academia military labs and technically professional societies to focus on development of this standard should allow us to draw upon the experiences to feed into this reliability standard with AWR safety in mind We have a long living experiment with complex software systems on the Space Transportation System (STS) International Space Station (ISS) missile systems nuclear submarine and ship systems nuclear control systems military and commercial jet systems from which we should be able to obtain at least inferred software reliability information from the architecture and run time that the systems use We should look at the lessons learned from these systems to see what could have been done to improve and what was done right that should be carried forward The challenge is collecting this information into a central database and arriving at some figure of reliability from previous systems where data exists This would at least provide a starting point to allow initial assessments and could be optimized in the future Also this is not the only study for establishing reliability metrics to complex software systems There have been research projects of similarity to this effort that have risen and fallen The data from those projects should not be wasted but studied to feed into whatever standard that is developed While historical information would be useful each design is unique and requires tools to accomplish the design Investigation of architectural modeling constructs should be further investigated as a possible augmentation to the design and test process We need to determine which forum is best to conduct this effort (eg SAE IEEE AIAA ACM AHS INCOSE or other) As stated in the paper ldquoSpace Shuttle Avionicsrdquo [31] ldquoThe designers the flight crew and other operational users of the system often have a mindset established in a previous program or experience which results in a bias against new or different lsquounconventionalrsquo approachesrdquo If nothing is done to address this problem it will only get worse over time It is past time to address the issue of reliability of complex systems and software

23

BACKUP SLIDES

BACKUP SLIDES

24

PEO Aviation System Safety Management Decision Authority Matrix

Severity(Most Credible)

FrequentP gt 1E-3

A

Probable 1D-4 lt P lt= 1E-3

B

Occasional1E-5 lt P lt= 1E-4

C

Remote1E-6 lt P lt= 1E-5

D

Improbable1E-7 lt P lt= 1E-6

E

Catastrophic1

Critical 2

Marginal 3

Negligible4

Army Acquisition

PEO Aviation

ProgramManagement

HazardCategory

Description

1 Catastrophic Death or permanent total disability system loss

2 Critical Severe injury or minor occupational illness (no permanent effect) minor system or environmental damage

3 Marginal Minor injury or minor occupational illness (no permanent effect) minor system or environmental damage

4 Negligible Less than minor injury or occupational illness (no lost workdays) or less than minor environmental damage

RiskLevel

Description Probability (Frequency) (per 100000 flight hours)

A Frequent gt 100 (P gt 1E-3)

B Probable lt=100 and gt10 (1E-4 lt P lt= 1E-3)

C Occasional lt= 10 and gt1 (1E-5 lt P lt= 1E-4)

D Remote lt=1 and gt01 (1E-6 lt P lt= 1E-5)

E Improbable lt=01 and gt001 (1E-7 lt P lt= 1E-6)

Presenter
Presentation Notes
As already mentioned it is the goal of the US Army Aviation Engineering Directorate that the system is safe and reliable to operate and will perform the mission when delivered Another goal includes the released system continues to safely perform the mission if maintained and operated per the operatorrsquos manual Replacement parts and overhaul work must be high quality to support continued airworthiness Per the Program Element Office Memorandum 08-03 Risk Matrix US Army flight control systems are to achieve 1E-9 reliability for flight critical functions per civil airspace regulations [35] and 1E-6 reliability for tactical use [36] Quantifying these numbers is established for component hardware but not for software Just as hardware should have quantifiable reliability so should software13

25

Reliability Defined

bull Software Reliability - the probability that a given piece of software will execute without failure in a given environment for a given time [40] ndash Often debated as to how to measure

bull Hardware Reliability - the probability that a hardware component fails over time ndash Well defined and established

bull System Reliability - the probability of success or the probability that the system will perform its intended function under specified design limits [over a given amount of time] [39] ndash A combination of software and hardware reliability

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

26

Hardware vs Software Reliability

Hardware Reliability Software ReliabilityFailure rate has a bathtub curve The burn-in state is similar to the software debugging state

Without considering program evolution failure rate is statistically non-increasing

Material deterioration can cause failures even though the system is not used

Failures never occur if the software is not used

Failure data are fitted to some distributions The selection of the underlying distribution is based on the analysis of failure data and experiences Emphasis is placed on analyzing failure data

Most models are analytically derived from assumptions Emphasis is on developing the model the interpretation of the model assumptions and the physical meaning of the parameters

Failures are caused by material deterioration design errors misuse and environment

Failures are caused by incorrect logic incorrect statements or incorrect input data

Can be improved by better design better material applying redundancy and accelerated life cycle testing

Can be improved by increasing testing effort and correcting discovered faults Reliability tends to change continuously during testing due to the addition of problems in new code or the removal of problems by debugging errors

Hardware repairs restore the original condition Software repairs establish a new piece of software

Hardware failures are usually preceded by warnings Software failures are rarely preceded by warnings

Hardware components can be standardized Software components have rarely been standardized

Hardware can usually be tested exhaustively Software essentially requires infinite testing for completeness

Reference [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Hong Pham compares software versus hardware reliability with the information as shown in Table 2 [39] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

27

Acronym ListACRONYM DEFINITIONAADL Architectural Analysis and Design LanguageAC Advisory Circular (FAA)ACM Association of Computing MachineryAED Aviation Engineering Directorate (AMRDEC)AFTD Aviation Flight Test Directorate (US Army)AGC Apollo Guidance ComputerAHS American Helicopter SocietyAIAA American Institute of Aeronautics and Astronautics (Inc)AMCOM Aviation and Missile Command (US Army)AMRDEC Aviation and Missiles Research Development and Engineering Center (US Army)AR Army RegulationARINC Aeronautical Radio Inc ARP Aerospace Recommended PracticeASIF Avionics Software Integration FacilityATAM Architecture Tradeoff Analysis MethodATM Air Traffic ManagementAWR Airworthiness ReleaseCAAS Common Avionics Architecture SystemCH-47 Cargo Helicopter ChinookCMM Capability Maturity ModelCMMI Capability Maturity Model IndexCMU Carnegie Mellon UniversityCNS Communications Navigation SurveillanceCoSMIC Component Synthesis using Model-Integrated ComputingCPS Cyber-Physical SystemCRC Chemical Rubber Company (ie CRC Press)DFBW Digital Fly-By-WireDoD Department of DefenseE3 Electrical and Electromagnetic EffectsESML Embedded System Modeling LanguageFAA Federal Aviation AdministrationFCS Future Combat SystemsFHA Functional Hazard AssessmentFMEA Failure Modes Effects Analysis

28

Acronym List (concluded)

ACRONYM DEFINITIONGPWS Ground Proximity Warning SystemIBM International Business MachinesIEC International Engineering ConsortiumIL Instrumentation Lab (now Draper Laboratory)IMA Integrated Modular AvionicsINCOSE International Council On Systems EngineeringISO International Organization for StandardizationISS International Space StationKAL Korean AirlinesMISRA Motor Industry Standard Software Reliability AssociationMIT Massachusetts Institute of TechnologyNASA National Aeronautics and Space Administration (USA)PDR Preliminary Design ReviewPEO Program Element OfficePNAS Proceedings of the National Academy of SciencesRAQ Rotorcraft and Aircraft QualificationRMA Rate Monotonic AnalysisRTC Redstone Test Center (US Army) RTTC Redstone Technical Test Center (US Army)RTCA Radio Technical Commission for AeronauticsSAE Society of Automotive EngineersSED Software Engineering Directorate (AMRDEC)SEES Software Engineering Evaluation SystemSEI Software Engineering Institute (CMU)SIL System Integration LaboratorySSA System Safety AssessmentSTS Space Transportation SystemSysML Systems Modeling LanguageTMR Triple Modular RedundantTRL Technical Readiness LevelUAS Unmanned Aircraft SystemUH-60 Utility Helicopter BlackhawkUML Unified Modeling LanguageUS United StatesUSL Universal Systems Language

29

References

bull [1] Israel Koren and Mani Krishna ldquoFault-Tolerant Systemsrdquo Morgan Kaufmann 2007bull [2] Jianto Pan ldquoSoftware Reliabilityrdquo Carnegie Mellon University Spring 1999bull [3] Nachum Dershowitz httpwwwcstauacil~nachumdhorrorhtmlbull [4] httpwwwair-attackcombull [5] David A Mindell ldquoDigital Apollo Human and Machine in Spaceflightrdquo The MIT Press 2008bull [6] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Flight Software Airworthiness SED-SES-PMHFSA 001 December 2003bull [7] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Software Safety SED-SES-PMHSSA 001 February 2006bull [8] AMCOM Regulation 385-17 Software System Safety Policy 15 March 2008bull [9] NASA Software Safety Guidebook NASA-GB-871913 31 March 2004bull [10] Margaret Hamilton amp William Hackler ldquoUniversal Systems Language Lessons Learned from Apollordquo IEEE Computer

Society 2008bull [11] Margaret Hamilton ldquoFull Life Cycle Systems Engineering and Software Development Environment Development Before

The Fact In Actionrdquo httpwwwhtiuscomArticlesFull_Life_Cyclehtmbull [12] Peter Feiler David Gluch John Hudak ldquo The Architecture Analysis amp Design Language (AADL) An Introductionrdquo

CMUSEI-2006-TN-011 February 2006bull [13] Peter Feiler John Hudak ldquoDeveloping AADL Models for Control Systems A Practitionerrsquos Guiderdquo CMUSEI-2007-TR-

014 July 2007bull [14] Bruce Lewis ldquoUsing the Architecture Analysis and Design Language for System Verification and Validationrdquo SEI

Presentation 2006bull [15] Feiler Gluch Hudak Lewis ldquoEmbedded System Architecture Analysis Using SAE AADLrdquo CMUSEI-2004-TN-004 June

2004bull [16] Charles Pecheur Stacy Nelson ldquoVampV of Advanced Systems at NASArdquo NASACR-2002-211402 April 2002bull [17] Systems Integration Requirements Task Group ldquoARP 4754 Certification Considerations for Highly-Integrated or

Complex Aircraft Systemsrdquo SAE Aerospace 10 April 1996bull [18] SAE ldquoARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne System and

Equipmentrdquo December 1996bull [19] Aeronautical Radio Inc (ARINC) ldquoARINC Specification 653P1-2 Avionics Application Software Standard Interface Part 1

ndash Required Servicesrdquo 7 March 2006

30

References (Continued)

bull [20] Department of Defense ldquoMIL-STD-882D Standard Practice for System Safetyrdquo 19 January 1993bull [21] RTCA Incorporated ldquoDO-178 Software Considerations in Airborne Systems and Equipment Certificationrdquo 1

December 1992bull [22] RTCA Incorporated ldquoDO-254 Design Assurance Guidance for Airborne Electronic Hardwarerdquo 19 April 2000bull [23] US Army ldquoAeronautical Design Standard Handbook Rotorcraft and Aircraft Qualification (RAQ) Handbookrdquo 21

October 1996bull [24] Cary R Spitzer (Editor) ldquoAvionics Elements Software and Functionsrdquo CRC Press 2007bull [25] US Army ldquoArmy Regulation 70-62 Airworthiness Qualification of Aircraft Systemsrdquo 21 May 2007bull [26] US Army ldquoArmy Regulation 95-1 Aviation Flight Regulationsrdquo 3 February 2006bull [27] ldquoUsing the Architecture Tradeoff Analysis Method (ATAM) to Evaluate the Software Architecture for a Product Line

of Avionics Systems A Case Studyrdquo Barbacci Clements Lattanze Northrop Wood July 2003 CMUSEI-2003-TN-012bull [28] ldquoAll in the Family CAAS amp AADLrdquo Peter Feiler August 2008 CMUSEI-2008-SR-021bull [29] ldquoCMMI Guidelines for Process Integration and Product Improvementrdquo Chrissis Konrad Shrum Pearson Education

2007bull [30] ldquoModel Driven Performance Analysis for Avionics Systemsrdquo Brendan OrsquoConnell Draper Laboratory January 2006bull [31] John F Hanaway Robert W Moorehead ldquoSpace Shuttle Avionics Systemsrdquo NASA SP-504 1989bull [32] Lui Sha ldquoThe Complexity Challenge in Modern Avionics Softwarerdquo August 14 2006bull [33] ldquoIncidents Prompt New Scrutiny of Airplane Software Glitchesrdquo 30 May 2006 Wall Street Journalbull [34] Eyal Ophir Clifford Nass and Anthony Wagner ldquo Cognitive Control in Media Multitaskersrdquo PNAS 20 July 2009bull [35] ldquoAdvisory Circular AC 251309-1A System Design and Analysisrdquo Federal Aviation Administration 21 June 1988bull [36] Program Element Office Policy Memorandum 08-03bull [38] httpwwwnsfgovpubs2008nsf08611nsf08611htm National Science Foundation webpage on Cyber-Physical

Systemsbull [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000bull [40] Paul Rook editor ldquoSoftware Reliability Handbookrdquo Elsevier Science Publishers LTD 1990

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)

4

Defense Acquisition Approach to Systems Development and Test

Requirements Establishment

Analysis

High Level Design

Detailed Specifications

Implementation Coding

Operational Testing amp Validation

Verification

Development Testing

Deployed System

Presenter
Presentation Notes
The general recommended system development V-curve as shown in is not always followed in a strict sense although it should be the goal Such negligence to the proper process makes the establishment of certification very difficult For new system development and existing system upgrades requirements must be clear complete and testable The certification requirements must be made obvious in the development of the requirements establishment phase with the goal of being fully identified during the requirements development

5

US Army Airworthiness

Airworthiness Qualification meansThe system is safe and reliable to operate and willperform the mission when deliveredIt will continue to safely perform the mission ifmaintainedoperated per the manualParts and overhaul work must be high quality to maintain airworthinessFlight control systems have high reliability requirements - 10-9 for civil airspace critical IFR functions [35]- 10-6 for tactical airspace [36]

Presenter
Presentation Notes
The US Army has been involved in flight since the early Wright-B flight of 1909 Initially fixed wing and rotary wing aircraft in early aviation remained simple and federated The basics to aviate navigate and communicate were handled by dedicated gauges compasses gyroscopes and mechanical linkages to fly the aircraft With the dawn of the Space Age in the 1960s the rise of more complex electronic control in aviation appeared with the ApolloSaturn projects NASA Dryden F-8 project and others As time progressed more integrated modular avionics (IMA) and digital fly by wire (DFBW) came into military and commercial aircraft to reduce weight increase functionality and provide redundancy With this advancement of complex flight control technology testing verification and validation became problematic To this day complex systems have been challenging to qualify for the Federal Aviation Administration (FAA) and military airworthiness commands13The US Army AED holds the responsibility for airworthiness approval This includes the design approval production approval and continued airworthiness In the design approval AED must determine if the system design or modification meets airworthiness standards In the milestone decisions it must determine whether the system is produced or modified in accordance with the approved design For continued airworthiness a judgment must be made for the system operated and maintained to keep it compliant with the approved design The bottom line is that the qualified system meets its performance requirements and is reliable when delivered1313Establishing approval for AWR has become increasingly difficult with the evolution of complex avionic control systems from federated architectures to IMA architectures that rely heavily on complex software systems (ie now identified as cyber-physical systems) As stated earlier AED has the mission to ensure airworthiness qualification for aircraft and subsystems used in the US Army aviation fleet including helicopters airplanes and Unmanned Aircraft Systems (UAS) 13

6

AED and Qualification

bull AEDrsquos mission is to ensure airworthiness qualification for aircraft and subsystems used in the US Army fleet

bull Airworthiness Qualification isndash Demonstration of an aircraft or aircraft subsystem or

component including modifications to function safely meeting performance specifications when used and maintained within prescribed limits (AR 70-62)

bull Traditionally qualified systems by ndash Similarity ndash Analysisndash Testndash Demonstrationndash Examination

Presenter
Presentation Notes
Airworthiness qualification is the demonstration of an aircraft or aircraft subsystem or component including modifications to function satisfactorily when used and maintained within prescribed limits [25] The US Army has traditionally qualified systems and components by physical testing analysis demonstration or by similarity

7

Evolution of Helicopter Systems

bull Past systems historically federated ndash Distributed Functionality ndash Hardware basedndash Simple easier to test

bull Now systems are becoming integratedndash Combined Functionality ndash More Software Intensivendash Complex more difficult to test

Chief Warrant Office Jim Beaty (back row far left) and crew of the of the Vietnam UH-1 Flying Bikinis (friend of Alex Boydston)

UH-1 Cockpit (US Army)

Chinook CAAS Cockpit (US Army)

CH-47 Chinook (US Army)

Presenter
Presentation Notes
The US Army has been involved in flight since the early Wright-B flight of 1909 Initially fixed wing and rotary wing aircraft in early aviation remained simple and federated The basics to aviate navigate and communicate were handled by dedicated gauges compasses gyroscopes and mechanical linkages to fly the aircraft With the dawn of the Space Age in the 1960s the rise of more complex electronic control in aviation appeared with the ApolloSaturn projects NASA Dryden F-8 project and others As time progressed more integrated modular avionics (IMA) and digital fly by wire (DFBW) came into military and commercial aircraft to reduce weight increase functionality and provide redundancy With this advancement of complex flight control technology testing verification and validation became problematic To this day complex systems have been challenging to qualify for the Federal Aviation Administration (FAA) and military airworthiness commands13

8

Present Approach to Testing

bull Several disciplines weigh in such as software avionics GNampCenvironmental E3 electrical human factors structures aerodynamics etc

bull Current test methodology per older federated systemsndash Hardware Mil-Std 810 Mil-Std 461ndash Requirements Analysis (Traceability)

bull Test at different levelsndash Individual software module testingndash Black box testingndash System Integration testing

bull Hot benchbull System Integration Lab (SIL)

ndash Aircraft level Testing bull Groundbull Flight Aviation Flight Test Directorate (AFTD) Testing (US Army Photo)

Aviation System Integration Facility (ASIF) (US Army Photo)

Presenter
Presentation Notes
Airworthiness qualification is the demonstration of an aircraft or aircraft subsystem or component including modifications to function satisfactorily when used and maintained within prescribed limits [25] The US Army has traditionally qualified systems and components by physical testing analysis demonstration or by similarity Historically most systems were federated They were hardware based simple and distributed Now they have become more integrated They are more software intensive complex and have combined functionality contained in one or more computers With this evolution from simple to more complex the Army is finding it more difficult to execute an AWR As systems evolve to more complex systems of systems this problem is only growing worse13The current test approach to achieving confidence in systems for an AWR for the US Army is based more on traditional federated avionics systems Experienced personnel in Software Vehicle Management Systems Avionic Communications Navigation amp Guidance Control Electrical Environmental Human Factors Electrical and Electro-magnetic Effects (E3) Structures Thermal Safety Integration Testing and Flight Test personnel and test pilots all play important roles in accomplishing test and review of new and existing systems While some may not consider areas such as thermal or EEE important to software development they are crucial since the software is running on physical systems that are affected by heat and susceptibility to electromagnetic radiation which can cause abnormal operation or bit errors Current test methodology for hardware relies on MIL-STD-810 MIL-STD-461 and requirements analysis such as traceability MIL-STD-810 is the Department of Defense Test Method Standard for Environmental Engineering Considerations and Laboratory Tests1313It focuses on equipment environmental design and test limits to the conditions that it will experience throughout its service life and establishing chamber test methods that replicate the effects of environments on the equipment rather than imitating the environments themselves MIL-STD-461 includes the ldquoRequirements for the Control of Electromagnetic Interference Characteristics of Subsystems and Equipmentrdquo These test standards are geared for hardware and physical effects and do not address software operation Current methods for software testing include testing the individual software modules black box (or interface) level system integration level and aircraft level testing System integration testing can include hot bench system integration labs (SILs) (such as the shown in Figure 4) and aircraft ground level testing as conducted by the Aviation Flight Test Directorate (AFTD) of Redstone Test Center (RTC) (see Figure 5) 13Neglecting complete and good requirements promotes risk It is common practice by Program Managers (PMs) to accept risks As issues are found in systems it is on AED to issue an Army Safety Action Memorandum (ASAM) that identifies a deficiency to the field pilots Concurrently an Airworthiness Impact Statement (AWIS) is issued to the PMs which contain a probabilistic analysis of how the identified shortcoming will affect risk The AMCOM Safety Office will produce its own calculations The PMs can either accept or reject this assessment Regardless cautions and warnings are placed in the AWR to keep the entire program and flight crew aware of issues13

9

Development Challenges

bull Legacy Aircraft often upgraded in a piecemeal fashionndash Makes certification difficultndash Desire to increase to modern requirements based on size of upgrade and

what it includes ndash hard to scope

bull New system requirements must be clear complete and testable ndash Certification requirements must be obvious

bull Orchestrating agreement between stakeholders is necessary to mitigatendash Juggling of multiple software buildsndash End system that is difficult to test certify and deployndash Escalating Costsndash System Safety from being poorly understoodndash Design iterations

Presenter
Presentation Notes
It would be wonderful if all systems were straightforward in design easily testable and simple to write an Airworthiness Release (AWR) for however that is not the case Legacy aircraft such as the Chinook have been upgraded in a piecemeal fashion acquiring much needed improvements in aviation navigation and communication The general recommended system development V-curve as shown earlier is not always followed in a strict sense although it should be the goal process Negligence to the proper process makes the establishment of certification very difficult For new system development and existing system upgrades requirements must be clear complete and testable The certification requirements must be made obvious in the development of the requirements establishment phase with the goal of being fully identified during the requirements development Orchestrating agreement among all stakeholders (eg the program manager systems engineers human factors engineers integrators test engineers manufacturers users and certifiers) is necessary to mitigate problems such as13juggling multiple software builds13producing a difficult-to-test difficult-to-certify and difficult-to-deploy systems13misunderstanding system safety and13requiring design iterations that impact schedules and costs

10

Complexity Issues

bull System Development costs and schedule increase with complexityndash Existing lack of schedule and funding resources

bull Keeps systems from achieving full compliance with specifications and requirements

bull Garbage in -gt Garbage OuthellipPoor requirements -gt Poor Systemndash Finding problems in new designs at PDR is too latendash Difficult to correct existing poorly designed fielded complex systems

bull Complexity amp reliability of complex systems is not fully understoodndash How do we accurately assess operating risk performance reliability of

complex systems based on limited testing and analysisndash How do we know when system design is good enoughndash Latent defects occur in supposedly well-tested mature systems

bull Avionics parts and software change constantlyndash Spiral development -gt new softwarehardware qualification required frequently ndash How do we streamline the process (partition the system) so the need for

complete re-qualification after changes is lessened

Presenter
Presentation Notes
In an old computer development lesson it is well known that if you put ldquogarbage in you will get garbage outrdquo Likewise in a system if you have poor requirements then you will end up with a poor system Finding problems in new designs at Preliminary Design Review (PDR) is too late It has been shown that discovering issues at this stage in the game will cause a reiteration on the design which costs time and money Furthermore dealing with existing fielded complex systems that have not gone through the rigors of proper systems engineering results quite often with an overwhelming mess of functionality to test verify validate qualify and certifydesigns at PDR is too late It has been shown that discovering issues at this stage in the life-cycle will cause a reiteration on the design which costs time and money Furthermore dealing with existing fielded complex systems that have not gone through the rigors of proper systems engineering results quite often with a complicated conglomeration of functionality to test verify validate qualify certify and maintain This is indicative where complexity has exceeded our understanding in how to certify a system We still do not fully understand complexity and how to address reliability of complex systems How do we accurately assess operating risks performance and reliability of complex systems based on limited testing and analysis How do we know when a system design is good enough How do we modularize spirally developed systems to minimize the need for re-qualification of unchanging portions of the system We are 30 plus years into this technology and we still deal with systems with latent defects that are occurring in supposedly well-tested and mature systems To further exacerbate the problem we are now dealing with complex system of systems (ie cyber-physical systems)13It is a given that you can keep on adding redundancy and complexity to a problem to attain a desired level of reliability but at some point in time the reliability will taper off At best we sometimes must satisfy for an optimum point before digressing in reliability In the same vein system development costs and schedule increase with complexity too (see Figure 7 and 8)13Avionics parts and software constantly change over the life of a program Typically a spiral development program occurs with complex software development which means that qualification is required frequently This begs the question of how to streamline the process so that the need to conduct a complete requalification is avoided13With these complex systems there are other hurdles to cross such as fully characterizing and conducting the Functional Hazard Assessments (FHAs) and Failure Mode Effects and Criticality Analyses (FMECAs) It is crucial for the safety assessment that these are conducted correctly to fully understand the risks and later perform the correct tests at the right level Additionally once the complex component hardware and software are integrated then yet other problems appear It is at that time that the disparity of teams crossing multiple contractors and development groups becomes obvious and if not properly coordinated could cause impacts to the schedule Other programmatic problems affect complex system development and qualification For instance lack of schedule and funding resources causes a shortcoming to adequately provide for the proper compliance with specification and requirements short-circuiting the systems engineering process An ever decreasing availability of trained engineers to support the development and test of such systems exists Non-technical political influences sometimes affect the reliability of complex systems and are difficult to avoid Lastly there is a lack of a centralized database that captures the various families of systems that have been built along with their characterization of success and failures Such a database from all past and present government complex systems could be valuable in establishing reliability basis for future models

11

Complexity Issues (continued)

bull Functional Hazard Assessments and related documentation are crucial

ndash Understanding risks ndash Performing the correct tests at the right level Lab test vs Flight Test

bull Saves flight time and money

bull Systems Integration for complex systems is a schedule driver

bull Need experienced personnel to work complex systems

bull Need a centralized database - just doesnrsquot existndash Determine data needed for quantifying reliability of complex systemsndash Capture the pertinent data on previous or existing complex systemsndash Understand successes and failures of previous and present complex systemsndash Establish baseline reliability figures for known architectures

bull Complex System of Systems exacerbates problem

12

Reliability vs Complexity ampCost vs Complexity

Notional Graphs

bull Reliability vs Complexity bull Cost amp Schedule vs Complexity

Rel

iabi

lity

Complexity Complexity

Cos

t amp S

ched

uleOptimum

Aggregation of part reliability

feeds into overall system reliability

Desired

Presenter
Presentation Notes
Avionics parts and software constantly change over the life of a program Typically a spiral development program occurs with complex software development which means that qualification is required frequently This begs the question of how to streamline the process so that the need to conduct a complete re-qualification is avoided13With these complex systems there are other hurdles to cross such as fully characterizing and conducting the Functional Hazard Assessments (FHAs) and Failure Mode Effects and Criticality Analyses (FMECAs) It is crucial for the safety assessment that these are conducted correctly to fully understand the risks and later perform the correct tests at the right level13Additionally once the complex component hardware and software are integrated then yet other problems crop up It is at that time that the disparity of teams crossing multiple contractors and development groups becomes obvious and if not properly coordinated could cause impacts to the schedule 13Other programmatic problems affect complex system development and qualification For instance lack of schedule and funding resources causes a shortcoming to adequately provide for the proper compliance with specification and requirements short-circuiting the systems engineering process An ever decreasing availability of trained engineers to support the development and test of such systems exists Non-technical political influences sometimes affect the reliability of complex systems and are difficult to avoid Lastly there is a lack of a centralized database that captures the various families of systems that have been built along with their characterization of success and failures Such a database from all past and present government complex systems could be valuable in establishing reliability basis for future models

13

A Few Examples of Complex Systems

bull This is not a new problem Other have struggled with the challenges of establishing confidence in complex systems

ndash NASAbull Apollo Guidance Computerbull Dryden F8 Crusaderbull Space Shuttlebull International Space Station

ndash Commercial Airlinersbull Airbus A320 and higherbull Boeing B777 B787

ndash Militarybull Ships and Submarinesbull Jets (F14F15 F16 F18 F22 F35 etc)bull Cargo Planes (C130J Hercules C17 Globemaster etc)bull Helicopters (Chinook Blackhawk Sea Stallion etc)bull Rocketsbull Unmanned Aerial Systemsbull Unmanned Ground Systemsbull Unmanned Submarine Systems Photos by US Army NASA US Navy and US Air Force

Presenter
Presentation Notes
As mentioned earlier complex avionics systems are not a new idea Since the early 1960rsquos complex avionic architectures have existed beginning with the ApolloSaturn program Massachusetts Institute of Technology (MIT) Instrumentation Lab (IL) which is now Draper Laboratory and International Business Machines (IBM) led the way with the MITIL Apollo Guidance Computer (AGC) and the Saturn V IBM triple modular redundant (TMR) voting guidance computer system The word software was not even coined at the time but engineers such as Margaret Hamilton MITIL Director of Apollo On-board Software can attest to the fact that some the same issues with creating reliable software then still exists today [5] A large majority of the issues then dealt with the communication between systems engineers and the programmers Requirements were thrown over the wall without the confirmation that the requirements were complete and a lot of the issues cropped up as interface problems Identifying these issues prompted Hamilton to create her own company and create a modeling language called Universal Systems Language (USL) to head off the problems experienced with Apollo [11] Some 200 plus modeling programs have been developed since Apollo and used to mitigate issues and increase confidence in systems of varying complexity13As time progressed other systems came along The NASA Dryden F8 Crusader was the first digital fly by wire (DFBW) jet aircraft that relied heavily on complex IMA and software for flight control The Space Transportation System (STS) shuttle includes a Quad Modular Redundant (QMR) system with a fifth backup flight computer containing uncommon code US Air Force and Naval airplanes that have possessed complex or redundant IMA configurations include the F14 Tomcat F15 Eagle F16 Falcon F18 Hornet F22 Raptor F35 Joint Strike Fighter F117 Nighthawk V22 Osprey C17 Globemaster and many more along with recent Unmanned Air Vehicle Systems (UAVS) The US Army complex systems on helicopters include the13RAH-66 Comanche DFBW Triple Modular Redundant (TMR) architecture13glass cockpit avionics on the UH-60M Blackhawk baseline 13Common Avionics Architecture System (CAAS) glass cockpit on the UH-60M Blackhawk modernization and CH-47F Chinooks13and other aircraft13Additionally there are many self-checking pair engine controller systems along with system of system Future Combat Systems (FCS) and Unmanned Air Vehicle Systems (UAVS) This has also permeated the commercial airliner market with the Airbus 320 and higher Airbus models Boeing 777 and Boeing 787 aircraft With this ever increasing technology something must be done about the reliability issue With such a wealth of data on aviation and non-aviation cyber-physical systems such as submarine ship nuclear medical locomotive and automotive systems there should be adequate information to get a start on modeling systems correctly for reliability Therefore this is not an isolated problem to avionics and other disciplines should aide in resolving this problem13

14

Some Complex System Failures

bull V-22 Osprey crashesbull Mars Climate Orbiter crashbull Mars Pathfinder software resetbull USS Vincennes downing an Airbus 320bull Therac-25 software radiation treatment

failurebull 1989 Airbus A320 air show crashbull China Airlines Airbus Industries A300

crashbull Ariane 5 satellite launcher malfunctionbull Failure of the primary flight system to

sync with the backup during prelaunch of STS-1

bull Mexicana Airlines Boeing 727 airliner crashed into a mountain due to the software not correctly determining the mountain position

bull Loss of the first American probe to Venus

bull Korean Airlines KAL 901 accidentbull Soviet Phobos I Mars probe lost

bull Three Mile Islandbull F-18 fighter plane crash due to bad

exceptionbull F-14 fighter plane lost to

uncontrollable spinbull Swedish Gripen prototype crashedbull Swedish Gripen air-show crashbull F-22 failure crossing the IDLbull 2006 German-Spanish Barracuda UAVbull 2004 FA-22 Raptor stealth fighter jet

crash bull FA-22 Raptor navigation system

software error at Nellis AFBbull 50 cockpit blackouts on A320bull A320 multiple avionics and electrical

failures at Newark NJbull Boeing 777 Malaysian Airlines jetlinerrsquos

nightmarish autopilot rollercoaster ridebull 3000 feet US Army and Air Force UAV

Crashes

bull hellip And Many Morehellip

Presenter
Presentation Notes
Multiple crashes have occurred with the V-22 Osprey [41]13In 1999 the Mars Climate Orbiter crashed because of incorrect units in a program caused by poor systems engineering practices [42 44]13In 1988 an Airbus 320 was shot down by the USS Vincennes because of cryptic and misleading output displayed by the tracking software [3]13In 1989 an Airbus A320 crashed at an air show due to altitude indication and software handling [3]13In 1994 a China Airlines Airbus Industries A300 crash on killing 264 from faulty software [3]13In 1996 the first Ariane 5 satellite launcher destruction mishap was caused by a faulty software design error with a few lines of ADA code containing unprotected variables Horizontal velocity of the Ariane 5 exceeded the Arian 4 resulting in the guidance system veering the rocket off course Insufficient testing did not catch this error which was a carry-over from Ariane 4[3 39]13In 1986 a Mexicana Airlines Boeing 727 airliner crashed into a mountain due to the software not correctly determining the mountain position [39]13In 1986 the Therac-25 radiation therapy machines overdosed cancer patients due to flaw in the computer program controlling the highly automated devices [3 39 45]13During the maiden launch in 1981 of the Discovery space shuttle a failure of the primary flight control computer system to establish sync with the backup during prelaunch [43]13On December 10 1990 the Space Shuttle Columbia had to land early due to computer software problems [39]13In 1997 The Mars Pathfinder software reset problem due to latent task execution caused by priority inversion with a mutex [3 44]13An error in a single FORTRAN statement resulted in the loss of the first American probe to Venus From G J Myers Software Reliability Principles amp Practice p 25 [3]13In September 1999 the Korean Airlines KAL 901 accident in Guam killed 225 out of 254 aboard A worldwide bug was discovered in barometric altimetry in Ground Proximity Warning System (GPWS) From ACM SIGSOFT Software Engineering Notes vol 23 no 1 [3]13The Soviet Phobos I Mars probe was lost due to a faulty software update at a cost of 300 million rubles Its disorientation broke the radio link and the solar batteries discharged before reacquisition From Aviation Week 13 Feb 1989 [3]13An F-18 fighter plane crash due to a missing exception condition From ACM SIGSOFT Software Engineering Notes vol 6 no 2 [3]13An F-14 fighter plane was lost to uncontrollable spin traced to tactical software From ACM SIGSOFT Software Engineering vol9 no5 [3]13In 1989 Swedish Gripen prototype crashed due to software in their digital fly-by-wire system [3 46]13In 1995 another Gripen fighter plane crashed during air-show caused by a software issue [3 46]13On February 11 2007 twelve FA-22 Raptors were forced to head back to the Hawaii when a software bug caused a computer crash as they were crossing International Date Line In the middle of the ocean all their systems comprising navigation fuel and part of the communications systems dumped All the attempts to reboot failed[47]13February 2006 German-Spanish Unmanned Combat Air Vehicle Barracuda crash due to software failure [4]13December 2004 a glitch in the software for controlling flight probably caused an FA-22 Raptor stealth fighter jet to crash on takeoff at Nellis Air Force [4]13In 2008 a United Airbus A320 registration N462UA experienced multiple avionics and electrical failures including loss of all communications shortly after rotation while departing Newark Liberty International Airport in Newark New Jersey [NTSB Report Identification DCA08IA033] 13In 2006 a Boeing 777 Malaysian Airlines jetlinerrsquos autopilot caused a stall to occur by climbing 3000 feet Pilots struggled to nose down the plane but plunged into a steep dive After pulling back up the pilots regained control Cause was defective flight software providing incorrect data for airspeed and acceleration confusing the flight computers and initially ignoring the pilotrsquos commands[49]13US Army and Air Force UAV crashes from control system or human error13

15

Lessons Learned from Failures

bull From Nancy Levesonrsquos paper ldquoThe Role of Software in Spacecraft Accidentsrdquondash ldquoFlaws in the safety culture diffusion of responsibility and

authorityndash Limited communication channels and poor information flowndash Inadequate system and software engineering ndash Poor or missing specifications ndash Unnecessary complexity and software functionality ndash Software reuse or changes without appropriate safety analysisndash [Shortcomings] in safety engineering practices ndash Flaws in test and simulation environments ndash Inadequate human factors design for softwarerdquo

Presenter
Presentation Notes
In Dr Nancy Levesonrsquos paper [36] ldquoThe Role of Software in Spacecraft Accidentsrdquo she cited problems with software development issues within NASA on various projects According to Dr Leveson there were ldquoflaws in the safety culture diffusion of responsibility and authority limited communication channels and poor information flow inadequate system and software engineering poor or missing specifications unnecessary complexity and software functionality software reuse or changes without appropriate safety analysis violation of basic safety engineering practices inadequate system safety engineering flaws in test and simulation environments and inadequate human factors design for softwarerdquo While these problems were identified for spacecraft development within NASA and corrected aviation in general could learn from these lessons to mitigate issues with complex systems development

16

Some Current Guidelines

bull DO-178B - Software Considerations in Airborne Systems and Equipment Certification bull DO-248B ndash Final Report for the Clarification of DO-178Bbull DO-278 - Guidelines for Communications Navigation Surveillance and Air Traffic Management

(CNSATM) Systems Software Integrity Assurancebull DO-254 - Design Assurance Guidance for Airborne Electronic Hardware bull DO-297 ndash Integrated Modular Avionics (IMA) Development Guidance and Certification

Considerationsbull SAE-ARP4754 ndash Certification Consideration for Highly Integrated or Complex Aircraft Systemsbull SAE-ARP4671- Guidelines and Methods for Conducting the Safety Assessment Process on

Airborne Systems and Equipmentbull FAA Advisory Circular AC27-1B - Certification of Normal Category Rotorcraftbull FAA Advisory Circular AC29-2C - Certification of Transport Category Rotorcraftbull ISOIEC 12207 - Software Life Cycle Processesbull ARINC 653 - Specification Standard for Time and System Partitionbull MIL-STD-882D - DoD System Safetybull ADS-51-HDBK - Rotorcraft and Aircraft Qualification Handbookbull AR-70-62 - Airworthiness Release Standardbull SED-SES-PMHFSA 001 - Software Engineering Directorate (SED) Software Engineering

Evaluation System (SEES) Program Manager Handbook for Flight Software Airworthinessbull SED-SES-PMHSS 001 - SED SEES Program Manager Handbook for Software Safety

WHATrsquoS MISSING - Reliability Standard for Complex Systems

Presenter
Presentation Notes
These problems previously stated drove the development of these guidelines however there is no standard for system reliability that includes software There are other standards and circulars that pertain to complex systems but a reliability standard for complex systems but a reliability standard is missing for complex systems which would outline the process for establishing cyber-physical systems reliability This standard should indicate how to model and analyze and ascertain the projected level of reliability

17

Certification Assessment Considerations

bull Sufficient data and time must be available for air worthiness evaluation

bull Certification processndash Currently lengthy ndash Depends much on human interpretation trade offs and risk mitigation ndash Overwhelming for complex integrated systems (FHAs FTAs FMECAs

risk mitigation etc)

bull Consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives

bull Certification Tasks outlined in DO-297 should be consideredndash Task 1 Module Acceptancendash Task 2 Application softwarehardware acceptancendash Task 3 IMA system acceptancendash Task 4 Aircraft integration of IMA system ndash including VampVndash Task 5 Change of modules or applicationsndash Task 6 Reused of modules or applications

Presenter
Presentation Notes
In order to execute an AWR sufficient data must be provided to the appropriate airworthiness authorities to demonstrate that safety requirements are met to the levels per the System Safety Assessment (SSA) or equivalent The certification process is currently lengthy and depends on much human interpretation of the myriad of complex architecture functions 13The current guidelines such as DO-178B DO-254 DO-297 SAE-ARP-4754 and SAE-ARP-4671 along with many other guidelines outline the proper steps that should be taken System safety managementrsquos military standard is MIL-STD-882 and has been in use for decades Civilian safety standards for the aviation industry include SAE ARP4754 which shows the incorporation of system safety activities into the design process and provides guidance on techniques to use to ensure a safe design SAE ARP4761 contains significant guidance on how to perform the system safety activities spoken about in SAE ARP4754 DO-178B outlines development assurance guidance for aviation software based on the failure condition categorization of the functionality provided by the software DO-254 embodies similar guidance for aviation hardware ARINC 653 is a widely accepted standard to ensure time and space partitioning for software DO-297 does an excellent job of describing the certification tasks to take for an IMA system which include13Task 1 Module acceptance13Task 2 Application softwarehardware acceptance13Task 3 IMA system acceptance13Task4 Aircraft integration of IMA systems including verification and validation13Task 5 Change of modules or applications13Task 6 Reuse of modules or applications13Taken together these standards provide guidance that if followed likely will result in safe highly reliable and cost-effective systems over the life-cycle of the system Yet while these guidelines exist there is not a consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives Also there is not a standard outlining overall reliability for a system to include hardware and software reliability In order to achieve this level of reliability a standard should be developed to define the process and method to achieve a quantifiable reliability number that would in turn lead to acceptance13

18

TRL 3 or 4 TRL 6 or 7 TRL 8 or 9

Component 4

ComplexityFundamentals

Reliability Parametrics

Component 3

ComplexityFundamentals

Reliability Parametrics

Definition of Complexity and Reliability is Needed

Subsystem 1

SystemIntegration ofComponents

ReliabilityDependencies

Component 2

ComplexityFundamentals

Reliability Parametrics

Component 1

ComplexityFundamentals

Reliability Parametrics

Subsystem 2

SystemIntegration ofComponents

ReliabilityDependencies

Integration System

Realized System

ReliabilitySensitivities

High Reliable Complex System

Certificate(eg AWR)

Integration

Integration

Integration

Integration

Integration

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) andor AWR

19

Analytical Models and Reliability

bull Analytical models of hardware reliability are well understood

bull Architecture modeling and software reliability modeling is not a novel idea but is highly debated

ndash There are many approaches and little consensus as to best wayndash Many models (Jelinski-Moranda Littlewood-Verrall Musa-Okumoto etc) [1]ndash Many tools (over 200+ tools since 1970s have been built) [2]

bull Predictability of software reliability is of great concern because it is a major contributor to unreliability[2]

bull Software Reliability is the probability of error-free operation of a computer program in a specified environment for a specified time [1]

bull Need basis for setting reliability figures based on previous systems and iteratively refine those figures in the future

bull NOT A REPLACEMENT FOR TESTING AND VERIFICATION

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability While not suggesting as solution in total better modeling practices should be considered toward a solution to bridge the gap between design test and implementation A method to model the architecture early in the requirements establishment phase follow the detailed design and coding and then be able to verify the system along with the model may be a path to greater confidence in the system and reduce the risks warnings and cautions that must be issued13In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result (see Figure 9) The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) or AWR 13To achieve this goal modeling and analysis tools that follow a standard for modeling reliability should exist As previously stated over 200 tools have been created since 1970s Here is a list of a few of the current tools13Universal Systems Language (USL)13Unified Modeling Language (UML)13Systems Modeling Language (SysML)13MATLABSimulink13Telelogic Rhapsody13MathCAD13Colored Petri Nets13Rate Monotonic Analysis (RMA)13STATEMATE (Used by Airbus)13Standard for the Development of Safety-Critical Embedded Software (SCADE)13OPNET13Embedded System Modeling Language (ESML)13Component Synthesis using Model-Integrated Computing (CoSMIC)13Architectural Analysis and Design Language (AADL)13By no means is this list complete Typically different companies and projects address this challenge and choose unit tools to perform the upfront analysis and modeling not following a standard approach Multiple tools need to converge or be compatible with the framework of a common tool such as AADL The resultant tool(s) could be used to verify the requirements up front to mitigate reliability issues down the life-cycle 13

20

Tools for Modeling and Analysis

bull Universal Systems Language (USL)bull Unified Modeling Language (UML)bull Systems Modeling Language (SysML)bull MATLABSimulinkbull Telelogic Rhapsodybull MathCadbull Colored Petri Netsbull Rate Monotonic Analysis (RMA)bull STATEMATE (Used by Airbus)bull SCADEbull OPNETbull Embedded System Modeling Language (ESML)bull Component Synthesis using Model-Integrated Computing (CoSMIC)bull Architectural Analysis and Design Language (AADL)bull At least 200+ more packages since the 70rsquosbull Certified tools needs to converge to an accepted standard

modelinganalysis method for complex system reliability

Presenter
Presentation Notes
Typically different companies and projects address the challenge and choose unit tools to perform the upfront analysis and modeling not following a guideline approach Multiple tools need to converge or be compatible with a set modeling standard for complex systems The compliant tools could be used to verify the requirements up front to mitigate reliability issues down the life-cycle A notional approach would follow that shown in Figure 8 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model The down side to modeling from certain circles is getting people to believe those models How do you certify a modeling tool and the actual models within the tools Those issues should be addressed going forward

21

Modification to Acquisition Model

Requirements Establishment

High Level Design

Detailed Specifications

Implementation Coding

Verification

Development TestingA

rchi

tect

ural

Mod

el amp

A

naly

sis

Propose standard modeling methodology to be applied at different phases of development to enhance requirements development reliability allocation reliability

measurement and testing (DISCLAIMER DOES NOT REPLACE TESTING)

Reliabilityallocated Reliability

measured

Operational Testing amp Validation

Deployed System

Presenter
Presentation Notes
A notional approach would follow that shown in Figure 10 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model This approach could bridge the design and test phases together It is emphasized here that the architectural model would not replace critical testing but augments the process to allow for better requirement identification and verification Thorough ground and flight tests should never be replaced by modeling Modeling would only allow for more robust and a higher level of confidence in the requirements and design The model could be used in conjunction with the testing to confirm the design Proper modeling and analysis would reduce total program costs by enforcing more complete and correct initial requirements which reduce issues discovered down the road in testing that are expensive to fix or impossible to fix and having to accept high risks Additionally if the model is maintained and optimized then it could possibly be used after system deployment to analyze impacts of upgrades or changes to the system allowing for more complete analysis and reduce overall system redesign costs1313A hurdle to cross with modeling and analysis is convincing people to believe those models Some method to certify these models and modeling tools should be addressed in the future Standards should be set in place for correct modeling techniques for complex systems Lastly consideration of standard verification checking tools should be made such as with the use of the Motor Industry Software Reliability Association (MISRA) compliance verification tool for the use of C in safety critical systems

22

Systems Reliability Standard Establishment

bull Establish a working group to define this standardndash Need a technical society to lead the charge on this

bull Collaborate with industry academia military and societiesndash Focus on development of a reliability standard with AWR safety in mindndash Draw upon the experiences to feed into this standard

bull Study existing and previous complex systemsndash Shuttle Space Station missile systems nuclear submarine and ship

systems nuclear control systems military and commercial jet systems ndash Obtain software reliability information from given existing and previous

systemsndash Build database which would serve as basis for future reliability

bull Research prior efforts in complex systems analysis

bull Establish consensus based modeling and analysis method

Presenter
Presentation Notes
In conclusion methods for achieving a design for complex systems do exist however achieving reliability and attaining a level of qualification that would permit better AWRs does not There needs to be a standard developed to tackle this issue rather than relying on current methods to ascertain airworthiness of complex systems This will not occur overnight An orchestrated collaboration among industry academia military labs and technically professional societies to focus on development of this standard should allow us to draw upon the experiences to feed into this reliability standard with AWR safety in mind We have a long living experiment with complex software systems on the Space Transportation System (STS) International Space Station (ISS) missile systems nuclear submarine and ship systems nuclear control systems military and commercial jet systems from which we should be able to obtain at least inferred software reliability information from the architecture and run time that the systems use We should look at the lessons learned from these systems to see what could have been done to improve and what was done right that should be carried forward The challenge is collecting this information into a central database and arriving at some figure of reliability from previous systems where data exists This would at least provide a starting point to allow initial assessments and could be optimized in the future Also this is not the only study for establishing reliability metrics to complex software systems There have been research projects of similarity to this effort that have risen and fallen The data from those projects should not be wasted but studied to feed into whatever standard that is developed While historical information would be useful each design is unique and requires tools to accomplish the design Investigation of architectural modeling constructs should be further investigated as a possible augmentation to the design and test process We need to determine which forum is best to conduct this effort (eg SAE IEEE AIAA ACM AHS INCOSE or other) As stated in the paper ldquoSpace Shuttle Avionicsrdquo [31] ldquoThe designers the flight crew and other operational users of the system often have a mindset established in a previous program or experience which results in a bias against new or different lsquounconventionalrsquo approachesrdquo If nothing is done to address this problem it will only get worse over time It is past time to address the issue of reliability of complex systems and software

23

BACKUP SLIDES

BACKUP SLIDES

24

PEO Aviation System Safety Management Decision Authority Matrix

Severity(Most Credible)

FrequentP gt 1E-3

A

Probable 1D-4 lt P lt= 1E-3

B

Occasional1E-5 lt P lt= 1E-4

C

Remote1E-6 lt P lt= 1E-5

D

Improbable1E-7 lt P lt= 1E-6

E

Catastrophic1

Critical 2

Marginal 3

Negligible4

Army Acquisition

PEO Aviation

ProgramManagement

HazardCategory

Description

1 Catastrophic Death or permanent total disability system loss

2 Critical Severe injury or minor occupational illness (no permanent effect) minor system or environmental damage

3 Marginal Minor injury or minor occupational illness (no permanent effect) minor system or environmental damage

4 Negligible Less than minor injury or occupational illness (no lost workdays) or less than minor environmental damage

RiskLevel

Description Probability (Frequency) (per 100000 flight hours)

A Frequent gt 100 (P gt 1E-3)

B Probable lt=100 and gt10 (1E-4 lt P lt= 1E-3)

C Occasional lt= 10 and gt1 (1E-5 lt P lt= 1E-4)

D Remote lt=1 and gt01 (1E-6 lt P lt= 1E-5)

E Improbable lt=01 and gt001 (1E-7 lt P lt= 1E-6)

Presenter
Presentation Notes
As already mentioned it is the goal of the US Army Aviation Engineering Directorate that the system is safe and reliable to operate and will perform the mission when delivered Another goal includes the released system continues to safely perform the mission if maintained and operated per the operatorrsquos manual Replacement parts and overhaul work must be high quality to support continued airworthiness Per the Program Element Office Memorandum 08-03 Risk Matrix US Army flight control systems are to achieve 1E-9 reliability for flight critical functions per civil airspace regulations [35] and 1E-6 reliability for tactical use [36] Quantifying these numbers is established for component hardware but not for software Just as hardware should have quantifiable reliability so should software13

25

Reliability Defined

bull Software Reliability - the probability that a given piece of software will execute without failure in a given environment for a given time [40] ndash Often debated as to how to measure

bull Hardware Reliability - the probability that a hardware component fails over time ndash Well defined and established

bull System Reliability - the probability of success or the probability that the system will perform its intended function under specified design limits [over a given amount of time] [39] ndash A combination of software and hardware reliability

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

26

Hardware vs Software Reliability

Hardware Reliability Software ReliabilityFailure rate has a bathtub curve The burn-in state is similar to the software debugging state

Without considering program evolution failure rate is statistically non-increasing

Material deterioration can cause failures even though the system is not used

Failures never occur if the software is not used

Failure data are fitted to some distributions The selection of the underlying distribution is based on the analysis of failure data and experiences Emphasis is placed on analyzing failure data

Most models are analytically derived from assumptions Emphasis is on developing the model the interpretation of the model assumptions and the physical meaning of the parameters

Failures are caused by material deterioration design errors misuse and environment

Failures are caused by incorrect logic incorrect statements or incorrect input data

Can be improved by better design better material applying redundancy and accelerated life cycle testing

Can be improved by increasing testing effort and correcting discovered faults Reliability tends to change continuously during testing due to the addition of problems in new code or the removal of problems by debugging errors

Hardware repairs restore the original condition Software repairs establish a new piece of software

Hardware failures are usually preceded by warnings Software failures are rarely preceded by warnings

Hardware components can be standardized Software components have rarely been standardized

Hardware can usually be tested exhaustively Software essentially requires infinite testing for completeness

Reference [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Hong Pham compares software versus hardware reliability with the information as shown in Table 2 [39] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

27

Acronym ListACRONYM DEFINITIONAADL Architectural Analysis and Design LanguageAC Advisory Circular (FAA)ACM Association of Computing MachineryAED Aviation Engineering Directorate (AMRDEC)AFTD Aviation Flight Test Directorate (US Army)AGC Apollo Guidance ComputerAHS American Helicopter SocietyAIAA American Institute of Aeronautics and Astronautics (Inc)AMCOM Aviation and Missile Command (US Army)AMRDEC Aviation and Missiles Research Development and Engineering Center (US Army)AR Army RegulationARINC Aeronautical Radio Inc ARP Aerospace Recommended PracticeASIF Avionics Software Integration FacilityATAM Architecture Tradeoff Analysis MethodATM Air Traffic ManagementAWR Airworthiness ReleaseCAAS Common Avionics Architecture SystemCH-47 Cargo Helicopter ChinookCMM Capability Maturity ModelCMMI Capability Maturity Model IndexCMU Carnegie Mellon UniversityCNS Communications Navigation SurveillanceCoSMIC Component Synthesis using Model-Integrated ComputingCPS Cyber-Physical SystemCRC Chemical Rubber Company (ie CRC Press)DFBW Digital Fly-By-WireDoD Department of DefenseE3 Electrical and Electromagnetic EffectsESML Embedded System Modeling LanguageFAA Federal Aviation AdministrationFCS Future Combat SystemsFHA Functional Hazard AssessmentFMEA Failure Modes Effects Analysis

28

Acronym List (concluded)

ACRONYM DEFINITIONGPWS Ground Proximity Warning SystemIBM International Business MachinesIEC International Engineering ConsortiumIL Instrumentation Lab (now Draper Laboratory)IMA Integrated Modular AvionicsINCOSE International Council On Systems EngineeringISO International Organization for StandardizationISS International Space StationKAL Korean AirlinesMISRA Motor Industry Standard Software Reliability AssociationMIT Massachusetts Institute of TechnologyNASA National Aeronautics and Space Administration (USA)PDR Preliminary Design ReviewPEO Program Element OfficePNAS Proceedings of the National Academy of SciencesRAQ Rotorcraft and Aircraft QualificationRMA Rate Monotonic AnalysisRTC Redstone Test Center (US Army) RTTC Redstone Technical Test Center (US Army)RTCA Radio Technical Commission for AeronauticsSAE Society of Automotive EngineersSED Software Engineering Directorate (AMRDEC)SEES Software Engineering Evaluation SystemSEI Software Engineering Institute (CMU)SIL System Integration LaboratorySSA System Safety AssessmentSTS Space Transportation SystemSysML Systems Modeling LanguageTMR Triple Modular RedundantTRL Technical Readiness LevelUAS Unmanned Aircraft SystemUH-60 Utility Helicopter BlackhawkUML Unified Modeling LanguageUS United StatesUSL Universal Systems Language

29

References

bull [1] Israel Koren and Mani Krishna ldquoFault-Tolerant Systemsrdquo Morgan Kaufmann 2007bull [2] Jianto Pan ldquoSoftware Reliabilityrdquo Carnegie Mellon University Spring 1999bull [3] Nachum Dershowitz httpwwwcstauacil~nachumdhorrorhtmlbull [4] httpwwwair-attackcombull [5] David A Mindell ldquoDigital Apollo Human and Machine in Spaceflightrdquo The MIT Press 2008bull [6] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Flight Software Airworthiness SED-SES-PMHFSA 001 December 2003bull [7] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Software Safety SED-SES-PMHSSA 001 February 2006bull [8] AMCOM Regulation 385-17 Software System Safety Policy 15 March 2008bull [9] NASA Software Safety Guidebook NASA-GB-871913 31 March 2004bull [10] Margaret Hamilton amp William Hackler ldquoUniversal Systems Language Lessons Learned from Apollordquo IEEE Computer

Society 2008bull [11] Margaret Hamilton ldquoFull Life Cycle Systems Engineering and Software Development Environment Development Before

The Fact In Actionrdquo httpwwwhtiuscomArticlesFull_Life_Cyclehtmbull [12] Peter Feiler David Gluch John Hudak ldquo The Architecture Analysis amp Design Language (AADL) An Introductionrdquo

CMUSEI-2006-TN-011 February 2006bull [13] Peter Feiler John Hudak ldquoDeveloping AADL Models for Control Systems A Practitionerrsquos Guiderdquo CMUSEI-2007-TR-

014 July 2007bull [14] Bruce Lewis ldquoUsing the Architecture Analysis and Design Language for System Verification and Validationrdquo SEI

Presentation 2006bull [15] Feiler Gluch Hudak Lewis ldquoEmbedded System Architecture Analysis Using SAE AADLrdquo CMUSEI-2004-TN-004 June

2004bull [16] Charles Pecheur Stacy Nelson ldquoVampV of Advanced Systems at NASArdquo NASACR-2002-211402 April 2002bull [17] Systems Integration Requirements Task Group ldquoARP 4754 Certification Considerations for Highly-Integrated or

Complex Aircraft Systemsrdquo SAE Aerospace 10 April 1996bull [18] SAE ldquoARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne System and

Equipmentrdquo December 1996bull [19] Aeronautical Radio Inc (ARINC) ldquoARINC Specification 653P1-2 Avionics Application Software Standard Interface Part 1

ndash Required Servicesrdquo 7 March 2006

30

References (Continued)

bull [20] Department of Defense ldquoMIL-STD-882D Standard Practice for System Safetyrdquo 19 January 1993bull [21] RTCA Incorporated ldquoDO-178 Software Considerations in Airborne Systems and Equipment Certificationrdquo 1

December 1992bull [22] RTCA Incorporated ldquoDO-254 Design Assurance Guidance for Airborne Electronic Hardwarerdquo 19 April 2000bull [23] US Army ldquoAeronautical Design Standard Handbook Rotorcraft and Aircraft Qualification (RAQ) Handbookrdquo 21

October 1996bull [24] Cary R Spitzer (Editor) ldquoAvionics Elements Software and Functionsrdquo CRC Press 2007bull [25] US Army ldquoArmy Regulation 70-62 Airworthiness Qualification of Aircraft Systemsrdquo 21 May 2007bull [26] US Army ldquoArmy Regulation 95-1 Aviation Flight Regulationsrdquo 3 February 2006bull [27] ldquoUsing the Architecture Tradeoff Analysis Method (ATAM) to Evaluate the Software Architecture for a Product Line

of Avionics Systems A Case Studyrdquo Barbacci Clements Lattanze Northrop Wood July 2003 CMUSEI-2003-TN-012bull [28] ldquoAll in the Family CAAS amp AADLrdquo Peter Feiler August 2008 CMUSEI-2008-SR-021bull [29] ldquoCMMI Guidelines for Process Integration and Product Improvementrdquo Chrissis Konrad Shrum Pearson Education

2007bull [30] ldquoModel Driven Performance Analysis for Avionics Systemsrdquo Brendan OrsquoConnell Draper Laboratory January 2006bull [31] John F Hanaway Robert W Moorehead ldquoSpace Shuttle Avionics Systemsrdquo NASA SP-504 1989bull [32] Lui Sha ldquoThe Complexity Challenge in Modern Avionics Softwarerdquo August 14 2006bull [33] ldquoIncidents Prompt New Scrutiny of Airplane Software Glitchesrdquo 30 May 2006 Wall Street Journalbull [34] Eyal Ophir Clifford Nass and Anthony Wagner ldquo Cognitive Control in Media Multitaskersrdquo PNAS 20 July 2009bull [35] ldquoAdvisory Circular AC 251309-1A System Design and Analysisrdquo Federal Aviation Administration 21 June 1988bull [36] Program Element Office Policy Memorandum 08-03bull [38] httpwwwnsfgovpubs2008nsf08611nsf08611htm National Science Foundation webpage on Cyber-Physical

Systemsbull [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000bull [40] Paul Rook editor ldquoSoftware Reliability Handbookrdquo Elsevier Science Publishers LTD 1990

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)

5

US Army Airworthiness

Airworthiness Qualification meansThe system is safe and reliable to operate and willperform the mission when deliveredIt will continue to safely perform the mission ifmaintainedoperated per the manualParts and overhaul work must be high quality to maintain airworthinessFlight control systems have high reliability requirements - 10-9 for civil airspace critical IFR functions [35]- 10-6 for tactical airspace [36]

Presenter
Presentation Notes
The US Army has been involved in flight since the early Wright-B flight of 1909 Initially fixed wing and rotary wing aircraft in early aviation remained simple and federated The basics to aviate navigate and communicate were handled by dedicated gauges compasses gyroscopes and mechanical linkages to fly the aircraft With the dawn of the Space Age in the 1960s the rise of more complex electronic control in aviation appeared with the ApolloSaturn projects NASA Dryden F-8 project and others As time progressed more integrated modular avionics (IMA) and digital fly by wire (DFBW) came into military and commercial aircraft to reduce weight increase functionality and provide redundancy With this advancement of complex flight control technology testing verification and validation became problematic To this day complex systems have been challenging to qualify for the Federal Aviation Administration (FAA) and military airworthiness commands13The US Army AED holds the responsibility for airworthiness approval This includes the design approval production approval and continued airworthiness In the design approval AED must determine if the system design or modification meets airworthiness standards In the milestone decisions it must determine whether the system is produced or modified in accordance with the approved design For continued airworthiness a judgment must be made for the system operated and maintained to keep it compliant with the approved design The bottom line is that the qualified system meets its performance requirements and is reliable when delivered1313Establishing approval for AWR has become increasingly difficult with the evolution of complex avionic control systems from federated architectures to IMA architectures that rely heavily on complex software systems (ie now identified as cyber-physical systems) As stated earlier AED has the mission to ensure airworthiness qualification for aircraft and subsystems used in the US Army aviation fleet including helicopters airplanes and Unmanned Aircraft Systems (UAS) 13

6

AED and Qualification

bull AEDrsquos mission is to ensure airworthiness qualification for aircraft and subsystems used in the US Army fleet

bull Airworthiness Qualification isndash Demonstration of an aircraft or aircraft subsystem or

component including modifications to function safely meeting performance specifications when used and maintained within prescribed limits (AR 70-62)

bull Traditionally qualified systems by ndash Similarity ndash Analysisndash Testndash Demonstrationndash Examination

Presenter
Presentation Notes
Airworthiness qualification is the demonstration of an aircraft or aircraft subsystem or component including modifications to function satisfactorily when used and maintained within prescribed limits [25] The US Army has traditionally qualified systems and components by physical testing analysis demonstration or by similarity

7

Evolution of Helicopter Systems

bull Past systems historically federated ndash Distributed Functionality ndash Hardware basedndash Simple easier to test

bull Now systems are becoming integratedndash Combined Functionality ndash More Software Intensivendash Complex more difficult to test

Chief Warrant Office Jim Beaty (back row far left) and crew of the of the Vietnam UH-1 Flying Bikinis (friend of Alex Boydston)

UH-1 Cockpit (US Army)

Chinook CAAS Cockpit (US Army)

CH-47 Chinook (US Army)

Presenter
Presentation Notes
The US Army has been involved in flight since the early Wright-B flight of 1909 Initially fixed wing and rotary wing aircraft in early aviation remained simple and federated The basics to aviate navigate and communicate were handled by dedicated gauges compasses gyroscopes and mechanical linkages to fly the aircraft With the dawn of the Space Age in the 1960s the rise of more complex electronic control in aviation appeared with the ApolloSaturn projects NASA Dryden F-8 project and others As time progressed more integrated modular avionics (IMA) and digital fly by wire (DFBW) came into military and commercial aircraft to reduce weight increase functionality and provide redundancy With this advancement of complex flight control technology testing verification and validation became problematic To this day complex systems have been challenging to qualify for the Federal Aviation Administration (FAA) and military airworthiness commands13

8

Present Approach to Testing

bull Several disciplines weigh in such as software avionics GNampCenvironmental E3 electrical human factors structures aerodynamics etc

bull Current test methodology per older federated systemsndash Hardware Mil-Std 810 Mil-Std 461ndash Requirements Analysis (Traceability)

bull Test at different levelsndash Individual software module testingndash Black box testingndash System Integration testing

bull Hot benchbull System Integration Lab (SIL)

ndash Aircraft level Testing bull Groundbull Flight Aviation Flight Test Directorate (AFTD) Testing (US Army Photo)

Aviation System Integration Facility (ASIF) (US Army Photo)

Presenter
Presentation Notes
Airworthiness qualification is the demonstration of an aircraft or aircraft subsystem or component including modifications to function satisfactorily when used and maintained within prescribed limits [25] The US Army has traditionally qualified systems and components by physical testing analysis demonstration or by similarity Historically most systems were federated They were hardware based simple and distributed Now they have become more integrated They are more software intensive complex and have combined functionality contained in one or more computers With this evolution from simple to more complex the Army is finding it more difficult to execute an AWR As systems evolve to more complex systems of systems this problem is only growing worse13The current test approach to achieving confidence in systems for an AWR for the US Army is based more on traditional federated avionics systems Experienced personnel in Software Vehicle Management Systems Avionic Communications Navigation amp Guidance Control Electrical Environmental Human Factors Electrical and Electro-magnetic Effects (E3) Structures Thermal Safety Integration Testing and Flight Test personnel and test pilots all play important roles in accomplishing test and review of new and existing systems While some may not consider areas such as thermal or EEE important to software development they are crucial since the software is running on physical systems that are affected by heat and susceptibility to electromagnetic radiation which can cause abnormal operation or bit errors Current test methodology for hardware relies on MIL-STD-810 MIL-STD-461 and requirements analysis such as traceability MIL-STD-810 is the Department of Defense Test Method Standard for Environmental Engineering Considerations and Laboratory Tests1313It focuses on equipment environmental design and test limits to the conditions that it will experience throughout its service life and establishing chamber test methods that replicate the effects of environments on the equipment rather than imitating the environments themselves MIL-STD-461 includes the ldquoRequirements for the Control of Electromagnetic Interference Characteristics of Subsystems and Equipmentrdquo These test standards are geared for hardware and physical effects and do not address software operation Current methods for software testing include testing the individual software modules black box (or interface) level system integration level and aircraft level testing System integration testing can include hot bench system integration labs (SILs) (such as the shown in Figure 4) and aircraft ground level testing as conducted by the Aviation Flight Test Directorate (AFTD) of Redstone Test Center (RTC) (see Figure 5) 13Neglecting complete and good requirements promotes risk It is common practice by Program Managers (PMs) to accept risks As issues are found in systems it is on AED to issue an Army Safety Action Memorandum (ASAM) that identifies a deficiency to the field pilots Concurrently an Airworthiness Impact Statement (AWIS) is issued to the PMs which contain a probabilistic analysis of how the identified shortcoming will affect risk The AMCOM Safety Office will produce its own calculations The PMs can either accept or reject this assessment Regardless cautions and warnings are placed in the AWR to keep the entire program and flight crew aware of issues13

9

Development Challenges

bull Legacy Aircraft often upgraded in a piecemeal fashionndash Makes certification difficultndash Desire to increase to modern requirements based on size of upgrade and

what it includes ndash hard to scope

bull New system requirements must be clear complete and testable ndash Certification requirements must be obvious

bull Orchestrating agreement between stakeholders is necessary to mitigatendash Juggling of multiple software buildsndash End system that is difficult to test certify and deployndash Escalating Costsndash System Safety from being poorly understoodndash Design iterations

Presenter
Presentation Notes
It would be wonderful if all systems were straightforward in design easily testable and simple to write an Airworthiness Release (AWR) for however that is not the case Legacy aircraft such as the Chinook have been upgraded in a piecemeal fashion acquiring much needed improvements in aviation navigation and communication The general recommended system development V-curve as shown earlier is not always followed in a strict sense although it should be the goal process Negligence to the proper process makes the establishment of certification very difficult For new system development and existing system upgrades requirements must be clear complete and testable The certification requirements must be made obvious in the development of the requirements establishment phase with the goal of being fully identified during the requirements development Orchestrating agreement among all stakeholders (eg the program manager systems engineers human factors engineers integrators test engineers manufacturers users and certifiers) is necessary to mitigate problems such as13juggling multiple software builds13producing a difficult-to-test difficult-to-certify and difficult-to-deploy systems13misunderstanding system safety and13requiring design iterations that impact schedules and costs

10

Complexity Issues

bull System Development costs and schedule increase with complexityndash Existing lack of schedule and funding resources

bull Keeps systems from achieving full compliance with specifications and requirements

bull Garbage in -gt Garbage OuthellipPoor requirements -gt Poor Systemndash Finding problems in new designs at PDR is too latendash Difficult to correct existing poorly designed fielded complex systems

bull Complexity amp reliability of complex systems is not fully understoodndash How do we accurately assess operating risk performance reliability of

complex systems based on limited testing and analysisndash How do we know when system design is good enoughndash Latent defects occur in supposedly well-tested mature systems

bull Avionics parts and software change constantlyndash Spiral development -gt new softwarehardware qualification required frequently ndash How do we streamline the process (partition the system) so the need for

complete re-qualification after changes is lessened

Presenter
Presentation Notes
In an old computer development lesson it is well known that if you put ldquogarbage in you will get garbage outrdquo Likewise in a system if you have poor requirements then you will end up with a poor system Finding problems in new designs at Preliminary Design Review (PDR) is too late It has been shown that discovering issues at this stage in the game will cause a reiteration on the design which costs time and money Furthermore dealing with existing fielded complex systems that have not gone through the rigors of proper systems engineering results quite often with an overwhelming mess of functionality to test verify validate qualify and certifydesigns at PDR is too late It has been shown that discovering issues at this stage in the life-cycle will cause a reiteration on the design which costs time and money Furthermore dealing with existing fielded complex systems that have not gone through the rigors of proper systems engineering results quite often with a complicated conglomeration of functionality to test verify validate qualify certify and maintain This is indicative where complexity has exceeded our understanding in how to certify a system We still do not fully understand complexity and how to address reliability of complex systems How do we accurately assess operating risks performance and reliability of complex systems based on limited testing and analysis How do we know when a system design is good enough How do we modularize spirally developed systems to minimize the need for re-qualification of unchanging portions of the system We are 30 plus years into this technology and we still deal with systems with latent defects that are occurring in supposedly well-tested and mature systems To further exacerbate the problem we are now dealing with complex system of systems (ie cyber-physical systems)13It is a given that you can keep on adding redundancy and complexity to a problem to attain a desired level of reliability but at some point in time the reliability will taper off At best we sometimes must satisfy for an optimum point before digressing in reliability In the same vein system development costs and schedule increase with complexity too (see Figure 7 and 8)13Avionics parts and software constantly change over the life of a program Typically a spiral development program occurs with complex software development which means that qualification is required frequently This begs the question of how to streamline the process so that the need to conduct a complete requalification is avoided13With these complex systems there are other hurdles to cross such as fully characterizing and conducting the Functional Hazard Assessments (FHAs) and Failure Mode Effects and Criticality Analyses (FMECAs) It is crucial for the safety assessment that these are conducted correctly to fully understand the risks and later perform the correct tests at the right level Additionally once the complex component hardware and software are integrated then yet other problems appear It is at that time that the disparity of teams crossing multiple contractors and development groups becomes obvious and if not properly coordinated could cause impacts to the schedule Other programmatic problems affect complex system development and qualification For instance lack of schedule and funding resources causes a shortcoming to adequately provide for the proper compliance with specification and requirements short-circuiting the systems engineering process An ever decreasing availability of trained engineers to support the development and test of such systems exists Non-technical political influences sometimes affect the reliability of complex systems and are difficult to avoid Lastly there is a lack of a centralized database that captures the various families of systems that have been built along with their characterization of success and failures Such a database from all past and present government complex systems could be valuable in establishing reliability basis for future models

11

Complexity Issues (continued)

bull Functional Hazard Assessments and related documentation are crucial

ndash Understanding risks ndash Performing the correct tests at the right level Lab test vs Flight Test

bull Saves flight time and money

bull Systems Integration for complex systems is a schedule driver

bull Need experienced personnel to work complex systems

bull Need a centralized database - just doesnrsquot existndash Determine data needed for quantifying reliability of complex systemsndash Capture the pertinent data on previous or existing complex systemsndash Understand successes and failures of previous and present complex systemsndash Establish baseline reliability figures for known architectures

bull Complex System of Systems exacerbates problem

12

Reliability vs Complexity ampCost vs Complexity

Notional Graphs

bull Reliability vs Complexity bull Cost amp Schedule vs Complexity

Rel

iabi

lity

Complexity Complexity

Cos

t amp S

ched

uleOptimum

Aggregation of part reliability

feeds into overall system reliability

Desired

Presenter
Presentation Notes
Avionics parts and software constantly change over the life of a program Typically a spiral development program occurs with complex software development which means that qualification is required frequently This begs the question of how to streamline the process so that the need to conduct a complete re-qualification is avoided13With these complex systems there are other hurdles to cross such as fully characterizing and conducting the Functional Hazard Assessments (FHAs) and Failure Mode Effects and Criticality Analyses (FMECAs) It is crucial for the safety assessment that these are conducted correctly to fully understand the risks and later perform the correct tests at the right level13Additionally once the complex component hardware and software are integrated then yet other problems crop up It is at that time that the disparity of teams crossing multiple contractors and development groups becomes obvious and if not properly coordinated could cause impacts to the schedule 13Other programmatic problems affect complex system development and qualification For instance lack of schedule and funding resources causes a shortcoming to adequately provide for the proper compliance with specification and requirements short-circuiting the systems engineering process An ever decreasing availability of trained engineers to support the development and test of such systems exists Non-technical political influences sometimes affect the reliability of complex systems and are difficult to avoid Lastly there is a lack of a centralized database that captures the various families of systems that have been built along with their characterization of success and failures Such a database from all past and present government complex systems could be valuable in establishing reliability basis for future models

13

A Few Examples of Complex Systems

bull This is not a new problem Other have struggled with the challenges of establishing confidence in complex systems

ndash NASAbull Apollo Guidance Computerbull Dryden F8 Crusaderbull Space Shuttlebull International Space Station

ndash Commercial Airlinersbull Airbus A320 and higherbull Boeing B777 B787

ndash Militarybull Ships and Submarinesbull Jets (F14F15 F16 F18 F22 F35 etc)bull Cargo Planes (C130J Hercules C17 Globemaster etc)bull Helicopters (Chinook Blackhawk Sea Stallion etc)bull Rocketsbull Unmanned Aerial Systemsbull Unmanned Ground Systemsbull Unmanned Submarine Systems Photos by US Army NASA US Navy and US Air Force

Presenter
Presentation Notes
As mentioned earlier complex avionics systems are not a new idea Since the early 1960rsquos complex avionic architectures have existed beginning with the ApolloSaturn program Massachusetts Institute of Technology (MIT) Instrumentation Lab (IL) which is now Draper Laboratory and International Business Machines (IBM) led the way with the MITIL Apollo Guidance Computer (AGC) and the Saturn V IBM triple modular redundant (TMR) voting guidance computer system The word software was not even coined at the time but engineers such as Margaret Hamilton MITIL Director of Apollo On-board Software can attest to the fact that some the same issues with creating reliable software then still exists today [5] A large majority of the issues then dealt with the communication between systems engineers and the programmers Requirements were thrown over the wall without the confirmation that the requirements were complete and a lot of the issues cropped up as interface problems Identifying these issues prompted Hamilton to create her own company and create a modeling language called Universal Systems Language (USL) to head off the problems experienced with Apollo [11] Some 200 plus modeling programs have been developed since Apollo and used to mitigate issues and increase confidence in systems of varying complexity13As time progressed other systems came along The NASA Dryden F8 Crusader was the first digital fly by wire (DFBW) jet aircraft that relied heavily on complex IMA and software for flight control The Space Transportation System (STS) shuttle includes a Quad Modular Redundant (QMR) system with a fifth backup flight computer containing uncommon code US Air Force and Naval airplanes that have possessed complex or redundant IMA configurations include the F14 Tomcat F15 Eagle F16 Falcon F18 Hornet F22 Raptor F35 Joint Strike Fighter F117 Nighthawk V22 Osprey C17 Globemaster and many more along with recent Unmanned Air Vehicle Systems (UAVS) The US Army complex systems on helicopters include the13RAH-66 Comanche DFBW Triple Modular Redundant (TMR) architecture13glass cockpit avionics on the UH-60M Blackhawk baseline 13Common Avionics Architecture System (CAAS) glass cockpit on the UH-60M Blackhawk modernization and CH-47F Chinooks13and other aircraft13Additionally there are many self-checking pair engine controller systems along with system of system Future Combat Systems (FCS) and Unmanned Air Vehicle Systems (UAVS) This has also permeated the commercial airliner market with the Airbus 320 and higher Airbus models Boeing 777 and Boeing 787 aircraft With this ever increasing technology something must be done about the reliability issue With such a wealth of data on aviation and non-aviation cyber-physical systems such as submarine ship nuclear medical locomotive and automotive systems there should be adequate information to get a start on modeling systems correctly for reliability Therefore this is not an isolated problem to avionics and other disciplines should aide in resolving this problem13

14

Some Complex System Failures

bull V-22 Osprey crashesbull Mars Climate Orbiter crashbull Mars Pathfinder software resetbull USS Vincennes downing an Airbus 320bull Therac-25 software radiation treatment

failurebull 1989 Airbus A320 air show crashbull China Airlines Airbus Industries A300

crashbull Ariane 5 satellite launcher malfunctionbull Failure of the primary flight system to

sync with the backup during prelaunch of STS-1

bull Mexicana Airlines Boeing 727 airliner crashed into a mountain due to the software not correctly determining the mountain position

bull Loss of the first American probe to Venus

bull Korean Airlines KAL 901 accidentbull Soviet Phobos I Mars probe lost

bull Three Mile Islandbull F-18 fighter plane crash due to bad

exceptionbull F-14 fighter plane lost to

uncontrollable spinbull Swedish Gripen prototype crashedbull Swedish Gripen air-show crashbull F-22 failure crossing the IDLbull 2006 German-Spanish Barracuda UAVbull 2004 FA-22 Raptor stealth fighter jet

crash bull FA-22 Raptor navigation system

software error at Nellis AFBbull 50 cockpit blackouts on A320bull A320 multiple avionics and electrical

failures at Newark NJbull Boeing 777 Malaysian Airlines jetlinerrsquos

nightmarish autopilot rollercoaster ridebull 3000 feet US Army and Air Force UAV

Crashes

bull hellip And Many Morehellip

Presenter
Presentation Notes
Multiple crashes have occurred with the V-22 Osprey [41]13In 1999 the Mars Climate Orbiter crashed because of incorrect units in a program caused by poor systems engineering practices [42 44]13In 1988 an Airbus 320 was shot down by the USS Vincennes because of cryptic and misleading output displayed by the tracking software [3]13In 1989 an Airbus A320 crashed at an air show due to altitude indication and software handling [3]13In 1994 a China Airlines Airbus Industries A300 crash on killing 264 from faulty software [3]13In 1996 the first Ariane 5 satellite launcher destruction mishap was caused by a faulty software design error with a few lines of ADA code containing unprotected variables Horizontal velocity of the Ariane 5 exceeded the Arian 4 resulting in the guidance system veering the rocket off course Insufficient testing did not catch this error which was a carry-over from Ariane 4[3 39]13In 1986 a Mexicana Airlines Boeing 727 airliner crashed into a mountain due to the software not correctly determining the mountain position [39]13In 1986 the Therac-25 radiation therapy machines overdosed cancer patients due to flaw in the computer program controlling the highly automated devices [3 39 45]13During the maiden launch in 1981 of the Discovery space shuttle a failure of the primary flight control computer system to establish sync with the backup during prelaunch [43]13On December 10 1990 the Space Shuttle Columbia had to land early due to computer software problems [39]13In 1997 The Mars Pathfinder software reset problem due to latent task execution caused by priority inversion with a mutex [3 44]13An error in a single FORTRAN statement resulted in the loss of the first American probe to Venus From G J Myers Software Reliability Principles amp Practice p 25 [3]13In September 1999 the Korean Airlines KAL 901 accident in Guam killed 225 out of 254 aboard A worldwide bug was discovered in barometric altimetry in Ground Proximity Warning System (GPWS) From ACM SIGSOFT Software Engineering Notes vol 23 no 1 [3]13The Soviet Phobos I Mars probe was lost due to a faulty software update at a cost of 300 million rubles Its disorientation broke the radio link and the solar batteries discharged before reacquisition From Aviation Week 13 Feb 1989 [3]13An F-18 fighter plane crash due to a missing exception condition From ACM SIGSOFT Software Engineering Notes vol 6 no 2 [3]13An F-14 fighter plane was lost to uncontrollable spin traced to tactical software From ACM SIGSOFT Software Engineering vol9 no5 [3]13In 1989 Swedish Gripen prototype crashed due to software in their digital fly-by-wire system [3 46]13In 1995 another Gripen fighter plane crashed during air-show caused by a software issue [3 46]13On February 11 2007 twelve FA-22 Raptors were forced to head back to the Hawaii when a software bug caused a computer crash as they were crossing International Date Line In the middle of the ocean all their systems comprising navigation fuel and part of the communications systems dumped All the attempts to reboot failed[47]13February 2006 German-Spanish Unmanned Combat Air Vehicle Barracuda crash due to software failure [4]13December 2004 a glitch in the software for controlling flight probably caused an FA-22 Raptor stealth fighter jet to crash on takeoff at Nellis Air Force [4]13In 2008 a United Airbus A320 registration N462UA experienced multiple avionics and electrical failures including loss of all communications shortly after rotation while departing Newark Liberty International Airport in Newark New Jersey [NTSB Report Identification DCA08IA033] 13In 2006 a Boeing 777 Malaysian Airlines jetlinerrsquos autopilot caused a stall to occur by climbing 3000 feet Pilots struggled to nose down the plane but plunged into a steep dive After pulling back up the pilots regained control Cause was defective flight software providing incorrect data for airspeed and acceleration confusing the flight computers and initially ignoring the pilotrsquos commands[49]13US Army and Air Force UAV crashes from control system or human error13

15

Lessons Learned from Failures

bull From Nancy Levesonrsquos paper ldquoThe Role of Software in Spacecraft Accidentsrdquondash ldquoFlaws in the safety culture diffusion of responsibility and

authorityndash Limited communication channels and poor information flowndash Inadequate system and software engineering ndash Poor or missing specifications ndash Unnecessary complexity and software functionality ndash Software reuse or changes without appropriate safety analysisndash [Shortcomings] in safety engineering practices ndash Flaws in test and simulation environments ndash Inadequate human factors design for softwarerdquo

Presenter
Presentation Notes
In Dr Nancy Levesonrsquos paper [36] ldquoThe Role of Software in Spacecraft Accidentsrdquo she cited problems with software development issues within NASA on various projects According to Dr Leveson there were ldquoflaws in the safety culture diffusion of responsibility and authority limited communication channels and poor information flow inadequate system and software engineering poor or missing specifications unnecessary complexity and software functionality software reuse or changes without appropriate safety analysis violation of basic safety engineering practices inadequate system safety engineering flaws in test and simulation environments and inadequate human factors design for softwarerdquo While these problems were identified for spacecraft development within NASA and corrected aviation in general could learn from these lessons to mitigate issues with complex systems development

16

Some Current Guidelines

bull DO-178B - Software Considerations in Airborne Systems and Equipment Certification bull DO-248B ndash Final Report for the Clarification of DO-178Bbull DO-278 - Guidelines for Communications Navigation Surveillance and Air Traffic Management

(CNSATM) Systems Software Integrity Assurancebull DO-254 - Design Assurance Guidance for Airborne Electronic Hardware bull DO-297 ndash Integrated Modular Avionics (IMA) Development Guidance and Certification

Considerationsbull SAE-ARP4754 ndash Certification Consideration for Highly Integrated or Complex Aircraft Systemsbull SAE-ARP4671- Guidelines and Methods for Conducting the Safety Assessment Process on

Airborne Systems and Equipmentbull FAA Advisory Circular AC27-1B - Certification of Normal Category Rotorcraftbull FAA Advisory Circular AC29-2C - Certification of Transport Category Rotorcraftbull ISOIEC 12207 - Software Life Cycle Processesbull ARINC 653 - Specification Standard for Time and System Partitionbull MIL-STD-882D - DoD System Safetybull ADS-51-HDBK - Rotorcraft and Aircraft Qualification Handbookbull AR-70-62 - Airworthiness Release Standardbull SED-SES-PMHFSA 001 - Software Engineering Directorate (SED) Software Engineering

Evaluation System (SEES) Program Manager Handbook for Flight Software Airworthinessbull SED-SES-PMHSS 001 - SED SEES Program Manager Handbook for Software Safety

WHATrsquoS MISSING - Reliability Standard for Complex Systems

Presenter
Presentation Notes
These problems previously stated drove the development of these guidelines however there is no standard for system reliability that includes software There are other standards and circulars that pertain to complex systems but a reliability standard for complex systems but a reliability standard is missing for complex systems which would outline the process for establishing cyber-physical systems reliability This standard should indicate how to model and analyze and ascertain the projected level of reliability

17

Certification Assessment Considerations

bull Sufficient data and time must be available for air worthiness evaluation

bull Certification processndash Currently lengthy ndash Depends much on human interpretation trade offs and risk mitigation ndash Overwhelming for complex integrated systems (FHAs FTAs FMECAs

risk mitigation etc)

bull Consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives

bull Certification Tasks outlined in DO-297 should be consideredndash Task 1 Module Acceptancendash Task 2 Application softwarehardware acceptancendash Task 3 IMA system acceptancendash Task 4 Aircraft integration of IMA system ndash including VampVndash Task 5 Change of modules or applicationsndash Task 6 Reused of modules or applications

Presenter
Presentation Notes
In order to execute an AWR sufficient data must be provided to the appropriate airworthiness authorities to demonstrate that safety requirements are met to the levels per the System Safety Assessment (SSA) or equivalent The certification process is currently lengthy and depends on much human interpretation of the myriad of complex architecture functions 13The current guidelines such as DO-178B DO-254 DO-297 SAE-ARP-4754 and SAE-ARP-4671 along with many other guidelines outline the proper steps that should be taken System safety managementrsquos military standard is MIL-STD-882 and has been in use for decades Civilian safety standards for the aviation industry include SAE ARP4754 which shows the incorporation of system safety activities into the design process and provides guidance on techniques to use to ensure a safe design SAE ARP4761 contains significant guidance on how to perform the system safety activities spoken about in SAE ARP4754 DO-178B outlines development assurance guidance for aviation software based on the failure condition categorization of the functionality provided by the software DO-254 embodies similar guidance for aviation hardware ARINC 653 is a widely accepted standard to ensure time and space partitioning for software DO-297 does an excellent job of describing the certification tasks to take for an IMA system which include13Task 1 Module acceptance13Task 2 Application softwarehardware acceptance13Task 3 IMA system acceptance13Task4 Aircraft integration of IMA systems including verification and validation13Task 5 Change of modules or applications13Task 6 Reuse of modules or applications13Taken together these standards provide guidance that if followed likely will result in safe highly reliable and cost-effective systems over the life-cycle of the system Yet while these guidelines exist there is not a consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives Also there is not a standard outlining overall reliability for a system to include hardware and software reliability In order to achieve this level of reliability a standard should be developed to define the process and method to achieve a quantifiable reliability number that would in turn lead to acceptance13

18

TRL 3 or 4 TRL 6 or 7 TRL 8 or 9

Component 4

ComplexityFundamentals

Reliability Parametrics

Component 3

ComplexityFundamentals

Reliability Parametrics

Definition of Complexity and Reliability is Needed

Subsystem 1

SystemIntegration ofComponents

ReliabilityDependencies

Component 2

ComplexityFundamentals

Reliability Parametrics

Component 1

ComplexityFundamentals

Reliability Parametrics

Subsystem 2

SystemIntegration ofComponents

ReliabilityDependencies

Integration System

Realized System

ReliabilitySensitivities

High Reliable Complex System

Certificate(eg AWR)

Integration

Integration

Integration

Integration

Integration

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) andor AWR

19

Analytical Models and Reliability

bull Analytical models of hardware reliability are well understood

bull Architecture modeling and software reliability modeling is not a novel idea but is highly debated

ndash There are many approaches and little consensus as to best wayndash Many models (Jelinski-Moranda Littlewood-Verrall Musa-Okumoto etc) [1]ndash Many tools (over 200+ tools since 1970s have been built) [2]

bull Predictability of software reliability is of great concern because it is a major contributor to unreliability[2]

bull Software Reliability is the probability of error-free operation of a computer program in a specified environment for a specified time [1]

bull Need basis for setting reliability figures based on previous systems and iteratively refine those figures in the future

bull NOT A REPLACEMENT FOR TESTING AND VERIFICATION

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability While not suggesting as solution in total better modeling practices should be considered toward a solution to bridge the gap between design test and implementation A method to model the architecture early in the requirements establishment phase follow the detailed design and coding and then be able to verify the system along with the model may be a path to greater confidence in the system and reduce the risks warnings and cautions that must be issued13In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result (see Figure 9) The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) or AWR 13To achieve this goal modeling and analysis tools that follow a standard for modeling reliability should exist As previously stated over 200 tools have been created since 1970s Here is a list of a few of the current tools13Universal Systems Language (USL)13Unified Modeling Language (UML)13Systems Modeling Language (SysML)13MATLABSimulink13Telelogic Rhapsody13MathCAD13Colored Petri Nets13Rate Monotonic Analysis (RMA)13STATEMATE (Used by Airbus)13Standard for the Development of Safety-Critical Embedded Software (SCADE)13OPNET13Embedded System Modeling Language (ESML)13Component Synthesis using Model-Integrated Computing (CoSMIC)13Architectural Analysis and Design Language (AADL)13By no means is this list complete Typically different companies and projects address this challenge and choose unit tools to perform the upfront analysis and modeling not following a standard approach Multiple tools need to converge or be compatible with the framework of a common tool such as AADL The resultant tool(s) could be used to verify the requirements up front to mitigate reliability issues down the life-cycle 13

20

Tools for Modeling and Analysis

bull Universal Systems Language (USL)bull Unified Modeling Language (UML)bull Systems Modeling Language (SysML)bull MATLABSimulinkbull Telelogic Rhapsodybull MathCadbull Colored Petri Netsbull Rate Monotonic Analysis (RMA)bull STATEMATE (Used by Airbus)bull SCADEbull OPNETbull Embedded System Modeling Language (ESML)bull Component Synthesis using Model-Integrated Computing (CoSMIC)bull Architectural Analysis and Design Language (AADL)bull At least 200+ more packages since the 70rsquosbull Certified tools needs to converge to an accepted standard

modelinganalysis method for complex system reliability

Presenter
Presentation Notes
Typically different companies and projects address the challenge and choose unit tools to perform the upfront analysis and modeling not following a guideline approach Multiple tools need to converge or be compatible with a set modeling standard for complex systems The compliant tools could be used to verify the requirements up front to mitigate reliability issues down the life-cycle A notional approach would follow that shown in Figure 8 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model The down side to modeling from certain circles is getting people to believe those models How do you certify a modeling tool and the actual models within the tools Those issues should be addressed going forward

21

Modification to Acquisition Model

Requirements Establishment

High Level Design

Detailed Specifications

Implementation Coding

Verification

Development TestingA

rchi

tect

ural

Mod

el amp

A

naly

sis

Propose standard modeling methodology to be applied at different phases of development to enhance requirements development reliability allocation reliability

measurement and testing (DISCLAIMER DOES NOT REPLACE TESTING)

Reliabilityallocated Reliability

measured

Operational Testing amp Validation

Deployed System

Presenter
Presentation Notes
A notional approach would follow that shown in Figure 10 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model This approach could bridge the design and test phases together It is emphasized here that the architectural model would not replace critical testing but augments the process to allow for better requirement identification and verification Thorough ground and flight tests should never be replaced by modeling Modeling would only allow for more robust and a higher level of confidence in the requirements and design The model could be used in conjunction with the testing to confirm the design Proper modeling and analysis would reduce total program costs by enforcing more complete and correct initial requirements which reduce issues discovered down the road in testing that are expensive to fix or impossible to fix and having to accept high risks Additionally if the model is maintained and optimized then it could possibly be used after system deployment to analyze impacts of upgrades or changes to the system allowing for more complete analysis and reduce overall system redesign costs1313A hurdle to cross with modeling and analysis is convincing people to believe those models Some method to certify these models and modeling tools should be addressed in the future Standards should be set in place for correct modeling techniques for complex systems Lastly consideration of standard verification checking tools should be made such as with the use of the Motor Industry Software Reliability Association (MISRA) compliance verification tool for the use of C in safety critical systems

22

Systems Reliability Standard Establishment

bull Establish a working group to define this standardndash Need a technical society to lead the charge on this

bull Collaborate with industry academia military and societiesndash Focus on development of a reliability standard with AWR safety in mindndash Draw upon the experiences to feed into this standard

bull Study existing and previous complex systemsndash Shuttle Space Station missile systems nuclear submarine and ship

systems nuclear control systems military and commercial jet systems ndash Obtain software reliability information from given existing and previous

systemsndash Build database which would serve as basis for future reliability

bull Research prior efforts in complex systems analysis

bull Establish consensus based modeling and analysis method

Presenter
Presentation Notes
In conclusion methods for achieving a design for complex systems do exist however achieving reliability and attaining a level of qualification that would permit better AWRs does not There needs to be a standard developed to tackle this issue rather than relying on current methods to ascertain airworthiness of complex systems This will not occur overnight An orchestrated collaboration among industry academia military labs and technically professional societies to focus on development of this standard should allow us to draw upon the experiences to feed into this reliability standard with AWR safety in mind We have a long living experiment with complex software systems on the Space Transportation System (STS) International Space Station (ISS) missile systems nuclear submarine and ship systems nuclear control systems military and commercial jet systems from which we should be able to obtain at least inferred software reliability information from the architecture and run time that the systems use We should look at the lessons learned from these systems to see what could have been done to improve and what was done right that should be carried forward The challenge is collecting this information into a central database and arriving at some figure of reliability from previous systems where data exists This would at least provide a starting point to allow initial assessments and could be optimized in the future Also this is not the only study for establishing reliability metrics to complex software systems There have been research projects of similarity to this effort that have risen and fallen The data from those projects should not be wasted but studied to feed into whatever standard that is developed While historical information would be useful each design is unique and requires tools to accomplish the design Investigation of architectural modeling constructs should be further investigated as a possible augmentation to the design and test process We need to determine which forum is best to conduct this effort (eg SAE IEEE AIAA ACM AHS INCOSE or other) As stated in the paper ldquoSpace Shuttle Avionicsrdquo [31] ldquoThe designers the flight crew and other operational users of the system often have a mindset established in a previous program or experience which results in a bias against new or different lsquounconventionalrsquo approachesrdquo If nothing is done to address this problem it will only get worse over time It is past time to address the issue of reliability of complex systems and software

23

BACKUP SLIDES

BACKUP SLIDES

24

PEO Aviation System Safety Management Decision Authority Matrix

Severity(Most Credible)

FrequentP gt 1E-3

A

Probable 1D-4 lt P lt= 1E-3

B

Occasional1E-5 lt P lt= 1E-4

C

Remote1E-6 lt P lt= 1E-5

D

Improbable1E-7 lt P lt= 1E-6

E

Catastrophic1

Critical 2

Marginal 3

Negligible4

Army Acquisition

PEO Aviation

ProgramManagement

HazardCategory

Description

1 Catastrophic Death or permanent total disability system loss

2 Critical Severe injury or minor occupational illness (no permanent effect) minor system or environmental damage

3 Marginal Minor injury or minor occupational illness (no permanent effect) minor system or environmental damage

4 Negligible Less than minor injury or occupational illness (no lost workdays) or less than minor environmental damage

RiskLevel

Description Probability (Frequency) (per 100000 flight hours)

A Frequent gt 100 (P gt 1E-3)

B Probable lt=100 and gt10 (1E-4 lt P lt= 1E-3)

C Occasional lt= 10 and gt1 (1E-5 lt P lt= 1E-4)

D Remote lt=1 and gt01 (1E-6 lt P lt= 1E-5)

E Improbable lt=01 and gt001 (1E-7 lt P lt= 1E-6)

Presenter
Presentation Notes
As already mentioned it is the goal of the US Army Aviation Engineering Directorate that the system is safe and reliable to operate and will perform the mission when delivered Another goal includes the released system continues to safely perform the mission if maintained and operated per the operatorrsquos manual Replacement parts and overhaul work must be high quality to support continued airworthiness Per the Program Element Office Memorandum 08-03 Risk Matrix US Army flight control systems are to achieve 1E-9 reliability for flight critical functions per civil airspace regulations [35] and 1E-6 reliability for tactical use [36] Quantifying these numbers is established for component hardware but not for software Just as hardware should have quantifiable reliability so should software13

25

Reliability Defined

bull Software Reliability - the probability that a given piece of software will execute without failure in a given environment for a given time [40] ndash Often debated as to how to measure

bull Hardware Reliability - the probability that a hardware component fails over time ndash Well defined and established

bull System Reliability - the probability of success or the probability that the system will perform its intended function under specified design limits [over a given amount of time] [39] ndash A combination of software and hardware reliability

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

26

Hardware vs Software Reliability

Hardware Reliability Software ReliabilityFailure rate has a bathtub curve The burn-in state is similar to the software debugging state

Without considering program evolution failure rate is statistically non-increasing

Material deterioration can cause failures even though the system is not used

Failures never occur if the software is not used

Failure data are fitted to some distributions The selection of the underlying distribution is based on the analysis of failure data and experiences Emphasis is placed on analyzing failure data

Most models are analytically derived from assumptions Emphasis is on developing the model the interpretation of the model assumptions and the physical meaning of the parameters

Failures are caused by material deterioration design errors misuse and environment

Failures are caused by incorrect logic incorrect statements or incorrect input data

Can be improved by better design better material applying redundancy and accelerated life cycle testing

Can be improved by increasing testing effort and correcting discovered faults Reliability tends to change continuously during testing due to the addition of problems in new code or the removal of problems by debugging errors

Hardware repairs restore the original condition Software repairs establish a new piece of software

Hardware failures are usually preceded by warnings Software failures are rarely preceded by warnings

Hardware components can be standardized Software components have rarely been standardized

Hardware can usually be tested exhaustively Software essentially requires infinite testing for completeness

Reference [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Hong Pham compares software versus hardware reliability with the information as shown in Table 2 [39] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

27

Acronym ListACRONYM DEFINITIONAADL Architectural Analysis and Design LanguageAC Advisory Circular (FAA)ACM Association of Computing MachineryAED Aviation Engineering Directorate (AMRDEC)AFTD Aviation Flight Test Directorate (US Army)AGC Apollo Guidance ComputerAHS American Helicopter SocietyAIAA American Institute of Aeronautics and Astronautics (Inc)AMCOM Aviation and Missile Command (US Army)AMRDEC Aviation and Missiles Research Development and Engineering Center (US Army)AR Army RegulationARINC Aeronautical Radio Inc ARP Aerospace Recommended PracticeASIF Avionics Software Integration FacilityATAM Architecture Tradeoff Analysis MethodATM Air Traffic ManagementAWR Airworthiness ReleaseCAAS Common Avionics Architecture SystemCH-47 Cargo Helicopter ChinookCMM Capability Maturity ModelCMMI Capability Maturity Model IndexCMU Carnegie Mellon UniversityCNS Communications Navigation SurveillanceCoSMIC Component Synthesis using Model-Integrated ComputingCPS Cyber-Physical SystemCRC Chemical Rubber Company (ie CRC Press)DFBW Digital Fly-By-WireDoD Department of DefenseE3 Electrical and Electromagnetic EffectsESML Embedded System Modeling LanguageFAA Federal Aviation AdministrationFCS Future Combat SystemsFHA Functional Hazard AssessmentFMEA Failure Modes Effects Analysis

28

Acronym List (concluded)

ACRONYM DEFINITIONGPWS Ground Proximity Warning SystemIBM International Business MachinesIEC International Engineering ConsortiumIL Instrumentation Lab (now Draper Laboratory)IMA Integrated Modular AvionicsINCOSE International Council On Systems EngineeringISO International Organization for StandardizationISS International Space StationKAL Korean AirlinesMISRA Motor Industry Standard Software Reliability AssociationMIT Massachusetts Institute of TechnologyNASA National Aeronautics and Space Administration (USA)PDR Preliminary Design ReviewPEO Program Element OfficePNAS Proceedings of the National Academy of SciencesRAQ Rotorcraft and Aircraft QualificationRMA Rate Monotonic AnalysisRTC Redstone Test Center (US Army) RTTC Redstone Technical Test Center (US Army)RTCA Radio Technical Commission for AeronauticsSAE Society of Automotive EngineersSED Software Engineering Directorate (AMRDEC)SEES Software Engineering Evaluation SystemSEI Software Engineering Institute (CMU)SIL System Integration LaboratorySSA System Safety AssessmentSTS Space Transportation SystemSysML Systems Modeling LanguageTMR Triple Modular RedundantTRL Technical Readiness LevelUAS Unmanned Aircraft SystemUH-60 Utility Helicopter BlackhawkUML Unified Modeling LanguageUS United StatesUSL Universal Systems Language

29

References

bull [1] Israel Koren and Mani Krishna ldquoFault-Tolerant Systemsrdquo Morgan Kaufmann 2007bull [2] Jianto Pan ldquoSoftware Reliabilityrdquo Carnegie Mellon University Spring 1999bull [3] Nachum Dershowitz httpwwwcstauacil~nachumdhorrorhtmlbull [4] httpwwwair-attackcombull [5] David A Mindell ldquoDigital Apollo Human and Machine in Spaceflightrdquo The MIT Press 2008bull [6] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Flight Software Airworthiness SED-SES-PMHFSA 001 December 2003bull [7] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Software Safety SED-SES-PMHSSA 001 February 2006bull [8] AMCOM Regulation 385-17 Software System Safety Policy 15 March 2008bull [9] NASA Software Safety Guidebook NASA-GB-871913 31 March 2004bull [10] Margaret Hamilton amp William Hackler ldquoUniversal Systems Language Lessons Learned from Apollordquo IEEE Computer

Society 2008bull [11] Margaret Hamilton ldquoFull Life Cycle Systems Engineering and Software Development Environment Development Before

The Fact In Actionrdquo httpwwwhtiuscomArticlesFull_Life_Cyclehtmbull [12] Peter Feiler David Gluch John Hudak ldquo The Architecture Analysis amp Design Language (AADL) An Introductionrdquo

CMUSEI-2006-TN-011 February 2006bull [13] Peter Feiler John Hudak ldquoDeveloping AADL Models for Control Systems A Practitionerrsquos Guiderdquo CMUSEI-2007-TR-

014 July 2007bull [14] Bruce Lewis ldquoUsing the Architecture Analysis and Design Language for System Verification and Validationrdquo SEI

Presentation 2006bull [15] Feiler Gluch Hudak Lewis ldquoEmbedded System Architecture Analysis Using SAE AADLrdquo CMUSEI-2004-TN-004 June

2004bull [16] Charles Pecheur Stacy Nelson ldquoVampV of Advanced Systems at NASArdquo NASACR-2002-211402 April 2002bull [17] Systems Integration Requirements Task Group ldquoARP 4754 Certification Considerations for Highly-Integrated or

Complex Aircraft Systemsrdquo SAE Aerospace 10 April 1996bull [18] SAE ldquoARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne System and

Equipmentrdquo December 1996bull [19] Aeronautical Radio Inc (ARINC) ldquoARINC Specification 653P1-2 Avionics Application Software Standard Interface Part 1

ndash Required Servicesrdquo 7 March 2006

30

References (Continued)

bull [20] Department of Defense ldquoMIL-STD-882D Standard Practice for System Safetyrdquo 19 January 1993bull [21] RTCA Incorporated ldquoDO-178 Software Considerations in Airborne Systems and Equipment Certificationrdquo 1

December 1992bull [22] RTCA Incorporated ldquoDO-254 Design Assurance Guidance for Airborne Electronic Hardwarerdquo 19 April 2000bull [23] US Army ldquoAeronautical Design Standard Handbook Rotorcraft and Aircraft Qualification (RAQ) Handbookrdquo 21

October 1996bull [24] Cary R Spitzer (Editor) ldquoAvionics Elements Software and Functionsrdquo CRC Press 2007bull [25] US Army ldquoArmy Regulation 70-62 Airworthiness Qualification of Aircraft Systemsrdquo 21 May 2007bull [26] US Army ldquoArmy Regulation 95-1 Aviation Flight Regulationsrdquo 3 February 2006bull [27] ldquoUsing the Architecture Tradeoff Analysis Method (ATAM) to Evaluate the Software Architecture for a Product Line

of Avionics Systems A Case Studyrdquo Barbacci Clements Lattanze Northrop Wood July 2003 CMUSEI-2003-TN-012bull [28] ldquoAll in the Family CAAS amp AADLrdquo Peter Feiler August 2008 CMUSEI-2008-SR-021bull [29] ldquoCMMI Guidelines for Process Integration and Product Improvementrdquo Chrissis Konrad Shrum Pearson Education

2007bull [30] ldquoModel Driven Performance Analysis for Avionics Systemsrdquo Brendan OrsquoConnell Draper Laboratory January 2006bull [31] John F Hanaway Robert W Moorehead ldquoSpace Shuttle Avionics Systemsrdquo NASA SP-504 1989bull [32] Lui Sha ldquoThe Complexity Challenge in Modern Avionics Softwarerdquo August 14 2006bull [33] ldquoIncidents Prompt New Scrutiny of Airplane Software Glitchesrdquo 30 May 2006 Wall Street Journalbull [34] Eyal Ophir Clifford Nass and Anthony Wagner ldquo Cognitive Control in Media Multitaskersrdquo PNAS 20 July 2009bull [35] ldquoAdvisory Circular AC 251309-1A System Design and Analysisrdquo Federal Aviation Administration 21 June 1988bull [36] Program Element Office Policy Memorandum 08-03bull [38] httpwwwnsfgovpubs2008nsf08611nsf08611htm National Science Foundation webpage on Cyber-Physical

Systemsbull [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000bull [40] Paul Rook editor ldquoSoftware Reliability Handbookrdquo Elsevier Science Publishers LTD 1990

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)

6

AED and Qualification

bull AEDrsquos mission is to ensure airworthiness qualification for aircraft and subsystems used in the US Army fleet

bull Airworthiness Qualification isndash Demonstration of an aircraft or aircraft subsystem or

component including modifications to function safely meeting performance specifications when used and maintained within prescribed limits (AR 70-62)

bull Traditionally qualified systems by ndash Similarity ndash Analysisndash Testndash Demonstrationndash Examination

Presenter
Presentation Notes
Airworthiness qualification is the demonstration of an aircraft or aircraft subsystem or component including modifications to function satisfactorily when used and maintained within prescribed limits [25] The US Army has traditionally qualified systems and components by physical testing analysis demonstration or by similarity

7

Evolution of Helicopter Systems

bull Past systems historically federated ndash Distributed Functionality ndash Hardware basedndash Simple easier to test

bull Now systems are becoming integratedndash Combined Functionality ndash More Software Intensivendash Complex more difficult to test

Chief Warrant Office Jim Beaty (back row far left) and crew of the of the Vietnam UH-1 Flying Bikinis (friend of Alex Boydston)

UH-1 Cockpit (US Army)

Chinook CAAS Cockpit (US Army)

CH-47 Chinook (US Army)

Presenter
Presentation Notes
The US Army has been involved in flight since the early Wright-B flight of 1909 Initially fixed wing and rotary wing aircraft in early aviation remained simple and federated The basics to aviate navigate and communicate were handled by dedicated gauges compasses gyroscopes and mechanical linkages to fly the aircraft With the dawn of the Space Age in the 1960s the rise of more complex electronic control in aviation appeared with the ApolloSaturn projects NASA Dryden F-8 project and others As time progressed more integrated modular avionics (IMA) and digital fly by wire (DFBW) came into military and commercial aircraft to reduce weight increase functionality and provide redundancy With this advancement of complex flight control technology testing verification and validation became problematic To this day complex systems have been challenging to qualify for the Federal Aviation Administration (FAA) and military airworthiness commands13

8

Present Approach to Testing

bull Several disciplines weigh in such as software avionics GNampCenvironmental E3 electrical human factors structures aerodynamics etc

bull Current test methodology per older federated systemsndash Hardware Mil-Std 810 Mil-Std 461ndash Requirements Analysis (Traceability)

bull Test at different levelsndash Individual software module testingndash Black box testingndash System Integration testing

bull Hot benchbull System Integration Lab (SIL)

ndash Aircraft level Testing bull Groundbull Flight Aviation Flight Test Directorate (AFTD) Testing (US Army Photo)

Aviation System Integration Facility (ASIF) (US Army Photo)

Presenter
Presentation Notes
Airworthiness qualification is the demonstration of an aircraft or aircraft subsystem or component including modifications to function satisfactorily when used and maintained within prescribed limits [25] The US Army has traditionally qualified systems and components by physical testing analysis demonstration or by similarity Historically most systems were federated They were hardware based simple and distributed Now they have become more integrated They are more software intensive complex and have combined functionality contained in one or more computers With this evolution from simple to more complex the Army is finding it more difficult to execute an AWR As systems evolve to more complex systems of systems this problem is only growing worse13The current test approach to achieving confidence in systems for an AWR for the US Army is based more on traditional federated avionics systems Experienced personnel in Software Vehicle Management Systems Avionic Communications Navigation amp Guidance Control Electrical Environmental Human Factors Electrical and Electro-magnetic Effects (E3) Structures Thermal Safety Integration Testing and Flight Test personnel and test pilots all play important roles in accomplishing test and review of new and existing systems While some may not consider areas such as thermal or EEE important to software development they are crucial since the software is running on physical systems that are affected by heat and susceptibility to electromagnetic radiation which can cause abnormal operation or bit errors Current test methodology for hardware relies on MIL-STD-810 MIL-STD-461 and requirements analysis such as traceability MIL-STD-810 is the Department of Defense Test Method Standard for Environmental Engineering Considerations and Laboratory Tests1313It focuses on equipment environmental design and test limits to the conditions that it will experience throughout its service life and establishing chamber test methods that replicate the effects of environments on the equipment rather than imitating the environments themselves MIL-STD-461 includes the ldquoRequirements for the Control of Electromagnetic Interference Characteristics of Subsystems and Equipmentrdquo These test standards are geared for hardware and physical effects and do not address software operation Current methods for software testing include testing the individual software modules black box (or interface) level system integration level and aircraft level testing System integration testing can include hot bench system integration labs (SILs) (such as the shown in Figure 4) and aircraft ground level testing as conducted by the Aviation Flight Test Directorate (AFTD) of Redstone Test Center (RTC) (see Figure 5) 13Neglecting complete and good requirements promotes risk It is common practice by Program Managers (PMs) to accept risks As issues are found in systems it is on AED to issue an Army Safety Action Memorandum (ASAM) that identifies a deficiency to the field pilots Concurrently an Airworthiness Impact Statement (AWIS) is issued to the PMs which contain a probabilistic analysis of how the identified shortcoming will affect risk The AMCOM Safety Office will produce its own calculations The PMs can either accept or reject this assessment Regardless cautions and warnings are placed in the AWR to keep the entire program and flight crew aware of issues13

9

Development Challenges

bull Legacy Aircraft often upgraded in a piecemeal fashionndash Makes certification difficultndash Desire to increase to modern requirements based on size of upgrade and

what it includes ndash hard to scope

bull New system requirements must be clear complete and testable ndash Certification requirements must be obvious

bull Orchestrating agreement between stakeholders is necessary to mitigatendash Juggling of multiple software buildsndash End system that is difficult to test certify and deployndash Escalating Costsndash System Safety from being poorly understoodndash Design iterations

Presenter
Presentation Notes
It would be wonderful if all systems were straightforward in design easily testable and simple to write an Airworthiness Release (AWR) for however that is not the case Legacy aircraft such as the Chinook have been upgraded in a piecemeal fashion acquiring much needed improvements in aviation navigation and communication The general recommended system development V-curve as shown earlier is not always followed in a strict sense although it should be the goal process Negligence to the proper process makes the establishment of certification very difficult For new system development and existing system upgrades requirements must be clear complete and testable The certification requirements must be made obvious in the development of the requirements establishment phase with the goal of being fully identified during the requirements development Orchestrating agreement among all stakeholders (eg the program manager systems engineers human factors engineers integrators test engineers manufacturers users and certifiers) is necessary to mitigate problems such as13juggling multiple software builds13producing a difficult-to-test difficult-to-certify and difficult-to-deploy systems13misunderstanding system safety and13requiring design iterations that impact schedules and costs

10

Complexity Issues

bull System Development costs and schedule increase with complexityndash Existing lack of schedule and funding resources

bull Keeps systems from achieving full compliance with specifications and requirements

bull Garbage in -gt Garbage OuthellipPoor requirements -gt Poor Systemndash Finding problems in new designs at PDR is too latendash Difficult to correct existing poorly designed fielded complex systems

bull Complexity amp reliability of complex systems is not fully understoodndash How do we accurately assess operating risk performance reliability of

complex systems based on limited testing and analysisndash How do we know when system design is good enoughndash Latent defects occur in supposedly well-tested mature systems

bull Avionics parts and software change constantlyndash Spiral development -gt new softwarehardware qualification required frequently ndash How do we streamline the process (partition the system) so the need for

complete re-qualification after changes is lessened

Presenter
Presentation Notes
In an old computer development lesson it is well known that if you put ldquogarbage in you will get garbage outrdquo Likewise in a system if you have poor requirements then you will end up with a poor system Finding problems in new designs at Preliminary Design Review (PDR) is too late It has been shown that discovering issues at this stage in the game will cause a reiteration on the design which costs time and money Furthermore dealing with existing fielded complex systems that have not gone through the rigors of proper systems engineering results quite often with an overwhelming mess of functionality to test verify validate qualify and certifydesigns at PDR is too late It has been shown that discovering issues at this stage in the life-cycle will cause a reiteration on the design which costs time and money Furthermore dealing with existing fielded complex systems that have not gone through the rigors of proper systems engineering results quite often with a complicated conglomeration of functionality to test verify validate qualify certify and maintain This is indicative where complexity has exceeded our understanding in how to certify a system We still do not fully understand complexity and how to address reliability of complex systems How do we accurately assess operating risks performance and reliability of complex systems based on limited testing and analysis How do we know when a system design is good enough How do we modularize spirally developed systems to minimize the need for re-qualification of unchanging portions of the system We are 30 plus years into this technology and we still deal with systems with latent defects that are occurring in supposedly well-tested and mature systems To further exacerbate the problem we are now dealing with complex system of systems (ie cyber-physical systems)13It is a given that you can keep on adding redundancy and complexity to a problem to attain a desired level of reliability but at some point in time the reliability will taper off At best we sometimes must satisfy for an optimum point before digressing in reliability In the same vein system development costs and schedule increase with complexity too (see Figure 7 and 8)13Avionics parts and software constantly change over the life of a program Typically a spiral development program occurs with complex software development which means that qualification is required frequently This begs the question of how to streamline the process so that the need to conduct a complete requalification is avoided13With these complex systems there are other hurdles to cross such as fully characterizing and conducting the Functional Hazard Assessments (FHAs) and Failure Mode Effects and Criticality Analyses (FMECAs) It is crucial for the safety assessment that these are conducted correctly to fully understand the risks and later perform the correct tests at the right level Additionally once the complex component hardware and software are integrated then yet other problems appear It is at that time that the disparity of teams crossing multiple contractors and development groups becomes obvious and if not properly coordinated could cause impacts to the schedule Other programmatic problems affect complex system development and qualification For instance lack of schedule and funding resources causes a shortcoming to adequately provide for the proper compliance with specification and requirements short-circuiting the systems engineering process An ever decreasing availability of trained engineers to support the development and test of such systems exists Non-technical political influences sometimes affect the reliability of complex systems and are difficult to avoid Lastly there is a lack of a centralized database that captures the various families of systems that have been built along with their characterization of success and failures Such a database from all past and present government complex systems could be valuable in establishing reliability basis for future models

11

Complexity Issues (continued)

bull Functional Hazard Assessments and related documentation are crucial

ndash Understanding risks ndash Performing the correct tests at the right level Lab test vs Flight Test

bull Saves flight time and money

bull Systems Integration for complex systems is a schedule driver

bull Need experienced personnel to work complex systems

bull Need a centralized database - just doesnrsquot existndash Determine data needed for quantifying reliability of complex systemsndash Capture the pertinent data on previous or existing complex systemsndash Understand successes and failures of previous and present complex systemsndash Establish baseline reliability figures for known architectures

bull Complex System of Systems exacerbates problem

12

Reliability vs Complexity ampCost vs Complexity

Notional Graphs

bull Reliability vs Complexity bull Cost amp Schedule vs Complexity

Rel

iabi

lity

Complexity Complexity

Cos

t amp S

ched

uleOptimum

Aggregation of part reliability

feeds into overall system reliability

Desired

Presenter
Presentation Notes
Avionics parts and software constantly change over the life of a program Typically a spiral development program occurs with complex software development which means that qualification is required frequently This begs the question of how to streamline the process so that the need to conduct a complete re-qualification is avoided13With these complex systems there are other hurdles to cross such as fully characterizing and conducting the Functional Hazard Assessments (FHAs) and Failure Mode Effects and Criticality Analyses (FMECAs) It is crucial for the safety assessment that these are conducted correctly to fully understand the risks and later perform the correct tests at the right level13Additionally once the complex component hardware and software are integrated then yet other problems crop up It is at that time that the disparity of teams crossing multiple contractors and development groups becomes obvious and if not properly coordinated could cause impacts to the schedule 13Other programmatic problems affect complex system development and qualification For instance lack of schedule and funding resources causes a shortcoming to adequately provide for the proper compliance with specification and requirements short-circuiting the systems engineering process An ever decreasing availability of trained engineers to support the development and test of such systems exists Non-technical political influences sometimes affect the reliability of complex systems and are difficult to avoid Lastly there is a lack of a centralized database that captures the various families of systems that have been built along with their characterization of success and failures Such a database from all past and present government complex systems could be valuable in establishing reliability basis for future models

13

A Few Examples of Complex Systems

bull This is not a new problem Other have struggled with the challenges of establishing confidence in complex systems

ndash NASAbull Apollo Guidance Computerbull Dryden F8 Crusaderbull Space Shuttlebull International Space Station

ndash Commercial Airlinersbull Airbus A320 and higherbull Boeing B777 B787

ndash Militarybull Ships and Submarinesbull Jets (F14F15 F16 F18 F22 F35 etc)bull Cargo Planes (C130J Hercules C17 Globemaster etc)bull Helicopters (Chinook Blackhawk Sea Stallion etc)bull Rocketsbull Unmanned Aerial Systemsbull Unmanned Ground Systemsbull Unmanned Submarine Systems Photos by US Army NASA US Navy and US Air Force

Presenter
Presentation Notes
As mentioned earlier complex avionics systems are not a new idea Since the early 1960rsquos complex avionic architectures have existed beginning with the ApolloSaturn program Massachusetts Institute of Technology (MIT) Instrumentation Lab (IL) which is now Draper Laboratory and International Business Machines (IBM) led the way with the MITIL Apollo Guidance Computer (AGC) and the Saturn V IBM triple modular redundant (TMR) voting guidance computer system The word software was not even coined at the time but engineers such as Margaret Hamilton MITIL Director of Apollo On-board Software can attest to the fact that some the same issues with creating reliable software then still exists today [5] A large majority of the issues then dealt with the communication between systems engineers and the programmers Requirements were thrown over the wall without the confirmation that the requirements were complete and a lot of the issues cropped up as interface problems Identifying these issues prompted Hamilton to create her own company and create a modeling language called Universal Systems Language (USL) to head off the problems experienced with Apollo [11] Some 200 plus modeling programs have been developed since Apollo and used to mitigate issues and increase confidence in systems of varying complexity13As time progressed other systems came along The NASA Dryden F8 Crusader was the first digital fly by wire (DFBW) jet aircraft that relied heavily on complex IMA and software for flight control The Space Transportation System (STS) shuttle includes a Quad Modular Redundant (QMR) system with a fifth backup flight computer containing uncommon code US Air Force and Naval airplanes that have possessed complex or redundant IMA configurations include the F14 Tomcat F15 Eagle F16 Falcon F18 Hornet F22 Raptor F35 Joint Strike Fighter F117 Nighthawk V22 Osprey C17 Globemaster and many more along with recent Unmanned Air Vehicle Systems (UAVS) The US Army complex systems on helicopters include the13RAH-66 Comanche DFBW Triple Modular Redundant (TMR) architecture13glass cockpit avionics on the UH-60M Blackhawk baseline 13Common Avionics Architecture System (CAAS) glass cockpit on the UH-60M Blackhawk modernization and CH-47F Chinooks13and other aircraft13Additionally there are many self-checking pair engine controller systems along with system of system Future Combat Systems (FCS) and Unmanned Air Vehicle Systems (UAVS) This has also permeated the commercial airliner market with the Airbus 320 and higher Airbus models Boeing 777 and Boeing 787 aircraft With this ever increasing technology something must be done about the reliability issue With such a wealth of data on aviation and non-aviation cyber-physical systems such as submarine ship nuclear medical locomotive and automotive systems there should be adequate information to get a start on modeling systems correctly for reliability Therefore this is not an isolated problem to avionics and other disciplines should aide in resolving this problem13

14

Some Complex System Failures

bull V-22 Osprey crashesbull Mars Climate Orbiter crashbull Mars Pathfinder software resetbull USS Vincennes downing an Airbus 320bull Therac-25 software radiation treatment

failurebull 1989 Airbus A320 air show crashbull China Airlines Airbus Industries A300

crashbull Ariane 5 satellite launcher malfunctionbull Failure of the primary flight system to

sync with the backup during prelaunch of STS-1

bull Mexicana Airlines Boeing 727 airliner crashed into a mountain due to the software not correctly determining the mountain position

bull Loss of the first American probe to Venus

bull Korean Airlines KAL 901 accidentbull Soviet Phobos I Mars probe lost

bull Three Mile Islandbull F-18 fighter plane crash due to bad

exceptionbull F-14 fighter plane lost to

uncontrollable spinbull Swedish Gripen prototype crashedbull Swedish Gripen air-show crashbull F-22 failure crossing the IDLbull 2006 German-Spanish Barracuda UAVbull 2004 FA-22 Raptor stealth fighter jet

crash bull FA-22 Raptor navigation system

software error at Nellis AFBbull 50 cockpit blackouts on A320bull A320 multiple avionics and electrical

failures at Newark NJbull Boeing 777 Malaysian Airlines jetlinerrsquos

nightmarish autopilot rollercoaster ridebull 3000 feet US Army and Air Force UAV

Crashes

bull hellip And Many Morehellip

Presenter
Presentation Notes
Multiple crashes have occurred with the V-22 Osprey [41]13In 1999 the Mars Climate Orbiter crashed because of incorrect units in a program caused by poor systems engineering practices [42 44]13In 1988 an Airbus 320 was shot down by the USS Vincennes because of cryptic and misleading output displayed by the tracking software [3]13In 1989 an Airbus A320 crashed at an air show due to altitude indication and software handling [3]13In 1994 a China Airlines Airbus Industries A300 crash on killing 264 from faulty software [3]13In 1996 the first Ariane 5 satellite launcher destruction mishap was caused by a faulty software design error with a few lines of ADA code containing unprotected variables Horizontal velocity of the Ariane 5 exceeded the Arian 4 resulting in the guidance system veering the rocket off course Insufficient testing did not catch this error which was a carry-over from Ariane 4[3 39]13In 1986 a Mexicana Airlines Boeing 727 airliner crashed into a mountain due to the software not correctly determining the mountain position [39]13In 1986 the Therac-25 radiation therapy machines overdosed cancer patients due to flaw in the computer program controlling the highly automated devices [3 39 45]13During the maiden launch in 1981 of the Discovery space shuttle a failure of the primary flight control computer system to establish sync with the backup during prelaunch [43]13On December 10 1990 the Space Shuttle Columbia had to land early due to computer software problems [39]13In 1997 The Mars Pathfinder software reset problem due to latent task execution caused by priority inversion with a mutex [3 44]13An error in a single FORTRAN statement resulted in the loss of the first American probe to Venus From G J Myers Software Reliability Principles amp Practice p 25 [3]13In September 1999 the Korean Airlines KAL 901 accident in Guam killed 225 out of 254 aboard A worldwide bug was discovered in barometric altimetry in Ground Proximity Warning System (GPWS) From ACM SIGSOFT Software Engineering Notes vol 23 no 1 [3]13The Soviet Phobos I Mars probe was lost due to a faulty software update at a cost of 300 million rubles Its disorientation broke the radio link and the solar batteries discharged before reacquisition From Aviation Week 13 Feb 1989 [3]13An F-18 fighter plane crash due to a missing exception condition From ACM SIGSOFT Software Engineering Notes vol 6 no 2 [3]13An F-14 fighter plane was lost to uncontrollable spin traced to tactical software From ACM SIGSOFT Software Engineering vol9 no5 [3]13In 1989 Swedish Gripen prototype crashed due to software in their digital fly-by-wire system [3 46]13In 1995 another Gripen fighter plane crashed during air-show caused by a software issue [3 46]13On February 11 2007 twelve FA-22 Raptors were forced to head back to the Hawaii when a software bug caused a computer crash as they were crossing International Date Line In the middle of the ocean all their systems comprising navigation fuel and part of the communications systems dumped All the attempts to reboot failed[47]13February 2006 German-Spanish Unmanned Combat Air Vehicle Barracuda crash due to software failure [4]13December 2004 a glitch in the software for controlling flight probably caused an FA-22 Raptor stealth fighter jet to crash on takeoff at Nellis Air Force [4]13In 2008 a United Airbus A320 registration N462UA experienced multiple avionics and electrical failures including loss of all communications shortly after rotation while departing Newark Liberty International Airport in Newark New Jersey [NTSB Report Identification DCA08IA033] 13In 2006 a Boeing 777 Malaysian Airlines jetlinerrsquos autopilot caused a stall to occur by climbing 3000 feet Pilots struggled to nose down the plane but plunged into a steep dive After pulling back up the pilots regained control Cause was defective flight software providing incorrect data for airspeed and acceleration confusing the flight computers and initially ignoring the pilotrsquos commands[49]13US Army and Air Force UAV crashes from control system or human error13

15

Lessons Learned from Failures

bull From Nancy Levesonrsquos paper ldquoThe Role of Software in Spacecraft Accidentsrdquondash ldquoFlaws in the safety culture diffusion of responsibility and

authorityndash Limited communication channels and poor information flowndash Inadequate system and software engineering ndash Poor or missing specifications ndash Unnecessary complexity and software functionality ndash Software reuse or changes without appropriate safety analysisndash [Shortcomings] in safety engineering practices ndash Flaws in test and simulation environments ndash Inadequate human factors design for softwarerdquo

Presenter
Presentation Notes
In Dr Nancy Levesonrsquos paper [36] ldquoThe Role of Software in Spacecraft Accidentsrdquo she cited problems with software development issues within NASA on various projects According to Dr Leveson there were ldquoflaws in the safety culture diffusion of responsibility and authority limited communication channels and poor information flow inadequate system and software engineering poor or missing specifications unnecessary complexity and software functionality software reuse or changes without appropriate safety analysis violation of basic safety engineering practices inadequate system safety engineering flaws in test and simulation environments and inadequate human factors design for softwarerdquo While these problems were identified for spacecraft development within NASA and corrected aviation in general could learn from these lessons to mitigate issues with complex systems development

16

Some Current Guidelines

bull DO-178B - Software Considerations in Airborne Systems and Equipment Certification bull DO-248B ndash Final Report for the Clarification of DO-178Bbull DO-278 - Guidelines for Communications Navigation Surveillance and Air Traffic Management

(CNSATM) Systems Software Integrity Assurancebull DO-254 - Design Assurance Guidance for Airborne Electronic Hardware bull DO-297 ndash Integrated Modular Avionics (IMA) Development Guidance and Certification

Considerationsbull SAE-ARP4754 ndash Certification Consideration for Highly Integrated or Complex Aircraft Systemsbull SAE-ARP4671- Guidelines and Methods for Conducting the Safety Assessment Process on

Airborne Systems and Equipmentbull FAA Advisory Circular AC27-1B - Certification of Normal Category Rotorcraftbull FAA Advisory Circular AC29-2C - Certification of Transport Category Rotorcraftbull ISOIEC 12207 - Software Life Cycle Processesbull ARINC 653 - Specification Standard for Time and System Partitionbull MIL-STD-882D - DoD System Safetybull ADS-51-HDBK - Rotorcraft and Aircraft Qualification Handbookbull AR-70-62 - Airworthiness Release Standardbull SED-SES-PMHFSA 001 - Software Engineering Directorate (SED) Software Engineering

Evaluation System (SEES) Program Manager Handbook for Flight Software Airworthinessbull SED-SES-PMHSS 001 - SED SEES Program Manager Handbook for Software Safety

WHATrsquoS MISSING - Reliability Standard for Complex Systems

Presenter
Presentation Notes
These problems previously stated drove the development of these guidelines however there is no standard for system reliability that includes software There are other standards and circulars that pertain to complex systems but a reliability standard for complex systems but a reliability standard is missing for complex systems which would outline the process for establishing cyber-physical systems reliability This standard should indicate how to model and analyze and ascertain the projected level of reliability

17

Certification Assessment Considerations

bull Sufficient data and time must be available for air worthiness evaluation

bull Certification processndash Currently lengthy ndash Depends much on human interpretation trade offs and risk mitigation ndash Overwhelming for complex integrated systems (FHAs FTAs FMECAs

risk mitigation etc)

bull Consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives

bull Certification Tasks outlined in DO-297 should be consideredndash Task 1 Module Acceptancendash Task 2 Application softwarehardware acceptancendash Task 3 IMA system acceptancendash Task 4 Aircraft integration of IMA system ndash including VampVndash Task 5 Change of modules or applicationsndash Task 6 Reused of modules or applications

Presenter
Presentation Notes
In order to execute an AWR sufficient data must be provided to the appropriate airworthiness authorities to demonstrate that safety requirements are met to the levels per the System Safety Assessment (SSA) or equivalent The certification process is currently lengthy and depends on much human interpretation of the myriad of complex architecture functions 13The current guidelines such as DO-178B DO-254 DO-297 SAE-ARP-4754 and SAE-ARP-4671 along with many other guidelines outline the proper steps that should be taken System safety managementrsquos military standard is MIL-STD-882 and has been in use for decades Civilian safety standards for the aviation industry include SAE ARP4754 which shows the incorporation of system safety activities into the design process and provides guidance on techniques to use to ensure a safe design SAE ARP4761 contains significant guidance on how to perform the system safety activities spoken about in SAE ARP4754 DO-178B outlines development assurance guidance for aviation software based on the failure condition categorization of the functionality provided by the software DO-254 embodies similar guidance for aviation hardware ARINC 653 is a widely accepted standard to ensure time and space partitioning for software DO-297 does an excellent job of describing the certification tasks to take for an IMA system which include13Task 1 Module acceptance13Task 2 Application softwarehardware acceptance13Task 3 IMA system acceptance13Task4 Aircraft integration of IMA systems including verification and validation13Task 5 Change of modules or applications13Task 6 Reuse of modules or applications13Taken together these standards provide guidance that if followed likely will result in safe highly reliable and cost-effective systems over the life-cycle of the system Yet while these guidelines exist there is not a consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives Also there is not a standard outlining overall reliability for a system to include hardware and software reliability In order to achieve this level of reliability a standard should be developed to define the process and method to achieve a quantifiable reliability number that would in turn lead to acceptance13

18

TRL 3 or 4 TRL 6 or 7 TRL 8 or 9

Component 4

ComplexityFundamentals

Reliability Parametrics

Component 3

ComplexityFundamentals

Reliability Parametrics

Definition of Complexity and Reliability is Needed

Subsystem 1

SystemIntegration ofComponents

ReliabilityDependencies

Component 2

ComplexityFundamentals

Reliability Parametrics

Component 1

ComplexityFundamentals

Reliability Parametrics

Subsystem 2

SystemIntegration ofComponents

ReliabilityDependencies

Integration System

Realized System

ReliabilitySensitivities

High Reliable Complex System

Certificate(eg AWR)

Integration

Integration

Integration

Integration

Integration

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) andor AWR

19

Analytical Models and Reliability

bull Analytical models of hardware reliability are well understood

bull Architecture modeling and software reliability modeling is not a novel idea but is highly debated

ndash There are many approaches and little consensus as to best wayndash Many models (Jelinski-Moranda Littlewood-Verrall Musa-Okumoto etc) [1]ndash Many tools (over 200+ tools since 1970s have been built) [2]

bull Predictability of software reliability is of great concern because it is a major contributor to unreliability[2]

bull Software Reliability is the probability of error-free operation of a computer program in a specified environment for a specified time [1]

bull Need basis for setting reliability figures based on previous systems and iteratively refine those figures in the future

bull NOT A REPLACEMENT FOR TESTING AND VERIFICATION

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability While not suggesting as solution in total better modeling practices should be considered toward a solution to bridge the gap between design test and implementation A method to model the architecture early in the requirements establishment phase follow the detailed design and coding and then be able to verify the system along with the model may be a path to greater confidence in the system and reduce the risks warnings and cautions that must be issued13In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result (see Figure 9) The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) or AWR 13To achieve this goal modeling and analysis tools that follow a standard for modeling reliability should exist As previously stated over 200 tools have been created since 1970s Here is a list of a few of the current tools13Universal Systems Language (USL)13Unified Modeling Language (UML)13Systems Modeling Language (SysML)13MATLABSimulink13Telelogic Rhapsody13MathCAD13Colored Petri Nets13Rate Monotonic Analysis (RMA)13STATEMATE (Used by Airbus)13Standard for the Development of Safety-Critical Embedded Software (SCADE)13OPNET13Embedded System Modeling Language (ESML)13Component Synthesis using Model-Integrated Computing (CoSMIC)13Architectural Analysis and Design Language (AADL)13By no means is this list complete Typically different companies and projects address this challenge and choose unit tools to perform the upfront analysis and modeling not following a standard approach Multiple tools need to converge or be compatible with the framework of a common tool such as AADL The resultant tool(s) could be used to verify the requirements up front to mitigate reliability issues down the life-cycle 13

20

Tools for Modeling and Analysis

bull Universal Systems Language (USL)bull Unified Modeling Language (UML)bull Systems Modeling Language (SysML)bull MATLABSimulinkbull Telelogic Rhapsodybull MathCadbull Colored Petri Netsbull Rate Monotonic Analysis (RMA)bull STATEMATE (Used by Airbus)bull SCADEbull OPNETbull Embedded System Modeling Language (ESML)bull Component Synthesis using Model-Integrated Computing (CoSMIC)bull Architectural Analysis and Design Language (AADL)bull At least 200+ more packages since the 70rsquosbull Certified tools needs to converge to an accepted standard

modelinganalysis method for complex system reliability

Presenter
Presentation Notes
Typically different companies and projects address the challenge and choose unit tools to perform the upfront analysis and modeling not following a guideline approach Multiple tools need to converge or be compatible with a set modeling standard for complex systems The compliant tools could be used to verify the requirements up front to mitigate reliability issues down the life-cycle A notional approach would follow that shown in Figure 8 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model The down side to modeling from certain circles is getting people to believe those models How do you certify a modeling tool and the actual models within the tools Those issues should be addressed going forward

21

Modification to Acquisition Model

Requirements Establishment

High Level Design

Detailed Specifications

Implementation Coding

Verification

Development TestingA

rchi

tect

ural

Mod

el amp

A

naly

sis

Propose standard modeling methodology to be applied at different phases of development to enhance requirements development reliability allocation reliability

measurement and testing (DISCLAIMER DOES NOT REPLACE TESTING)

Reliabilityallocated Reliability

measured

Operational Testing amp Validation

Deployed System

Presenter
Presentation Notes
A notional approach would follow that shown in Figure 10 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model This approach could bridge the design and test phases together It is emphasized here that the architectural model would not replace critical testing but augments the process to allow for better requirement identification and verification Thorough ground and flight tests should never be replaced by modeling Modeling would only allow for more robust and a higher level of confidence in the requirements and design The model could be used in conjunction with the testing to confirm the design Proper modeling and analysis would reduce total program costs by enforcing more complete and correct initial requirements which reduce issues discovered down the road in testing that are expensive to fix or impossible to fix and having to accept high risks Additionally if the model is maintained and optimized then it could possibly be used after system deployment to analyze impacts of upgrades or changes to the system allowing for more complete analysis and reduce overall system redesign costs1313A hurdle to cross with modeling and analysis is convincing people to believe those models Some method to certify these models and modeling tools should be addressed in the future Standards should be set in place for correct modeling techniques for complex systems Lastly consideration of standard verification checking tools should be made such as with the use of the Motor Industry Software Reliability Association (MISRA) compliance verification tool for the use of C in safety critical systems

22

Systems Reliability Standard Establishment

bull Establish a working group to define this standardndash Need a technical society to lead the charge on this

bull Collaborate with industry academia military and societiesndash Focus on development of a reliability standard with AWR safety in mindndash Draw upon the experiences to feed into this standard

bull Study existing and previous complex systemsndash Shuttle Space Station missile systems nuclear submarine and ship

systems nuclear control systems military and commercial jet systems ndash Obtain software reliability information from given existing and previous

systemsndash Build database which would serve as basis for future reliability

bull Research prior efforts in complex systems analysis

bull Establish consensus based modeling and analysis method

Presenter
Presentation Notes
In conclusion methods for achieving a design for complex systems do exist however achieving reliability and attaining a level of qualification that would permit better AWRs does not There needs to be a standard developed to tackle this issue rather than relying on current methods to ascertain airworthiness of complex systems This will not occur overnight An orchestrated collaboration among industry academia military labs and technically professional societies to focus on development of this standard should allow us to draw upon the experiences to feed into this reliability standard with AWR safety in mind We have a long living experiment with complex software systems on the Space Transportation System (STS) International Space Station (ISS) missile systems nuclear submarine and ship systems nuclear control systems military and commercial jet systems from which we should be able to obtain at least inferred software reliability information from the architecture and run time that the systems use We should look at the lessons learned from these systems to see what could have been done to improve and what was done right that should be carried forward The challenge is collecting this information into a central database and arriving at some figure of reliability from previous systems where data exists This would at least provide a starting point to allow initial assessments and could be optimized in the future Also this is not the only study for establishing reliability metrics to complex software systems There have been research projects of similarity to this effort that have risen and fallen The data from those projects should not be wasted but studied to feed into whatever standard that is developed While historical information would be useful each design is unique and requires tools to accomplish the design Investigation of architectural modeling constructs should be further investigated as a possible augmentation to the design and test process We need to determine which forum is best to conduct this effort (eg SAE IEEE AIAA ACM AHS INCOSE or other) As stated in the paper ldquoSpace Shuttle Avionicsrdquo [31] ldquoThe designers the flight crew and other operational users of the system often have a mindset established in a previous program or experience which results in a bias against new or different lsquounconventionalrsquo approachesrdquo If nothing is done to address this problem it will only get worse over time It is past time to address the issue of reliability of complex systems and software

23

BACKUP SLIDES

BACKUP SLIDES

24

PEO Aviation System Safety Management Decision Authority Matrix

Severity(Most Credible)

FrequentP gt 1E-3

A

Probable 1D-4 lt P lt= 1E-3

B

Occasional1E-5 lt P lt= 1E-4

C

Remote1E-6 lt P lt= 1E-5

D

Improbable1E-7 lt P lt= 1E-6

E

Catastrophic1

Critical 2

Marginal 3

Negligible4

Army Acquisition

PEO Aviation

ProgramManagement

HazardCategory

Description

1 Catastrophic Death or permanent total disability system loss

2 Critical Severe injury or minor occupational illness (no permanent effect) minor system or environmental damage

3 Marginal Minor injury or minor occupational illness (no permanent effect) minor system or environmental damage

4 Negligible Less than minor injury or occupational illness (no lost workdays) or less than minor environmental damage

RiskLevel

Description Probability (Frequency) (per 100000 flight hours)

A Frequent gt 100 (P gt 1E-3)

B Probable lt=100 and gt10 (1E-4 lt P lt= 1E-3)

C Occasional lt= 10 and gt1 (1E-5 lt P lt= 1E-4)

D Remote lt=1 and gt01 (1E-6 lt P lt= 1E-5)

E Improbable lt=01 and gt001 (1E-7 lt P lt= 1E-6)

Presenter
Presentation Notes
As already mentioned it is the goal of the US Army Aviation Engineering Directorate that the system is safe and reliable to operate and will perform the mission when delivered Another goal includes the released system continues to safely perform the mission if maintained and operated per the operatorrsquos manual Replacement parts and overhaul work must be high quality to support continued airworthiness Per the Program Element Office Memorandum 08-03 Risk Matrix US Army flight control systems are to achieve 1E-9 reliability for flight critical functions per civil airspace regulations [35] and 1E-6 reliability for tactical use [36] Quantifying these numbers is established for component hardware but not for software Just as hardware should have quantifiable reliability so should software13

25

Reliability Defined

bull Software Reliability - the probability that a given piece of software will execute without failure in a given environment for a given time [40] ndash Often debated as to how to measure

bull Hardware Reliability - the probability that a hardware component fails over time ndash Well defined and established

bull System Reliability - the probability of success or the probability that the system will perform its intended function under specified design limits [over a given amount of time] [39] ndash A combination of software and hardware reliability

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

26

Hardware vs Software Reliability

Hardware Reliability Software ReliabilityFailure rate has a bathtub curve The burn-in state is similar to the software debugging state

Without considering program evolution failure rate is statistically non-increasing

Material deterioration can cause failures even though the system is not used

Failures never occur if the software is not used

Failure data are fitted to some distributions The selection of the underlying distribution is based on the analysis of failure data and experiences Emphasis is placed on analyzing failure data

Most models are analytically derived from assumptions Emphasis is on developing the model the interpretation of the model assumptions and the physical meaning of the parameters

Failures are caused by material deterioration design errors misuse and environment

Failures are caused by incorrect logic incorrect statements or incorrect input data

Can be improved by better design better material applying redundancy and accelerated life cycle testing

Can be improved by increasing testing effort and correcting discovered faults Reliability tends to change continuously during testing due to the addition of problems in new code or the removal of problems by debugging errors

Hardware repairs restore the original condition Software repairs establish a new piece of software

Hardware failures are usually preceded by warnings Software failures are rarely preceded by warnings

Hardware components can be standardized Software components have rarely been standardized

Hardware can usually be tested exhaustively Software essentially requires infinite testing for completeness

Reference [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Hong Pham compares software versus hardware reliability with the information as shown in Table 2 [39] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

27

Acronym ListACRONYM DEFINITIONAADL Architectural Analysis and Design LanguageAC Advisory Circular (FAA)ACM Association of Computing MachineryAED Aviation Engineering Directorate (AMRDEC)AFTD Aviation Flight Test Directorate (US Army)AGC Apollo Guidance ComputerAHS American Helicopter SocietyAIAA American Institute of Aeronautics and Astronautics (Inc)AMCOM Aviation and Missile Command (US Army)AMRDEC Aviation and Missiles Research Development and Engineering Center (US Army)AR Army RegulationARINC Aeronautical Radio Inc ARP Aerospace Recommended PracticeASIF Avionics Software Integration FacilityATAM Architecture Tradeoff Analysis MethodATM Air Traffic ManagementAWR Airworthiness ReleaseCAAS Common Avionics Architecture SystemCH-47 Cargo Helicopter ChinookCMM Capability Maturity ModelCMMI Capability Maturity Model IndexCMU Carnegie Mellon UniversityCNS Communications Navigation SurveillanceCoSMIC Component Synthesis using Model-Integrated ComputingCPS Cyber-Physical SystemCRC Chemical Rubber Company (ie CRC Press)DFBW Digital Fly-By-WireDoD Department of DefenseE3 Electrical and Electromagnetic EffectsESML Embedded System Modeling LanguageFAA Federal Aviation AdministrationFCS Future Combat SystemsFHA Functional Hazard AssessmentFMEA Failure Modes Effects Analysis

28

Acronym List (concluded)

ACRONYM DEFINITIONGPWS Ground Proximity Warning SystemIBM International Business MachinesIEC International Engineering ConsortiumIL Instrumentation Lab (now Draper Laboratory)IMA Integrated Modular AvionicsINCOSE International Council On Systems EngineeringISO International Organization for StandardizationISS International Space StationKAL Korean AirlinesMISRA Motor Industry Standard Software Reliability AssociationMIT Massachusetts Institute of TechnologyNASA National Aeronautics and Space Administration (USA)PDR Preliminary Design ReviewPEO Program Element OfficePNAS Proceedings of the National Academy of SciencesRAQ Rotorcraft and Aircraft QualificationRMA Rate Monotonic AnalysisRTC Redstone Test Center (US Army) RTTC Redstone Technical Test Center (US Army)RTCA Radio Technical Commission for AeronauticsSAE Society of Automotive EngineersSED Software Engineering Directorate (AMRDEC)SEES Software Engineering Evaluation SystemSEI Software Engineering Institute (CMU)SIL System Integration LaboratorySSA System Safety AssessmentSTS Space Transportation SystemSysML Systems Modeling LanguageTMR Triple Modular RedundantTRL Technical Readiness LevelUAS Unmanned Aircraft SystemUH-60 Utility Helicopter BlackhawkUML Unified Modeling LanguageUS United StatesUSL Universal Systems Language

29

References

bull [1] Israel Koren and Mani Krishna ldquoFault-Tolerant Systemsrdquo Morgan Kaufmann 2007bull [2] Jianto Pan ldquoSoftware Reliabilityrdquo Carnegie Mellon University Spring 1999bull [3] Nachum Dershowitz httpwwwcstauacil~nachumdhorrorhtmlbull [4] httpwwwair-attackcombull [5] David A Mindell ldquoDigital Apollo Human and Machine in Spaceflightrdquo The MIT Press 2008bull [6] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Flight Software Airworthiness SED-SES-PMHFSA 001 December 2003bull [7] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Software Safety SED-SES-PMHSSA 001 February 2006bull [8] AMCOM Regulation 385-17 Software System Safety Policy 15 March 2008bull [9] NASA Software Safety Guidebook NASA-GB-871913 31 March 2004bull [10] Margaret Hamilton amp William Hackler ldquoUniversal Systems Language Lessons Learned from Apollordquo IEEE Computer

Society 2008bull [11] Margaret Hamilton ldquoFull Life Cycle Systems Engineering and Software Development Environment Development Before

The Fact In Actionrdquo httpwwwhtiuscomArticlesFull_Life_Cyclehtmbull [12] Peter Feiler David Gluch John Hudak ldquo The Architecture Analysis amp Design Language (AADL) An Introductionrdquo

CMUSEI-2006-TN-011 February 2006bull [13] Peter Feiler John Hudak ldquoDeveloping AADL Models for Control Systems A Practitionerrsquos Guiderdquo CMUSEI-2007-TR-

014 July 2007bull [14] Bruce Lewis ldquoUsing the Architecture Analysis and Design Language for System Verification and Validationrdquo SEI

Presentation 2006bull [15] Feiler Gluch Hudak Lewis ldquoEmbedded System Architecture Analysis Using SAE AADLrdquo CMUSEI-2004-TN-004 June

2004bull [16] Charles Pecheur Stacy Nelson ldquoVampV of Advanced Systems at NASArdquo NASACR-2002-211402 April 2002bull [17] Systems Integration Requirements Task Group ldquoARP 4754 Certification Considerations for Highly-Integrated or

Complex Aircraft Systemsrdquo SAE Aerospace 10 April 1996bull [18] SAE ldquoARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne System and

Equipmentrdquo December 1996bull [19] Aeronautical Radio Inc (ARINC) ldquoARINC Specification 653P1-2 Avionics Application Software Standard Interface Part 1

ndash Required Servicesrdquo 7 March 2006

30

References (Continued)

bull [20] Department of Defense ldquoMIL-STD-882D Standard Practice for System Safetyrdquo 19 January 1993bull [21] RTCA Incorporated ldquoDO-178 Software Considerations in Airborne Systems and Equipment Certificationrdquo 1

December 1992bull [22] RTCA Incorporated ldquoDO-254 Design Assurance Guidance for Airborne Electronic Hardwarerdquo 19 April 2000bull [23] US Army ldquoAeronautical Design Standard Handbook Rotorcraft and Aircraft Qualification (RAQ) Handbookrdquo 21

October 1996bull [24] Cary R Spitzer (Editor) ldquoAvionics Elements Software and Functionsrdquo CRC Press 2007bull [25] US Army ldquoArmy Regulation 70-62 Airworthiness Qualification of Aircraft Systemsrdquo 21 May 2007bull [26] US Army ldquoArmy Regulation 95-1 Aviation Flight Regulationsrdquo 3 February 2006bull [27] ldquoUsing the Architecture Tradeoff Analysis Method (ATAM) to Evaluate the Software Architecture for a Product Line

of Avionics Systems A Case Studyrdquo Barbacci Clements Lattanze Northrop Wood July 2003 CMUSEI-2003-TN-012bull [28] ldquoAll in the Family CAAS amp AADLrdquo Peter Feiler August 2008 CMUSEI-2008-SR-021bull [29] ldquoCMMI Guidelines for Process Integration and Product Improvementrdquo Chrissis Konrad Shrum Pearson Education

2007bull [30] ldquoModel Driven Performance Analysis for Avionics Systemsrdquo Brendan OrsquoConnell Draper Laboratory January 2006bull [31] John F Hanaway Robert W Moorehead ldquoSpace Shuttle Avionics Systemsrdquo NASA SP-504 1989bull [32] Lui Sha ldquoThe Complexity Challenge in Modern Avionics Softwarerdquo August 14 2006bull [33] ldquoIncidents Prompt New Scrutiny of Airplane Software Glitchesrdquo 30 May 2006 Wall Street Journalbull [34] Eyal Ophir Clifford Nass and Anthony Wagner ldquo Cognitive Control in Media Multitaskersrdquo PNAS 20 July 2009bull [35] ldquoAdvisory Circular AC 251309-1A System Design and Analysisrdquo Federal Aviation Administration 21 June 1988bull [36] Program Element Office Policy Memorandum 08-03bull [38] httpwwwnsfgovpubs2008nsf08611nsf08611htm National Science Foundation webpage on Cyber-Physical

Systemsbull [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000bull [40] Paul Rook editor ldquoSoftware Reliability Handbookrdquo Elsevier Science Publishers LTD 1990

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)

7

Evolution of Helicopter Systems

bull Past systems historically federated ndash Distributed Functionality ndash Hardware basedndash Simple easier to test

bull Now systems are becoming integratedndash Combined Functionality ndash More Software Intensivendash Complex more difficult to test

Chief Warrant Office Jim Beaty (back row far left) and crew of the of the Vietnam UH-1 Flying Bikinis (friend of Alex Boydston)

UH-1 Cockpit (US Army)

Chinook CAAS Cockpit (US Army)

CH-47 Chinook (US Army)

Presenter
Presentation Notes
The US Army has been involved in flight since the early Wright-B flight of 1909 Initially fixed wing and rotary wing aircraft in early aviation remained simple and federated The basics to aviate navigate and communicate were handled by dedicated gauges compasses gyroscopes and mechanical linkages to fly the aircraft With the dawn of the Space Age in the 1960s the rise of more complex electronic control in aviation appeared with the ApolloSaturn projects NASA Dryden F-8 project and others As time progressed more integrated modular avionics (IMA) and digital fly by wire (DFBW) came into military and commercial aircraft to reduce weight increase functionality and provide redundancy With this advancement of complex flight control technology testing verification and validation became problematic To this day complex systems have been challenging to qualify for the Federal Aviation Administration (FAA) and military airworthiness commands13

8

Present Approach to Testing

bull Several disciplines weigh in such as software avionics GNampCenvironmental E3 electrical human factors structures aerodynamics etc

bull Current test methodology per older federated systemsndash Hardware Mil-Std 810 Mil-Std 461ndash Requirements Analysis (Traceability)

bull Test at different levelsndash Individual software module testingndash Black box testingndash System Integration testing

bull Hot benchbull System Integration Lab (SIL)

ndash Aircraft level Testing bull Groundbull Flight Aviation Flight Test Directorate (AFTD) Testing (US Army Photo)

Aviation System Integration Facility (ASIF) (US Army Photo)

Presenter
Presentation Notes
Airworthiness qualification is the demonstration of an aircraft or aircraft subsystem or component including modifications to function satisfactorily when used and maintained within prescribed limits [25] The US Army has traditionally qualified systems and components by physical testing analysis demonstration or by similarity Historically most systems were federated They were hardware based simple and distributed Now they have become more integrated They are more software intensive complex and have combined functionality contained in one or more computers With this evolution from simple to more complex the Army is finding it more difficult to execute an AWR As systems evolve to more complex systems of systems this problem is only growing worse13The current test approach to achieving confidence in systems for an AWR for the US Army is based more on traditional federated avionics systems Experienced personnel in Software Vehicle Management Systems Avionic Communications Navigation amp Guidance Control Electrical Environmental Human Factors Electrical and Electro-magnetic Effects (E3) Structures Thermal Safety Integration Testing and Flight Test personnel and test pilots all play important roles in accomplishing test and review of new and existing systems While some may not consider areas such as thermal or EEE important to software development they are crucial since the software is running on physical systems that are affected by heat and susceptibility to electromagnetic radiation which can cause abnormal operation or bit errors Current test methodology for hardware relies on MIL-STD-810 MIL-STD-461 and requirements analysis such as traceability MIL-STD-810 is the Department of Defense Test Method Standard for Environmental Engineering Considerations and Laboratory Tests1313It focuses on equipment environmental design and test limits to the conditions that it will experience throughout its service life and establishing chamber test methods that replicate the effects of environments on the equipment rather than imitating the environments themselves MIL-STD-461 includes the ldquoRequirements for the Control of Electromagnetic Interference Characteristics of Subsystems and Equipmentrdquo These test standards are geared for hardware and physical effects and do not address software operation Current methods for software testing include testing the individual software modules black box (or interface) level system integration level and aircraft level testing System integration testing can include hot bench system integration labs (SILs) (such as the shown in Figure 4) and aircraft ground level testing as conducted by the Aviation Flight Test Directorate (AFTD) of Redstone Test Center (RTC) (see Figure 5) 13Neglecting complete and good requirements promotes risk It is common practice by Program Managers (PMs) to accept risks As issues are found in systems it is on AED to issue an Army Safety Action Memorandum (ASAM) that identifies a deficiency to the field pilots Concurrently an Airworthiness Impact Statement (AWIS) is issued to the PMs which contain a probabilistic analysis of how the identified shortcoming will affect risk The AMCOM Safety Office will produce its own calculations The PMs can either accept or reject this assessment Regardless cautions and warnings are placed in the AWR to keep the entire program and flight crew aware of issues13

9

Development Challenges

bull Legacy Aircraft often upgraded in a piecemeal fashionndash Makes certification difficultndash Desire to increase to modern requirements based on size of upgrade and

what it includes ndash hard to scope

bull New system requirements must be clear complete and testable ndash Certification requirements must be obvious

bull Orchestrating agreement between stakeholders is necessary to mitigatendash Juggling of multiple software buildsndash End system that is difficult to test certify and deployndash Escalating Costsndash System Safety from being poorly understoodndash Design iterations

Presenter
Presentation Notes
It would be wonderful if all systems were straightforward in design easily testable and simple to write an Airworthiness Release (AWR) for however that is not the case Legacy aircraft such as the Chinook have been upgraded in a piecemeal fashion acquiring much needed improvements in aviation navigation and communication The general recommended system development V-curve as shown earlier is not always followed in a strict sense although it should be the goal process Negligence to the proper process makes the establishment of certification very difficult For new system development and existing system upgrades requirements must be clear complete and testable The certification requirements must be made obvious in the development of the requirements establishment phase with the goal of being fully identified during the requirements development Orchestrating agreement among all stakeholders (eg the program manager systems engineers human factors engineers integrators test engineers manufacturers users and certifiers) is necessary to mitigate problems such as13juggling multiple software builds13producing a difficult-to-test difficult-to-certify and difficult-to-deploy systems13misunderstanding system safety and13requiring design iterations that impact schedules and costs

10

Complexity Issues

bull System Development costs and schedule increase with complexityndash Existing lack of schedule and funding resources

bull Keeps systems from achieving full compliance with specifications and requirements

bull Garbage in -gt Garbage OuthellipPoor requirements -gt Poor Systemndash Finding problems in new designs at PDR is too latendash Difficult to correct existing poorly designed fielded complex systems

bull Complexity amp reliability of complex systems is not fully understoodndash How do we accurately assess operating risk performance reliability of

complex systems based on limited testing and analysisndash How do we know when system design is good enoughndash Latent defects occur in supposedly well-tested mature systems

bull Avionics parts and software change constantlyndash Spiral development -gt new softwarehardware qualification required frequently ndash How do we streamline the process (partition the system) so the need for

complete re-qualification after changes is lessened

Presenter
Presentation Notes
In an old computer development lesson it is well known that if you put ldquogarbage in you will get garbage outrdquo Likewise in a system if you have poor requirements then you will end up with a poor system Finding problems in new designs at Preliminary Design Review (PDR) is too late It has been shown that discovering issues at this stage in the game will cause a reiteration on the design which costs time and money Furthermore dealing with existing fielded complex systems that have not gone through the rigors of proper systems engineering results quite often with an overwhelming mess of functionality to test verify validate qualify and certifydesigns at PDR is too late It has been shown that discovering issues at this stage in the life-cycle will cause a reiteration on the design which costs time and money Furthermore dealing with existing fielded complex systems that have not gone through the rigors of proper systems engineering results quite often with a complicated conglomeration of functionality to test verify validate qualify certify and maintain This is indicative where complexity has exceeded our understanding in how to certify a system We still do not fully understand complexity and how to address reliability of complex systems How do we accurately assess operating risks performance and reliability of complex systems based on limited testing and analysis How do we know when a system design is good enough How do we modularize spirally developed systems to minimize the need for re-qualification of unchanging portions of the system We are 30 plus years into this technology and we still deal with systems with latent defects that are occurring in supposedly well-tested and mature systems To further exacerbate the problem we are now dealing with complex system of systems (ie cyber-physical systems)13It is a given that you can keep on adding redundancy and complexity to a problem to attain a desired level of reliability but at some point in time the reliability will taper off At best we sometimes must satisfy for an optimum point before digressing in reliability In the same vein system development costs and schedule increase with complexity too (see Figure 7 and 8)13Avionics parts and software constantly change over the life of a program Typically a spiral development program occurs with complex software development which means that qualification is required frequently This begs the question of how to streamline the process so that the need to conduct a complete requalification is avoided13With these complex systems there are other hurdles to cross such as fully characterizing and conducting the Functional Hazard Assessments (FHAs) and Failure Mode Effects and Criticality Analyses (FMECAs) It is crucial for the safety assessment that these are conducted correctly to fully understand the risks and later perform the correct tests at the right level Additionally once the complex component hardware and software are integrated then yet other problems appear It is at that time that the disparity of teams crossing multiple contractors and development groups becomes obvious and if not properly coordinated could cause impacts to the schedule Other programmatic problems affect complex system development and qualification For instance lack of schedule and funding resources causes a shortcoming to adequately provide for the proper compliance with specification and requirements short-circuiting the systems engineering process An ever decreasing availability of trained engineers to support the development and test of such systems exists Non-technical political influences sometimes affect the reliability of complex systems and are difficult to avoid Lastly there is a lack of a centralized database that captures the various families of systems that have been built along with their characterization of success and failures Such a database from all past and present government complex systems could be valuable in establishing reliability basis for future models

11

Complexity Issues (continued)

bull Functional Hazard Assessments and related documentation are crucial

ndash Understanding risks ndash Performing the correct tests at the right level Lab test vs Flight Test

bull Saves flight time and money

bull Systems Integration for complex systems is a schedule driver

bull Need experienced personnel to work complex systems

bull Need a centralized database - just doesnrsquot existndash Determine data needed for quantifying reliability of complex systemsndash Capture the pertinent data on previous or existing complex systemsndash Understand successes and failures of previous and present complex systemsndash Establish baseline reliability figures for known architectures

bull Complex System of Systems exacerbates problem

12

Reliability vs Complexity ampCost vs Complexity

Notional Graphs

bull Reliability vs Complexity bull Cost amp Schedule vs Complexity

Rel

iabi

lity

Complexity Complexity

Cos

t amp S

ched

uleOptimum

Aggregation of part reliability

feeds into overall system reliability

Desired

Presenter
Presentation Notes
Avionics parts and software constantly change over the life of a program Typically a spiral development program occurs with complex software development which means that qualification is required frequently This begs the question of how to streamline the process so that the need to conduct a complete re-qualification is avoided13With these complex systems there are other hurdles to cross such as fully characterizing and conducting the Functional Hazard Assessments (FHAs) and Failure Mode Effects and Criticality Analyses (FMECAs) It is crucial for the safety assessment that these are conducted correctly to fully understand the risks and later perform the correct tests at the right level13Additionally once the complex component hardware and software are integrated then yet other problems crop up It is at that time that the disparity of teams crossing multiple contractors and development groups becomes obvious and if not properly coordinated could cause impacts to the schedule 13Other programmatic problems affect complex system development and qualification For instance lack of schedule and funding resources causes a shortcoming to adequately provide for the proper compliance with specification and requirements short-circuiting the systems engineering process An ever decreasing availability of trained engineers to support the development and test of such systems exists Non-technical political influences sometimes affect the reliability of complex systems and are difficult to avoid Lastly there is a lack of a centralized database that captures the various families of systems that have been built along with their characterization of success and failures Such a database from all past and present government complex systems could be valuable in establishing reliability basis for future models

13

A Few Examples of Complex Systems

bull This is not a new problem Other have struggled with the challenges of establishing confidence in complex systems

ndash NASAbull Apollo Guidance Computerbull Dryden F8 Crusaderbull Space Shuttlebull International Space Station

ndash Commercial Airlinersbull Airbus A320 and higherbull Boeing B777 B787

ndash Militarybull Ships and Submarinesbull Jets (F14F15 F16 F18 F22 F35 etc)bull Cargo Planes (C130J Hercules C17 Globemaster etc)bull Helicopters (Chinook Blackhawk Sea Stallion etc)bull Rocketsbull Unmanned Aerial Systemsbull Unmanned Ground Systemsbull Unmanned Submarine Systems Photos by US Army NASA US Navy and US Air Force

Presenter
Presentation Notes
As mentioned earlier complex avionics systems are not a new idea Since the early 1960rsquos complex avionic architectures have existed beginning with the ApolloSaturn program Massachusetts Institute of Technology (MIT) Instrumentation Lab (IL) which is now Draper Laboratory and International Business Machines (IBM) led the way with the MITIL Apollo Guidance Computer (AGC) and the Saturn V IBM triple modular redundant (TMR) voting guidance computer system The word software was not even coined at the time but engineers such as Margaret Hamilton MITIL Director of Apollo On-board Software can attest to the fact that some the same issues with creating reliable software then still exists today [5] A large majority of the issues then dealt with the communication between systems engineers and the programmers Requirements were thrown over the wall without the confirmation that the requirements were complete and a lot of the issues cropped up as interface problems Identifying these issues prompted Hamilton to create her own company and create a modeling language called Universal Systems Language (USL) to head off the problems experienced with Apollo [11] Some 200 plus modeling programs have been developed since Apollo and used to mitigate issues and increase confidence in systems of varying complexity13As time progressed other systems came along The NASA Dryden F8 Crusader was the first digital fly by wire (DFBW) jet aircraft that relied heavily on complex IMA and software for flight control The Space Transportation System (STS) shuttle includes a Quad Modular Redundant (QMR) system with a fifth backup flight computer containing uncommon code US Air Force and Naval airplanes that have possessed complex or redundant IMA configurations include the F14 Tomcat F15 Eagle F16 Falcon F18 Hornet F22 Raptor F35 Joint Strike Fighter F117 Nighthawk V22 Osprey C17 Globemaster and many more along with recent Unmanned Air Vehicle Systems (UAVS) The US Army complex systems on helicopters include the13RAH-66 Comanche DFBW Triple Modular Redundant (TMR) architecture13glass cockpit avionics on the UH-60M Blackhawk baseline 13Common Avionics Architecture System (CAAS) glass cockpit on the UH-60M Blackhawk modernization and CH-47F Chinooks13and other aircraft13Additionally there are many self-checking pair engine controller systems along with system of system Future Combat Systems (FCS) and Unmanned Air Vehicle Systems (UAVS) This has also permeated the commercial airliner market with the Airbus 320 and higher Airbus models Boeing 777 and Boeing 787 aircraft With this ever increasing technology something must be done about the reliability issue With such a wealth of data on aviation and non-aviation cyber-physical systems such as submarine ship nuclear medical locomotive and automotive systems there should be adequate information to get a start on modeling systems correctly for reliability Therefore this is not an isolated problem to avionics and other disciplines should aide in resolving this problem13

14

Some Complex System Failures

bull V-22 Osprey crashesbull Mars Climate Orbiter crashbull Mars Pathfinder software resetbull USS Vincennes downing an Airbus 320bull Therac-25 software radiation treatment

failurebull 1989 Airbus A320 air show crashbull China Airlines Airbus Industries A300

crashbull Ariane 5 satellite launcher malfunctionbull Failure of the primary flight system to

sync with the backup during prelaunch of STS-1

bull Mexicana Airlines Boeing 727 airliner crashed into a mountain due to the software not correctly determining the mountain position

bull Loss of the first American probe to Venus

bull Korean Airlines KAL 901 accidentbull Soviet Phobos I Mars probe lost

bull Three Mile Islandbull F-18 fighter plane crash due to bad

exceptionbull F-14 fighter plane lost to

uncontrollable spinbull Swedish Gripen prototype crashedbull Swedish Gripen air-show crashbull F-22 failure crossing the IDLbull 2006 German-Spanish Barracuda UAVbull 2004 FA-22 Raptor stealth fighter jet

crash bull FA-22 Raptor navigation system

software error at Nellis AFBbull 50 cockpit blackouts on A320bull A320 multiple avionics and electrical

failures at Newark NJbull Boeing 777 Malaysian Airlines jetlinerrsquos

nightmarish autopilot rollercoaster ridebull 3000 feet US Army and Air Force UAV

Crashes

bull hellip And Many Morehellip

Presenter
Presentation Notes
Multiple crashes have occurred with the V-22 Osprey [41]13In 1999 the Mars Climate Orbiter crashed because of incorrect units in a program caused by poor systems engineering practices [42 44]13In 1988 an Airbus 320 was shot down by the USS Vincennes because of cryptic and misleading output displayed by the tracking software [3]13In 1989 an Airbus A320 crashed at an air show due to altitude indication and software handling [3]13In 1994 a China Airlines Airbus Industries A300 crash on killing 264 from faulty software [3]13In 1996 the first Ariane 5 satellite launcher destruction mishap was caused by a faulty software design error with a few lines of ADA code containing unprotected variables Horizontal velocity of the Ariane 5 exceeded the Arian 4 resulting in the guidance system veering the rocket off course Insufficient testing did not catch this error which was a carry-over from Ariane 4[3 39]13In 1986 a Mexicana Airlines Boeing 727 airliner crashed into a mountain due to the software not correctly determining the mountain position [39]13In 1986 the Therac-25 radiation therapy machines overdosed cancer patients due to flaw in the computer program controlling the highly automated devices [3 39 45]13During the maiden launch in 1981 of the Discovery space shuttle a failure of the primary flight control computer system to establish sync with the backup during prelaunch [43]13On December 10 1990 the Space Shuttle Columbia had to land early due to computer software problems [39]13In 1997 The Mars Pathfinder software reset problem due to latent task execution caused by priority inversion with a mutex [3 44]13An error in a single FORTRAN statement resulted in the loss of the first American probe to Venus From G J Myers Software Reliability Principles amp Practice p 25 [3]13In September 1999 the Korean Airlines KAL 901 accident in Guam killed 225 out of 254 aboard A worldwide bug was discovered in barometric altimetry in Ground Proximity Warning System (GPWS) From ACM SIGSOFT Software Engineering Notes vol 23 no 1 [3]13The Soviet Phobos I Mars probe was lost due to a faulty software update at a cost of 300 million rubles Its disorientation broke the radio link and the solar batteries discharged before reacquisition From Aviation Week 13 Feb 1989 [3]13An F-18 fighter plane crash due to a missing exception condition From ACM SIGSOFT Software Engineering Notes vol 6 no 2 [3]13An F-14 fighter plane was lost to uncontrollable spin traced to tactical software From ACM SIGSOFT Software Engineering vol9 no5 [3]13In 1989 Swedish Gripen prototype crashed due to software in their digital fly-by-wire system [3 46]13In 1995 another Gripen fighter plane crashed during air-show caused by a software issue [3 46]13On February 11 2007 twelve FA-22 Raptors were forced to head back to the Hawaii when a software bug caused a computer crash as they were crossing International Date Line In the middle of the ocean all their systems comprising navigation fuel and part of the communications systems dumped All the attempts to reboot failed[47]13February 2006 German-Spanish Unmanned Combat Air Vehicle Barracuda crash due to software failure [4]13December 2004 a glitch in the software for controlling flight probably caused an FA-22 Raptor stealth fighter jet to crash on takeoff at Nellis Air Force [4]13In 2008 a United Airbus A320 registration N462UA experienced multiple avionics and electrical failures including loss of all communications shortly after rotation while departing Newark Liberty International Airport in Newark New Jersey [NTSB Report Identification DCA08IA033] 13In 2006 a Boeing 777 Malaysian Airlines jetlinerrsquos autopilot caused a stall to occur by climbing 3000 feet Pilots struggled to nose down the plane but plunged into a steep dive After pulling back up the pilots regained control Cause was defective flight software providing incorrect data for airspeed and acceleration confusing the flight computers and initially ignoring the pilotrsquos commands[49]13US Army and Air Force UAV crashes from control system or human error13

15

Lessons Learned from Failures

bull From Nancy Levesonrsquos paper ldquoThe Role of Software in Spacecraft Accidentsrdquondash ldquoFlaws in the safety culture diffusion of responsibility and

authorityndash Limited communication channels and poor information flowndash Inadequate system and software engineering ndash Poor or missing specifications ndash Unnecessary complexity and software functionality ndash Software reuse or changes without appropriate safety analysisndash [Shortcomings] in safety engineering practices ndash Flaws in test and simulation environments ndash Inadequate human factors design for softwarerdquo

Presenter
Presentation Notes
In Dr Nancy Levesonrsquos paper [36] ldquoThe Role of Software in Spacecraft Accidentsrdquo she cited problems with software development issues within NASA on various projects According to Dr Leveson there were ldquoflaws in the safety culture diffusion of responsibility and authority limited communication channels and poor information flow inadequate system and software engineering poor or missing specifications unnecessary complexity and software functionality software reuse or changes without appropriate safety analysis violation of basic safety engineering practices inadequate system safety engineering flaws in test and simulation environments and inadequate human factors design for softwarerdquo While these problems were identified for spacecraft development within NASA and corrected aviation in general could learn from these lessons to mitigate issues with complex systems development

16

Some Current Guidelines

bull DO-178B - Software Considerations in Airborne Systems and Equipment Certification bull DO-248B ndash Final Report for the Clarification of DO-178Bbull DO-278 - Guidelines for Communications Navigation Surveillance and Air Traffic Management

(CNSATM) Systems Software Integrity Assurancebull DO-254 - Design Assurance Guidance for Airborne Electronic Hardware bull DO-297 ndash Integrated Modular Avionics (IMA) Development Guidance and Certification

Considerationsbull SAE-ARP4754 ndash Certification Consideration for Highly Integrated or Complex Aircraft Systemsbull SAE-ARP4671- Guidelines and Methods for Conducting the Safety Assessment Process on

Airborne Systems and Equipmentbull FAA Advisory Circular AC27-1B - Certification of Normal Category Rotorcraftbull FAA Advisory Circular AC29-2C - Certification of Transport Category Rotorcraftbull ISOIEC 12207 - Software Life Cycle Processesbull ARINC 653 - Specification Standard for Time and System Partitionbull MIL-STD-882D - DoD System Safetybull ADS-51-HDBK - Rotorcraft and Aircraft Qualification Handbookbull AR-70-62 - Airworthiness Release Standardbull SED-SES-PMHFSA 001 - Software Engineering Directorate (SED) Software Engineering

Evaluation System (SEES) Program Manager Handbook for Flight Software Airworthinessbull SED-SES-PMHSS 001 - SED SEES Program Manager Handbook for Software Safety

WHATrsquoS MISSING - Reliability Standard for Complex Systems

Presenter
Presentation Notes
These problems previously stated drove the development of these guidelines however there is no standard for system reliability that includes software There are other standards and circulars that pertain to complex systems but a reliability standard for complex systems but a reliability standard is missing for complex systems which would outline the process for establishing cyber-physical systems reliability This standard should indicate how to model and analyze and ascertain the projected level of reliability

17

Certification Assessment Considerations

bull Sufficient data and time must be available for air worthiness evaluation

bull Certification processndash Currently lengthy ndash Depends much on human interpretation trade offs and risk mitigation ndash Overwhelming for complex integrated systems (FHAs FTAs FMECAs

risk mitigation etc)

bull Consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives

bull Certification Tasks outlined in DO-297 should be consideredndash Task 1 Module Acceptancendash Task 2 Application softwarehardware acceptancendash Task 3 IMA system acceptancendash Task 4 Aircraft integration of IMA system ndash including VampVndash Task 5 Change of modules or applicationsndash Task 6 Reused of modules or applications

Presenter
Presentation Notes
In order to execute an AWR sufficient data must be provided to the appropriate airworthiness authorities to demonstrate that safety requirements are met to the levels per the System Safety Assessment (SSA) or equivalent The certification process is currently lengthy and depends on much human interpretation of the myriad of complex architecture functions 13The current guidelines such as DO-178B DO-254 DO-297 SAE-ARP-4754 and SAE-ARP-4671 along with many other guidelines outline the proper steps that should be taken System safety managementrsquos military standard is MIL-STD-882 and has been in use for decades Civilian safety standards for the aviation industry include SAE ARP4754 which shows the incorporation of system safety activities into the design process and provides guidance on techniques to use to ensure a safe design SAE ARP4761 contains significant guidance on how to perform the system safety activities spoken about in SAE ARP4754 DO-178B outlines development assurance guidance for aviation software based on the failure condition categorization of the functionality provided by the software DO-254 embodies similar guidance for aviation hardware ARINC 653 is a widely accepted standard to ensure time and space partitioning for software DO-297 does an excellent job of describing the certification tasks to take for an IMA system which include13Task 1 Module acceptance13Task 2 Application softwarehardware acceptance13Task 3 IMA system acceptance13Task4 Aircraft integration of IMA systems including verification and validation13Task 5 Change of modules or applications13Task 6 Reuse of modules or applications13Taken together these standards provide guidance that if followed likely will result in safe highly reliable and cost-effective systems over the life-cycle of the system Yet while these guidelines exist there is not a consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives Also there is not a standard outlining overall reliability for a system to include hardware and software reliability In order to achieve this level of reliability a standard should be developed to define the process and method to achieve a quantifiable reliability number that would in turn lead to acceptance13

18

TRL 3 or 4 TRL 6 or 7 TRL 8 or 9

Component 4

ComplexityFundamentals

Reliability Parametrics

Component 3

ComplexityFundamentals

Reliability Parametrics

Definition of Complexity and Reliability is Needed

Subsystem 1

SystemIntegration ofComponents

ReliabilityDependencies

Component 2

ComplexityFundamentals

Reliability Parametrics

Component 1

ComplexityFundamentals

Reliability Parametrics

Subsystem 2

SystemIntegration ofComponents

ReliabilityDependencies

Integration System

Realized System

ReliabilitySensitivities

High Reliable Complex System

Certificate(eg AWR)

Integration

Integration

Integration

Integration

Integration

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) andor AWR

19

Analytical Models and Reliability

bull Analytical models of hardware reliability are well understood

bull Architecture modeling and software reliability modeling is not a novel idea but is highly debated

ndash There are many approaches and little consensus as to best wayndash Many models (Jelinski-Moranda Littlewood-Verrall Musa-Okumoto etc) [1]ndash Many tools (over 200+ tools since 1970s have been built) [2]

bull Predictability of software reliability is of great concern because it is a major contributor to unreliability[2]

bull Software Reliability is the probability of error-free operation of a computer program in a specified environment for a specified time [1]

bull Need basis for setting reliability figures based on previous systems and iteratively refine those figures in the future

bull NOT A REPLACEMENT FOR TESTING AND VERIFICATION

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability While not suggesting as solution in total better modeling practices should be considered toward a solution to bridge the gap between design test and implementation A method to model the architecture early in the requirements establishment phase follow the detailed design and coding and then be able to verify the system along with the model may be a path to greater confidence in the system and reduce the risks warnings and cautions that must be issued13In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result (see Figure 9) The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) or AWR 13To achieve this goal modeling and analysis tools that follow a standard for modeling reliability should exist As previously stated over 200 tools have been created since 1970s Here is a list of a few of the current tools13Universal Systems Language (USL)13Unified Modeling Language (UML)13Systems Modeling Language (SysML)13MATLABSimulink13Telelogic Rhapsody13MathCAD13Colored Petri Nets13Rate Monotonic Analysis (RMA)13STATEMATE (Used by Airbus)13Standard for the Development of Safety-Critical Embedded Software (SCADE)13OPNET13Embedded System Modeling Language (ESML)13Component Synthesis using Model-Integrated Computing (CoSMIC)13Architectural Analysis and Design Language (AADL)13By no means is this list complete Typically different companies and projects address this challenge and choose unit tools to perform the upfront analysis and modeling not following a standard approach Multiple tools need to converge or be compatible with the framework of a common tool such as AADL The resultant tool(s) could be used to verify the requirements up front to mitigate reliability issues down the life-cycle 13

20

Tools for Modeling and Analysis

bull Universal Systems Language (USL)bull Unified Modeling Language (UML)bull Systems Modeling Language (SysML)bull MATLABSimulinkbull Telelogic Rhapsodybull MathCadbull Colored Petri Netsbull Rate Monotonic Analysis (RMA)bull STATEMATE (Used by Airbus)bull SCADEbull OPNETbull Embedded System Modeling Language (ESML)bull Component Synthesis using Model-Integrated Computing (CoSMIC)bull Architectural Analysis and Design Language (AADL)bull At least 200+ more packages since the 70rsquosbull Certified tools needs to converge to an accepted standard

modelinganalysis method for complex system reliability

Presenter
Presentation Notes
Typically different companies and projects address the challenge and choose unit tools to perform the upfront analysis and modeling not following a guideline approach Multiple tools need to converge or be compatible with a set modeling standard for complex systems The compliant tools could be used to verify the requirements up front to mitigate reliability issues down the life-cycle A notional approach would follow that shown in Figure 8 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model The down side to modeling from certain circles is getting people to believe those models How do you certify a modeling tool and the actual models within the tools Those issues should be addressed going forward

21

Modification to Acquisition Model

Requirements Establishment

High Level Design

Detailed Specifications

Implementation Coding

Verification

Development TestingA

rchi

tect

ural

Mod

el amp

A

naly

sis

Propose standard modeling methodology to be applied at different phases of development to enhance requirements development reliability allocation reliability

measurement and testing (DISCLAIMER DOES NOT REPLACE TESTING)

Reliabilityallocated Reliability

measured

Operational Testing amp Validation

Deployed System

Presenter
Presentation Notes
A notional approach would follow that shown in Figure 10 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model This approach could bridge the design and test phases together It is emphasized here that the architectural model would not replace critical testing but augments the process to allow for better requirement identification and verification Thorough ground and flight tests should never be replaced by modeling Modeling would only allow for more robust and a higher level of confidence in the requirements and design The model could be used in conjunction with the testing to confirm the design Proper modeling and analysis would reduce total program costs by enforcing more complete and correct initial requirements which reduce issues discovered down the road in testing that are expensive to fix or impossible to fix and having to accept high risks Additionally if the model is maintained and optimized then it could possibly be used after system deployment to analyze impacts of upgrades or changes to the system allowing for more complete analysis and reduce overall system redesign costs1313A hurdle to cross with modeling and analysis is convincing people to believe those models Some method to certify these models and modeling tools should be addressed in the future Standards should be set in place for correct modeling techniques for complex systems Lastly consideration of standard verification checking tools should be made such as with the use of the Motor Industry Software Reliability Association (MISRA) compliance verification tool for the use of C in safety critical systems

22

Systems Reliability Standard Establishment

bull Establish a working group to define this standardndash Need a technical society to lead the charge on this

bull Collaborate with industry academia military and societiesndash Focus on development of a reliability standard with AWR safety in mindndash Draw upon the experiences to feed into this standard

bull Study existing and previous complex systemsndash Shuttle Space Station missile systems nuclear submarine and ship

systems nuclear control systems military and commercial jet systems ndash Obtain software reliability information from given existing and previous

systemsndash Build database which would serve as basis for future reliability

bull Research prior efforts in complex systems analysis

bull Establish consensus based modeling and analysis method

Presenter
Presentation Notes
In conclusion methods for achieving a design for complex systems do exist however achieving reliability and attaining a level of qualification that would permit better AWRs does not There needs to be a standard developed to tackle this issue rather than relying on current methods to ascertain airworthiness of complex systems This will not occur overnight An orchestrated collaboration among industry academia military labs and technically professional societies to focus on development of this standard should allow us to draw upon the experiences to feed into this reliability standard with AWR safety in mind We have a long living experiment with complex software systems on the Space Transportation System (STS) International Space Station (ISS) missile systems nuclear submarine and ship systems nuclear control systems military and commercial jet systems from which we should be able to obtain at least inferred software reliability information from the architecture and run time that the systems use We should look at the lessons learned from these systems to see what could have been done to improve and what was done right that should be carried forward The challenge is collecting this information into a central database and arriving at some figure of reliability from previous systems where data exists This would at least provide a starting point to allow initial assessments and could be optimized in the future Also this is not the only study for establishing reliability metrics to complex software systems There have been research projects of similarity to this effort that have risen and fallen The data from those projects should not be wasted but studied to feed into whatever standard that is developed While historical information would be useful each design is unique and requires tools to accomplish the design Investigation of architectural modeling constructs should be further investigated as a possible augmentation to the design and test process We need to determine which forum is best to conduct this effort (eg SAE IEEE AIAA ACM AHS INCOSE or other) As stated in the paper ldquoSpace Shuttle Avionicsrdquo [31] ldquoThe designers the flight crew and other operational users of the system often have a mindset established in a previous program or experience which results in a bias against new or different lsquounconventionalrsquo approachesrdquo If nothing is done to address this problem it will only get worse over time It is past time to address the issue of reliability of complex systems and software

23

BACKUP SLIDES

BACKUP SLIDES

24

PEO Aviation System Safety Management Decision Authority Matrix

Severity(Most Credible)

FrequentP gt 1E-3

A

Probable 1D-4 lt P lt= 1E-3

B

Occasional1E-5 lt P lt= 1E-4

C

Remote1E-6 lt P lt= 1E-5

D

Improbable1E-7 lt P lt= 1E-6

E

Catastrophic1

Critical 2

Marginal 3

Negligible4

Army Acquisition

PEO Aviation

ProgramManagement

HazardCategory

Description

1 Catastrophic Death or permanent total disability system loss

2 Critical Severe injury or minor occupational illness (no permanent effect) minor system or environmental damage

3 Marginal Minor injury or minor occupational illness (no permanent effect) minor system or environmental damage

4 Negligible Less than minor injury or occupational illness (no lost workdays) or less than minor environmental damage

RiskLevel

Description Probability (Frequency) (per 100000 flight hours)

A Frequent gt 100 (P gt 1E-3)

B Probable lt=100 and gt10 (1E-4 lt P lt= 1E-3)

C Occasional lt= 10 and gt1 (1E-5 lt P lt= 1E-4)

D Remote lt=1 and gt01 (1E-6 lt P lt= 1E-5)

E Improbable lt=01 and gt001 (1E-7 lt P lt= 1E-6)

Presenter
Presentation Notes
As already mentioned it is the goal of the US Army Aviation Engineering Directorate that the system is safe and reliable to operate and will perform the mission when delivered Another goal includes the released system continues to safely perform the mission if maintained and operated per the operatorrsquos manual Replacement parts and overhaul work must be high quality to support continued airworthiness Per the Program Element Office Memorandum 08-03 Risk Matrix US Army flight control systems are to achieve 1E-9 reliability for flight critical functions per civil airspace regulations [35] and 1E-6 reliability for tactical use [36] Quantifying these numbers is established for component hardware but not for software Just as hardware should have quantifiable reliability so should software13

25

Reliability Defined

bull Software Reliability - the probability that a given piece of software will execute without failure in a given environment for a given time [40] ndash Often debated as to how to measure

bull Hardware Reliability - the probability that a hardware component fails over time ndash Well defined and established

bull System Reliability - the probability of success or the probability that the system will perform its intended function under specified design limits [over a given amount of time] [39] ndash A combination of software and hardware reliability

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

26

Hardware vs Software Reliability

Hardware Reliability Software ReliabilityFailure rate has a bathtub curve The burn-in state is similar to the software debugging state

Without considering program evolution failure rate is statistically non-increasing

Material deterioration can cause failures even though the system is not used

Failures never occur if the software is not used

Failure data are fitted to some distributions The selection of the underlying distribution is based on the analysis of failure data and experiences Emphasis is placed on analyzing failure data

Most models are analytically derived from assumptions Emphasis is on developing the model the interpretation of the model assumptions and the physical meaning of the parameters

Failures are caused by material deterioration design errors misuse and environment

Failures are caused by incorrect logic incorrect statements or incorrect input data

Can be improved by better design better material applying redundancy and accelerated life cycle testing

Can be improved by increasing testing effort and correcting discovered faults Reliability tends to change continuously during testing due to the addition of problems in new code or the removal of problems by debugging errors

Hardware repairs restore the original condition Software repairs establish a new piece of software

Hardware failures are usually preceded by warnings Software failures are rarely preceded by warnings

Hardware components can be standardized Software components have rarely been standardized

Hardware can usually be tested exhaustively Software essentially requires infinite testing for completeness

Reference [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Hong Pham compares software versus hardware reliability with the information as shown in Table 2 [39] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

27

Acronym ListACRONYM DEFINITIONAADL Architectural Analysis and Design LanguageAC Advisory Circular (FAA)ACM Association of Computing MachineryAED Aviation Engineering Directorate (AMRDEC)AFTD Aviation Flight Test Directorate (US Army)AGC Apollo Guidance ComputerAHS American Helicopter SocietyAIAA American Institute of Aeronautics and Astronautics (Inc)AMCOM Aviation and Missile Command (US Army)AMRDEC Aviation and Missiles Research Development and Engineering Center (US Army)AR Army RegulationARINC Aeronautical Radio Inc ARP Aerospace Recommended PracticeASIF Avionics Software Integration FacilityATAM Architecture Tradeoff Analysis MethodATM Air Traffic ManagementAWR Airworthiness ReleaseCAAS Common Avionics Architecture SystemCH-47 Cargo Helicopter ChinookCMM Capability Maturity ModelCMMI Capability Maturity Model IndexCMU Carnegie Mellon UniversityCNS Communications Navigation SurveillanceCoSMIC Component Synthesis using Model-Integrated ComputingCPS Cyber-Physical SystemCRC Chemical Rubber Company (ie CRC Press)DFBW Digital Fly-By-WireDoD Department of DefenseE3 Electrical and Electromagnetic EffectsESML Embedded System Modeling LanguageFAA Federal Aviation AdministrationFCS Future Combat SystemsFHA Functional Hazard AssessmentFMEA Failure Modes Effects Analysis

28

Acronym List (concluded)

ACRONYM DEFINITIONGPWS Ground Proximity Warning SystemIBM International Business MachinesIEC International Engineering ConsortiumIL Instrumentation Lab (now Draper Laboratory)IMA Integrated Modular AvionicsINCOSE International Council On Systems EngineeringISO International Organization for StandardizationISS International Space StationKAL Korean AirlinesMISRA Motor Industry Standard Software Reliability AssociationMIT Massachusetts Institute of TechnologyNASA National Aeronautics and Space Administration (USA)PDR Preliminary Design ReviewPEO Program Element OfficePNAS Proceedings of the National Academy of SciencesRAQ Rotorcraft and Aircraft QualificationRMA Rate Monotonic AnalysisRTC Redstone Test Center (US Army) RTTC Redstone Technical Test Center (US Army)RTCA Radio Technical Commission for AeronauticsSAE Society of Automotive EngineersSED Software Engineering Directorate (AMRDEC)SEES Software Engineering Evaluation SystemSEI Software Engineering Institute (CMU)SIL System Integration LaboratorySSA System Safety AssessmentSTS Space Transportation SystemSysML Systems Modeling LanguageTMR Triple Modular RedundantTRL Technical Readiness LevelUAS Unmanned Aircraft SystemUH-60 Utility Helicopter BlackhawkUML Unified Modeling LanguageUS United StatesUSL Universal Systems Language

29

References

bull [1] Israel Koren and Mani Krishna ldquoFault-Tolerant Systemsrdquo Morgan Kaufmann 2007bull [2] Jianto Pan ldquoSoftware Reliabilityrdquo Carnegie Mellon University Spring 1999bull [3] Nachum Dershowitz httpwwwcstauacil~nachumdhorrorhtmlbull [4] httpwwwair-attackcombull [5] David A Mindell ldquoDigital Apollo Human and Machine in Spaceflightrdquo The MIT Press 2008bull [6] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Flight Software Airworthiness SED-SES-PMHFSA 001 December 2003bull [7] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Software Safety SED-SES-PMHSSA 001 February 2006bull [8] AMCOM Regulation 385-17 Software System Safety Policy 15 March 2008bull [9] NASA Software Safety Guidebook NASA-GB-871913 31 March 2004bull [10] Margaret Hamilton amp William Hackler ldquoUniversal Systems Language Lessons Learned from Apollordquo IEEE Computer

Society 2008bull [11] Margaret Hamilton ldquoFull Life Cycle Systems Engineering and Software Development Environment Development Before

The Fact In Actionrdquo httpwwwhtiuscomArticlesFull_Life_Cyclehtmbull [12] Peter Feiler David Gluch John Hudak ldquo The Architecture Analysis amp Design Language (AADL) An Introductionrdquo

CMUSEI-2006-TN-011 February 2006bull [13] Peter Feiler John Hudak ldquoDeveloping AADL Models for Control Systems A Practitionerrsquos Guiderdquo CMUSEI-2007-TR-

014 July 2007bull [14] Bruce Lewis ldquoUsing the Architecture Analysis and Design Language for System Verification and Validationrdquo SEI

Presentation 2006bull [15] Feiler Gluch Hudak Lewis ldquoEmbedded System Architecture Analysis Using SAE AADLrdquo CMUSEI-2004-TN-004 June

2004bull [16] Charles Pecheur Stacy Nelson ldquoVampV of Advanced Systems at NASArdquo NASACR-2002-211402 April 2002bull [17] Systems Integration Requirements Task Group ldquoARP 4754 Certification Considerations for Highly-Integrated or

Complex Aircraft Systemsrdquo SAE Aerospace 10 April 1996bull [18] SAE ldquoARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne System and

Equipmentrdquo December 1996bull [19] Aeronautical Radio Inc (ARINC) ldquoARINC Specification 653P1-2 Avionics Application Software Standard Interface Part 1

ndash Required Servicesrdquo 7 March 2006

30

References (Continued)

bull [20] Department of Defense ldquoMIL-STD-882D Standard Practice for System Safetyrdquo 19 January 1993bull [21] RTCA Incorporated ldquoDO-178 Software Considerations in Airborne Systems and Equipment Certificationrdquo 1

December 1992bull [22] RTCA Incorporated ldquoDO-254 Design Assurance Guidance for Airborne Electronic Hardwarerdquo 19 April 2000bull [23] US Army ldquoAeronautical Design Standard Handbook Rotorcraft and Aircraft Qualification (RAQ) Handbookrdquo 21

October 1996bull [24] Cary R Spitzer (Editor) ldquoAvionics Elements Software and Functionsrdquo CRC Press 2007bull [25] US Army ldquoArmy Regulation 70-62 Airworthiness Qualification of Aircraft Systemsrdquo 21 May 2007bull [26] US Army ldquoArmy Regulation 95-1 Aviation Flight Regulationsrdquo 3 February 2006bull [27] ldquoUsing the Architecture Tradeoff Analysis Method (ATAM) to Evaluate the Software Architecture for a Product Line

of Avionics Systems A Case Studyrdquo Barbacci Clements Lattanze Northrop Wood July 2003 CMUSEI-2003-TN-012bull [28] ldquoAll in the Family CAAS amp AADLrdquo Peter Feiler August 2008 CMUSEI-2008-SR-021bull [29] ldquoCMMI Guidelines for Process Integration and Product Improvementrdquo Chrissis Konrad Shrum Pearson Education

2007bull [30] ldquoModel Driven Performance Analysis for Avionics Systemsrdquo Brendan OrsquoConnell Draper Laboratory January 2006bull [31] John F Hanaway Robert W Moorehead ldquoSpace Shuttle Avionics Systemsrdquo NASA SP-504 1989bull [32] Lui Sha ldquoThe Complexity Challenge in Modern Avionics Softwarerdquo August 14 2006bull [33] ldquoIncidents Prompt New Scrutiny of Airplane Software Glitchesrdquo 30 May 2006 Wall Street Journalbull [34] Eyal Ophir Clifford Nass and Anthony Wagner ldquo Cognitive Control in Media Multitaskersrdquo PNAS 20 July 2009bull [35] ldquoAdvisory Circular AC 251309-1A System Design and Analysisrdquo Federal Aviation Administration 21 June 1988bull [36] Program Element Office Policy Memorandum 08-03bull [38] httpwwwnsfgovpubs2008nsf08611nsf08611htm National Science Foundation webpage on Cyber-Physical

Systemsbull [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000bull [40] Paul Rook editor ldquoSoftware Reliability Handbookrdquo Elsevier Science Publishers LTD 1990

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)

8

Present Approach to Testing

bull Several disciplines weigh in such as software avionics GNampCenvironmental E3 electrical human factors structures aerodynamics etc

bull Current test methodology per older federated systemsndash Hardware Mil-Std 810 Mil-Std 461ndash Requirements Analysis (Traceability)

bull Test at different levelsndash Individual software module testingndash Black box testingndash System Integration testing

bull Hot benchbull System Integration Lab (SIL)

ndash Aircraft level Testing bull Groundbull Flight Aviation Flight Test Directorate (AFTD) Testing (US Army Photo)

Aviation System Integration Facility (ASIF) (US Army Photo)

Presenter
Presentation Notes
Airworthiness qualification is the demonstration of an aircraft or aircraft subsystem or component including modifications to function satisfactorily when used and maintained within prescribed limits [25] The US Army has traditionally qualified systems and components by physical testing analysis demonstration or by similarity Historically most systems were federated They were hardware based simple and distributed Now they have become more integrated They are more software intensive complex and have combined functionality contained in one or more computers With this evolution from simple to more complex the Army is finding it more difficult to execute an AWR As systems evolve to more complex systems of systems this problem is only growing worse13The current test approach to achieving confidence in systems for an AWR for the US Army is based more on traditional federated avionics systems Experienced personnel in Software Vehicle Management Systems Avionic Communications Navigation amp Guidance Control Electrical Environmental Human Factors Electrical and Electro-magnetic Effects (E3) Structures Thermal Safety Integration Testing and Flight Test personnel and test pilots all play important roles in accomplishing test and review of new and existing systems While some may not consider areas such as thermal or EEE important to software development they are crucial since the software is running on physical systems that are affected by heat and susceptibility to electromagnetic radiation which can cause abnormal operation or bit errors Current test methodology for hardware relies on MIL-STD-810 MIL-STD-461 and requirements analysis such as traceability MIL-STD-810 is the Department of Defense Test Method Standard for Environmental Engineering Considerations and Laboratory Tests1313It focuses on equipment environmental design and test limits to the conditions that it will experience throughout its service life and establishing chamber test methods that replicate the effects of environments on the equipment rather than imitating the environments themselves MIL-STD-461 includes the ldquoRequirements for the Control of Electromagnetic Interference Characteristics of Subsystems and Equipmentrdquo These test standards are geared for hardware and physical effects and do not address software operation Current methods for software testing include testing the individual software modules black box (or interface) level system integration level and aircraft level testing System integration testing can include hot bench system integration labs (SILs) (such as the shown in Figure 4) and aircraft ground level testing as conducted by the Aviation Flight Test Directorate (AFTD) of Redstone Test Center (RTC) (see Figure 5) 13Neglecting complete and good requirements promotes risk It is common practice by Program Managers (PMs) to accept risks As issues are found in systems it is on AED to issue an Army Safety Action Memorandum (ASAM) that identifies a deficiency to the field pilots Concurrently an Airworthiness Impact Statement (AWIS) is issued to the PMs which contain a probabilistic analysis of how the identified shortcoming will affect risk The AMCOM Safety Office will produce its own calculations The PMs can either accept or reject this assessment Regardless cautions and warnings are placed in the AWR to keep the entire program and flight crew aware of issues13

9

Development Challenges

bull Legacy Aircraft often upgraded in a piecemeal fashionndash Makes certification difficultndash Desire to increase to modern requirements based on size of upgrade and

what it includes ndash hard to scope

bull New system requirements must be clear complete and testable ndash Certification requirements must be obvious

bull Orchestrating agreement between stakeholders is necessary to mitigatendash Juggling of multiple software buildsndash End system that is difficult to test certify and deployndash Escalating Costsndash System Safety from being poorly understoodndash Design iterations

Presenter
Presentation Notes
It would be wonderful if all systems were straightforward in design easily testable and simple to write an Airworthiness Release (AWR) for however that is not the case Legacy aircraft such as the Chinook have been upgraded in a piecemeal fashion acquiring much needed improvements in aviation navigation and communication The general recommended system development V-curve as shown earlier is not always followed in a strict sense although it should be the goal process Negligence to the proper process makes the establishment of certification very difficult For new system development and existing system upgrades requirements must be clear complete and testable The certification requirements must be made obvious in the development of the requirements establishment phase with the goal of being fully identified during the requirements development Orchestrating agreement among all stakeholders (eg the program manager systems engineers human factors engineers integrators test engineers manufacturers users and certifiers) is necessary to mitigate problems such as13juggling multiple software builds13producing a difficult-to-test difficult-to-certify and difficult-to-deploy systems13misunderstanding system safety and13requiring design iterations that impact schedules and costs

10

Complexity Issues

bull System Development costs and schedule increase with complexityndash Existing lack of schedule and funding resources

bull Keeps systems from achieving full compliance with specifications and requirements

bull Garbage in -gt Garbage OuthellipPoor requirements -gt Poor Systemndash Finding problems in new designs at PDR is too latendash Difficult to correct existing poorly designed fielded complex systems

bull Complexity amp reliability of complex systems is not fully understoodndash How do we accurately assess operating risk performance reliability of

complex systems based on limited testing and analysisndash How do we know when system design is good enoughndash Latent defects occur in supposedly well-tested mature systems

bull Avionics parts and software change constantlyndash Spiral development -gt new softwarehardware qualification required frequently ndash How do we streamline the process (partition the system) so the need for

complete re-qualification after changes is lessened

Presenter
Presentation Notes
In an old computer development lesson it is well known that if you put ldquogarbage in you will get garbage outrdquo Likewise in a system if you have poor requirements then you will end up with a poor system Finding problems in new designs at Preliminary Design Review (PDR) is too late It has been shown that discovering issues at this stage in the game will cause a reiteration on the design which costs time and money Furthermore dealing with existing fielded complex systems that have not gone through the rigors of proper systems engineering results quite often with an overwhelming mess of functionality to test verify validate qualify and certifydesigns at PDR is too late It has been shown that discovering issues at this stage in the life-cycle will cause a reiteration on the design which costs time and money Furthermore dealing with existing fielded complex systems that have not gone through the rigors of proper systems engineering results quite often with a complicated conglomeration of functionality to test verify validate qualify certify and maintain This is indicative where complexity has exceeded our understanding in how to certify a system We still do not fully understand complexity and how to address reliability of complex systems How do we accurately assess operating risks performance and reliability of complex systems based on limited testing and analysis How do we know when a system design is good enough How do we modularize spirally developed systems to minimize the need for re-qualification of unchanging portions of the system We are 30 plus years into this technology and we still deal with systems with latent defects that are occurring in supposedly well-tested and mature systems To further exacerbate the problem we are now dealing with complex system of systems (ie cyber-physical systems)13It is a given that you can keep on adding redundancy and complexity to a problem to attain a desired level of reliability but at some point in time the reliability will taper off At best we sometimes must satisfy for an optimum point before digressing in reliability In the same vein system development costs and schedule increase with complexity too (see Figure 7 and 8)13Avionics parts and software constantly change over the life of a program Typically a spiral development program occurs with complex software development which means that qualification is required frequently This begs the question of how to streamline the process so that the need to conduct a complete requalification is avoided13With these complex systems there are other hurdles to cross such as fully characterizing and conducting the Functional Hazard Assessments (FHAs) and Failure Mode Effects and Criticality Analyses (FMECAs) It is crucial for the safety assessment that these are conducted correctly to fully understand the risks and later perform the correct tests at the right level Additionally once the complex component hardware and software are integrated then yet other problems appear It is at that time that the disparity of teams crossing multiple contractors and development groups becomes obvious and if not properly coordinated could cause impacts to the schedule Other programmatic problems affect complex system development and qualification For instance lack of schedule and funding resources causes a shortcoming to adequately provide for the proper compliance with specification and requirements short-circuiting the systems engineering process An ever decreasing availability of trained engineers to support the development and test of such systems exists Non-technical political influences sometimes affect the reliability of complex systems and are difficult to avoid Lastly there is a lack of a centralized database that captures the various families of systems that have been built along with their characterization of success and failures Such a database from all past and present government complex systems could be valuable in establishing reliability basis for future models

11

Complexity Issues (continued)

bull Functional Hazard Assessments and related documentation are crucial

ndash Understanding risks ndash Performing the correct tests at the right level Lab test vs Flight Test

bull Saves flight time and money

bull Systems Integration for complex systems is a schedule driver

bull Need experienced personnel to work complex systems

bull Need a centralized database - just doesnrsquot existndash Determine data needed for quantifying reliability of complex systemsndash Capture the pertinent data on previous or existing complex systemsndash Understand successes and failures of previous and present complex systemsndash Establish baseline reliability figures for known architectures

bull Complex System of Systems exacerbates problem

12

Reliability vs Complexity ampCost vs Complexity

Notional Graphs

bull Reliability vs Complexity bull Cost amp Schedule vs Complexity

Rel

iabi

lity

Complexity Complexity

Cos

t amp S

ched

uleOptimum

Aggregation of part reliability

feeds into overall system reliability

Desired

Presenter
Presentation Notes
Avionics parts and software constantly change over the life of a program Typically a spiral development program occurs with complex software development which means that qualification is required frequently This begs the question of how to streamline the process so that the need to conduct a complete re-qualification is avoided13With these complex systems there are other hurdles to cross such as fully characterizing and conducting the Functional Hazard Assessments (FHAs) and Failure Mode Effects and Criticality Analyses (FMECAs) It is crucial for the safety assessment that these are conducted correctly to fully understand the risks and later perform the correct tests at the right level13Additionally once the complex component hardware and software are integrated then yet other problems crop up It is at that time that the disparity of teams crossing multiple contractors and development groups becomes obvious and if not properly coordinated could cause impacts to the schedule 13Other programmatic problems affect complex system development and qualification For instance lack of schedule and funding resources causes a shortcoming to adequately provide for the proper compliance with specification and requirements short-circuiting the systems engineering process An ever decreasing availability of trained engineers to support the development and test of such systems exists Non-technical political influences sometimes affect the reliability of complex systems and are difficult to avoid Lastly there is a lack of a centralized database that captures the various families of systems that have been built along with their characterization of success and failures Such a database from all past and present government complex systems could be valuable in establishing reliability basis for future models

13

A Few Examples of Complex Systems

bull This is not a new problem Other have struggled with the challenges of establishing confidence in complex systems

ndash NASAbull Apollo Guidance Computerbull Dryden F8 Crusaderbull Space Shuttlebull International Space Station

ndash Commercial Airlinersbull Airbus A320 and higherbull Boeing B777 B787

ndash Militarybull Ships and Submarinesbull Jets (F14F15 F16 F18 F22 F35 etc)bull Cargo Planes (C130J Hercules C17 Globemaster etc)bull Helicopters (Chinook Blackhawk Sea Stallion etc)bull Rocketsbull Unmanned Aerial Systemsbull Unmanned Ground Systemsbull Unmanned Submarine Systems Photos by US Army NASA US Navy and US Air Force

Presenter
Presentation Notes
As mentioned earlier complex avionics systems are not a new idea Since the early 1960rsquos complex avionic architectures have existed beginning with the ApolloSaturn program Massachusetts Institute of Technology (MIT) Instrumentation Lab (IL) which is now Draper Laboratory and International Business Machines (IBM) led the way with the MITIL Apollo Guidance Computer (AGC) and the Saturn V IBM triple modular redundant (TMR) voting guidance computer system The word software was not even coined at the time but engineers such as Margaret Hamilton MITIL Director of Apollo On-board Software can attest to the fact that some the same issues with creating reliable software then still exists today [5] A large majority of the issues then dealt with the communication between systems engineers and the programmers Requirements were thrown over the wall without the confirmation that the requirements were complete and a lot of the issues cropped up as interface problems Identifying these issues prompted Hamilton to create her own company and create a modeling language called Universal Systems Language (USL) to head off the problems experienced with Apollo [11] Some 200 plus modeling programs have been developed since Apollo and used to mitigate issues and increase confidence in systems of varying complexity13As time progressed other systems came along The NASA Dryden F8 Crusader was the first digital fly by wire (DFBW) jet aircraft that relied heavily on complex IMA and software for flight control The Space Transportation System (STS) shuttle includes a Quad Modular Redundant (QMR) system with a fifth backup flight computer containing uncommon code US Air Force and Naval airplanes that have possessed complex or redundant IMA configurations include the F14 Tomcat F15 Eagle F16 Falcon F18 Hornet F22 Raptor F35 Joint Strike Fighter F117 Nighthawk V22 Osprey C17 Globemaster and many more along with recent Unmanned Air Vehicle Systems (UAVS) The US Army complex systems on helicopters include the13RAH-66 Comanche DFBW Triple Modular Redundant (TMR) architecture13glass cockpit avionics on the UH-60M Blackhawk baseline 13Common Avionics Architecture System (CAAS) glass cockpit on the UH-60M Blackhawk modernization and CH-47F Chinooks13and other aircraft13Additionally there are many self-checking pair engine controller systems along with system of system Future Combat Systems (FCS) and Unmanned Air Vehicle Systems (UAVS) This has also permeated the commercial airliner market with the Airbus 320 and higher Airbus models Boeing 777 and Boeing 787 aircraft With this ever increasing technology something must be done about the reliability issue With such a wealth of data on aviation and non-aviation cyber-physical systems such as submarine ship nuclear medical locomotive and automotive systems there should be adequate information to get a start on modeling systems correctly for reliability Therefore this is not an isolated problem to avionics and other disciplines should aide in resolving this problem13

14

Some Complex System Failures

bull V-22 Osprey crashesbull Mars Climate Orbiter crashbull Mars Pathfinder software resetbull USS Vincennes downing an Airbus 320bull Therac-25 software radiation treatment

failurebull 1989 Airbus A320 air show crashbull China Airlines Airbus Industries A300

crashbull Ariane 5 satellite launcher malfunctionbull Failure of the primary flight system to

sync with the backup during prelaunch of STS-1

bull Mexicana Airlines Boeing 727 airliner crashed into a mountain due to the software not correctly determining the mountain position

bull Loss of the first American probe to Venus

bull Korean Airlines KAL 901 accidentbull Soviet Phobos I Mars probe lost

bull Three Mile Islandbull F-18 fighter plane crash due to bad

exceptionbull F-14 fighter plane lost to

uncontrollable spinbull Swedish Gripen prototype crashedbull Swedish Gripen air-show crashbull F-22 failure crossing the IDLbull 2006 German-Spanish Barracuda UAVbull 2004 FA-22 Raptor stealth fighter jet

crash bull FA-22 Raptor navigation system

software error at Nellis AFBbull 50 cockpit blackouts on A320bull A320 multiple avionics and electrical

failures at Newark NJbull Boeing 777 Malaysian Airlines jetlinerrsquos

nightmarish autopilot rollercoaster ridebull 3000 feet US Army and Air Force UAV

Crashes

bull hellip And Many Morehellip

Presenter
Presentation Notes
Multiple crashes have occurred with the V-22 Osprey [41]13In 1999 the Mars Climate Orbiter crashed because of incorrect units in a program caused by poor systems engineering practices [42 44]13In 1988 an Airbus 320 was shot down by the USS Vincennes because of cryptic and misleading output displayed by the tracking software [3]13In 1989 an Airbus A320 crashed at an air show due to altitude indication and software handling [3]13In 1994 a China Airlines Airbus Industries A300 crash on killing 264 from faulty software [3]13In 1996 the first Ariane 5 satellite launcher destruction mishap was caused by a faulty software design error with a few lines of ADA code containing unprotected variables Horizontal velocity of the Ariane 5 exceeded the Arian 4 resulting in the guidance system veering the rocket off course Insufficient testing did not catch this error which was a carry-over from Ariane 4[3 39]13In 1986 a Mexicana Airlines Boeing 727 airliner crashed into a mountain due to the software not correctly determining the mountain position [39]13In 1986 the Therac-25 radiation therapy machines overdosed cancer patients due to flaw in the computer program controlling the highly automated devices [3 39 45]13During the maiden launch in 1981 of the Discovery space shuttle a failure of the primary flight control computer system to establish sync with the backup during prelaunch [43]13On December 10 1990 the Space Shuttle Columbia had to land early due to computer software problems [39]13In 1997 The Mars Pathfinder software reset problem due to latent task execution caused by priority inversion with a mutex [3 44]13An error in a single FORTRAN statement resulted in the loss of the first American probe to Venus From G J Myers Software Reliability Principles amp Practice p 25 [3]13In September 1999 the Korean Airlines KAL 901 accident in Guam killed 225 out of 254 aboard A worldwide bug was discovered in barometric altimetry in Ground Proximity Warning System (GPWS) From ACM SIGSOFT Software Engineering Notes vol 23 no 1 [3]13The Soviet Phobos I Mars probe was lost due to a faulty software update at a cost of 300 million rubles Its disorientation broke the radio link and the solar batteries discharged before reacquisition From Aviation Week 13 Feb 1989 [3]13An F-18 fighter plane crash due to a missing exception condition From ACM SIGSOFT Software Engineering Notes vol 6 no 2 [3]13An F-14 fighter plane was lost to uncontrollable spin traced to tactical software From ACM SIGSOFT Software Engineering vol9 no5 [3]13In 1989 Swedish Gripen prototype crashed due to software in their digital fly-by-wire system [3 46]13In 1995 another Gripen fighter plane crashed during air-show caused by a software issue [3 46]13On February 11 2007 twelve FA-22 Raptors were forced to head back to the Hawaii when a software bug caused a computer crash as they were crossing International Date Line In the middle of the ocean all their systems comprising navigation fuel and part of the communications systems dumped All the attempts to reboot failed[47]13February 2006 German-Spanish Unmanned Combat Air Vehicle Barracuda crash due to software failure [4]13December 2004 a glitch in the software for controlling flight probably caused an FA-22 Raptor stealth fighter jet to crash on takeoff at Nellis Air Force [4]13In 2008 a United Airbus A320 registration N462UA experienced multiple avionics and electrical failures including loss of all communications shortly after rotation while departing Newark Liberty International Airport in Newark New Jersey [NTSB Report Identification DCA08IA033] 13In 2006 a Boeing 777 Malaysian Airlines jetlinerrsquos autopilot caused a stall to occur by climbing 3000 feet Pilots struggled to nose down the plane but plunged into a steep dive After pulling back up the pilots regained control Cause was defective flight software providing incorrect data for airspeed and acceleration confusing the flight computers and initially ignoring the pilotrsquos commands[49]13US Army and Air Force UAV crashes from control system or human error13

15

Lessons Learned from Failures

bull From Nancy Levesonrsquos paper ldquoThe Role of Software in Spacecraft Accidentsrdquondash ldquoFlaws in the safety culture diffusion of responsibility and

authorityndash Limited communication channels and poor information flowndash Inadequate system and software engineering ndash Poor or missing specifications ndash Unnecessary complexity and software functionality ndash Software reuse or changes without appropriate safety analysisndash [Shortcomings] in safety engineering practices ndash Flaws in test and simulation environments ndash Inadequate human factors design for softwarerdquo

Presenter
Presentation Notes
In Dr Nancy Levesonrsquos paper [36] ldquoThe Role of Software in Spacecraft Accidentsrdquo she cited problems with software development issues within NASA on various projects According to Dr Leveson there were ldquoflaws in the safety culture diffusion of responsibility and authority limited communication channels and poor information flow inadequate system and software engineering poor or missing specifications unnecessary complexity and software functionality software reuse or changes without appropriate safety analysis violation of basic safety engineering practices inadequate system safety engineering flaws in test and simulation environments and inadequate human factors design for softwarerdquo While these problems were identified for spacecraft development within NASA and corrected aviation in general could learn from these lessons to mitigate issues with complex systems development

16

Some Current Guidelines

bull DO-178B - Software Considerations in Airborne Systems and Equipment Certification bull DO-248B ndash Final Report for the Clarification of DO-178Bbull DO-278 - Guidelines for Communications Navigation Surveillance and Air Traffic Management

(CNSATM) Systems Software Integrity Assurancebull DO-254 - Design Assurance Guidance for Airborne Electronic Hardware bull DO-297 ndash Integrated Modular Avionics (IMA) Development Guidance and Certification

Considerationsbull SAE-ARP4754 ndash Certification Consideration for Highly Integrated or Complex Aircraft Systemsbull SAE-ARP4671- Guidelines and Methods for Conducting the Safety Assessment Process on

Airborne Systems and Equipmentbull FAA Advisory Circular AC27-1B - Certification of Normal Category Rotorcraftbull FAA Advisory Circular AC29-2C - Certification of Transport Category Rotorcraftbull ISOIEC 12207 - Software Life Cycle Processesbull ARINC 653 - Specification Standard for Time and System Partitionbull MIL-STD-882D - DoD System Safetybull ADS-51-HDBK - Rotorcraft and Aircraft Qualification Handbookbull AR-70-62 - Airworthiness Release Standardbull SED-SES-PMHFSA 001 - Software Engineering Directorate (SED) Software Engineering

Evaluation System (SEES) Program Manager Handbook for Flight Software Airworthinessbull SED-SES-PMHSS 001 - SED SEES Program Manager Handbook for Software Safety

WHATrsquoS MISSING - Reliability Standard for Complex Systems

Presenter
Presentation Notes
These problems previously stated drove the development of these guidelines however there is no standard for system reliability that includes software There are other standards and circulars that pertain to complex systems but a reliability standard for complex systems but a reliability standard is missing for complex systems which would outline the process for establishing cyber-physical systems reliability This standard should indicate how to model and analyze and ascertain the projected level of reliability

17

Certification Assessment Considerations

bull Sufficient data and time must be available for air worthiness evaluation

bull Certification processndash Currently lengthy ndash Depends much on human interpretation trade offs and risk mitigation ndash Overwhelming for complex integrated systems (FHAs FTAs FMECAs

risk mitigation etc)

bull Consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives

bull Certification Tasks outlined in DO-297 should be consideredndash Task 1 Module Acceptancendash Task 2 Application softwarehardware acceptancendash Task 3 IMA system acceptancendash Task 4 Aircraft integration of IMA system ndash including VampVndash Task 5 Change of modules or applicationsndash Task 6 Reused of modules or applications

Presenter
Presentation Notes
In order to execute an AWR sufficient data must be provided to the appropriate airworthiness authorities to demonstrate that safety requirements are met to the levels per the System Safety Assessment (SSA) or equivalent The certification process is currently lengthy and depends on much human interpretation of the myriad of complex architecture functions 13The current guidelines such as DO-178B DO-254 DO-297 SAE-ARP-4754 and SAE-ARP-4671 along with many other guidelines outline the proper steps that should be taken System safety managementrsquos military standard is MIL-STD-882 and has been in use for decades Civilian safety standards for the aviation industry include SAE ARP4754 which shows the incorporation of system safety activities into the design process and provides guidance on techniques to use to ensure a safe design SAE ARP4761 contains significant guidance on how to perform the system safety activities spoken about in SAE ARP4754 DO-178B outlines development assurance guidance for aviation software based on the failure condition categorization of the functionality provided by the software DO-254 embodies similar guidance for aviation hardware ARINC 653 is a widely accepted standard to ensure time and space partitioning for software DO-297 does an excellent job of describing the certification tasks to take for an IMA system which include13Task 1 Module acceptance13Task 2 Application softwarehardware acceptance13Task 3 IMA system acceptance13Task4 Aircraft integration of IMA systems including verification and validation13Task 5 Change of modules or applications13Task 6 Reuse of modules or applications13Taken together these standards provide guidance that if followed likely will result in safe highly reliable and cost-effective systems over the life-cycle of the system Yet while these guidelines exist there is not a consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives Also there is not a standard outlining overall reliability for a system to include hardware and software reliability In order to achieve this level of reliability a standard should be developed to define the process and method to achieve a quantifiable reliability number that would in turn lead to acceptance13

18

TRL 3 or 4 TRL 6 or 7 TRL 8 or 9

Component 4

ComplexityFundamentals

Reliability Parametrics

Component 3

ComplexityFundamentals

Reliability Parametrics

Definition of Complexity and Reliability is Needed

Subsystem 1

SystemIntegration ofComponents

ReliabilityDependencies

Component 2

ComplexityFundamentals

Reliability Parametrics

Component 1

ComplexityFundamentals

Reliability Parametrics

Subsystem 2

SystemIntegration ofComponents

ReliabilityDependencies

Integration System

Realized System

ReliabilitySensitivities

High Reliable Complex System

Certificate(eg AWR)

Integration

Integration

Integration

Integration

Integration

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) andor AWR

19

Analytical Models and Reliability

bull Analytical models of hardware reliability are well understood

bull Architecture modeling and software reliability modeling is not a novel idea but is highly debated

ndash There are many approaches and little consensus as to best wayndash Many models (Jelinski-Moranda Littlewood-Verrall Musa-Okumoto etc) [1]ndash Many tools (over 200+ tools since 1970s have been built) [2]

bull Predictability of software reliability is of great concern because it is a major contributor to unreliability[2]

bull Software Reliability is the probability of error-free operation of a computer program in a specified environment for a specified time [1]

bull Need basis for setting reliability figures based on previous systems and iteratively refine those figures in the future

bull NOT A REPLACEMENT FOR TESTING AND VERIFICATION

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability While not suggesting as solution in total better modeling practices should be considered toward a solution to bridge the gap between design test and implementation A method to model the architecture early in the requirements establishment phase follow the detailed design and coding and then be able to verify the system along with the model may be a path to greater confidence in the system and reduce the risks warnings and cautions that must be issued13In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result (see Figure 9) The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) or AWR 13To achieve this goal modeling and analysis tools that follow a standard for modeling reliability should exist As previously stated over 200 tools have been created since 1970s Here is a list of a few of the current tools13Universal Systems Language (USL)13Unified Modeling Language (UML)13Systems Modeling Language (SysML)13MATLABSimulink13Telelogic Rhapsody13MathCAD13Colored Petri Nets13Rate Monotonic Analysis (RMA)13STATEMATE (Used by Airbus)13Standard for the Development of Safety-Critical Embedded Software (SCADE)13OPNET13Embedded System Modeling Language (ESML)13Component Synthesis using Model-Integrated Computing (CoSMIC)13Architectural Analysis and Design Language (AADL)13By no means is this list complete Typically different companies and projects address this challenge and choose unit tools to perform the upfront analysis and modeling not following a standard approach Multiple tools need to converge or be compatible with the framework of a common tool such as AADL The resultant tool(s) could be used to verify the requirements up front to mitigate reliability issues down the life-cycle 13

20

Tools for Modeling and Analysis

bull Universal Systems Language (USL)bull Unified Modeling Language (UML)bull Systems Modeling Language (SysML)bull MATLABSimulinkbull Telelogic Rhapsodybull MathCadbull Colored Petri Netsbull Rate Monotonic Analysis (RMA)bull STATEMATE (Used by Airbus)bull SCADEbull OPNETbull Embedded System Modeling Language (ESML)bull Component Synthesis using Model-Integrated Computing (CoSMIC)bull Architectural Analysis and Design Language (AADL)bull At least 200+ more packages since the 70rsquosbull Certified tools needs to converge to an accepted standard

modelinganalysis method for complex system reliability

Presenter
Presentation Notes
Typically different companies and projects address the challenge and choose unit tools to perform the upfront analysis and modeling not following a guideline approach Multiple tools need to converge or be compatible with a set modeling standard for complex systems The compliant tools could be used to verify the requirements up front to mitigate reliability issues down the life-cycle A notional approach would follow that shown in Figure 8 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model The down side to modeling from certain circles is getting people to believe those models How do you certify a modeling tool and the actual models within the tools Those issues should be addressed going forward

21

Modification to Acquisition Model

Requirements Establishment

High Level Design

Detailed Specifications

Implementation Coding

Verification

Development TestingA

rchi

tect

ural

Mod

el amp

A

naly

sis

Propose standard modeling methodology to be applied at different phases of development to enhance requirements development reliability allocation reliability

measurement and testing (DISCLAIMER DOES NOT REPLACE TESTING)

Reliabilityallocated Reliability

measured

Operational Testing amp Validation

Deployed System

Presenter
Presentation Notes
A notional approach would follow that shown in Figure 10 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model This approach could bridge the design and test phases together It is emphasized here that the architectural model would not replace critical testing but augments the process to allow for better requirement identification and verification Thorough ground and flight tests should never be replaced by modeling Modeling would only allow for more robust and a higher level of confidence in the requirements and design The model could be used in conjunction with the testing to confirm the design Proper modeling and analysis would reduce total program costs by enforcing more complete and correct initial requirements which reduce issues discovered down the road in testing that are expensive to fix or impossible to fix and having to accept high risks Additionally if the model is maintained and optimized then it could possibly be used after system deployment to analyze impacts of upgrades or changes to the system allowing for more complete analysis and reduce overall system redesign costs1313A hurdle to cross with modeling and analysis is convincing people to believe those models Some method to certify these models and modeling tools should be addressed in the future Standards should be set in place for correct modeling techniques for complex systems Lastly consideration of standard verification checking tools should be made such as with the use of the Motor Industry Software Reliability Association (MISRA) compliance verification tool for the use of C in safety critical systems

22

Systems Reliability Standard Establishment

bull Establish a working group to define this standardndash Need a technical society to lead the charge on this

bull Collaborate with industry academia military and societiesndash Focus on development of a reliability standard with AWR safety in mindndash Draw upon the experiences to feed into this standard

bull Study existing and previous complex systemsndash Shuttle Space Station missile systems nuclear submarine and ship

systems nuclear control systems military and commercial jet systems ndash Obtain software reliability information from given existing and previous

systemsndash Build database which would serve as basis for future reliability

bull Research prior efforts in complex systems analysis

bull Establish consensus based modeling and analysis method

Presenter
Presentation Notes
In conclusion methods for achieving a design for complex systems do exist however achieving reliability and attaining a level of qualification that would permit better AWRs does not There needs to be a standard developed to tackle this issue rather than relying on current methods to ascertain airworthiness of complex systems This will not occur overnight An orchestrated collaboration among industry academia military labs and technically professional societies to focus on development of this standard should allow us to draw upon the experiences to feed into this reliability standard with AWR safety in mind We have a long living experiment with complex software systems on the Space Transportation System (STS) International Space Station (ISS) missile systems nuclear submarine and ship systems nuclear control systems military and commercial jet systems from which we should be able to obtain at least inferred software reliability information from the architecture and run time that the systems use We should look at the lessons learned from these systems to see what could have been done to improve and what was done right that should be carried forward The challenge is collecting this information into a central database and arriving at some figure of reliability from previous systems where data exists This would at least provide a starting point to allow initial assessments and could be optimized in the future Also this is not the only study for establishing reliability metrics to complex software systems There have been research projects of similarity to this effort that have risen and fallen The data from those projects should not be wasted but studied to feed into whatever standard that is developed While historical information would be useful each design is unique and requires tools to accomplish the design Investigation of architectural modeling constructs should be further investigated as a possible augmentation to the design and test process We need to determine which forum is best to conduct this effort (eg SAE IEEE AIAA ACM AHS INCOSE or other) As stated in the paper ldquoSpace Shuttle Avionicsrdquo [31] ldquoThe designers the flight crew and other operational users of the system often have a mindset established in a previous program or experience which results in a bias against new or different lsquounconventionalrsquo approachesrdquo If nothing is done to address this problem it will only get worse over time It is past time to address the issue of reliability of complex systems and software

23

BACKUP SLIDES

BACKUP SLIDES

24

PEO Aviation System Safety Management Decision Authority Matrix

Severity(Most Credible)

FrequentP gt 1E-3

A

Probable 1D-4 lt P lt= 1E-3

B

Occasional1E-5 lt P lt= 1E-4

C

Remote1E-6 lt P lt= 1E-5

D

Improbable1E-7 lt P lt= 1E-6

E

Catastrophic1

Critical 2

Marginal 3

Negligible4

Army Acquisition

PEO Aviation

ProgramManagement

HazardCategory

Description

1 Catastrophic Death or permanent total disability system loss

2 Critical Severe injury or minor occupational illness (no permanent effect) minor system or environmental damage

3 Marginal Minor injury or minor occupational illness (no permanent effect) minor system or environmental damage

4 Negligible Less than minor injury or occupational illness (no lost workdays) or less than minor environmental damage

RiskLevel

Description Probability (Frequency) (per 100000 flight hours)

A Frequent gt 100 (P gt 1E-3)

B Probable lt=100 and gt10 (1E-4 lt P lt= 1E-3)

C Occasional lt= 10 and gt1 (1E-5 lt P lt= 1E-4)

D Remote lt=1 and gt01 (1E-6 lt P lt= 1E-5)

E Improbable lt=01 and gt001 (1E-7 lt P lt= 1E-6)

Presenter
Presentation Notes
As already mentioned it is the goal of the US Army Aviation Engineering Directorate that the system is safe and reliable to operate and will perform the mission when delivered Another goal includes the released system continues to safely perform the mission if maintained and operated per the operatorrsquos manual Replacement parts and overhaul work must be high quality to support continued airworthiness Per the Program Element Office Memorandum 08-03 Risk Matrix US Army flight control systems are to achieve 1E-9 reliability for flight critical functions per civil airspace regulations [35] and 1E-6 reliability for tactical use [36] Quantifying these numbers is established for component hardware but not for software Just as hardware should have quantifiable reliability so should software13

25

Reliability Defined

bull Software Reliability - the probability that a given piece of software will execute without failure in a given environment for a given time [40] ndash Often debated as to how to measure

bull Hardware Reliability - the probability that a hardware component fails over time ndash Well defined and established

bull System Reliability - the probability of success or the probability that the system will perform its intended function under specified design limits [over a given amount of time] [39] ndash A combination of software and hardware reliability

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

26

Hardware vs Software Reliability

Hardware Reliability Software ReliabilityFailure rate has a bathtub curve The burn-in state is similar to the software debugging state

Without considering program evolution failure rate is statistically non-increasing

Material deterioration can cause failures even though the system is not used

Failures never occur if the software is not used

Failure data are fitted to some distributions The selection of the underlying distribution is based on the analysis of failure data and experiences Emphasis is placed on analyzing failure data

Most models are analytically derived from assumptions Emphasis is on developing the model the interpretation of the model assumptions and the physical meaning of the parameters

Failures are caused by material deterioration design errors misuse and environment

Failures are caused by incorrect logic incorrect statements or incorrect input data

Can be improved by better design better material applying redundancy and accelerated life cycle testing

Can be improved by increasing testing effort and correcting discovered faults Reliability tends to change continuously during testing due to the addition of problems in new code or the removal of problems by debugging errors

Hardware repairs restore the original condition Software repairs establish a new piece of software

Hardware failures are usually preceded by warnings Software failures are rarely preceded by warnings

Hardware components can be standardized Software components have rarely been standardized

Hardware can usually be tested exhaustively Software essentially requires infinite testing for completeness

Reference [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Hong Pham compares software versus hardware reliability with the information as shown in Table 2 [39] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

27

Acronym ListACRONYM DEFINITIONAADL Architectural Analysis and Design LanguageAC Advisory Circular (FAA)ACM Association of Computing MachineryAED Aviation Engineering Directorate (AMRDEC)AFTD Aviation Flight Test Directorate (US Army)AGC Apollo Guidance ComputerAHS American Helicopter SocietyAIAA American Institute of Aeronautics and Astronautics (Inc)AMCOM Aviation and Missile Command (US Army)AMRDEC Aviation and Missiles Research Development and Engineering Center (US Army)AR Army RegulationARINC Aeronautical Radio Inc ARP Aerospace Recommended PracticeASIF Avionics Software Integration FacilityATAM Architecture Tradeoff Analysis MethodATM Air Traffic ManagementAWR Airworthiness ReleaseCAAS Common Avionics Architecture SystemCH-47 Cargo Helicopter ChinookCMM Capability Maturity ModelCMMI Capability Maturity Model IndexCMU Carnegie Mellon UniversityCNS Communications Navigation SurveillanceCoSMIC Component Synthesis using Model-Integrated ComputingCPS Cyber-Physical SystemCRC Chemical Rubber Company (ie CRC Press)DFBW Digital Fly-By-WireDoD Department of DefenseE3 Electrical and Electromagnetic EffectsESML Embedded System Modeling LanguageFAA Federal Aviation AdministrationFCS Future Combat SystemsFHA Functional Hazard AssessmentFMEA Failure Modes Effects Analysis

28

Acronym List (concluded)

ACRONYM DEFINITIONGPWS Ground Proximity Warning SystemIBM International Business MachinesIEC International Engineering ConsortiumIL Instrumentation Lab (now Draper Laboratory)IMA Integrated Modular AvionicsINCOSE International Council On Systems EngineeringISO International Organization for StandardizationISS International Space StationKAL Korean AirlinesMISRA Motor Industry Standard Software Reliability AssociationMIT Massachusetts Institute of TechnologyNASA National Aeronautics and Space Administration (USA)PDR Preliminary Design ReviewPEO Program Element OfficePNAS Proceedings of the National Academy of SciencesRAQ Rotorcraft and Aircraft QualificationRMA Rate Monotonic AnalysisRTC Redstone Test Center (US Army) RTTC Redstone Technical Test Center (US Army)RTCA Radio Technical Commission for AeronauticsSAE Society of Automotive EngineersSED Software Engineering Directorate (AMRDEC)SEES Software Engineering Evaluation SystemSEI Software Engineering Institute (CMU)SIL System Integration LaboratorySSA System Safety AssessmentSTS Space Transportation SystemSysML Systems Modeling LanguageTMR Triple Modular RedundantTRL Technical Readiness LevelUAS Unmanned Aircraft SystemUH-60 Utility Helicopter BlackhawkUML Unified Modeling LanguageUS United StatesUSL Universal Systems Language

29

References

bull [1] Israel Koren and Mani Krishna ldquoFault-Tolerant Systemsrdquo Morgan Kaufmann 2007bull [2] Jianto Pan ldquoSoftware Reliabilityrdquo Carnegie Mellon University Spring 1999bull [3] Nachum Dershowitz httpwwwcstauacil~nachumdhorrorhtmlbull [4] httpwwwair-attackcombull [5] David A Mindell ldquoDigital Apollo Human and Machine in Spaceflightrdquo The MIT Press 2008bull [6] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Flight Software Airworthiness SED-SES-PMHFSA 001 December 2003bull [7] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Software Safety SED-SES-PMHSSA 001 February 2006bull [8] AMCOM Regulation 385-17 Software System Safety Policy 15 March 2008bull [9] NASA Software Safety Guidebook NASA-GB-871913 31 March 2004bull [10] Margaret Hamilton amp William Hackler ldquoUniversal Systems Language Lessons Learned from Apollordquo IEEE Computer

Society 2008bull [11] Margaret Hamilton ldquoFull Life Cycle Systems Engineering and Software Development Environment Development Before

The Fact In Actionrdquo httpwwwhtiuscomArticlesFull_Life_Cyclehtmbull [12] Peter Feiler David Gluch John Hudak ldquo The Architecture Analysis amp Design Language (AADL) An Introductionrdquo

CMUSEI-2006-TN-011 February 2006bull [13] Peter Feiler John Hudak ldquoDeveloping AADL Models for Control Systems A Practitionerrsquos Guiderdquo CMUSEI-2007-TR-

014 July 2007bull [14] Bruce Lewis ldquoUsing the Architecture Analysis and Design Language for System Verification and Validationrdquo SEI

Presentation 2006bull [15] Feiler Gluch Hudak Lewis ldquoEmbedded System Architecture Analysis Using SAE AADLrdquo CMUSEI-2004-TN-004 June

2004bull [16] Charles Pecheur Stacy Nelson ldquoVampV of Advanced Systems at NASArdquo NASACR-2002-211402 April 2002bull [17] Systems Integration Requirements Task Group ldquoARP 4754 Certification Considerations for Highly-Integrated or

Complex Aircraft Systemsrdquo SAE Aerospace 10 April 1996bull [18] SAE ldquoARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne System and

Equipmentrdquo December 1996bull [19] Aeronautical Radio Inc (ARINC) ldquoARINC Specification 653P1-2 Avionics Application Software Standard Interface Part 1

ndash Required Servicesrdquo 7 March 2006

30

References (Continued)

bull [20] Department of Defense ldquoMIL-STD-882D Standard Practice for System Safetyrdquo 19 January 1993bull [21] RTCA Incorporated ldquoDO-178 Software Considerations in Airborne Systems and Equipment Certificationrdquo 1

December 1992bull [22] RTCA Incorporated ldquoDO-254 Design Assurance Guidance for Airborne Electronic Hardwarerdquo 19 April 2000bull [23] US Army ldquoAeronautical Design Standard Handbook Rotorcraft and Aircraft Qualification (RAQ) Handbookrdquo 21

October 1996bull [24] Cary R Spitzer (Editor) ldquoAvionics Elements Software and Functionsrdquo CRC Press 2007bull [25] US Army ldquoArmy Regulation 70-62 Airworthiness Qualification of Aircraft Systemsrdquo 21 May 2007bull [26] US Army ldquoArmy Regulation 95-1 Aviation Flight Regulationsrdquo 3 February 2006bull [27] ldquoUsing the Architecture Tradeoff Analysis Method (ATAM) to Evaluate the Software Architecture for a Product Line

of Avionics Systems A Case Studyrdquo Barbacci Clements Lattanze Northrop Wood July 2003 CMUSEI-2003-TN-012bull [28] ldquoAll in the Family CAAS amp AADLrdquo Peter Feiler August 2008 CMUSEI-2008-SR-021bull [29] ldquoCMMI Guidelines for Process Integration and Product Improvementrdquo Chrissis Konrad Shrum Pearson Education

2007bull [30] ldquoModel Driven Performance Analysis for Avionics Systemsrdquo Brendan OrsquoConnell Draper Laboratory January 2006bull [31] John F Hanaway Robert W Moorehead ldquoSpace Shuttle Avionics Systemsrdquo NASA SP-504 1989bull [32] Lui Sha ldquoThe Complexity Challenge in Modern Avionics Softwarerdquo August 14 2006bull [33] ldquoIncidents Prompt New Scrutiny of Airplane Software Glitchesrdquo 30 May 2006 Wall Street Journalbull [34] Eyal Ophir Clifford Nass and Anthony Wagner ldquo Cognitive Control in Media Multitaskersrdquo PNAS 20 July 2009bull [35] ldquoAdvisory Circular AC 251309-1A System Design and Analysisrdquo Federal Aviation Administration 21 June 1988bull [36] Program Element Office Policy Memorandum 08-03bull [38] httpwwwnsfgovpubs2008nsf08611nsf08611htm National Science Foundation webpage on Cyber-Physical

Systemsbull [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000bull [40] Paul Rook editor ldquoSoftware Reliability Handbookrdquo Elsevier Science Publishers LTD 1990

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)

9

Development Challenges

bull Legacy Aircraft often upgraded in a piecemeal fashionndash Makes certification difficultndash Desire to increase to modern requirements based on size of upgrade and

what it includes ndash hard to scope

bull New system requirements must be clear complete and testable ndash Certification requirements must be obvious

bull Orchestrating agreement between stakeholders is necessary to mitigatendash Juggling of multiple software buildsndash End system that is difficult to test certify and deployndash Escalating Costsndash System Safety from being poorly understoodndash Design iterations

Presenter
Presentation Notes
It would be wonderful if all systems were straightforward in design easily testable and simple to write an Airworthiness Release (AWR) for however that is not the case Legacy aircraft such as the Chinook have been upgraded in a piecemeal fashion acquiring much needed improvements in aviation navigation and communication The general recommended system development V-curve as shown earlier is not always followed in a strict sense although it should be the goal process Negligence to the proper process makes the establishment of certification very difficult For new system development and existing system upgrades requirements must be clear complete and testable The certification requirements must be made obvious in the development of the requirements establishment phase with the goal of being fully identified during the requirements development Orchestrating agreement among all stakeholders (eg the program manager systems engineers human factors engineers integrators test engineers manufacturers users and certifiers) is necessary to mitigate problems such as13juggling multiple software builds13producing a difficult-to-test difficult-to-certify and difficult-to-deploy systems13misunderstanding system safety and13requiring design iterations that impact schedules and costs

10

Complexity Issues

bull System Development costs and schedule increase with complexityndash Existing lack of schedule and funding resources

bull Keeps systems from achieving full compliance with specifications and requirements

bull Garbage in -gt Garbage OuthellipPoor requirements -gt Poor Systemndash Finding problems in new designs at PDR is too latendash Difficult to correct existing poorly designed fielded complex systems

bull Complexity amp reliability of complex systems is not fully understoodndash How do we accurately assess operating risk performance reliability of

complex systems based on limited testing and analysisndash How do we know when system design is good enoughndash Latent defects occur in supposedly well-tested mature systems

bull Avionics parts and software change constantlyndash Spiral development -gt new softwarehardware qualification required frequently ndash How do we streamline the process (partition the system) so the need for

complete re-qualification after changes is lessened

Presenter
Presentation Notes
In an old computer development lesson it is well known that if you put ldquogarbage in you will get garbage outrdquo Likewise in a system if you have poor requirements then you will end up with a poor system Finding problems in new designs at Preliminary Design Review (PDR) is too late It has been shown that discovering issues at this stage in the game will cause a reiteration on the design which costs time and money Furthermore dealing with existing fielded complex systems that have not gone through the rigors of proper systems engineering results quite often with an overwhelming mess of functionality to test verify validate qualify and certifydesigns at PDR is too late It has been shown that discovering issues at this stage in the life-cycle will cause a reiteration on the design which costs time and money Furthermore dealing with existing fielded complex systems that have not gone through the rigors of proper systems engineering results quite often with a complicated conglomeration of functionality to test verify validate qualify certify and maintain This is indicative where complexity has exceeded our understanding in how to certify a system We still do not fully understand complexity and how to address reliability of complex systems How do we accurately assess operating risks performance and reliability of complex systems based on limited testing and analysis How do we know when a system design is good enough How do we modularize spirally developed systems to minimize the need for re-qualification of unchanging portions of the system We are 30 plus years into this technology and we still deal with systems with latent defects that are occurring in supposedly well-tested and mature systems To further exacerbate the problem we are now dealing with complex system of systems (ie cyber-physical systems)13It is a given that you can keep on adding redundancy and complexity to a problem to attain a desired level of reliability but at some point in time the reliability will taper off At best we sometimes must satisfy for an optimum point before digressing in reliability In the same vein system development costs and schedule increase with complexity too (see Figure 7 and 8)13Avionics parts and software constantly change over the life of a program Typically a spiral development program occurs with complex software development which means that qualification is required frequently This begs the question of how to streamline the process so that the need to conduct a complete requalification is avoided13With these complex systems there are other hurdles to cross such as fully characterizing and conducting the Functional Hazard Assessments (FHAs) and Failure Mode Effects and Criticality Analyses (FMECAs) It is crucial for the safety assessment that these are conducted correctly to fully understand the risks and later perform the correct tests at the right level Additionally once the complex component hardware and software are integrated then yet other problems appear It is at that time that the disparity of teams crossing multiple contractors and development groups becomes obvious and if not properly coordinated could cause impacts to the schedule Other programmatic problems affect complex system development and qualification For instance lack of schedule and funding resources causes a shortcoming to adequately provide for the proper compliance with specification and requirements short-circuiting the systems engineering process An ever decreasing availability of trained engineers to support the development and test of such systems exists Non-technical political influences sometimes affect the reliability of complex systems and are difficult to avoid Lastly there is a lack of a centralized database that captures the various families of systems that have been built along with their characterization of success and failures Such a database from all past and present government complex systems could be valuable in establishing reliability basis for future models

11

Complexity Issues (continued)

bull Functional Hazard Assessments and related documentation are crucial

ndash Understanding risks ndash Performing the correct tests at the right level Lab test vs Flight Test

bull Saves flight time and money

bull Systems Integration for complex systems is a schedule driver

bull Need experienced personnel to work complex systems

bull Need a centralized database - just doesnrsquot existndash Determine data needed for quantifying reliability of complex systemsndash Capture the pertinent data on previous or existing complex systemsndash Understand successes and failures of previous and present complex systemsndash Establish baseline reliability figures for known architectures

bull Complex System of Systems exacerbates problem

12

Reliability vs Complexity ampCost vs Complexity

Notional Graphs

bull Reliability vs Complexity bull Cost amp Schedule vs Complexity

Rel

iabi

lity

Complexity Complexity

Cos

t amp S

ched

uleOptimum

Aggregation of part reliability

feeds into overall system reliability

Desired

Presenter
Presentation Notes
Avionics parts and software constantly change over the life of a program Typically a spiral development program occurs with complex software development which means that qualification is required frequently This begs the question of how to streamline the process so that the need to conduct a complete re-qualification is avoided13With these complex systems there are other hurdles to cross such as fully characterizing and conducting the Functional Hazard Assessments (FHAs) and Failure Mode Effects and Criticality Analyses (FMECAs) It is crucial for the safety assessment that these are conducted correctly to fully understand the risks and later perform the correct tests at the right level13Additionally once the complex component hardware and software are integrated then yet other problems crop up It is at that time that the disparity of teams crossing multiple contractors and development groups becomes obvious and if not properly coordinated could cause impacts to the schedule 13Other programmatic problems affect complex system development and qualification For instance lack of schedule and funding resources causes a shortcoming to adequately provide for the proper compliance with specification and requirements short-circuiting the systems engineering process An ever decreasing availability of trained engineers to support the development and test of such systems exists Non-technical political influences sometimes affect the reliability of complex systems and are difficult to avoid Lastly there is a lack of a centralized database that captures the various families of systems that have been built along with their characterization of success and failures Such a database from all past and present government complex systems could be valuable in establishing reliability basis for future models

13

A Few Examples of Complex Systems

bull This is not a new problem Other have struggled with the challenges of establishing confidence in complex systems

ndash NASAbull Apollo Guidance Computerbull Dryden F8 Crusaderbull Space Shuttlebull International Space Station

ndash Commercial Airlinersbull Airbus A320 and higherbull Boeing B777 B787

ndash Militarybull Ships and Submarinesbull Jets (F14F15 F16 F18 F22 F35 etc)bull Cargo Planes (C130J Hercules C17 Globemaster etc)bull Helicopters (Chinook Blackhawk Sea Stallion etc)bull Rocketsbull Unmanned Aerial Systemsbull Unmanned Ground Systemsbull Unmanned Submarine Systems Photos by US Army NASA US Navy and US Air Force

Presenter
Presentation Notes
As mentioned earlier complex avionics systems are not a new idea Since the early 1960rsquos complex avionic architectures have existed beginning with the ApolloSaturn program Massachusetts Institute of Technology (MIT) Instrumentation Lab (IL) which is now Draper Laboratory and International Business Machines (IBM) led the way with the MITIL Apollo Guidance Computer (AGC) and the Saturn V IBM triple modular redundant (TMR) voting guidance computer system The word software was not even coined at the time but engineers such as Margaret Hamilton MITIL Director of Apollo On-board Software can attest to the fact that some the same issues with creating reliable software then still exists today [5] A large majority of the issues then dealt with the communication between systems engineers and the programmers Requirements were thrown over the wall without the confirmation that the requirements were complete and a lot of the issues cropped up as interface problems Identifying these issues prompted Hamilton to create her own company and create a modeling language called Universal Systems Language (USL) to head off the problems experienced with Apollo [11] Some 200 plus modeling programs have been developed since Apollo and used to mitigate issues and increase confidence in systems of varying complexity13As time progressed other systems came along The NASA Dryden F8 Crusader was the first digital fly by wire (DFBW) jet aircraft that relied heavily on complex IMA and software for flight control The Space Transportation System (STS) shuttle includes a Quad Modular Redundant (QMR) system with a fifth backup flight computer containing uncommon code US Air Force and Naval airplanes that have possessed complex or redundant IMA configurations include the F14 Tomcat F15 Eagle F16 Falcon F18 Hornet F22 Raptor F35 Joint Strike Fighter F117 Nighthawk V22 Osprey C17 Globemaster and many more along with recent Unmanned Air Vehicle Systems (UAVS) The US Army complex systems on helicopters include the13RAH-66 Comanche DFBW Triple Modular Redundant (TMR) architecture13glass cockpit avionics on the UH-60M Blackhawk baseline 13Common Avionics Architecture System (CAAS) glass cockpit on the UH-60M Blackhawk modernization and CH-47F Chinooks13and other aircraft13Additionally there are many self-checking pair engine controller systems along with system of system Future Combat Systems (FCS) and Unmanned Air Vehicle Systems (UAVS) This has also permeated the commercial airliner market with the Airbus 320 and higher Airbus models Boeing 777 and Boeing 787 aircraft With this ever increasing technology something must be done about the reliability issue With such a wealth of data on aviation and non-aviation cyber-physical systems such as submarine ship nuclear medical locomotive and automotive systems there should be adequate information to get a start on modeling systems correctly for reliability Therefore this is not an isolated problem to avionics and other disciplines should aide in resolving this problem13

14

Some Complex System Failures

bull V-22 Osprey crashesbull Mars Climate Orbiter crashbull Mars Pathfinder software resetbull USS Vincennes downing an Airbus 320bull Therac-25 software radiation treatment

failurebull 1989 Airbus A320 air show crashbull China Airlines Airbus Industries A300

crashbull Ariane 5 satellite launcher malfunctionbull Failure of the primary flight system to

sync with the backup during prelaunch of STS-1

bull Mexicana Airlines Boeing 727 airliner crashed into a mountain due to the software not correctly determining the mountain position

bull Loss of the first American probe to Venus

bull Korean Airlines KAL 901 accidentbull Soviet Phobos I Mars probe lost

bull Three Mile Islandbull F-18 fighter plane crash due to bad

exceptionbull F-14 fighter plane lost to

uncontrollable spinbull Swedish Gripen prototype crashedbull Swedish Gripen air-show crashbull F-22 failure crossing the IDLbull 2006 German-Spanish Barracuda UAVbull 2004 FA-22 Raptor stealth fighter jet

crash bull FA-22 Raptor navigation system

software error at Nellis AFBbull 50 cockpit blackouts on A320bull A320 multiple avionics and electrical

failures at Newark NJbull Boeing 777 Malaysian Airlines jetlinerrsquos

nightmarish autopilot rollercoaster ridebull 3000 feet US Army and Air Force UAV

Crashes

bull hellip And Many Morehellip

Presenter
Presentation Notes
Multiple crashes have occurred with the V-22 Osprey [41]13In 1999 the Mars Climate Orbiter crashed because of incorrect units in a program caused by poor systems engineering practices [42 44]13In 1988 an Airbus 320 was shot down by the USS Vincennes because of cryptic and misleading output displayed by the tracking software [3]13In 1989 an Airbus A320 crashed at an air show due to altitude indication and software handling [3]13In 1994 a China Airlines Airbus Industries A300 crash on killing 264 from faulty software [3]13In 1996 the first Ariane 5 satellite launcher destruction mishap was caused by a faulty software design error with a few lines of ADA code containing unprotected variables Horizontal velocity of the Ariane 5 exceeded the Arian 4 resulting in the guidance system veering the rocket off course Insufficient testing did not catch this error which was a carry-over from Ariane 4[3 39]13In 1986 a Mexicana Airlines Boeing 727 airliner crashed into a mountain due to the software not correctly determining the mountain position [39]13In 1986 the Therac-25 radiation therapy machines overdosed cancer patients due to flaw in the computer program controlling the highly automated devices [3 39 45]13During the maiden launch in 1981 of the Discovery space shuttle a failure of the primary flight control computer system to establish sync with the backup during prelaunch [43]13On December 10 1990 the Space Shuttle Columbia had to land early due to computer software problems [39]13In 1997 The Mars Pathfinder software reset problem due to latent task execution caused by priority inversion with a mutex [3 44]13An error in a single FORTRAN statement resulted in the loss of the first American probe to Venus From G J Myers Software Reliability Principles amp Practice p 25 [3]13In September 1999 the Korean Airlines KAL 901 accident in Guam killed 225 out of 254 aboard A worldwide bug was discovered in barometric altimetry in Ground Proximity Warning System (GPWS) From ACM SIGSOFT Software Engineering Notes vol 23 no 1 [3]13The Soviet Phobos I Mars probe was lost due to a faulty software update at a cost of 300 million rubles Its disorientation broke the radio link and the solar batteries discharged before reacquisition From Aviation Week 13 Feb 1989 [3]13An F-18 fighter plane crash due to a missing exception condition From ACM SIGSOFT Software Engineering Notes vol 6 no 2 [3]13An F-14 fighter plane was lost to uncontrollable spin traced to tactical software From ACM SIGSOFT Software Engineering vol9 no5 [3]13In 1989 Swedish Gripen prototype crashed due to software in their digital fly-by-wire system [3 46]13In 1995 another Gripen fighter plane crashed during air-show caused by a software issue [3 46]13On February 11 2007 twelve FA-22 Raptors were forced to head back to the Hawaii when a software bug caused a computer crash as they were crossing International Date Line In the middle of the ocean all their systems comprising navigation fuel and part of the communications systems dumped All the attempts to reboot failed[47]13February 2006 German-Spanish Unmanned Combat Air Vehicle Barracuda crash due to software failure [4]13December 2004 a glitch in the software for controlling flight probably caused an FA-22 Raptor stealth fighter jet to crash on takeoff at Nellis Air Force [4]13In 2008 a United Airbus A320 registration N462UA experienced multiple avionics and electrical failures including loss of all communications shortly after rotation while departing Newark Liberty International Airport in Newark New Jersey [NTSB Report Identification DCA08IA033] 13In 2006 a Boeing 777 Malaysian Airlines jetlinerrsquos autopilot caused a stall to occur by climbing 3000 feet Pilots struggled to nose down the plane but plunged into a steep dive After pulling back up the pilots regained control Cause was defective flight software providing incorrect data for airspeed and acceleration confusing the flight computers and initially ignoring the pilotrsquos commands[49]13US Army and Air Force UAV crashes from control system or human error13

15

Lessons Learned from Failures

bull From Nancy Levesonrsquos paper ldquoThe Role of Software in Spacecraft Accidentsrdquondash ldquoFlaws in the safety culture diffusion of responsibility and

authorityndash Limited communication channels and poor information flowndash Inadequate system and software engineering ndash Poor or missing specifications ndash Unnecessary complexity and software functionality ndash Software reuse or changes without appropriate safety analysisndash [Shortcomings] in safety engineering practices ndash Flaws in test and simulation environments ndash Inadequate human factors design for softwarerdquo

Presenter
Presentation Notes
In Dr Nancy Levesonrsquos paper [36] ldquoThe Role of Software in Spacecraft Accidentsrdquo she cited problems with software development issues within NASA on various projects According to Dr Leveson there were ldquoflaws in the safety culture diffusion of responsibility and authority limited communication channels and poor information flow inadequate system and software engineering poor or missing specifications unnecessary complexity and software functionality software reuse or changes without appropriate safety analysis violation of basic safety engineering practices inadequate system safety engineering flaws in test and simulation environments and inadequate human factors design for softwarerdquo While these problems were identified for spacecraft development within NASA and corrected aviation in general could learn from these lessons to mitigate issues with complex systems development

16

Some Current Guidelines

bull DO-178B - Software Considerations in Airborne Systems and Equipment Certification bull DO-248B ndash Final Report for the Clarification of DO-178Bbull DO-278 - Guidelines for Communications Navigation Surveillance and Air Traffic Management

(CNSATM) Systems Software Integrity Assurancebull DO-254 - Design Assurance Guidance for Airborne Electronic Hardware bull DO-297 ndash Integrated Modular Avionics (IMA) Development Guidance and Certification

Considerationsbull SAE-ARP4754 ndash Certification Consideration for Highly Integrated or Complex Aircraft Systemsbull SAE-ARP4671- Guidelines and Methods for Conducting the Safety Assessment Process on

Airborne Systems and Equipmentbull FAA Advisory Circular AC27-1B - Certification of Normal Category Rotorcraftbull FAA Advisory Circular AC29-2C - Certification of Transport Category Rotorcraftbull ISOIEC 12207 - Software Life Cycle Processesbull ARINC 653 - Specification Standard for Time and System Partitionbull MIL-STD-882D - DoD System Safetybull ADS-51-HDBK - Rotorcraft and Aircraft Qualification Handbookbull AR-70-62 - Airworthiness Release Standardbull SED-SES-PMHFSA 001 - Software Engineering Directorate (SED) Software Engineering

Evaluation System (SEES) Program Manager Handbook for Flight Software Airworthinessbull SED-SES-PMHSS 001 - SED SEES Program Manager Handbook for Software Safety

WHATrsquoS MISSING - Reliability Standard for Complex Systems

Presenter
Presentation Notes
These problems previously stated drove the development of these guidelines however there is no standard for system reliability that includes software There are other standards and circulars that pertain to complex systems but a reliability standard for complex systems but a reliability standard is missing for complex systems which would outline the process for establishing cyber-physical systems reliability This standard should indicate how to model and analyze and ascertain the projected level of reliability

17

Certification Assessment Considerations

bull Sufficient data and time must be available for air worthiness evaluation

bull Certification processndash Currently lengthy ndash Depends much on human interpretation trade offs and risk mitigation ndash Overwhelming for complex integrated systems (FHAs FTAs FMECAs

risk mitigation etc)

bull Consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives

bull Certification Tasks outlined in DO-297 should be consideredndash Task 1 Module Acceptancendash Task 2 Application softwarehardware acceptancendash Task 3 IMA system acceptancendash Task 4 Aircraft integration of IMA system ndash including VampVndash Task 5 Change of modules or applicationsndash Task 6 Reused of modules or applications

Presenter
Presentation Notes
In order to execute an AWR sufficient data must be provided to the appropriate airworthiness authorities to demonstrate that safety requirements are met to the levels per the System Safety Assessment (SSA) or equivalent The certification process is currently lengthy and depends on much human interpretation of the myriad of complex architecture functions 13The current guidelines such as DO-178B DO-254 DO-297 SAE-ARP-4754 and SAE-ARP-4671 along with many other guidelines outline the proper steps that should be taken System safety managementrsquos military standard is MIL-STD-882 and has been in use for decades Civilian safety standards for the aviation industry include SAE ARP4754 which shows the incorporation of system safety activities into the design process and provides guidance on techniques to use to ensure a safe design SAE ARP4761 contains significant guidance on how to perform the system safety activities spoken about in SAE ARP4754 DO-178B outlines development assurance guidance for aviation software based on the failure condition categorization of the functionality provided by the software DO-254 embodies similar guidance for aviation hardware ARINC 653 is a widely accepted standard to ensure time and space partitioning for software DO-297 does an excellent job of describing the certification tasks to take for an IMA system which include13Task 1 Module acceptance13Task 2 Application softwarehardware acceptance13Task 3 IMA system acceptance13Task4 Aircraft integration of IMA systems including verification and validation13Task 5 Change of modules or applications13Task 6 Reuse of modules or applications13Taken together these standards provide guidance that if followed likely will result in safe highly reliable and cost-effective systems over the life-cycle of the system Yet while these guidelines exist there is not a consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives Also there is not a standard outlining overall reliability for a system to include hardware and software reliability In order to achieve this level of reliability a standard should be developed to define the process and method to achieve a quantifiable reliability number that would in turn lead to acceptance13

18

TRL 3 or 4 TRL 6 or 7 TRL 8 or 9

Component 4

ComplexityFundamentals

Reliability Parametrics

Component 3

ComplexityFundamentals

Reliability Parametrics

Definition of Complexity and Reliability is Needed

Subsystem 1

SystemIntegration ofComponents

ReliabilityDependencies

Component 2

ComplexityFundamentals

Reliability Parametrics

Component 1

ComplexityFundamentals

Reliability Parametrics

Subsystem 2

SystemIntegration ofComponents

ReliabilityDependencies

Integration System

Realized System

ReliabilitySensitivities

High Reliable Complex System

Certificate(eg AWR)

Integration

Integration

Integration

Integration

Integration

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) andor AWR

19

Analytical Models and Reliability

bull Analytical models of hardware reliability are well understood

bull Architecture modeling and software reliability modeling is not a novel idea but is highly debated

ndash There are many approaches and little consensus as to best wayndash Many models (Jelinski-Moranda Littlewood-Verrall Musa-Okumoto etc) [1]ndash Many tools (over 200+ tools since 1970s have been built) [2]

bull Predictability of software reliability is of great concern because it is a major contributor to unreliability[2]

bull Software Reliability is the probability of error-free operation of a computer program in a specified environment for a specified time [1]

bull Need basis for setting reliability figures based on previous systems and iteratively refine those figures in the future

bull NOT A REPLACEMENT FOR TESTING AND VERIFICATION

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability While not suggesting as solution in total better modeling practices should be considered toward a solution to bridge the gap between design test and implementation A method to model the architecture early in the requirements establishment phase follow the detailed design and coding and then be able to verify the system along with the model may be a path to greater confidence in the system and reduce the risks warnings and cautions that must be issued13In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result (see Figure 9) The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) or AWR 13To achieve this goal modeling and analysis tools that follow a standard for modeling reliability should exist As previously stated over 200 tools have been created since 1970s Here is a list of a few of the current tools13Universal Systems Language (USL)13Unified Modeling Language (UML)13Systems Modeling Language (SysML)13MATLABSimulink13Telelogic Rhapsody13MathCAD13Colored Petri Nets13Rate Monotonic Analysis (RMA)13STATEMATE (Used by Airbus)13Standard for the Development of Safety-Critical Embedded Software (SCADE)13OPNET13Embedded System Modeling Language (ESML)13Component Synthesis using Model-Integrated Computing (CoSMIC)13Architectural Analysis and Design Language (AADL)13By no means is this list complete Typically different companies and projects address this challenge and choose unit tools to perform the upfront analysis and modeling not following a standard approach Multiple tools need to converge or be compatible with the framework of a common tool such as AADL The resultant tool(s) could be used to verify the requirements up front to mitigate reliability issues down the life-cycle 13

20

Tools for Modeling and Analysis

bull Universal Systems Language (USL)bull Unified Modeling Language (UML)bull Systems Modeling Language (SysML)bull MATLABSimulinkbull Telelogic Rhapsodybull MathCadbull Colored Petri Netsbull Rate Monotonic Analysis (RMA)bull STATEMATE (Used by Airbus)bull SCADEbull OPNETbull Embedded System Modeling Language (ESML)bull Component Synthesis using Model-Integrated Computing (CoSMIC)bull Architectural Analysis and Design Language (AADL)bull At least 200+ more packages since the 70rsquosbull Certified tools needs to converge to an accepted standard

modelinganalysis method for complex system reliability

Presenter
Presentation Notes
Typically different companies and projects address the challenge and choose unit tools to perform the upfront analysis and modeling not following a guideline approach Multiple tools need to converge or be compatible with a set modeling standard for complex systems The compliant tools could be used to verify the requirements up front to mitigate reliability issues down the life-cycle A notional approach would follow that shown in Figure 8 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model The down side to modeling from certain circles is getting people to believe those models How do you certify a modeling tool and the actual models within the tools Those issues should be addressed going forward

21

Modification to Acquisition Model

Requirements Establishment

High Level Design

Detailed Specifications

Implementation Coding

Verification

Development TestingA

rchi

tect

ural

Mod

el amp

A

naly

sis

Propose standard modeling methodology to be applied at different phases of development to enhance requirements development reliability allocation reliability

measurement and testing (DISCLAIMER DOES NOT REPLACE TESTING)

Reliabilityallocated Reliability

measured

Operational Testing amp Validation

Deployed System

Presenter
Presentation Notes
A notional approach would follow that shown in Figure 10 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model This approach could bridge the design and test phases together It is emphasized here that the architectural model would not replace critical testing but augments the process to allow for better requirement identification and verification Thorough ground and flight tests should never be replaced by modeling Modeling would only allow for more robust and a higher level of confidence in the requirements and design The model could be used in conjunction with the testing to confirm the design Proper modeling and analysis would reduce total program costs by enforcing more complete and correct initial requirements which reduce issues discovered down the road in testing that are expensive to fix or impossible to fix and having to accept high risks Additionally if the model is maintained and optimized then it could possibly be used after system deployment to analyze impacts of upgrades or changes to the system allowing for more complete analysis and reduce overall system redesign costs1313A hurdle to cross with modeling and analysis is convincing people to believe those models Some method to certify these models and modeling tools should be addressed in the future Standards should be set in place for correct modeling techniques for complex systems Lastly consideration of standard verification checking tools should be made such as with the use of the Motor Industry Software Reliability Association (MISRA) compliance verification tool for the use of C in safety critical systems

22

Systems Reliability Standard Establishment

bull Establish a working group to define this standardndash Need a technical society to lead the charge on this

bull Collaborate with industry academia military and societiesndash Focus on development of a reliability standard with AWR safety in mindndash Draw upon the experiences to feed into this standard

bull Study existing and previous complex systemsndash Shuttle Space Station missile systems nuclear submarine and ship

systems nuclear control systems military and commercial jet systems ndash Obtain software reliability information from given existing and previous

systemsndash Build database which would serve as basis for future reliability

bull Research prior efforts in complex systems analysis

bull Establish consensus based modeling and analysis method

Presenter
Presentation Notes
In conclusion methods for achieving a design for complex systems do exist however achieving reliability and attaining a level of qualification that would permit better AWRs does not There needs to be a standard developed to tackle this issue rather than relying on current methods to ascertain airworthiness of complex systems This will not occur overnight An orchestrated collaboration among industry academia military labs and technically professional societies to focus on development of this standard should allow us to draw upon the experiences to feed into this reliability standard with AWR safety in mind We have a long living experiment with complex software systems on the Space Transportation System (STS) International Space Station (ISS) missile systems nuclear submarine and ship systems nuclear control systems military and commercial jet systems from which we should be able to obtain at least inferred software reliability information from the architecture and run time that the systems use We should look at the lessons learned from these systems to see what could have been done to improve and what was done right that should be carried forward The challenge is collecting this information into a central database and arriving at some figure of reliability from previous systems where data exists This would at least provide a starting point to allow initial assessments and could be optimized in the future Also this is not the only study for establishing reliability metrics to complex software systems There have been research projects of similarity to this effort that have risen and fallen The data from those projects should not be wasted but studied to feed into whatever standard that is developed While historical information would be useful each design is unique and requires tools to accomplish the design Investigation of architectural modeling constructs should be further investigated as a possible augmentation to the design and test process We need to determine which forum is best to conduct this effort (eg SAE IEEE AIAA ACM AHS INCOSE or other) As stated in the paper ldquoSpace Shuttle Avionicsrdquo [31] ldquoThe designers the flight crew and other operational users of the system often have a mindset established in a previous program or experience which results in a bias against new or different lsquounconventionalrsquo approachesrdquo If nothing is done to address this problem it will only get worse over time It is past time to address the issue of reliability of complex systems and software

23

BACKUP SLIDES

BACKUP SLIDES

24

PEO Aviation System Safety Management Decision Authority Matrix

Severity(Most Credible)

FrequentP gt 1E-3

A

Probable 1D-4 lt P lt= 1E-3

B

Occasional1E-5 lt P lt= 1E-4

C

Remote1E-6 lt P lt= 1E-5

D

Improbable1E-7 lt P lt= 1E-6

E

Catastrophic1

Critical 2

Marginal 3

Negligible4

Army Acquisition

PEO Aviation

ProgramManagement

HazardCategory

Description

1 Catastrophic Death or permanent total disability system loss

2 Critical Severe injury or minor occupational illness (no permanent effect) minor system or environmental damage

3 Marginal Minor injury or minor occupational illness (no permanent effect) minor system or environmental damage

4 Negligible Less than minor injury or occupational illness (no lost workdays) or less than minor environmental damage

RiskLevel

Description Probability (Frequency) (per 100000 flight hours)

A Frequent gt 100 (P gt 1E-3)

B Probable lt=100 and gt10 (1E-4 lt P lt= 1E-3)

C Occasional lt= 10 and gt1 (1E-5 lt P lt= 1E-4)

D Remote lt=1 and gt01 (1E-6 lt P lt= 1E-5)

E Improbable lt=01 and gt001 (1E-7 lt P lt= 1E-6)

Presenter
Presentation Notes
As already mentioned it is the goal of the US Army Aviation Engineering Directorate that the system is safe and reliable to operate and will perform the mission when delivered Another goal includes the released system continues to safely perform the mission if maintained and operated per the operatorrsquos manual Replacement parts and overhaul work must be high quality to support continued airworthiness Per the Program Element Office Memorandum 08-03 Risk Matrix US Army flight control systems are to achieve 1E-9 reliability for flight critical functions per civil airspace regulations [35] and 1E-6 reliability for tactical use [36] Quantifying these numbers is established for component hardware but not for software Just as hardware should have quantifiable reliability so should software13

25

Reliability Defined

bull Software Reliability - the probability that a given piece of software will execute without failure in a given environment for a given time [40] ndash Often debated as to how to measure

bull Hardware Reliability - the probability that a hardware component fails over time ndash Well defined and established

bull System Reliability - the probability of success or the probability that the system will perform its intended function under specified design limits [over a given amount of time] [39] ndash A combination of software and hardware reliability

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

26

Hardware vs Software Reliability

Hardware Reliability Software ReliabilityFailure rate has a bathtub curve The burn-in state is similar to the software debugging state

Without considering program evolution failure rate is statistically non-increasing

Material deterioration can cause failures even though the system is not used

Failures never occur if the software is not used

Failure data are fitted to some distributions The selection of the underlying distribution is based on the analysis of failure data and experiences Emphasis is placed on analyzing failure data

Most models are analytically derived from assumptions Emphasis is on developing the model the interpretation of the model assumptions and the physical meaning of the parameters

Failures are caused by material deterioration design errors misuse and environment

Failures are caused by incorrect logic incorrect statements or incorrect input data

Can be improved by better design better material applying redundancy and accelerated life cycle testing

Can be improved by increasing testing effort and correcting discovered faults Reliability tends to change continuously during testing due to the addition of problems in new code or the removal of problems by debugging errors

Hardware repairs restore the original condition Software repairs establish a new piece of software

Hardware failures are usually preceded by warnings Software failures are rarely preceded by warnings

Hardware components can be standardized Software components have rarely been standardized

Hardware can usually be tested exhaustively Software essentially requires infinite testing for completeness

Reference [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Hong Pham compares software versus hardware reliability with the information as shown in Table 2 [39] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

27

Acronym ListACRONYM DEFINITIONAADL Architectural Analysis and Design LanguageAC Advisory Circular (FAA)ACM Association of Computing MachineryAED Aviation Engineering Directorate (AMRDEC)AFTD Aviation Flight Test Directorate (US Army)AGC Apollo Guidance ComputerAHS American Helicopter SocietyAIAA American Institute of Aeronautics and Astronautics (Inc)AMCOM Aviation and Missile Command (US Army)AMRDEC Aviation and Missiles Research Development and Engineering Center (US Army)AR Army RegulationARINC Aeronautical Radio Inc ARP Aerospace Recommended PracticeASIF Avionics Software Integration FacilityATAM Architecture Tradeoff Analysis MethodATM Air Traffic ManagementAWR Airworthiness ReleaseCAAS Common Avionics Architecture SystemCH-47 Cargo Helicopter ChinookCMM Capability Maturity ModelCMMI Capability Maturity Model IndexCMU Carnegie Mellon UniversityCNS Communications Navigation SurveillanceCoSMIC Component Synthesis using Model-Integrated ComputingCPS Cyber-Physical SystemCRC Chemical Rubber Company (ie CRC Press)DFBW Digital Fly-By-WireDoD Department of DefenseE3 Electrical and Electromagnetic EffectsESML Embedded System Modeling LanguageFAA Federal Aviation AdministrationFCS Future Combat SystemsFHA Functional Hazard AssessmentFMEA Failure Modes Effects Analysis

28

Acronym List (concluded)

ACRONYM DEFINITIONGPWS Ground Proximity Warning SystemIBM International Business MachinesIEC International Engineering ConsortiumIL Instrumentation Lab (now Draper Laboratory)IMA Integrated Modular AvionicsINCOSE International Council On Systems EngineeringISO International Organization for StandardizationISS International Space StationKAL Korean AirlinesMISRA Motor Industry Standard Software Reliability AssociationMIT Massachusetts Institute of TechnologyNASA National Aeronautics and Space Administration (USA)PDR Preliminary Design ReviewPEO Program Element OfficePNAS Proceedings of the National Academy of SciencesRAQ Rotorcraft and Aircraft QualificationRMA Rate Monotonic AnalysisRTC Redstone Test Center (US Army) RTTC Redstone Technical Test Center (US Army)RTCA Radio Technical Commission for AeronauticsSAE Society of Automotive EngineersSED Software Engineering Directorate (AMRDEC)SEES Software Engineering Evaluation SystemSEI Software Engineering Institute (CMU)SIL System Integration LaboratorySSA System Safety AssessmentSTS Space Transportation SystemSysML Systems Modeling LanguageTMR Triple Modular RedundantTRL Technical Readiness LevelUAS Unmanned Aircraft SystemUH-60 Utility Helicopter BlackhawkUML Unified Modeling LanguageUS United StatesUSL Universal Systems Language

29

References

bull [1] Israel Koren and Mani Krishna ldquoFault-Tolerant Systemsrdquo Morgan Kaufmann 2007bull [2] Jianto Pan ldquoSoftware Reliabilityrdquo Carnegie Mellon University Spring 1999bull [3] Nachum Dershowitz httpwwwcstauacil~nachumdhorrorhtmlbull [4] httpwwwair-attackcombull [5] David A Mindell ldquoDigital Apollo Human and Machine in Spaceflightrdquo The MIT Press 2008bull [6] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Flight Software Airworthiness SED-SES-PMHFSA 001 December 2003bull [7] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Software Safety SED-SES-PMHSSA 001 February 2006bull [8] AMCOM Regulation 385-17 Software System Safety Policy 15 March 2008bull [9] NASA Software Safety Guidebook NASA-GB-871913 31 March 2004bull [10] Margaret Hamilton amp William Hackler ldquoUniversal Systems Language Lessons Learned from Apollordquo IEEE Computer

Society 2008bull [11] Margaret Hamilton ldquoFull Life Cycle Systems Engineering and Software Development Environment Development Before

The Fact In Actionrdquo httpwwwhtiuscomArticlesFull_Life_Cyclehtmbull [12] Peter Feiler David Gluch John Hudak ldquo The Architecture Analysis amp Design Language (AADL) An Introductionrdquo

CMUSEI-2006-TN-011 February 2006bull [13] Peter Feiler John Hudak ldquoDeveloping AADL Models for Control Systems A Practitionerrsquos Guiderdquo CMUSEI-2007-TR-

014 July 2007bull [14] Bruce Lewis ldquoUsing the Architecture Analysis and Design Language for System Verification and Validationrdquo SEI

Presentation 2006bull [15] Feiler Gluch Hudak Lewis ldquoEmbedded System Architecture Analysis Using SAE AADLrdquo CMUSEI-2004-TN-004 June

2004bull [16] Charles Pecheur Stacy Nelson ldquoVampV of Advanced Systems at NASArdquo NASACR-2002-211402 April 2002bull [17] Systems Integration Requirements Task Group ldquoARP 4754 Certification Considerations for Highly-Integrated or

Complex Aircraft Systemsrdquo SAE Aerospace 10 April 1996bull [18] SAE ldquoARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne System and

Equipmentrdquo December 1996bull [19] Aeronautical Radio Inc (ARINC) ldquoARINC Specification 653P1-2 Avionics Application Software Standard Interface Part 1

ndash Required Servicesrdquo 7 March 2006

30

References (Continued)

bull [20] Department of Defense ldquoMIL-STD-882D Standard Practice for System Safetyrdquo 19 January 1993bull [21] RTCA Incorporated ldquoDO-178 Software Considerations in Airborne Systems and Equipment Certificationrdquo 1

December 1992bull [22] RTCA Incorporated ldquoDO-254 Design Assurance Guidance for Airborne Electronic Hardwarerdquo 19 April 2000bull [23] US Army ldquoAeronautical Design Standard Handbook Rotorcraft and Aircraft Qualification (RAQ) Handbookrdquo 21

October 1996bull [24] Cary R Spitzer (Editor) ldquoAvionics Elements Software and Functionsrdquo CRC Press 2007bull [25] US Army ldquoArmy Regulation 70-62 Airworthiness Qualification of Aircraft Systemsrdquo 21 May 2007bull [26] US Army ldquoArmy Regulation 95-1 Aviation Flight Regulationsrdquo 3 February 2006bull [27] ldquoUsing the Architecture Tradeoff Analysis Method (ATAM) to Evaluate the Software Architecture for a Product Line

of Avionics Systems A Case Studyrdquo Barbacci Clements Lattanze Northrop Wood July 2003 CMUSEI-2003-TN-012bull [28] ldquoAll in the Family CAAS amp AADLrdquo Peter Feiler August 2008 CMUSEI-2008-SR-021bull [29] ldquoCMMI Guidelines for Process Integration and Product Improvementrdquo Chrissis Konrad Shrum Pearson Education

2007bull [30] ldquoModel Driven Performance Analysis for Avionics Systemsrdquo Brendan OrsquoConnell Draper Laboratory January 2006bull [31] John F Hanaway Robert W Moorehead ldquoSpace Shuttle Avionics Systemsrdquo NASA SP-504 1989bull [32] Lui Sha ldquoThe Complexity Challenge in Modern Avionics Softwarerdquo August 14 2006bull [33] ldquoIncidents Prompt New Scrutiny of Airplane Software Glitchesrdquo 30 May 2006 Wall Street Journalbull [34] Eyal Ophir Clifford Nass and Anthony Wagner ldquo Cognitive Control in Media Multitaskersrdquo PNAS 20 July 2009bull [35] ldquoAdvisory Circular AC 251309-1A System Design and Analysisrdquo Federal Aviation Administration 21 June 1988bull [36] Program Element Office Policy Memorandum 08-03bull [38] httpwwwnsfgovpubs2008nsf08611nsf08611htm National Science Foundation webpage on Cyber-Physical

Systemsbull [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000bull [40] Paul Rook editor ldquoSoftware Reliability Handbookrdquo Elsevier Science Publishers LTD 1990

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)

10

Complexity Issues

bull System Development costs and schedule increase with complexityndash Existing lack of schedule and funding resources

bull Keeps systems from achieving full compliance with specifications and requirements

bull Garbage in -gt Garbage OuthellipPoor requirements -gt Poor Systemndash Finding problems in new designs at PDR is too latendash Difficult to correct existing poorly designed fielded complex systems

bull Complexity amp reliability of complex systems is not fully understoodndash How do we accurately assess operating risk performance reliability of

complex systems based on limited testing and analysisndash How do we know when system design is good enoughndash Latent defects occur in supposedly well-tested mature systems

bull Avionics parts and software change constantlyndash Spiral development -gt new softwarehardware qualification required frequently ndash How do we streamline the process (partition the system) so the need for

complete re-qualification after changes is lessened

Presenter
Presentation Notes
In an old computer development lesson it is well known that if you put ldquogarbage in you will get garbage outrdquo Likewise in a system if you have poor requirements then you will end up with a poor system Finding problems in new designs at Preliminary Design Review (PDR) is too late It has been shown that discovering issues at this stage in the game will cause a reiteration on the design which costs time and money Furthermore dealing with existing fielded complex systems that have not gone through the rigors of proper systems engineering results quite often with an overwhelming mess of functionality to test verify validate qualify and certifydesigns at PDR is too late It has been shown that discovering issues at this stage in the life-cycle will cause a reiteration on the design which costs time and money Furthermore dealing with existing fielded complex systems that have not gone through the rigors of proper systems engineering results quite often with a complicated conglomeration of functionality to test verify validate qualify certify and maintain This is indicative where complexity has exceeded our understanding in how to certify a system We still do not fully understand complexity and how to address reliability of complex systems How do we accurately assess operating risks performance and reliability of complex systems based on limited testing and analysis How do we know when a system design is good enough How do we modularize spirally developed systems to minimize the need for re-qualification of unchanging portions of the system We are 30 plus years into this technology and we still deal with systems with latent defects that are occurring in supposedly well-tested and mature systems To further exacerbate the problem we are now dealing with complex system of systems (ie cyber-physical systems)13It is a given that you can keep on adding redundancy and complexity to a problem to attain a desired level of reliability but at some point in time the reliability will taper off At best we sometimes must satisfy for an optimum point before digressing in reliability In the same vein system development costs and schedule increase with complexity too (see Figure 7 and 8)13Avionics parts and software constantly change over the life of a program Typically a spiral development program occurs with complex software development which means that qualification is required frequently This begs the question of how to streamline the process so that the need to conduct a complete requalification is avoided13With these complex systems there are other hurdles to cross such as fully characterizing and conducting the Functional Hazard Assessments (FHAs) and Failure Mode Effects and Criticality Analyses (FMECAs) It is crucial for the safety assessment that these are conducted correctly to fully understand the risks and later perform the correct tests at the right level Additionally once the complex component hardware and software are integrated then yet other problems appear It is at that time that the disparity of teams crossing multiple contractors and development groups becomes obvious and if not properly coordinated could cause impacts to the schedule Other programmatic problems affect complex system development and qualification For instance lack of schedule and funding resources causes a shortcoming to adequately provide for the proper compliance with specification and requirements short-circuiting the systems engineering process An ever decreasing availability of trained engineers to support the development and test of such systems exists Non-technical political influences sometimes affect the reliability of complex systems and are difficult to avoid Lastly there is a lack of a centralized database that captures the various families of systems that have been built along with their characterization of success and failures Such a database from all past and present government complex systems could be valuable in establishing reliability basis for future models

11

Complexity Issues (continued)

bull Functional Hazard Assessments and related documentation are crucial

ndash Understanding risks ndash Performing the correct tests at the right level Lab test vs Flight Test

bull Saves flight time and money

bull Systems Integration for complex systems is a schedule driver

bull Need experienced personnel to work complex systems

bull Need a centralized database - just doesnrsquot existndash Determine data needed for quantifying reliability of complex systemsndash Capture the pertinent data on previous or existing complex systemsndash Understand successes and failures of previous and present complex systemsndash Establish baseline reliability figures for known architectures

bull Complex System of Systems exacerbates problem

12

Reliability vs Complexity ampCost vs Complexity

Notional Graphs

bull Reliability vs Complexity bull Cost amp Schedule vs Complexity

Rel

iabi

lity

Complexity Complexity

Cos

t amp S

ched

uleOptimum

Aggregation of part reliability

feeds into overall system reliability

Desired

Presenter
Presentation Notes
Avionics parts and software constantly change over the life of a program Typically a spiral development program occurs with complex software development which means that qualification is required frequently This begs the question of how to streamline the process so that the need to conduct a complete re-qualification is avoided13With these complex systems there are other hurdles to cross such as fully characterizing and conducting the Functional Hazard Assessments (FHAs) and Failure Mode Effects and Criticality Analyses (FMECAs) It is crucial for the safety assessment that these are conducted correctly to fully understand the risks and later perform the correct tests at the right level13Additionally once the complex component hardware and software are integrated then yet other problems crop up It is at that time that the disparity of teams crossing multiple contractors and development groups becomes obvious and if not properly coordinated could cause impacts to the schedule 13Other programmatic problems affect complex system development and qualification For instance lack of schedule and funding resources causes a shortcoming to adequately provide for the proper compliance with specification and requirements short-circuiting the systems engineering process An ever decreasing availability of trained engineers to support the development and test of such systems exists Non-technical political influences sometimes affect the reliability of complex systems and are difficult to avoid Lastly there is a lack of a centralized database that captures the various families of systems that have been built along with their characterization of success and failures Such a database from all past and present government complex systems could be valuable in establishing reliability basis for future models

13

A Few Examples of Complex Systems

bull This is not a new problem Other have struggled with the challenges of establishing confidence in complex systems

ndash NASAbull Apollo Guidance Computerbull Dryden F8 Crusaderbull Space Shuttlebull International Space Station

ndash Commercial Airlinersbull Airbus A320 and higherbull Boeing B777 B787

ndash Militarybull Ships and Submarinesbull Jets (F14F15 F16 F18 F22 F35 etc)bull Cargo Planes (C130J Hercules C17 Globemaster etc)bull Helicopters (Chinook Blackhawk Sea Stallion etc)bull Rocketsbull Unmanned Aerial Systemsbull Unmanned Ground Systemsbull Unmanned Submarine Systems Photos by US Army NASA US Navy and US Air Force

Presenter
Presentation Notes
As mentioned earlier complex avionics systems are not a new idea Since the early 1960rsquos complex avionic architectures have existed beginning with the ApolloSaturn program Massachusetts Institute of Technology (MIT) Instrumentation Lab (IL) which is now Draper Laboratory and International Business Machines (IBM) led the way with the MITIL Apollo Guidance Computer (AGC) and the Saturn V IBM triple modular redundant (TMR) voting guidance computer system The word software was not even coined at the time but engineers such as Margaret Hamilton MITIL Director of Apollo On-board Software can attest to the fact that some the same issues with creating reliable software then still exists today [5] A large majority of the issues then dealt with the communication between systems engineers and the programmers Requirements were thrown over the wall without the confirmation that the requirements were complete and a lot of the issues cropped up as interface problems Identifying these issues prompted Hamilton to create her own company and create a modeling language called Universal Systems Language (USL) to head off the problems experienced with Apollo [11] Some 200 plus modeling programs have been developed since Apollo and used to mitigate issues and increase confidence in systems of varying complexity13As time progressed other systems came along The NASA Dryden F8 Crusader was the first digital fly by wire (DFBW) jet aircraft that relied heavily on complex IMA and software for flight control The Space Transportation System (STS) shuttle includes a Quad Modular Redundant (QMR) system with a fifth backup flight computer containing uncommon code US Air Force and Naval airplanes that have possessed complex or redundant IMA configurations include the F14 Tomcat F15 Eagle F16 Falcon F18 Hornet F22 Raptor F35 Joint Strike Fighter F117 Nighthawk V22 Osprey C17 Globemaster and many more along with recent Unmanned Air Vehicle Systems (UAVS) The US Army complex systems on helicopters include the13RAH-66 Comanche DFBW Triple Modular Redundant (TMR) architecture13glass cockpit avionics on the UH-60M Blackhawk baseline 13Common Avionics Architecture System (CAAS) glass cockpit on the UH-60M Blackhawk modernization and CH-47F Chinooks13and other aircraft13Additionally there are many self-checking pair engine controller systems along with system of system Future Combat Systems (FCS) and Unmanned Air Vehicle Systems (UAVS) This has also permeated the commercial airliner market with the Airbus 320 and higher Airbus models Boeing 777 and Boeing 787 aircraft With this ever increasing technology something must be done about the reliability issue With such a wealth of data on aviation and non-aviation cyber-physical systems such as submarine ship nuclear medical locomotive and automotive systems there should be adequate information to get a start on modeling systems correctly for reliability Therefore this is not an isolated problem to avionics and other disciplines should aide in resolving this problem13

14

Some Complex System Failures

bull V-22 Osprey crashesbull Mars Climate Orbiter crashbull Mars Pathfinder software resetbull USS Vincennes downing an Airbus 320bull Therac-25 software radiation treatment

failurebull 1989 Airbus A320 air show crashbull China Airlines Airbus Industries A300

crashbull Ariane 5 satellite launcher malfunctionbull Failure of the primary flight system to

sync with the backup during prelaunch of STS-1

bull Mexicana Airlines Boeing 727 airliner crashed into a mountain due to the software not correctly determining the mountain position

bull Loss of the first American probe to Venus

bull Korean Airlines KAL 901 accidentbull Soviet Phobos I Mars probe lost

bull Three Mile Islandbull F-18 fighter plane crash due to bad

exceptionbull F-14 fighter plane lost to

uncontrollable spinbull Swedish Gripen prototype crashedbull Swedish Gripen air-show crashbull F-22 failure crossing the IDLbull 2006 German-Spanish Barracuda UAVbull 2004 FA-22 Raptor stealth fighter jet

crash bull FA-22 Raptor navigation system

software error at Nellis AFBbull 50 cockpit blackouts on A320bull A320 multiple avionics and electrical

failures at Newark NJbull Boeing 777 Malaysian Airlines jetlinerrsquos

nightmarish autopilot rollercoaster ridebull 3000 feet US Army and Air Force UAV

Crashes

bull hellip And Many Morehellip

Presenter
Presentation Notes
Multiple crashes have occurred with the V-22 Osprey [41]13In 1999 the Mars Climate Orbiter crashed because of incorrect units in a program caused by poor systems engineering practices [42 44]13In 1988 an Airbus 320 was shot down by the USS Vincennes because of cryptic and misleading output displayed by the tracking software [3]13In 1989 an Airbus A320 crashed at an air show due to altitude indication and software handling [3]13In 1994 a China Airlines Airbus Industries A300 crash on killing 264 from faulty software [3]13In 1996 the first Ariane 5 satellite launcher destruction mishap was caused by a faulty software design error with a few lines of ADA code containing unprotected variables Horizontal velocity of the Ariane 5 exceeded the Arian 4 resulting in the guidance system veering the rocket off course Insufficient testing did not catch this error which was a carry-over from Ariane 4[3 39]13In 1986 a Mexicana Airlines Boeing 727 airliner crashed into a mountain due to the software not correctly determining the mountain position [39]13In 1986 the Therac-25 radiation therapy machines overdosed cancer patients due to flaw in the computer program controlling the highly automated devices [3 39 45]13During the maiden launch in 1981 of the Discovery space shuttle a failure of the primary flight control computer system to establish sync with the backup during prelaunch [43]13On December 10 1990 the Space Shuttle Columbia had to land early due to computer software problems [39]13In 1997 The Mars Pathfinder software reset problem due to latent task execution caused by priority inversion with a mutex [3 44]13An error in a single FORTRAN statement resulted in the loss of the first American probe to Venus From G J Myers Software Reliability Principles amp Practice p 25 [3]13In September 1999 the Korean Airlines KAL 901 accident in Guam killed 225 out of 254 aboard A worldwide bug was discovered in barometric altimetry in Ground Proximity Warning System (GPWS) From ACM SIGSOFT Software Engineering Notes vol 23 no 1 [3]13The Soviet Phobos I Mars probe was lost due to a faulty software update at a cost of 300 million rubles Its disorientation broke the radio link and the solar batteries discharged before reacquisition From Aviation Week 13 Feb 1989 [3]13An F-18 fighter plane crash due to a missing exception condition From ACM SIGSOFT Software Engineering Notes vol 6 no 2 [3]13An F-14 fighter plane was lost to uncontrollable spin traced to tactical software From ACM SIGSOFT Software Engineering vol9 no5 [3]13In 1989 Swedish Gripen prototype crashed due to software in their digital fly-by-wire system [3 46]13In 1995 another Gripen fighter plane crashed during air-show caused by a software issue [3 46]13On February 11 2007 twelve FA-22 Raptors were forced to head back to the Hawaii when a software bug caused a computer crash as they were crossing International Date Line In the middle of the ocean all their systems comprising navigation fuel and part of the communications systems dumped All the attempts to reboot failed[47]13February 2006 German-Spanish Unmanned Combat Air Vehicle Barracuda crash due to software failure [4]13December 2004 a glitch in the software for controlling flight probably caused an FA-22 Raptor stealth fighter jet to crash on takeoff at Nellis Air Force [4]13In 2008 a United Airbus A320 registration N462UA experienced multiple avionics and electrical failures including loss of all communications shortly after rotation while departing Newark Liberty International Airport in Newark New Jersey [NTSB Report Identification DCA08IA033] 13In 2006 a Boeing 777 Malaysian Airlines jetlinerrsquos autopilot caused a stall to occur by climbing 3000 feet Pilots struggled to nose down the plane but plunged into a steep dive After pulling back up the pilots regained control Cause was defective flight software providing incorrect data for airspeed and acceleration confusing the flight computers and initially ignoring the pilotrsquos commands[49]13US Army and Air Force UAV crashes from control system or human error13

15

Lessons Learned from Failures

bull From Nancy Levesonrsquos paper ldquoThe Role of Software in Spacecraft Accidentsrdquondash ldquoFlaws in the safety culture diffusion of responsibility and

authorityndash Limited communication channels and poor information flowndash Inadequate system and software engineering ndash Poor or missing specifications ndash Unnecessary complexity and software functionality ndash Software reuse or changes without appropriate safety analysisndash [Shortcomings] in safety engineering practices ndash Flaws in test and simulation environments ndash Inadequate human factors design for softwarerdquo

Presenter
Presentation Notes
In Dr Nancy Levesonrsquos paper [36] ldquoThe Role of Software in Spacecraft Accidentsrdquo she cited problems with software development issues within NASA on various projects According to Dr Leveson there were ldquoflaws in the safety culture diffusion of responsibility and authority limited communication channels and poor information flow inadequate system and software engineering poor or missing specifications unnecessary complexity and software functionality software reuse or changes without appropriate safety analysis violation of basic safety engineering practices inadequate system safety engineering flaws in test and simulation environments and inadequate human factors design for softwarerdquo While these problems were identified for spacecraft development within NASA and corrected aviation in general could learn from these lessons to mitigate issues with complex systems development

16

Some Current Guidelines

bull DO-178B - Software Considerations in Airborne Systems and Equipment Certification bull DO-248B ndash Final Report for the Clarification of DO-178Bbull DO-278 - Guidelines for Communications Navigation Surveillance and Air Traffic Management

(CNSATM) Systems Software Integrity Assurancebull DO-254 - Design Assurance Guidance for Airborne Electronic Hardware bull DO-297 ndash Integrated Modular Avionics (IMA) Development Guidance and Certification

Considerationsbull SAE-ARP4754 ndash Certification Consideration for Highly Integrated or Complex Aircraft Systemsbull SAE-ARP4671- Guidelines and Methods for Conducting the Safety Assessment Process on

Airborne Systems and Equipmentbull FAA Advisory Circular AC27-1B - Certification of Normal Category Rotorcraftbull FAA Advisory Circular AC29-2C - Certification of Transport Category Rotorcraftbull ISOIEC 12207 - Software Life Cycle Processesbull ARINC 653 - Specification Standard for Time and System Partitionbull MIL-STD-882D - DoD System Safetybull ADS-51-HDBK - Rotorcraft and Aircraft Qualification Handbookbull AR-70-62 - Airworthiness Release Standardbull SED-SES-PMHFSA 001 - Software Engineering Directorate (SED) Software Engineering

Evaluation System (SEES) Program Manager Handbook for Flight Software Airworthinessbull SED-SES-PMHSS 001 - SED SEES Program Manager Handbook for Software Safety

WHATrsquoS MISSING - Reliability Standard for Complex Systems

Presenter
Presentation Notes
These problems previously stated drove the development of these guidelines however there is no standard for system reliability that includes software There are other standards and circulars that pertain to complex systems but a reliability standard for complex systems but a reliability standard is missing for complex systems which would outline the process for establishing cyber-physical systems reliability This standard should indicate how to model and analyze and ascertain the projected level of reliability

17

Certification Assessment Considerations

bull Sufficient data and time must be available for air worthiness evaluation

bull Certification processndash Currently lengthy ndash Depends much on human interpretation trade offs and risk mitigation ndash Overwhelming for complex integrated systems (FHAs FTAs FMECAs

risk mitigation etc)

bull Consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives

bull Certification Tasks outlined in DO-297 should be consideredndash Task 1 Module Acceptancendash Task 2 Application softwarehardware acceptancendash Task 3 IMA system acceptancendash Task 4 Aircraft integration of IMA system ndash including VampVndash Task 5 Change of modules or applicationsndash Task 6 Reused of modules or applications

Presenter
Presentation Notes
In order to execute an AWR sufficient data must be provided to the appropriate airworthiness authorities to demonstrate that safety requirements are met to the levels per the System Safety Assessment (SSA) or equivalent The certification process is currently lengthy and depends on much human interpretation of the myriad of complex architecture functions 13The current guidelines such as DO-178B DO-254 DO-297 SAE-ARP-4754 and SAE-ARP-4671 along with many other guidelines outline the proper steps that should be taken System safety managementrsquos military standard is MIL-STD-882 and has been in use for decades Civilian safety standards for the aviation industry include SAE ARP4754 which shows the incorporation of system safety activities into the design process and provides guidance on techniques to use to ensure a safe design SAE ARP4761 contains significant guidance on how to perform the system safety activities spoken about in SAE ARP4754 DO-178B outlines development assurance guidance for aviation software based on the failure condition categorization of the functionality provided by the software DO-254 embodies similar guidance for aviation hardware ARINC 653 is a widely accepted standard to ensure time and space partitioning for software DO-297 does an excellent job of describing the certification tasks to take for an IMA system which include13Task 1 Module acceptance13Task 2 Application softwarehardware acceptance13Task 3 IMA system acceptance13Task4 Aircraft integration of IMA systems including verification and validation13Task 5 Change of modules or applications13Task 6 Reuse of modules or applications13Taken together these standards provide guidance that if followed likely will result in safe highly reliable and cost-effective systems over the life-cycle of the system Yet while these guidelines exist there is not a consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives Also there is not a standard outlining overall reliability for a system to include hardware and software reliability In order to achieve this level of reliability a standard should be developed to define the process and method to achieve a quantifiable reliability number that would in turn lead to acceptance13

18

TRL 3 or 4 TRL 6 or 7 TRL 8 or 9

Component 4

ComplexityFundamentals

Reliability Parametrics

Component 3

ComplexityFundamentals

Reliability Parametrics

Definition of Complexity and Reliability is Needed

Subsystem 1

SystemIntegration ofComponents

ReliabilityDependencies

Component 2

ComplexityFundamentals

Reliability Parametrics

Component 1

ComplexityFundamentals

Reliability Parametrics

Subsystem 2

SystemIntegration ofComponents

ReliabilityDependencies

Integration System

Realized System

ReliabilitySensitivities

High Reliable Complex System

Certificate(eg AWR)

Integration

Integration

Integration

Integration

Integration

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) andor AWR

19

Analytical Models and Reliability

bull Analytical models of hardware reliability are well understood

bull Architecture modeling and software reliability modeling is not a novel idea but is highly debated

ndash There are many approaches and little consensus as to best wayndash Many models (Jelinski-Moranda Littlewood-Verrall Musa-Okumoto etc) [1]ndash Many tools (over 200+ tools since 1970s have been built) [2]

bull Predictability of software reliability is of great concern because it is a major contributor to unreliability[2]

bull Software Reliability is the probability of error-free operation of a computer program in a specified environment for a specified time [1]

bull Need basis for setting reliability figures based on previous systems and iteratively refine those figures in the future

bull NOT A REPLACEMENT FOR TESTING AND VERIFICATION

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability While not suggesting as solution in total better modeling practices should be considered toward a solution to bridge the gap between design test and implementation A method to model the architecture early in the requirements establishment phase follow the detailed design and coding and then be able to verify the system along with the model may be a path to greater confidence in the system and reduce the risks warnings and cautions that must be issued13In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result (see Figure 9) The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) or AWR 13To achieve this goal modeling and analysis tools that follow a standard for modeling reliability should exist As previously stated over 200 tools have been created since 1970s Here is a list of a few of the current tools13Universal Systems Language (USL)13Unified Modeling Language (UML)13Systems Modeling Language (SysML)13MATLABSimulink13Telelogic Rhapsody13MathCAD13Colored Petri Nets13Rate Monotonic Analysis (RMA)13STATEMATE (Used by Airbus)13Standard for the Development of Safety-Critical Embedded Software (SCADE)13OPNET13Embedded System Modeling Language (ESML)13Component Synthesis using Model-Integrated Computing (CoSMIC)13Architectural Analysis and Design Language (AADL)13By no means is this list complete Typically different companies and projects address this challenge and choose unit tools to perform the upfront analysis and modeling not following a standard approach Multiple tools need to converge or be compatible with the framework of a common tool such as AADL The resultant tool(s) could be used to verify the requirements up front to mitigate reliability issues down the life-cycle 13

20

Tools for Modeling and Analysis

bull Universal Systems Language (USL)bull Unified Modeling Language (UML)bull Systems Modeling Language (SysML)bull MATLABSimulinkbull Telelogic Rhapsodybull MathCadbull Colored Petri Netsbull Rate Monotonic Analysis (RMA)bull STATEMATE (Used by Airbus)bull SCADEbull OPNETbull Embedded System Modeling Language (ESML)bull Component Synthesis using Model-Integrated Computing (CoSMIC)bull Architectural Analysis and Design Language (AADL)bull At least 200+ more packages since the 70rsquosbull Certified tools needs to converge to an accepted standard

modelinganalysis method for complex system reliability

Presenter
Presentation Notes
Typically different companies and projects address the challenge and choose unit tools to perform the upfront analysis and modeling not following a guideline approach Multiple tools need to converge or be compatible with a set modeling standard for complex systems The compliant tools could be used to verify the requirements up front to mitigate reliability issues down the life-cycle A notional approach would follow that shown in Figure 8 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model The down side to modeling from certain circles is getting people to believe those models How do you certify a modeling tool and the actual models within the tools Those issues should be addressed going forward

21

Modification to Acquisition Model

Requirements Establishment

High Level Design

Detailed Specifications

Implementation Coding

Verification

Development TestingA

rchi

tect

ural

Mod

el amp

A

naly

sis

Propose standard modeling methodology to be applied at different phases of development to enhance requirements development reliability allocation reliability

measurement and testing (DISCLAIMER DOES NOT REPLACE TESTING)

Reliabilityallocated Reliability

measured

Operational Testing amp Validation

Deployed System

Presenter
Presentation Notes
A notional approach would follow that shown in Figure 10 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model This approach could bridge the design and test phases together It is emphasized here that the architectural model would not replace critical testing but augments the process to allow for better requirement identification and verification Thorough ground and flight tests should never be replaced by modeling Modeling would only allow for more robust and a higher level of confidence in the requirements and design The model could be used in conjunction with the testing to confirm the design Proper modeling and analysis would reduce total program costs by enforcing more complete and correct initial requirements which reduce issues discovered down the road in testing that are expensive to fix or impossible to fix and having to accept high risks Additionally if the model is maintained and optimized then it could possibly be used after system deployment to analyze impacts of upgrades or changes to the system allowing for more complete analysis and reduce overall system redesign costs1313A hurdle to cross with modeling and analysis is convincing people to believe those models Some method to certify these models and modeling tools should be addressed in the future Standards should be set in place for correct modeling techniques for complex systems Lastly consideration of standard verification checking tools should be made such as with the use of the Motor Industry Software Reliability Association (MISRA) compliance verification tool for the use of C in safety critical systems

22

Systems Reliability Standard Establishment

bull Establish a working group to define this standardndash Need a technical society to lead the charge on this

bull Collaborate with industry academia military and societiesndash Focus on development of a reliability standard with AWR safety in mindndash Draw upon the experiences to feed into this standard

bull Study existing and previous complex systemsndash Shuttle Space Station missile systems nuclear submarine and ship

systems nuclear control systems military and commercial jet systems ndash Obtain software reliability information from given existing and previous

systemsndash Build database which would serve as basis for future reliability

bull Research prior efforts in complex systems analysis

bull Establish consensus based modeling and analysis method

Presenter
Presentation Notes
In conclusion methods for achieving a design for complex systems do exist however achieving reliability and attaining a level of qualification that would permit better AWRs does not There needs to be a standard developed to tackle this issue rather than relying on current methods to ascertain airworthiness of complex systems This will not occur overnight An orchestrated collaboration among industry academia military labs and technically professional societies to focus on development of this standard should allow us to draw upon the experiences to feed into this reliability standard with AWR safety in mind We have a long living experiment with complex software systems on the Space Transportation System (STS) International Space Station (ISS) missile systems nuclear submarine and ship systems nuclear control systems military and commercial jet systems from which we should be able to obtain at least inferred software reliability information from the architecture and run time that the systems use We should look at the lessons learned from these systems to see what could have been done to improve and what was done right that should be carried forward The challenge is collecting this information into a central database and arriving at some figure of reliability from previous systems where data exists This would at least provide a starting point to allow initial assessments and could be optimized in the future Also this is not the only study for establishing reliability metrics to complex software systems There have been research projects of similarity to this effort that have risen and fallen The data from those projects should not be wasted but studied to feed into whatever standard that is developed While historical information would be useful each design is unique and requires tools to accomplish the design Investigation of architectural modeling constructs should be further investigated as a possible augmentation to the design and test process We need to determine which forum is best to conduct this effort (eg SAE IEEE AIAA ACM AHS INCOSE or other) As stated in the paper ldquoSpace Shuttle Avionicsrdquo [31] ldquoThe designers the flight crew and other operational users of the system often have a mindset established in a previous program or experience which results in a bias against new or different lsquounconventionalrsquo approachesrdquo If nothing is done to address this problem it will only get worse over time It is past time to address the issue of reliability of complex systems and software

23

BACKUP SLIDES

BACKUP SLIDES

24

PEO Aviation System Safety Management Decision Authority Matrix

Severity(Most Credible)

FrequentP gt 1E-3

A

Probable 1D-4 lt P lt= 1E-3

B

Occasional1E-5 lt P lt= 1E-4

C

Remote1E-6 lt P lt= 1E-5

D

Improbable1E-7 lt P lt= 1E-6

E

Catastrophic1

Critical 2

Marginal 3

Negligible4

Army Acquisition

PEO Aviation

ProgramManagement

HazardCategory

Description

1 Catastrophic Death or permanent total disability system loss

2 Critical Severe injury or minor occupational illness (no permanent effect) minor system or environmental damage

3 Marginal Minor injury or minor occupational illness (no permanent effect) minor system or environmental damage

4 Negligible Less than minor injury or occupational illness (no lost workdays) or less than minor environmental damage

RiskLevel

Description Probability (Frequency) (per 100000 flight hours)

A Frequent gt 100 (P gt 1E-3)

B Probable lt=100 and gt10 (1E-4 lt P lt= 1E-3)

C Occasional lt= 10 and gt1 (1E-5 lt P lt= 1E-4)

D Remote lt=1 and gt01 (1E-6 lt P lt= 1E-5)

E Improbable lt=01 and gt001 (1E-7 lt P lt= 1E-6)

Presenter
Presentation Notes
As already mentioned it is the goal of the US Army Aviation Engineering Directorate that the system is safe and reliable to operate and will perform the mission when delivered Another goal includes the released system continues to safely perform the mission if maintained and operated per the operatorrsquos manual Replacement parts and overhaul work must be high quality to support continued airworthiness Per the Program Element Office Memorandum 08-03 Risk Matrix US Army flight control systems are to achieve 1E-9 reliability for flight critical functions per civil airspace regulations [35] and 1E-6 reliability for tactical use [36] Quantifying these numbers is established for component hardware but not for software Just as hardware should have quantifiable reliability so should software13

25

Reliability Defined

bull Software Reliability - the probability that a given piece of software will execute without failure in a given environment for a given time [40] ndash Often debated as to how to measure

bull Hardware Reliability - the probability that a hardware component fails over time ndash Well defined and established

bull System Reliability - the probability of success or the probability that the system will perform its intended function under specified design limits [over a given amount of time] [39] ndash A combination of software and hardware reliability

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

26

Hardware vs Software Reliability

Hardware Reliability Software ReliabilityFailure rate has a bathtub curve The burn-in state is similar to the software debugging state

Without considering program evolution failure rate is statistically non-increasing

Material deterioration can cause failures even though the system is not used

Failures never occur if the software is not used

Failure data are fitted to some distributions The selection of the underlying distribution is based on the analysis of failure data and experiences Emphasis is placed on analyzing failure data

Most models are analytically derived from assumptions Emphasis is on developing the model the interpretation of the model assumptions and the physical meaning of the parameters

Failures are caused by material deterioration design errors misuse and environment

Failures are caused by incorrect logic incorrect statements or incorrect input data

Can be improved by better design better material applying redundancy and accelerated life cycle testing

Can be improved by increasing testing effort and correcting discovered faults Reliability tends to change continuously during testing due to the addition of problems in new code or the removal of problems by debugging errors

Hardware repairs restore the original condition Software repairs establish a new piece of software

Hardware failures are usually preceded by warnings Software failures are rarely preceded by warnings

Hardware components can be standardized Software components have rarely been standardized

Hardware can usually be tested exhaustively Software essentially requires infinite testing for completeness

Reference [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Hong Pham compares software versus hardware reliability with the information as shown in Table 2 [39] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

27

Acronym ListACRONYM DEFINITIONAADL Architectural Analysis and Design LanguageAC Advisory Circular (FAA)ACM Association of Computing MachineryAED Aviation Engineering Directorate (AMRDEC)AFTD Aviation Flight Test Directorate (US Army)AGC Apollo Guidance ComputerAHS American Helicopter SocietyAIAA American Institute of Aeronautics and Astronautics (Inc)AMCOM Aviation and Missile Command (US Army)AMRDEC Aviation and Missiles Research Development and Engineering Center (US Army)AR Army RegulationARINC Aeronautical Radio Inc ARP Aerospace Recommended PracticeASIF Avionics Software Integration FacilityATAM Architecture Tradeoff Analysis MethodATM Air Traffic ManagementAWR Airworthiness ReleaseCAAS Common Avionics Architecture SystemCH-47 Cargo Helicopter ChinookCMM Capability Maturity ModelCMMI Capability Maturity Model IndexCMU Carnegie Mellon UniversityCNS Communications Navigation SurveillanceCoSMIC Component Synthesis using Model-Integrated ComputingCPS Cyber-Physical SystemCRC Chemical Rubber Company (ie CRC Press)DFBW Digital Fly-By-WireDoD Department of DefenseE3 Electrical and Electromagnetic EffectsESML Embedded System Modeling LanguageFAA Federal Aviation AdministrationFCS Future Combat SystemsFHA Functional Hazard AssessmentFMEA Failure Modes Effects Analysis

28

Acronym List (concluded)

ACRONYM DEFINITIONGPWS Ground Proximity Warning SystemIBM International Business MachinesIEC International Engineering ConsortiumIL Instrumentation Lab (now Draper Laboratory)IMA Integrated Modular AvionicsINCOSE International Council On Systems EngineeringISO International Organization for StandardizationISS International Space StationKAL Korean AirlinesMISRA Motor Industry Standard Software Reliability AssociationMIT Massachusetts Institute of TechnologyNASA National Aeronautics and Space Administration (USA)PDR Preliminary Design ReviewPEO Program Element OfficePNAS Proceedings of the National Academy of SciencesRAQ Rotorcraft and Aircraft QualificationRMA Rate Monotonic AnalysisRTC Redstone Test Center (US Army) RTTC Redstone Technical Test Center (US Army)RTCA Radio Technical Commission for AeronauticsSAE Society of Automotive EngineersSED Software Engineering Directorate (AMRDEC)SEES Software Engineering Evaluation SystemSEI Software Engineering Institute (CMU)SIL System Integration LaboratorySSA System Safety AssessmentSTS Space Transportation SystemSysML Systems Modeling LanguageTMR Triple Modular RedundantTRL Technical Readiness LevelUAS Unmanned Aircraft SystemUH-60 Utility Helicopter BlackhawkUML Unified Modeling LanguageUS United StatesUSL Universal Systems Language

29

References

bull [1] Israel Koren and Mani Krishna ldquoFault-Tolerant Systemsrdquo Morgan Kaufmann 2007bull [2] Jianto Pan ldquoSoftware Reliabilityrdquo Carnegie Mellon University Spring 1999bull [3] Nachum Dershowitz httpwwwcstauacil~nachumdhorrorhtmlbull [4] httpwwwair-attackcombull [5] David A Mindell ldquoDigital Apollo Human and Machine in Spaceflightrdquo The MIT Press 2008bull [6] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Flight Software Airworthiness SED-SES-PMHFSA 001 December 2003bull [7] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Software Safety SED-SES-PMHSSA 001 February 2006bull [8] AMCOM Regulation 385-17 Software System Safety Policy 15 March 2008bull [9] NASA Software Safety Guidebook NASA-GB-871913 31 March 2004bull [10] Margaret Hamilton amp William Hackler ldquoUniversal Systems Language Lessons Learned from Apollordquo IEEE Computer

Society 2008bull [11] Margaret Hamilton ldquoFull Life Cycle Systems Engineering and Software Development Environment Development Before

The Fact In Actionrdquo httpwwwhtiuscomArticlesFull_Life_Cyclehtmbull [12] Peter Feiler David Gluch John Hudak ldquo The Architecture Analysis amp Design Language (AADL) An Introductionrdquo

CMUSEI-2006-TN-011 February 2006bull [13] Peter Feiler John Hudak ldquoDeveloping AADL Models for Control Systems A Practitionerrsquos Guiderdquo CMUSEI-2007-TR-

014 July 2007bull [14] Bruce Lewis ldquoUsing the Architecture Analysis and Design Language for System Verification and Validationrdquo SEI

Presentation 2006bull [15] Feiler Gluch Hudak Lewis ldquoEmbedded System Architecture Analysis Using SAE AADLrdquo CMUSEI-2004-TN-004 June

2004bull [16] Charles Pecheur Stacy Nelson ldquoVampV of Advanced Systems at NASArdquo NASACR-2002-211402 April 2002bull [17] Systems Integration Requirements Task Group ldquoARP 4754 Certification Considerations for Highly-Integrated or

Complex Aircraft Systemsrdquo SAE Aerospace 10 April 1996bull [18] SAE ldquoARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne System and

Equipmentrdquo December 1996bull [19] Aeronautical Radio Inc (ARINC) ldquoARINC Specification 653P1-2 Avionics Application Software Standard Interface Part 1

ndash Required Servicesrdquo 7 March 2006

30

References (Continued)

bull [20] Department of Defense ldquoMIL-STD-882D Standard Practice for System Safetyrdquo 19 January 1993bull [21] RTCA Incorporated ldquoDO-178 Software Considerations in Airborne Systems and Equipment Certificationrdquo 1

December 1992bull [22] RTCA Incorporated ldquoDO-254 Design Assurance Guidance for Airborne Electronic Hardwarerdquo 19 April 2000bull [23] US Army ldquoAeronautical Design Standard Handbook Rotorcraft and Aircraft Qualification (RAQ) Handbookrdquo 21

October 1996bull [24] Cary R Spitzer (Editor) ldquoAvionics Elements Software and Functionsrdquo CRC Press 2007bull [25] US Army ldquoArmy Regulation 70-62 Airworthiness Qualification of Aircraft Systemsrdquo 21 May 2007bull [26] US Army ldquoArmy Regulation 95-1 Aviation Flight Regulationsrdquo 3 February 2006bull [27] ldquoUsing the Architecture Tradeoff Analysis Method (ATAM) to Evaluate the Software Architecture for a Product Line

of Avionics Systems A Case Studyrdquo Barbacci Clements Lattanze Northrop Wood July 2003 CMUSEI-2003-TN-012bull [28] ldquoAll in the Family CAAS amp AADLrdquo Peter Feiler August 2008 CMUSEI-2008-SR-021bull [29] ldquoCMMI Guidelines for Process Integration and Product Improvementrdquo Chrissis Konrad Shrum Pearson Education

2007bull [30] ldquoModel Driven Performance Analysis for Avionics Systemsrdquo Brendan OrsquoConnell Draper Laboratory January 2006bull [31] John F Hanaway Robert W Moorehead ldquoSpace Shuttle Avionics Systemsrdquo NASA SP-504 1989bull [32] Lui Sha ldquoThe Complexity Challenge in Modern Avionics Softwarerdquo August 14 2006bull [33] ldquoIncidents Prompt New Scrutiny of Airplane Software Glitchesrdquo 30 May 2006 Wall Street Journalbull [34] Eyal Ophir Clifford Nass and Anthony Wagner ldquo Cognitive Control in Media Multitaskersrdquo PNAS 20 July 2009bull [35] ldquoAdvisory Circular AC 251309-1A System Design and Analysisrdquo Federal Aviation Administration 21 June 1988bull [36] Program Element Office Policy Memorandum 08-03bull [38] httpwwwnsfgovpubs2008nsf08611nsf08611htm National Science Foundation webpage on Cyber-Physical

Systemsbull [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000bull [40] Paul Rook editor ldquoSoftware Reliability Handbookrdquo Elsevier Science Publishers LTD 1990

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)

11

Complexity Issues (continued)

bull Functional Hazard Assessments and related documentation are crucial

ndash Understanding risks ndash Performing the correct tests at the right level Lab test vs Flight Test

bull Saves flight time and money

bull Systems Integration for complex systems is a schedule driver

bull Need experienced personnel to work complex systems

bull Need a centralized database - just doesnrsquot existndash Determine data needed for quantifying reliability of complex systemsndash Capture the pertinent data on previous or existing complex systemsndash Understand successes and failures of previous and present complex systemsndash Establish baseline reliability figures for known architectures

bull Complex System of Systems exacerbates problem

12

Reliability vs Complexity ampCost vs Complexity

Notional Graphs

bull Reliability vs Complexity bull Cost amp Schedule vs Complexity

Rel

iabi

lity

Complexity Complexity

Cos

t amp S

ched

uleOptimum

Aggregation of part reliability

feeds into overall system reliability

Desired

Presenter
Presentation Notes
Avionics parts and software constantly change over the life of a program Typically a spiral development program occurs with complex software development which means that qualification is required frequently This begs the question of how to streamline the process so that the need to conduct a complete re-qualification is avoided13With these complex systems there are other hurdles to cross such as fully characterizing and conducting the Functional Hazard Assessments (FHAs) and Failure Mode Effects and Criticality Analyses (FMECAs) It is crucial for the safety assessment that these are conducted correctly to fully understand the risks and later perform the correct tests at the right level13Additionally once the complex component hardware and software are integrated then yet other problems crop up It is at that time that the disparity of teams crossing multiple contractors and development groups becomes obvious and if not properly coordinated could cause impacts to the schedule 13Other programmatic problems affect complex system development and qualification For instance lack of schedule and funding resources causes a shortcoming to adequately provide for the proper compliance with specification and requirements short-circuiting the systems engineering process An ever decreasing availability of trained engineers to support the development and test of such systems exists Non-technical political influences sometimes affect the reliability of complex systems and are difficult to avoid Lastly there is a lack of a centralized database that captures the various families of systems that have been built along with their characterization of success and failures Such a database from all past and present government complex systems could be valuable in establishing reliability basis for future models

13

A Few Examples of Complex Systems

bull This is not a new problem Other have struggled with the challenges of establishing confidence in complex systems

ndash NASAbull Apollo Guidance Computerbull Dryden F8 Crusaderbull Space Shuttlebull International Space Station

ndash Commercial Airlinersbull Airbus A320 and higherbull Boeing B777 B787

ndash Militarybull Ships and Submarinesbull Jets (F14F15 F16 F18 F22 F35 etc)bull Cargo Planes (C130J Hercules C17 Globemaster etc)bull Helicopters (Chinook Blackhawk Sea Stallion etc)bull Rocketsbull Unmanned Aerial Systemsbull Unmanned Ground Systemsbull Unmanned Submarine Systems Photos by US Army NASA US Navy and US Air Force

Presenter
Presentation Notes
As mentioned earlier complex avionics systems are not a new idea Since the early 1960rsquos complex avionic architectures have existed beginning with the ApolloSaturn program Massachusetts Institute of Technology (MIT) Instrumentation Lab (IL) which is now Draper Laboratory and International Business Machines (IBM) led the way with the MITIL Apollo Guidance Computer (AGC) and the Saturn V IBM triple modular redundant (TMR) voting guidance computer system The word software was not even coined at the time but engineers such as Margaret Hamilton MITIL Director of Apollo On-board Software can attest to the fact that some the same issues with creating reliable software then still exists today [5] A large majority of the issues then dealt with the communication between systems engineers and the programmers Requirements were thrown over the wall without the confirmation that the requirements were complete and a lot of the issues cropped up as interface problems Identifying these issues prompted Hamilton to create her own company and create a modeling language called Universal Systems Language (USL) to head off the problems experienced with Apollo [11] Some 200 plus modeling programs have been developed since Apollo and used to mitigate issues and increase confidence in systems of varying complexity13As time progressed other systems came along The NASA Dryden F8 Crusader was the first digital fly by wire (DFBW) jet aircraft that relied heavily on complex IMA and software for flight control The Space Transportation System (STS) shuttle includes a Quad Modular Redundant (QMR) system with a fifth backup flight computer containing uncommon code US Air Force and Naval airplanes that have possessed complex or redundant IMA configurations include the F14 Tomcat F15 Eagle F16 Falcon F18 Hornet F22 Raptor F35 Joint Strike Fighter F117 Nighthawk V22 Osprey C17 Globemaster and many more along with recent Unmanned Air Vehicle Systems (UAVS) The US Army complex systems on helicopters include the13RAH-66 Comanche DFBW Triple Modular Redundant (TMR) architecture13glass cockpit avionics on the UH-60M Blackhawk baseline 13Common Avionics Architecture System (CAAS) glass cockpit on the UH-60M Blackhawk modernization and CH-47F Chinooks13and other aircraft13Additionally there are many self-checking pair engine controller systems along with system of system Future Combat Systems (FCS) and Unmanned Air Vehicle Systems (UAVS) This has also permeated the commercial airliner market with the Airbus 320 and higher Airbus models Boeing 777 and Boeing 787 aircraft With this ever increasing technology something must be done about the reliability issue With such a wealth of data on aviation and non-aviation cyber-physical systems such as submarine ship nuclear medical locomotive and automotive systems there should be adequate information to get a start on modeling systems correctly for reliability Therefore this is not an isolated problem to avionics and other disciplines should aide in resolving this problem13

14

Some Complex System Failures

bull V-22 Osprey crashesbull Mars Climate Orbiter crashbull Mars Pathfinder software resetbull USS Vincennes downing an Airbus 320bull Therac-25 software radiation treatment

failurebull 1989 Airbus A320 air show crashbull China Airlines Airbus Industries A300

crashbull Ariane 5 satellite launcher malfunctionbull Failure of the primary flight system to

sync with the backup during prelaunch of STS-1

bull Mexicana Airlines Boeing 727 airliner crashed into a mountain due to the software not correctly determining the mountain position

bull Loss of the first American probe to Venus

bull Korean Airlines KAL 901 accidentbull Soviet Phobos I Mars probe lost

bull Three Mile Islandbull F-18 fighter plane crash due to bad

exceptionbull F-14 fighter plane lost to

uncontrollable spinbull Swedish Gripen prototype crashedbull Swedish Gripen air-show crashbull F-22 failure crossing the IDLbull 2006 German-Spanish Barracuda UAVbull 2004 FA-22 Raptor stealth fighter jet

crash bull FA-22 Raptor navigation system

software error at Nellis AFBbull 50 cockpit blackouts on A320bull A320 multiple avionics and electrical

failures at Newark NJbull Boeing 777 Malaysian Airlines jetlinerrsquos

nightmarish autopilot rollercoaster ridebull 3000 feet US Army and Air Force UAV

Crashes

bull hellip And Many Morehellip

Presenter
Presentation Notes
Multiple crashes have occurred with the V-22 Osprey [41]13In 1999 the Mars Climate Orbiter crashed because of incorrect units in a program caused by poor systems engineering practices [42 44]13In 1988 an Airbus 320 was shot down by the USS Vincennes because of cryptic and misleading output displayed by the tracking software [3]13In 1989 an Airbus A320 crashed at an air show due to altitude indication and software handling [3]13In 1994 a China Airlines Airbus Industries A300 crash on killing 264 from faulty software [3]13In 1996 the first Ariane 5 satellite launcher destruction mishap was caused by a faulty software design error with a few lines of ADA code containing unprotected variables Horizontal velocity of the Ariane 5 exceeded the Arian 4 resulting in the guidance system veering the rocket off course Insufficient testing did not catch this error which was a carry-over from Ariane 4[3 39]13In 1986 a Mexicana Airlines Boeing 727 airliner crashed into a mountain due to the software not correctly determining the mountain position [39]13In 1986 the Therac-25 radiation therapy machines overdosed cancer patients due to flaw in the computer program controlling the highly automated devices [3 39 45]13During the maiden launch in 1981 of the Discovery space shuttle a failure of the primary flight control computer system to establish sync with the backup during prelaunch [43]13On December 10 1990 the Space Shuttle Columbia had to land early due to computer software problems [39]13In 1997 The Mars Pathfinder software reset problem due to latent task execution caused by priority inversion with a mutex [3 44]13An error in a single FORTRAN statement resulted in the loss of the first American probe to Venus From G J Myers Software Reliability Principles amp Practice p 25 [3]13In September 1999 the Korean Airlines KAL 901 accident in Guam killed 225 out of 254 aboard A worldwide bug was discovered in barometric altimetry in Ground Proximity Warning System (GPWS) From ACM SIGSOFT Software Engineering Notes vol 23 no 1 [3]13The Soviet Phobos I Mars probe was lost due to a faulty software update at a cost of 300 million rubles Its disorientation broke the radio link and the solar batteries discharged before reacquisition From Aviation Week 13 Feb 1989 [3]13An F-18 fighter plane crash due to a missing exception condition From ACM SIGSOFT Software Engineering Notes vol 6 no 2 [3]13An F-14 fighter plane was lost to uncontrollable spin traced to tactical software From ACM SIGSOFT Software Engineering vol9 no5 [3]13In 1989 Swedish Gripen prototype crashed due to software in their digital fly-by-wire system [3 46]13In 1995 another Gripen fighter plane crashed during air-show caused by a software issue [3 46]13On February 11 2007 twelve FA-22 Raptors were forced to head back to the Hawaii when a software bug caused a computer crash as they were crossing International Date Line In the middle of the ocean all their systems comprising navigation fuel and part of the communications systems dumped All the attempts to reboot failed[47]13February 2006 German-Spanish Unmanned Combat Air Vehicle Barracuda crash due to software failure [4]13December 2004 a glitch in the software for controlling flight probably caused an FA-22 Raptor stealth fighter jet to crash on takeoff at Nellis Air Force [4]13In 2008 a United Airbus A320 registration N462UA experienced multiple avionics and electrical failures including loss of all communications shortly after rotation while departing Newark Liberty International Airport in Newark New Jersey [NTSB Report Identification DCA08IA033] 13In 2006 a Boeing 777 Malaysian Airlines jetlinerrsquos autopilot caused a stall to occur by climbing 3000 feet Pilots struggled to nose down the plane but plunged into a steep dive After pulling back up the pilots regained control Cause was defective flight software providing incorrect data for airspeed and acceleration confusing the flight computers and initially ignoring the pilotrsquos commands[49]13US Army and Air Force UAV crashes from control system or human error13

15

Lessons Learned from Failures

bull From Nancy Levesonrsquos paper ldquoThe Role of Software in Spacecraft Accidentsrdquondash ldquoFlaws in the safety culture diffusion of responsibility and

authorityndash Limited communication channels and poor information flowndash Inadequate system and software engineering ndash Poor or missing specifications ndash Unnecessary complexity and software functionality ndash Software reuse or changes without appropriate safety analysisndash [Shortcomings] in safety engineering practices ndash Flaws in test and simulation environments ndash Inadequate human factors design for softwarerdquo

Presenter
Presentation Notes
In Dr Nancy Levesonrsquos paper [36] ldquoThe Role of Software in Spacecraft Accidentsrdquo she cited problems with software development issues within NASA on various projects According to Dr Leveson there were ldquoflaws in the safety culture diffusion of responsibility and authority limited communication channels and poor information flow inadequate system and software engineering poor or missing specifications unnecessary complexity and software functionality software reuse or changes without appropriate safety analysis violation of basic safety engineering practices inadequate system safety engineering flaws in test and simulation environments and inadequate human factors design for softwarerdquo While these problems were identified for spacecraft development within NASA and corrected aviation in general could learn from these lessons to mitigate issues with complex systems development

16

Some Current Guidelines

bull DO-178B - Software Considerations in Airborne Systems and Equipment Certification bull DO-248B ndash Final Report for the Clarification of DO-178Bbull DO-278 - Guidelines for Communications Navigation Surveillance and Air Traffic Management

(CNSATM) Systems Software Integrity Assurancebull DO-254 - Design Assurance Guidance for Airborne Electronic Hardware bull DO-297 ndash Integrated Modular Avionics (IMA) Development Guidance and Certification

Considerationsbull SAE-ARP4754 ndash Certification Consideration for Highly Integrated or Complex Aircraft Systemsbull SAE-ARP4671- Guidelines and Methods for Conducting the Safety Assessment Process on

Airborne Systems and Equipmentbull FAA Advisory Circular AC27-1B - Certification of Normal Category Rotorcraftbull FAA Advisory Circular AC29-2C - Certification of Transport Category Rotorcraftbull ISOIEC 12207 - Software Life Cycle Processesbull ARINC 653 - Specification Standard for Time and System Partitionbull MIL-STD-882D - DoD System Safetybull ADS-51-HDBK - Rotorcraft and Aircraft Qualification Handbookbull AR-70-62 - Airworthiness Release Standardbull SED-SES-PMHFSA 001 - Software Engineering Directorate (SED) Software Engineering

Evaluation System (SEES) Program Manager Handbook for Flight Software Airworthinessbull SED-SES-PMHSS 001 - SED SEES Program Manager Handbook for Software Safety

WHATrsquoS MISSING - Reliability Standard for Complex Systems

Presenter
Presentation Notes
These problems previously stated drove the development of these guidelines however there is no standard for system reliability that includes software There are other standards and circulars that pertain to complex systems but a reliability standard for complex systems but a reliability standard is missing for complex systems which would outline the process for establishing cyber-physical systems reliability This standard should indicate how to model and analyze and ascertain the projected level of reliability

17

Certification Assessment Considerations

bull Sufficient data and time must be available for air worthiness evaluation

bull Certification processndash Currently lengthy ndash Depends much on human interpretation trade offs and risk mitigation ndash Overwhelming for complex integrated systems (FHAs FTAs FMECAs

risk mitigation etc)

bull Consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives

bull Certification Tasks outlined in DO-297 should be consideredndash Task 1 Module Acceptancendash Task 2 Application softwarehardware acceptancendash Task 3 IMA system acceptancendash Task 4 Aircraft integration of IMA system ndash including VampVndash Task 5 Change of modules or applicationsndash Task 6 Reused of modules or applications

Presenter
Presentation Notes
In order to execute an AWR sufficient data must be provided to the appropriate airworthiness authorities to demonstrate that safety requirements are met to the levels per the System Safety Assessment (SSA) or equivalent The certification process is currently lengthy and depends on much human interpretation of the myriad of complex architecture functions 13The current guidelines such as DO-178B DO-254 DO-297 SAE-ARP-4754 and SAE-ARP-4671 along with many other guidelines outline the proper steps that should be taken System safety managementrsquos military standard is MIL-STD-882 and has been in use for decades Civilian safety standards for the aviation industry include SAE ARP4754 which shows the incorporation of system safety activities into the design process and provides guidance on techniques to use to ensure a safe design SAE ARP4761 contains significant guidance on how to perform the system safety activities spoken about in SAE ARP4754 DO-178B outlines development assurance guidance for aviation software based on the failure condition categorization of the functionality provided by the software DO-254 embodies similar guidance for aviation hardware ARINC 653 is a widely accepted standard to ensure time and space partitioning for software DO-297 does an excellent job of describing the certification tasks to take for an IMA system which include13Task 1 Module acceptance13Task 2 Application softwarehardware acceptance13Task 3 IMA system acceptance13Task4 Aircraft integration of IMA systems including verification and validation13Task 5 Change of modules or applications13Task 6 Reuse of modules or applications13Taken together these standards provide guidance that if followed likely will result in safe highly reliable and cost-effective systems over the life-cycle of the system Yet while these guidelines exist there is not a consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives Also there is not a standard outlining overall reliability for a system to include hardware and software reliability In order to achieve this level of reliability a standard should be developed to define the process and method to achieve a quantifiable reliability number that would in turn lead to acceptance13

18

TRL 3 or 4 TRL 6 or 7 TRL 8 or 9

Component 4

ComplexityFundamentals

Reliability Parametrics

Component 3

ComplexityFundamentals

Reliability Parametrics

Definition of Complexity and Reliability is Needed

Subsystem 1

SystemIntegration ofComponents

ReliabilityDependencies

Component 2

ComplexityFundamentals

Reliability Parametrics

Component 1

ComplexityFundamentals

Reliability Parametrics

Subsystem 2

SystemIntegration ofComponents

ReliabilityDependencies

Integration System

Realized System

ReliabilitySensitivities

High Reliable Complex System

Certificate(eg AWR)

Integration

Integration

Integration

Integration

Integration

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) andor AWR

19

Analytical Models and Reliability

bull Analytical models of hardware reliability are well understood

bull Architecture modeling and software reliability modeling is not a novel idea but is highly debated

ndash There are many approaches and little consensus as to best wayndash Many models (Jelinski-Moranda Littlewood-Verrall Musa-Okumoto etc) [1]ndash Many tools (over 200+ tools since 1970s have been built) [2]

bull Predictability of software reliability is of great concern because it is a major contributor to unreliability[2]

bull Software Reliability is the probability of error-free operation of a computer program in a specified environment for a specified time [1]

bull Need basis for setting reliability figures based on previous systems and iteratively refine those figures in the future

bull NOT A REPLACEMENT FOR TESTING AND VERIFICATION

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability While not suggesting as solution in total better modeling practices should be considered toward a solution to bridge the gap between design test and implementation A method to model the architecture early in the requirements establishment phase follow the detailed design and coding and then be able to verify the system along with the model may be a path to greater confidence in the system and reduce the risks warnings and cautions that must be issued13In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result (see Figure 9) The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) or AWR 13To achieve this goal modeling and analysis tools that follow a standard for modeling reliability should exist As previously stated over 200 tools have been created since 1970s Here is a list of a few of the current tools13Universal Systems Language (USL)13Unified Modeling Language (UML)13Systems Modeling Language (SysML)13MATLABSimulink13Telelogic Rhapsody13MathCAD13Colored Petri Nets13Rate Monotonic Analysis (RMA)13STATEMATE (Used by Airbus)13Standard for the Development of Safety-Critical Embedded Software (SCADE)13OPNET13Embedded System Modeling Language (ESML)13Component Synthesis using Model-Integrated Computing (CoSMIC)13Architectural Analysis and Design Language (AADL)13By no means is this list complete Typically different companies and projects address this challenge and choose unit tools to perform the upfront analysis and modeling not following a standard approach Multiple tools need to converge or be compatible with the framework of a common tool such as AADL The resultant tool(s) could be used to verify the requirements up front to mitigate reliability issues down the life-cycle 13

20

Tools for Modeling and Analysis

bull Universal Systems Language (USL)bull Unified Modeling Language (UML)bull Systems Modeling Language (SysML)bull MATLABSimulinkbull Telelogic Rhapsodybull MathCadbull Colored Petri Netsbull Rate Monotonic Analysis (RMA)bull STATEMATE (Used by Airbus)bull SCADEbull OPNETbull Embedded System Modeling Language (ESML)bull Component Synthesis using Model-Integrated Computing (CoSMIC)bull Architectural Analysis and Design Language (AADL)bull At least 200+ more packages since the 70rsquosbull Certified tools needs to converge to an accepted standard

modelinganalysis method for complex system reliability

Presenter
Presentation Notes
Typically different companies and projects address the challenge and choose unit tools to perform the upfront analysis and modeling not following a guideline approach Multiple tools need to converge or be compatible with a set modeling standard for complex systems The compliant tools could be used to verify the requirements up front to mitigate reliability issues down the life-cycle A notional approach would follow that shown in Figure 8 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model The down side to modeling from certain circles is getting people to believe those models How do you certify a modeling tool and the actual models within the tools Those issues should be addressed going forward

21

Modification to Acquisition Model

Requirements Establishment

High Level Design

Detailed Specifications

Implementation Coding

Verification

Development TestingA

rchi

tect

ural

Mod

el amp

A

naly

sis

Propose standard modeling methodology to be applied at different phases of development to enhance requirements development reliability allocation reliability

measurement and testing (DISCLAIMER DOES NOT REPLACE TESTING)

Reliabilityallocated Reliability

measured

Operational Testing amp Validation

Deployed System

Presenter
Presentation Notes
A notional approach would follow that shown in Figure 10 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model This approach could bridge the design and test phases together It is emphasized here that the architectural model would not replace critical testing but augments the process to allow for better requirement identification and verification Thorough ground and flight tests should never be replaced by modeling Modeling would only allow for more robust and a higher level of confidence in the requirements and design The model could be used in conjunction with the testing to confirm the design Proper modeling and analysis would reduce total program costs by enforcing more complete and correct initial requirements which reduce issues discovered down the road in testing that are expensive to fix or impossible to fix and having to accept high risks Additionally if the model is maintained and optimized then it could possibly be used after system deployment to analyze impacts of upgrades or changes to the system allowing for more complete analysis and reduce overall system redesign costs1313A hurdle to cross with modeling and analysis is convincing people to believe those models Some method to certify these models and modeling tools should be addressed in the future Standards should be set in place for correct modeling techniques for complex systems Lastly consideration of standard verification checking tools should be made such as with the use of the Motor Industry Software Reliability Association (MISRA) compliance verification tool for the use of C in safety critical systems

22

Systems Reliability Standard Establishment

bull Establish a working group to define this standardndash Need a technical society to lead the charge on this

bull Collaborate with industry academia military and societiesndash Focus on development of a reliability standard with AWR safety in mindndash Draw upon the experiences to feed into this standard

bull Study existing and previous complex systemsndash Shuttle Space Station missile systems nuclear submarine and ship

systems nuclear control systems military and commercial jet systems ndash Obtain software reliability information from given existing and previous

systemsndash Build database which would serve as basis for future reliability

bull Research prior efforts in complex systems analysis

bull Establish consensus based modeling and analysis method

Presenter
Presentation Notes
In conclusion methods for achieving a design for complex systems do exist however achieving reliability and attaining a level of qualification that would permit better AWRs does not There needs to be a standard developed to tackle this issue rather than relying on current methods to ascertain airworthiness of complex systems This will not occur overnight An orchestrated collaboration among industry academia military labs and technically professional societies to focus on development of this standard should allow us to draw upon the experiences to feed into this reliability standard with AWR safety in mind We have a long living experiment with complex software systems on the Space Transportation System (STS) International Space Station (ISS) missile systems nuclear submarine and ship systems nuclear control systems military and commercial jet systems from which we should be able to obtain at least inferred software reliability information from the architecture and run time that the systems use We should look at the lessons learned from these systems to see what could have been done to improve and what was done right that should be carried forward The challenge is collecting this information into a central database and arriving at some figure of reliability from previous systems where data exists This would at least provide a starting point to allow initial assessments and could be optimized in the future Also this is not the only study for establishing reliability metrics to complex software systems There have been research projects of similarity to this effort that have risen and fallen The data from those projects should not be wasted but studied to feed into whatever standard that is developed While historical information would be useful each design is unique and requires tools to accomplish the design Investigation of architectural modeling constructs should be further investigated as a possible augmentation to the design and test process We need to determine which forum is best to conduct this effort (eg SAE IEEE AIAA ACM AHS INCOSE or other) As stated in the paper ldquoSpace Shuttle Avionicsrdquo [31] ldquoThe designers the flight crew and other operational users of the system often have a mindset established in a previous program or experience which results in a bias against new or different lsquounconventionalrsquo approachesrdquo If nothing is done to address this problem it will only get worse over time It is past time to address the issue of reliability of complex systems and software

23

BACKUP SLIDES

BACKUP SLIDES

24

PEO Aviation System Safety Management Decision Authority Matrix

Severity(Most Credible)

FrequentP gt 1E-3

A

Probable 1D-4 lt P lt= 1E-3

B

Occasional1E-5 lt P lt= 1E-4

C

Remote1E-6 lt P lt= 1E-5

D

Improbable1E-7 lt P lt= 1E-6

E

Catastrophic1

Critical 2

Marginal 3

Negligible4

Army Acquisition

PEO Aviation

ProgramManagement

HazardCategory

Description

1 Catastrophic Death or permanent total disability system loss

2 Critical Severe injury or minor occupational illness (no permanent effect) minor system or environmental damage

3 Marginal Minor injury or minor occupational illness (no permanent effect) minor system or environmental damage

4 Negligible Less than minor injury or occupational illness (no lost workdays) or less than minor environmental damage

RiskLevel

Description Probability (Frequency) (per 100000 flight hours)

A Frequent gt 100 (P gt 1E-3)

B Probable lt=100 and gt10 (1E-4 lt P lt= 1E-3)

C Occasional lt= 10 and gt1 (1E-5 lt P lt= 1E-4)

D Remote lt=1 and gt01 (1E-6 lt P lt= 1E-5)

E Improbable lt=01 and gt001 (1E-7 lt P lt= 1E-6)

Presenter
Presentation Notes
As already mentioned it is the goal of the US Army Aviation Engineering Directorate that the system is safe and reliable to operate and will perform the mission when delivered Another goal includes the released system continues to safely perform the mission if maintained and operated per the operatorrsquos manual Replacement parts and overhaul work must be high quality to support continued airworthiness Per the Program Element Office Memorandum 08-03 Risk Matrix US Army flight control systems are to achieve 1E-9 reliability for flight critical functions per civil airspace regulations [35] and 1E-6 reliability for tactical use [36] Quantifying these numbers is established for component hardware but not for software Just as hardware should have quantifiable reliability so should software13

25

Reliability Defined

bull Software Reliability - the probability that a given piece of software will execute without failure in a given environment for a given time [40] ndash Often debated as to how to measure

bull Hardware Reliability - the probability that a hardware component fails over time ndash Well defined and established

bull System Reliability - the probability of success or the probability that the system will perform its intended function under specified design limits [over a given amount of time] [39] ndash A combination of software and hardware reliability

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

26

Hardware vs Software Reliability

Hardware Reliability Software ReliabilityFailure rate has a bathtub curve The burn-in state is similar to the software debugging state

Without considering program evolution failure rate is statistically non-increasing

Material deterioration can cause failures even though the system is not used

Failures never occur if the software is not used

Failure data are fitted to some distributions The selection of the underlying distribution is based on the analysis of failure data and experiences Emphasis is placed on analyzing failure data

Most models are analytically derived from assumptions Emphasis is on developing the model the interpretation of the model assumptions and the physical meaning of the parameters

Failures are caused by material deterioration design errors misuse and environment

Failures are caused by incorrect logic incorrect statements or incorrect input data

Can be improved by better design better material applying redundancy and accelerated life cycle testing

Can be improved by increasing testing effort and correcting discovered faults Reliability tends to change continuously during testing due to the addition of problems in new code or the removal of problems by debugging errors

Hardware repairs restore the original condition Software repairs establish a new piece of software

Hardware failures are usually preceded by warnings Software failures are rarely preceded by warnings

Hardware components can be standardized Software components have rarely been standardized

Hardware can usually be tested exhaustively Software essentially requires infinite testing for completeness

Reference [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Hong Pham compares software versus hardware reliability with the information as shown in Table 2 [39] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

27

Acronym ListACRONYM DEFINITIONAADL Architectural Analysis and Design LanguageAC Advisory Circular (FAA)ACM Association of Computing MachineryAED Aviation Engineering Directorate (AMRDEC)AFTD Aviation Flight Test Directorate (US Army)AGC Apollo Guidance ComputerAHS American Helicopter SocietyAIAA American Institute of Aeronautics and Astronautics (Inc)AMCOM Aviation and Missile Command (US Army)AMRDEC Aviation and Missiles Research Development and Engineering Center (US Army)AR Army RegulationARINC Aeronautical Radio Inc ARP Aerospace Recommended PracticeASIF Avionics Software Integration FacilityATAM Architecture Tradeoff Analysis MethodATM Air Traffic ManagementAWR Airworthiness ReleaseCAAS Common Avionics Architecture SystemCH-47 Cargo Helicopter ChinookCMM Capability Maturity ModelCMMI Capability Maturity Model IndexCMU Carnegie Mellon UniversityCNS Communications Navigation SurveillanceCoSMIC Component Synthesis using Model-Integrated ComputingCPS Cyber-Physical SystemCRC Chemical Rubber Company (ie CRC Press)DFBW Digital Fly-By-WireDoD Department of DefenseE3 Electrical and Electromagnetic EffectsESML Embedded System Modeling LanguageFAA Federal Aviation AdministrationFCS Future Combat SystemsFHA Functional Hazard AssessmentFMEA Failure Modes Effects Analysis

28

Acronym List (concluded)

ACRONYM DEFINITIONGPWS Ground Proximity Warning SystemIBM International Business MachinesIEC International Engineering ConsortiumIL Instrumentation Lab (now Draper Laboratory)IMA Integrated Modular AvionicsINCOSE International Council On Systems EngineeringISO International Organization for StandardizationISS International Space StationKAL Korean AirlinesMISRA Motor Industry Standard Software Reliability AssociationMIT Massachusetts Institute of TechnologyNASA National Aeronautics and Space Administration (USA)PDR Preliminary Design ReviewPEO Program Element OfficePNAS Proceedings of the National Academy of SciencesRAQ Rotorcraft and Aircraft QualificationRMA Rate Monotonic AnalysisRTC Redstone Test Center (US Army) RTTC Redstone Technical Test Center (US Army)RTCA Radio Technical Commission for AeronauticsSAE Society of Automotive EngineersSED Software Engineering Directorate (AMRDEC)SEES Software Engineering Evaluation SystemSEI Software Engineering Institute (CMU)SIL System Integration LaboratorySSA System Safety AssessmentSTS Space Transportation SystemSysML Systems Modeling LanguageTMR Triple Modular RedundantTRL Technical Readiness LevelUAS Unmanned Aircraft SystemUH-60 Utility Helicopter BlackhawkUML Unified Modeling LanguageUS United StatesUSL Universal Systems Language

29

References

bull [1] Israel Koren and Mani Krishna ldquoFault-Tolerant Systemsrdquo Morgan Kaufmann 2007bull [2] Jianto Pan ldquoSoftware Reliabilityrdquo Carnegie Mellon University Spring 1999bull [3] Nachum Dershowitz httpwwwcstauacil~nachumdhorrorhtmlbull [4] httpwwwair-attackcombull [5] David A Mindell ldquoDigital Apollo Human and Machine in Spaceflightrdquo The MIT Press 2008bull [6] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Flight Software Airworthiness SED-SES-PMHFSA 001 December 2003bull [7] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Software Safety SED-SES-PMHSSA 001 February 2006bull [8] AMCOM Regulation 385-17 Software System Safety Policy 15 March 2008bull [9] NASA Software Safety Guidebook NASA-GB-871913 31 March 2004bull [10] Margaret Hamilton amp William Hackler ldquoUniversal Systems Language Lessons Learned from Apollordquo IEEE Computer

Society 2008bull [11] Margaret Hamilton ldquoFull Life Cycle Systems Engineering and Software Development Environment Development Before

The Fact In Actionrdquo httpwwwhtiuscomArticlesFull_Life_Cyclehtmbull [12] Peter Feiler David Gluch John Hudak ldquo The Architecture Analysis amp Design Language (AADL) An Introductionrdquo

CMUSEI-2006-TN-011 February 2006bull [13] Peter Feiler John Hudak ldquoDeveloping AADL Models for Control Systems A Practitionerrsquos Guiderdquo CMUSEI-2007-TR-

014 July 2007bull [14] Bruce Lewis ldquoUsing the Architecture Analysis and Design Language for System Verification and Validationrdquo SEI

Presentation 2006bull [15] Feiler Gluch Hudak Lewis ldquoEmbedded System Architecture Analysis Using SAE AADLrdquo CMUSEI-2004-TN-004 June

2004bull [16] Charles Pecheur Stacy Nelson ldquoVampV of Advanced Systems at NASArdquo NASACR-2002-211402 April 2002bull [17] Systems Integration Requirements Task Group ldquoARP 4754 Certification Considerations for Highly-Integrated or

Complex Aircraft Systemsrdquo SAE Aerospace 10 April 1996bull [18] SAE ldquoARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne System and

Equipmentrdquo December 1996bull [19] Aeronautical Radio Inc (ARINC) ldquoARINC Specification 653P1-2 Avionics Application Software Standard Interface Part 1

ndash Required Servicesrdquo 7 March 2006

30

References (Continued)

bull [20] Department of Defense ldquoMIL-STD-882D Standard Practice for System Safetyrdquo 19 January 1993bull [21] RTCA Incorporated ldquoDO-178 Software Considerations in Airborne Systems and Equipment Certificationrdquo 1

December 1992bull [22] RTCA Incorporated ldquoDO-254 Design Assurance Guidance for Airborne Electronic Hardwarerdquo 19 April 2000bull [23] US Army ldquoAeronautical Design Standard Handbook Rotorcraft and Aircraft Qualification (RAQ) Handbookrdquo 21

October 1996bull [24] Cary R Spitzer (Editor) ldquoAvionics Elements Software and Functionsrdquo CRC Press 2007bull [25] US Army ldquoArmy Regulation 70-62 Airworthiness Qualification of Aircraft Systemsrdquo 21 May 2007bull [26] US Army ldquoArmy Regulation 95-1 Aviation Flight Regulationsrdquo 3 February 2006bull [27] ldquoUsing the Architecture Tradeoff Analysis Method (ATAM) to Evaluate the Software Architecture for a Product Line

of Avionics Systems A Case Studyrdquo Barbacci Clements Lattanze Northrop Wood July 2003 CMUSEI-2003-TN-012bull [28] ldquoAll in the Family CAAS amp AADLrdquo Peter Feiler August 2008 CMUSEI-2008-SR-021bull [29] ldquoCMMI Guidelines for Process Integration and Product Improvementrdquo Chrissis Konrad Shrum Pearson Education

2007bull [30] ldquoModel Driven Performance Analysis for Avionics Systemsrdquo Brendan OrsquoConnell Draper Laboratory January 2006bull [31] John F Hanaway Robert W Moorehead ldquoSpace Shuttle Avionics Systemsrdquo NASA SP-504 1989bull [32] Lui Sha ldquoThe Complexity Challenge in Modern Avionics Softwarerdquo August 14 2006bull [33] ldquoIncidents Prompt New Scrutiny of Airplane Software Glitchesrdquo 30 May 2006 Wall Street Journalbull [34] Eyal Ophir Clifford Nass and Anthony Wagner ldquo Cognitive Control in Media Multitaskersrdquo PNAS 20 July 2009bull [35] ldquoAdvisory Circular AC 251309-1A System Design and Analysisrdquo Federal Aviation Administration 21 June 1988bull [36] Program Element Office Policy Memorandum 08-03bull [38] httpwwwnsfgovpubs2008nsf08611nsf08611htm National Science Foundation webpage on Cyber-Physical

Systemsbull [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000bull [40] Paul Rook editor ldquoSoftware Reliability Handbookrdquo Elsevier Science Publishers LTD 1990

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)

12

Reliability vs Complexity ampCost vs Complexity

Notional Graphs

bull Reliability vs Complexity bull Cost amp Schedule vs Complexity

Rel

iabi

lity

Complexity Complexity

Cos

t amp S

ched

uleOptimum

Aggregation of part reliability

feeds into overall system reliability

Desired

Presenter
Presentation Notes
Avionics parts and software constantly change over the life of a program Typically a spiral development program occurs with complex software development which means that qualification is required frequently This begs the question of how to streamline the process so that the need to conduct a complete re-qualification is avoided13With these complex systems there are other hurdles to cross such as fully characterizing and conducting the Functional Hazard Assessments (FHAs) and Failure Mode Effects and Criticality Analyses (FMECAs) It is crucial for the safety assessment that these are conducted correctly to fully understand the risks and later perform the correct tests at the right level13Additionally once the complex component hardware and software are integrated then yet other problems crop up It is at that time that the disparity of teams crossing multiple contractors and development groups becomes obvious and if not properly coordinated could cause impacts to the schedule 13Other programmatic problems affect complex system development and qualification For instance lack of schedule and funding resources causes a shortcoming to adequately provide for the proper compliance with specification and requirements short-circuiting the systems engineering process An ever decreasing availability of trained engineers to support the development and test of such systems exists Non-technical political influences sometimes affect the reliability of complex systems and are difficult to avoid Lastly there is a lack of a centralized database that captures the various families of systems that have been built along with their characterization of success and failures Such a database from all past and present government complex systems could be valuable in establishing reliability basis for future models

13

A Few Examples of Complex Systems

bull This is not a new problem Other have struggled with the challenges of establishing confidence in complex systems

ndash NASAbull Apollo Guidance Computerbull Dryden F8 Crusaderbull Space Shuttlebull International Space Station

ndash Commercial Airlinersbull Airbus A320 and higherbull Boeing B777 B787

ndash Militarybull Ships and Submarinesbull Jets (F14F15 F16 F18 F22 F35 etc)bull Cargo Planes (C130J Hercules C17 Globemaster etc)bull Helicopters (Chinook Blackhawk Sea Stallion etc)bull Rocketsbull Unmanned Aerial Systemsbull Unmanned Ground Systemsbull Unmanned Submarine Systems Photos by US Army NASA US Navy and US Air Force

Presenter
Presentation Notes
As mentioned earlier complex avionics systems are not a new idea Since the early 1960rsquos complex avionic architectures have existed beginning with the ApolloSaturn program Massachusetts Institute of Technology (MIT) Instrumentation Lab (IL) which is now Draper Laboratory and International Business Machines (IBM) led the way with the MITIL Apollo Guidance Computer (AGC) and the Saturn V IBM triple modular redundant (TMR) voting guidance computer system The word software was not even coined at the time but engineers such as Margaret Hamilton MITIL Director of Apollo On-board Software can attest to the fact that some the same issues with creating reliable software then still exists today [5] A large majority of the issues then dealt with the communication between systems engineers and the programmers Requirements were thrown over the wall without the confirmation that the requirements were complete and a lot of the issues cropped up as interface problems Identifying these issues prompted Hamilton to create her own company and create a modeling language called Universal Systems Language (USL) to head off the problems experienced with Apollo [11] Some 200 plus modeling programs have been developed since Apollo and used to mitigate issues and increase confidence in systems of varying complexity13As time progressed other systems came along The NASA Dryden F8 Crusader was the first digital fly by wire (DFBW) jet aircraft that relied heavily on complex IMA and software for flight control The Space Transportation System (STS) shuttle includes a Quad Modular Redundant (QMR) system with a fifth backup flight computer containing uncommon code US Air Force and Naval airplanes that have possessed complex or redundant IMA configurations include the F14 Tomcat F15 Eagle F16 Falcon F18 Hornet F22 Raptor F35 Joint Strike Fighter F117 Nighthawk V22 Osprey C17 Globemaster and many more along with recent Unmanned Air Vehicle Systems (UAVS) The US Army complex systems on helicopters include the13RAH-66 Comanche DFBW Triple Modular Redundant (TMR) architecture13glass cockpit avionics on the UH-60M Blackhawk baseline 13Common Avionics Architecture System (CAAS) glass cockpit on the UH-60M Blackhawk modernization and CH-47F Chinooks13and other aircraft13Additionally there are many self-checking pair engine controller systems along with system of system Future Combat Systems (FCS) and Unmanned Air Vehicle Systems (UAVS) This has also permeated the commercial airliner market with the Airbus 320 and higher Airbus models Boeing 777 and Boeing 787 aircraft With this ever increasing technology something must be done about the reliability issue With such a wealth of data on aviation and non-aviation cyber-physical systems such as submarine ship nuclear medical locomotive and automotive systems there should be adequate information to get a start on modeling systems correctly for reliability Therefore this is not an isolated problem to avionics and other disciplines should aide in resolving this problem13

14

Some Complex System Failures

bull V-22 Osprey crashesbull Mars Climate Orbiter crashbull Mars Pathfinder software resetbull USS Vincennes downing an Airbus 320bull Therac-25 software radiation treatment

failurebull 1989 Airbus A320 air show crashbull China Airlines Airbus Industries A300

crashbull Ariane 5 satellite launcher malfunctionbull Failure of the primary flight system to

sync with the backup during prelaunch of STS-1

bull Mexicana Airlines Boeing 727 airliner crashed into a mountain due to the software not correctly determining the mountain position

bull Loss of the first American probe to Venus

bull Korean Airlines KAL 901 accidentbull Soviet Phobos I Mars probe lost

bull Three Mile Islandbull F-18 fighter plane crash due to bad

exceptionbull F-14 fighter plane lost to

uncontrollable spinbull Swedish Gripen prototype crashedbull Swedish Gripen air-show crashbull F-22 failure crossing the IDLbull 2006 German-Spanish Barracuda UAVbull 2004 FA-22 Raptor stealth fighter jet

crash bull FA-22 Raptor navigation system

software error at Nellis AFBbull 50 cockpit blackouts on A320bull A320 multiple avionics and electrical

failures at Newark NJbull Boeing 777 Malaysian Airlines jetlinerrsquos

nightmarish autopilot rollercoaster ridebull 3000 feet US Army and Air Force UAV

Crashes

bull hellip And Many Morehellip

Presenter
Presentation Notes
Multiple crashes have occurred with the V-22 Osprey [41]13In 1999 the Mars Climate Orbiter crashed because of incorrect units in a program caused by poor systems engineering practices [42 44]13In 1988 an Airbus 320 was shot down by the USS Vincennes because of cryptic and misleading output displayed by the tracking software [3]13In 1989 an Airbus A320 crashed at an air show due to altitude indication and software handling [3]13In 1994 a China Airlines Airbus Industries A300 crash on killing 264 from faulty software [3]13In 1996 the first Ariane 5 satellite launcher destruction mishap was caused by a faulty software design error with a few lines of ADA code containing unprotected variables Horizontal velocity of the Ariane 5 exceeded the Arian 4 resulting in the guidance system veering the rocket off course Insufficient testing did not catch this error which was a carry-over from Ariane 4[3 39]13In 1986 a Mexicana Airlines Boeing 727 airliner crashed into a mountain due to the software not correctly determining the mountain position [39]13In 1986 the Therac-25 radiation therapy machines overdosed cancer patients due to flaw in the computer program controlling the highly automated devices [3 39 45]13During the maiden launch in 1981 of the Discovery space shuttle a failure of the primary flight control computer system to establish sync with the backup during prelaunch [43]13On December 10 1990 the Space Shuttle Columbia had to land early due to computer software problems [39]13In 1997 The Mars Pathfinder software reset problem due to latent task execution caused by priority inversion with a mutex [3 44]13An error in a single FORTRAN statement resulted in the loss of the first American probe to Venus From G J Myers Software Reliability Principles amp Practice p 25 [3]13In September 1999 the Korean Airlines KAL 901 accident in Guam killed 225 out of 254 aboard A worldwide bug was discovered in barometric altimetry in Ground Proximity Warning System (GPWS) From ACM SIGSOFT Software Engineering Notes vol 23 no 1 [3]13The Soviet Phobos I Mars probe was lost due to a faulty software update at a cost of 300 million rubles Its disorientation broke the radio link and the solar batteries discharged before reacquisition From Aviation Week 13 Feb 1989 [3]13An F-18 fighter plane crash due to a missing exception condition From ACM SIGSOFT Software Engineering Notes vol 6 no 2 [3]13An F-14 fighter plane was lost to uncontrollable spin traced to tactical software From ACM SIGSOFT Software Engineering vol9 no5 [3]13In 1989 Swedish Gripen prototype crashed due to software in their digital fly-by-wire system [3 46]13In 1995 another Gripen fighter plane crashed during air-show caused by a software issue [3 46]13On February 11 2007 twelve FA-22 Raptors were forced to head back to the Hawaii when a software bug caused a computer crash as they were crossing International Date Line In the middle of the ocean all their systems comprising navigation fuel and part of the communications systems dumped All the attempts to reboot failed[47]13February 2006 German-Spanish Unmanned Combat Air Vehicle Barracuda crash due to software failure [4]13December 2004 a glitch in the software for controlling flight probably caused an FA-22 Raptor stealth fighter jet to crash on takeoff at Nellis Air Force [4]13In 2008 a United Airbus A320 registration N462UA experienced multiple avionics and electrical failures including loss of all communications shortly after rotation while departing Newark Liberty International Airport in Newark New Jersey [NTSB Report Identification DCA08IA033] 13In 2006 a Boeing 777 Malaysian Airlines jetlinerrsquos autopilot caused a stall to occur by climbing 3000 feet Pilots struggled to nose down the plane but plunged into a steep dive After pulling back up the pilots regained control Cause was defective flight software providing incorrect data for airspeed and acceleration confusing the flight computers and initially ignoring the pilotrsquos commands[49]13US Army and Air Force UAV crashes from control system or human error13

15

Lessons Learned from Failures

bull From Nancy Levesonrsquos paper ldquoThe Role of Software in Spacecraft Accidentsrdquondash ldquoFlaws in the safety culture diffusion of responsibility and

authorityndash Limited communication channels and poor information flowndash Inadequate system and software engineering ndash Poor or missing specifications ndash Unnecessary complexity and software functionality ndash Software reuse or changes without appropriate safety analysisndash [Shortcomings] in safety engineering practices ndash Flaws in test and simulation environments ndash Inadequate human factors design for softwarerdquo

Presenter
Presentation Notes
In Dr Nancy Levesonrsquos paper [36] ldquoThe Role of Software in Spacecraft Accidentsrdquo she cited problems with software development issues within NASA on various projects According to Dr Leveson there were ldquoflaws in the safety culture diffusion of responsibility and authority limited communication channels and poor information flow inadequate system and software engineering poor or missing specifications unnecessary complexity and software functionality software reuse or changes without appropriate safety analysis violation of basic safety engineering practices inadequate system safety engineering flaws in test and simulation environments and inadequate human factors design for softwarerdquo While these problems were identified for spacecraft development within NASA and corrected aviation in general could learn from these lessons to mitigate issues with complex systems development

16

Some Current Guidelines

bull DO-178B - Software Considerations in Airborne Systems and Equipment Certification bull DO-248B ndash Final Report for the Clarification of DO-178Bbull DO-278 - Guidelines for Communications Navigation Surveillance and Air Traffic Management

(CNSATM) Systems Software Integrity Assurancebull DO-254 - Design Assurance Guidance for Airborne Electronic Hardware bull DO-297 ndash Integrated Modular Avionics (IMA) Development Guidance and Certification

Considerationsbull SAE-ARP4754 ndash Certification Consideration for Highly Integrated or Complex Aircraft Systemsbull SAE-ARP4671- Guidelines and Methods for Conducting the Safety Assessment Process on

Airborne Systems and Equipmentbull FAA Advisory Circular AC27-1B - Certification of Normal Category Rotorcraftbull FAA Advisory Circular AC29-2C - Certification of Transport Category Rotorcraftbull ISOIEC 12207 - Software Life Cycle Processesbull ARINC 653 - Specification Standard for Time and System Partitionbull MIL-STD-882D - DoD System Safetybull ADS-51-HDBK - Rotorcraft and Aircraft Qualification Handbookbull AR-70-62 - Airworthiness Release Standardbull SED-SES-PMHFSA 001 - Software Engineering Directorate (SED) Software Engineering

Evaluation System (SEES) Program Manager Handbook for Flight Software Airworthinessbull SED-SES-PMHSS 001 - SED SEES Program Manager Handbook for Software Safety

WHATrsquoS MISSING - Reliability Standard for Complex Systems

Presenter
Presentation Notes
These problems previously stated drove the development of these guidelines however there is no standard for system reliability that includes software There are other standards and circulars that pertain to complex systems but a reliability standard for complex systems but a reliability standard is missing for complex systems which would outline the process for establishing cyber-physical systems reliability This standard should indicate how to model and analyze and ascertain the projected level of reliability

17

Certification Assessment Considerations

bull Sufficient data and time must be available for air worthiness evaluation

bull Certification processndash Currently lengthy ndash Depends much on human interpretation trade offs and risk mitigation ndash Overwhelming for complex integrated systems (FHAs FTAs FMECAs

risk mitigation etc)

bull Consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives

bull Certification Tasks outlined in DO-297 should be consideredndash Task 1 Module Acceptancendash Task 2 Application softwarehardware acceptancendash Task 3 IMA system acceptancendash Task 4 Aircraft integration of IMA system ndash including VampVndash Task 5 Change of modules or applicationsndash Task 6 Reused of modules or applications

Presenter
Presentation Notes
In order to execute an AWR sufficient data must be provided to the appropriate airworthiness authorities to demonstrate that safety requirements are met to the levels per the System Safety Assessment (SSA) or equivalent The certification process is currently lengthy and depends on much human interpretation of the myriad of complex architecture functions 13The current guidelines such as DO-178B DO-254 DO-297 SAE-ARP-4754 and SAE-ARP-4671 along with many other guidelines outline the proper steps that should be taken System safety managementrsquos military standard is MIL-STD-882 and has been in use for decades Civilian safety standards for the aviation industry include SAE ARP4754 which shows the incorporation of system safety activities into the design process and provides guidance on techniques to use to ensure a safe design SAE ARP4761 contains significant guidance on how to perform the system safety activities spoken about in SAE ARP4754 DO-178B outlines development assurance guidance for aviation software based on the failure condition categorization of the functionality provided by the software DO-254 embodies similar guidance for aviation hardware ARINC 653 is a widely accepted standard to ensure time and space partitioning for software DO-297 does an excellent job of describing the certification tasks to take for an IMA system which include13Task 1 Module acceptance13Task 2 Application softwarehardware acceptance13Task 3 IMA system acceptance13Task4 Aircraft integration of IMA systems including verification and validation13Task 5 Change of modules or applications13Task 6 Reuse of modules or applications13Taken together these standards provide guidance that if followed likely will result in safe highly reliable and cost-effective systems over the life-cycle of the system Yet while these guidelines exist there is not a consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives Also there is not a standard outlining overall reliability for a system to include hardware and software reliability In order to achieve this level of reliability a standard should be developed to define the process and method to achieve a quantifiable reliability number that would in turn lead to acceptance13

18

TRL 3 or 4 TRL 6 or 7 TRL 8 or 9

Component 4

ComplexityFundamentals

Reliability Parametrics

Component 3

ComplexityFundamentals

Reliability Parametrics

Definition of Complexity and Reliability is Needed

Subsystem 1

SystemIntegration ofComponents

ReliabilityDependencies

Component 2

ComplexityFundamentals

Reliability Parametrics

Component 1

ComplexityFundamentals

Reliability Parametrics

Subsystem 2

SystemIntegration ofComponents

ReliabilityDependencies

Integration System

Realized System

ReliabilitySensitivities

High Reliable Complex System

Certificate(eg AWR)

Integration

Integration

Integration

Integration

Integration

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) andor AWR

19

Analytical Models and Reliability

bull Analytical models of hardware reliability are well understood

bull Architecture modeling and software reliability modeling is not a novel idea but is highly debated

ndash There are many approaches and little consensus as to best wayndash Many models (Jelinski-Moranda Littlewood-Verrall Musa-Okumoto etc) [1]ndash Many tools (over 200+ tools since 1970s have been built) [2]

bull Predictability of software reliability is of great concern because it is a major contributor to unreliability[2]

bull Software Reliability is the probability of error-free operation of a computer program in a specified environment for a specified time [1]

bull Need basis for setting reliability figures based on previous systems and iteratively refine those figures in the future

bull NOT A REPLACEMENT FOR TESTING AND VERIFICATION

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability While not suggesting as solution in total better modeling practices should be considered toward a solution to bridge the gap between design test and implementation A method to model the architecture early in the requirements establishment phase follow the detailed design and coding and then be able to verify the system along with the model may be a path to greater confidence in the system and reduce the risks warnings and cautions that must be issued13In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result (see Figure 9) The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) or AWR 13To achieve this goal modeling and analysis tools that follow a standard for modeling reliability should exist As previously stated over 200 tools have been created since 1970s Here is a list of a few of the current tools13Universal Systems Language (USL)13Unified Modeling Language (UML)13Systems Modeling Language (SysML)13MATLABSimulink13Telelogic Rhapsody13MathCAD13Colored Petri Nets13Rate Monotonic Analysis (RMA)13STATEMATE (Used by Airbus)13Standard for the Development of Safety-Critical Embedded Software (SCADE)13OPNET13Embedded System Modeling Language (ESML)13Component Synthesis using Model-Integrated Computing (CoSMIC)13Architectural Analysis and Design Language (AADL)13By no means is this list complete Typically different companies and projects address this challenge and choose unit tools to perform the upfront analysis and modeling not following a standard approach Multiple tools need to converge or be compatible with the framework of a common tool such as AADL The resultant tool(s) could be used to verify the requirements up front to mitigate reliability issues down the life-cycle 13

20

Tools for Modeling and Analysis

bull Universal Systems Language (USL)bull Unified Modeling Language (UML)bull Systems Modeling Language (SysML)bull MATLABSimulinkbull Telelogic Rhapsodybull MathCadbull Colored Petri Netsbull Rate Monotonic Analysis (RMA)bull STATEMATE (Used by Airbus)bull SCADEbull OPNETbull Embedded System Modeling Language (ESML)bull Component Synthesis using Model-Integrated Computing (CoSMIC)bull Architectural Analysis and Design Language (AADL)bull At least 200+ more packages since the 70rsquosbull Certified tools needs to converge to an accepted standard

modelinganalysis method for complex system reliability

Presenter
Presentation Notes
Typically different companies and projects address the challenge and choose unit tools to perform the upfront analysis and modeling not following a guideline approach Multiple tools need to converge or be compatible with a set modeling standard for complex systems The compliant tools could be used to verify the requirements up front to mitigate reliability issues down the life-cycle A notional approach would follow that shown in Figure 8 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model The down side to modeling from certain circles is getting people to believe those models How do you certify a modeling tool and the actual models within the tools Those issues should be addressed going forward

21

Modification to Acquisition Model

Requirements Establishment

High Level Design

Detailed Specifications

Implementation Coding

Verification

Development TestingA

rchi

tect

ural

Mod

el amp

A

naly

sis

Propose standard modeling methodology to be applied at different phases of development to enhance requirements development reliability allocation reliability

measurement and testing (DISCLAIMER DOES NOT REPLACE TESTING)

Reliabilityallocated Reliability

measured

Operational Testing amp Validation

Deployed System

Presenter
Presentation Notes
A notional approach would follow that shown in Figure 10 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model This approach could bridge the design and test phases together It is emphasized here that the architectural model would not replace critical testing but augments the process to allow for better requirement identification and verification Thorough ground and flight tests should never be replaced by modeling Modeling would only allow for more robust and a higher level of confidence in the requirements and design The model could be used in conjunction with the testing to confirm the design Proper modeling and analysis would reduce total program costs by enforcing more complete and correct initial requirements which reduce issues discovered down the road in testing that are expensive to fix or impossible to fix and having to accept high risks Additionally if the model is maintained and optimized then it could possibly be used after system deployment to analyze impacts of upgrades or changes to the system allowing for more complete analysis and reduce overall system redesign costs1313A hurdle to cross with modeling and analysis is convincing people to believe those models Some method to certify these models and modeling tools should be addressed in the future Standards should be set in place for correct modeling techniques for complex systems Lastly consideration of standard verification checking tools should be made such as with the use of the Motor Industry Software Reliability Association (MISRA) compliance verification tool for the use of C in safety critical systems

22

Systems Reliability Standard Establishment

bull Establish a working group to define this standardndash Need a technical society to lead the charge on this

bull Collaborate with industry academia military and societiesndash Focus on development of a reliability standard with AWR safety in mindndash Draw upon the experiences to feed into this standard

bull Study existing and previous complex systemsndash Shuttle Space Station missile systems nuclear submarine and ship

systems nuclear control systems military and commercial jet systems ndash Obtain software reliability information from given existing and previous

systemsndash Build database which would serve as basis for future reliability

bull Research prior efforts in complex systems analysis

bull Establish consensus based modeling and analysis method

Presenter
Presentation Notes
In conclusion methods for achieving a design for complex systems do exist however achieving reliability and attaining a level of qualification that would permit better AWRs does not There needs to be a standard developed to tackle this issue rather than relying on current methods to ascertain airworthiness of complex systems This will not occur overnight An orchestrated collaboration among industry academia military labs and technically professional societies to focus on development of this standard should allow us to draw upon the experiences to feed into this reliability standard with AWR safety in mind We have a long living experiment with complex software systems on the Space Transportation System (STS) International Space Station (ISS) missile systems nuclear submarine and ship systems nuclear control systems military and commercial jet systems from which we should be able to obtain at least inferred software reliability information from the architecture and run time that the systems use We should look at the lessons learned from these systems to see what could have been done to improve and what was done right that should be carried forward The challenge is collecting this information into a central database and arriving at some figure of reliability from previous systems where data exists This would at least provide a starting point to allow initial assessments and could be optimized in the future Also this is not the only study for establishing reliability metrics to complex software systems There have been research projects of similarity to this effort that have risen and fallen The data from those projects should not be wasted but studied to feed into whatever standard that is developed While historical information would be useful each design is unique and requires tools to accomplish the design Investigation of architectural modeling constructs should be further investigated as a possible augmentation to the design and test process We need to determine which forum is best to conduct this effort (eg SAE IEEE AIAA ACM AHS INCOSE or other) As stated in the paper ldquoSpace Shuttle Avionicsrdquo [31] ldquoThe designers the flight crew and other operational users of the system often have a mindset established in a previous program or experience which results in a bias against new or different lsquounconventionalrsquo approachesrdquo If nothing is done to address this problem it will only get worse over time It is past time to address the issue of reliability of complex systems and software

23

BACKUP SLIDES

BACKUP SLIDES

24

PEO Aviation System Safety Management Decision Authority Matrix

Severity(Most Credible)

FrequentP gt 1E-3

A

Probable 1D-4 lt P lt= 1E-3

B

Occasional1E-5 lt P lt= 1E-4

C

Remote1E-6 lt P lt= 1E-5

D

Improbable1E-7 lt P lt= 1E-6

E

Catastrophic1

Critical 2

Marginal 3

Negligible4

Army Acquisition

PEO Aviation

ProgramManagement

HazardCategory

Description

1 Catastrophic Death or permanent total disability system loss

2 Critical Severe injury or minor occupational illness (no permanent effect) minor system or environmental damage

3 Marginal Minor injury or minor occupational illness (no permanent effect) minor system or environmental damage

4 Negligible Less than minor injury or occupational illness (no lost workdays) or less than minor environmental damage

RiskLevel

Description Probability (Frequency) (per 100000 flight hours)

A Frequent gt 100 (P gt 1E-3)

B Probable lt=100 and gt10 (1E-4 lt P lt= 1E-3)

C Occasional lt= 10 and gt1 (1E-5 lt P lt= 1E-4)

D Remote lt=1 and gt01 (1E-6 lt P lt= 1E-5)

E Improbable lt=01 and gt001 (1E-7 lt P lt= 1E-6)

Presenter
Presentation Notes
As already mentioned it is the goal of the US Army Aviation Engineering Directorate that the system is safe and reliable to operate and will perform the mission when delivered Another goal includes the released system continues to safely perform the mission if maintained and operated per the operatorrsquos manual Replacement parts and overhaul work must be high quality to support continued airworthiness Per the Program Element Office Memorandum 08-03 Risk Matrix US Army flight control systems are to achieve 1E-9 reliability for flight critical functions per civil airspace regulations [35] and 1E-6 reliability for tactical use [36] Quantifying these numbers is established for component hardware but not for software Just as hardware should have quantifiable reliability so should software13

25

Reliability Defined

bull Software Reliability - the probability that a given piece of software will execute without failure in a given environment for a given time [40] ndash Often debated as to how to measure

bull Hardware Reliability - the probability that a hardware component fails over time ndash Well defined and established

bull System Reliability - the probability of success or the probability that the system will perform its intended function under specified design limits [over a given amount of time] [39] ndash A combination of software and hardware reliability

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

26

Hardware vs Software Reliability

Hardware Reliability Software ReliabilityFailure rate has a bathtub curve The burn-in state is similar to the software debugging state

Without considering program evolution failure rate is statistically non-increasing

Material deterioration can cause failures even though the system is not used

Failures never occur if the software is not used

Failure data are fitted to some distributions The selection of the underlying distribution is based on the analysis of failure data and experiences Emphasis is placed on analyzing failure data

Most models are analytically derived from assumptions Emphasis is on developing the model the interpretation of the model assumptions and the physical meaning of the parameters

Failures are caused by material deterioration design errors misuse and environment

Failures are caused by incorrect logic incorrect statements or incorrect input data

Can be improved by better design better material applying redundancy and accelerated life cycle testing

Can be improved by increasing testing effort and correcting discovered faults Reliability tends to change continuously during testing due to the addition of problems in new code or the removal of problems by debugging errors

Hardware repairs restore the original condition Software repairs establish a new piece of software

Hardware failures are usually preceded by warnings Software failures are rarely preceded by warnings

Hardware components can be standardized Software components have rarely been standardized

Hardware can usually be tested exhaustively Software essentially requires infinite testing for completeness

Reference [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Hong Pham compares software versus hardware reliability with the information as shown in Table 2 [39] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

27

Acronym ListACRONYM DEFINITIONAADL Architectural Analysis and Design LanguageAC Advisory Circular (FAA)ACM Association of Computing MachineryAED Aviation Engineering Directorate (AMRDEC)AFTD Aviation Flight Test Directorate (US Army)AGC Apollo Guidance ComputerAHS American Helicopter SocietyAIAA American Institute of Aeronautics and Astronautics (Inc)AMCOM Aviation and Missile Command (US Army)AMRDEC Aviation and Missiles Research Development and Engineering Center (US Army)AR Army RegulationARINC Aeronautical Radio Inc ARP Aerospace Recommended PracticeASIF Avionics Software Integration FacilityATAM Architecture Tradeoff Analysis MethodATM Air Traffic ManagementAWR Airworthiness ReleaseCAAS Common Avionics Architecture SystemCH-47 Cargo Helicopter ChinookCMM Capability Maturity ModelCMMI Capability Maturity Model IndexCMU Carnegie Mellon UniversityCNS Communications Navigation SurveillanceCoSMIC Component Synthesis using Model-Integrated ComputingCPS Cyber-Physical SystemCRC Chemical Rubber Company (ie CRC Press)DFBW Digital Fly-By-WireDoD Department of DefenseE3 Electrical and Electromagnetic EffectsESML Embedded System Modeling LanguageFAA Federal Aviation AdministrationFCS Future Combat SystemsFHA Functional Hazard AssessmentFMEA Failure Modes Effects Analysis

28

Acronym List (concluded)

ACRONYM DEFINITIONGPWS Ground Proximity Warning SystemIBM International Business MachinesIEC International Engineering ConsortiumIL Instrumentation Lab (now Draper Laboratory)IMA Integrated Modular AvionicsINCOSE International Council On Systems EngineeringISO International Organization for StandardizationISS International Space StationKAL Korean AirlinesMISRA Motor Industry Standard Software Reliability AssociationMIT Massachusetts Institute of TechnologyNASA National Aeronautics and Space Administration (USA)PDR Preliminary Design ReviewPEO Program Element OfficePNAS Proceedings of the National Academy of SciencesRAQ Rotorcraft and Aircraft QualificationRMA Rate Monotonic AnalysisRTC Redstone Test Center (US Army) RTTC Redstone Technical Test Center (US Army)RTCA Radio Technical Commission for AeronauticsSAE Society of Automotive EngineersSED Software Engineering Directorate (AMRDEC)SEES Software Engineering Evaluation SystemSEI Software Engineering Institute (CMU)SIL System Integration LaboratorySSA System Safety AssessmentSTS Space Transportation SystemSysML Systems Modeling LanguageTMR Triple Modular RedundantTRL Technical Readiness LevelUAS Unmanned Aircraft SystemUH-60 Utility Helicopter BlackhawkUML Unified Modeling LanguageUS United StatesUSL Universal Systems Language

29

References

bull [1] Israel Koren and Mani Krishna ldquoFault-Tolerant Systemsrdquo Morgan Kaufmann 2007bull [2] Jianto Pan ldquoSoftware Reliabilityrdquo Carnegie Mellon University Spring 1999bull [3] Nachum Dershowitz httpwwwcstauacil~nachumdhorrorhtmlbull [4] httpwwwair-attackcombull [5] David A Mindell ldquoDigital Apollo Human and Machine in Spaceflightrdquo The MIT Press 2008bull [6] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Flight Software Airworthiness SED-SES-PMHFSA 001 December 2003bull [7] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Software Safety SED-SES-PMHSSA 001 February 2006bull [8] AMCOM Regulation 385-17 Software System Safety Policy 15 March 2008bull [9] NASA Software Safety Guidebook NASA-GB-871913 31 March 2004bull [10] Margaret Hamilton amp William Hackler ldquoUniversal Systems Language Lessons Learned from Apollordquo IEEE Computer

Society 2008bull [11] Margaret Hamilton ldquoFull Life Cycle Systems Engineering and Software Development Environment Development Before

The Fact In Actionrdquo httpwwwhtiuscomArticlesFull_Life_Cyclehtmbull [12] Peter Feiler David Gluch John Hudak ldquo The Architecture Analysis amp Design Language (AADL) An Introductionrdquo

CMUSEI-2006-TN-011 February 2006bull [13] Peter Feiler John Hudak ldquoDeveloping AADL Models for Control Systems A Practitionerrsquos Guiderdquo CMUSEI-2007-TR-

014 July 2007bull [14] Bruce Lewis ldquoUsing the Architecture Analysis and Design Language for System Verification and Validationrdquo SEI

Presentation 2006bull [15] Feiler Gluch Hudak Lewis ldquoEmbedded System Architecture Analysis Using SAE AADLrdquo CMUSEI-2004-TN-004 June

2004bull [16] Charles Pecheur Stacy Nelson ldquoVampV of Advanced Systems at NASArdquo NASACR-2002-211402 April 2002bull [17] Systems Integration Requirements Task Group ldquoARP 4754 Certification Considerations for Highly-Integrated or

Complex Aircraft Systemsrdquo SAE Aerospace 10 April 1996bull [18] SAE ldquoARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne System and

Equipmentrdquo December 1996bull [19] Aeronautical Radio Inc (ARINC) ldquoARINC Specification 653P1-2 Avionics Application Software Standard Interface Part 1

ndash Required Servicesrdquo 7 March 2006

30

References (Continued)

bull [20] Department of Defense ldquoMIL-STD-882D Standard Practice for System Safetyrdquo 19 January 1993bull [21] RTCA Incorporated ldquoDO-178 Software Considerations in Airborne Systems and Equipment Certificationrdquo 1

December 1992bull [22] RTCA Incorporated ldquoDO-254 Design Assurance Guidance for Airborne Electronic Hardwarerdquo 19 April 2000bull [23] US Army ldquoAeronautical Design Standard Handbook Rotorcraft and Aircraft Qualification (RAQ) Handbookrdquo 21

October 1996bull [24] Cary R Spitzer (Editor) ldquoAvionics Elements Software and Functionsrdquo CRC Press 2007bull [25] US Army ldquoArmy Regulation 70-62 Airworthiness Qualification of Aircraft Systemsrdquo 21 May 2007bull [26] US Army ldquoArmy Regulation 95-1 Aviation Flight Regulationsrdquo 3 February 2006bull [27] ldquoUsing the Architecture Tradeoff Analysis Method (ATAM) to Evaluate the Software Architecture for a Product Line

of Avionics Systems A Case Studyrdquo Barbacci Clements Lattanze Northrop Wood July 2003 CMUSEI-2003-TN-012bull [28] ldquoAll in the Family CAAS amp AADLrdquo Peter Feiler August 2008 CMUSEI-2008-SR-021bull [29] ldquoCMMI Guidelines for Process Integration and Product Improvementrdquo Chrissis Konrad Shrum Pearson Education

2007bull [30] ldquoModel Driven Performance Analysis for Avionics Systemsrdquo Brendan OrsquoConnell Draper Laboratory January 2006bull [31] John F Hanaway Robert W Moorehead ldquoSpace Shuttle Avionics Systemsrdquo NASA SP-504 1989bull [32] Lui Sha ldquoThe Complexity Challenge in Modern Avionics Softwarerdquo August 14 2006bull [33] ldquoIncidents Prompt New Scrutiny of Airplane Software Glitchesrdquo 30 May 2006 Wall Street Journalbull [34] Eyal Ophir Clifford Nass and Anthony Wagner ldquo Cognitive Control in Media Multitaskersrdquo PNAS 20 July 2009bull [35] ldquoAdvisory Circular AC 251309-1A System Design and Analysisrdquo Federal Aviation Administration 21 June 1988bull [36] Program Element Office Policy Memorandum 08-03bull [38] httpwwwnsfgovpubs2008nsf08611nsf08611htm National Science Foundation webpage on Cyber-Physical

Systemsbull [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000bull [40] Paul Rook editor ldquoSoftware Reliability Handbookrdquo Elsevier Science Publishers LTD 1990

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)

13

A Few Examples of Complex Systems

bull This is not a new problem Other have struggled with the challenges of establishing confidence in complex systems

ndash NASAbull Apollo Guidance Computerbull Dryden F8 Crusaderbull Space Shuttlebull International Space Station

ndash Commercial Airlinersbull Airbus A320 and higherbull Boeing B777 B787

ndash Militarybull Ships and Submarinesbull Jets (F14F15 F16 F18 F22 F35 etc)bull Cargo Planes (C130J Hercules C17 Globemaster etc)bull Helicopters (Chinook Blackhawk Sea Stallion etc)bull Rocketsbull Unmanned Aerial Systemsbull Unmanned Ground Systemsbull Unmanned Submarine Systems Photos by US Army NASA US Navy and US Air Force

Presenter
Presentation Notes
As mentioned earlier complex avionics systems are not a new idea Since the early 1960rsquos complex avionic architectures have existed beginning with the ApolloSaturn program Massachusetts Institute of Technology (MIT) Instrumentation Lab (IL) which is now Draper Laboratory and International Business Machines (IBM) led the way with the MITIL Apollo Guidance Computer (AGC) and the Saturn V IBM triple modular redundant (TMR) voting guidance computer system The word software was not even coined at the time but engineers such as Margaret Hamilton MITIL Director of Apollo On-board Software can attest to the fact that some the same issues with creating reliable software then still exists today [5] A large majority of the issues then dealt with the communication between systems engineers and the programmers Requirements were thrown over the wall without the confirmation that the requirements were complete and a lot of the issues cropped up as interface problems Identifying these issues prompted Hamilton to create her own company and create a modeling language called Universal Systems Language (USL) to head off the problems experienced with Apollo [11] Some 200 plus modeling programs have been developed since Apollo and used to mitigate issues and increase confidence in systems of varying complexity13As time progressed other systems came along The NASA Dryden F8 Crusader was the first digital fly by wire (DFBW) jet aircraft that relied heavily on complex IMA and software for flight control The Space Transportation System (STS) shuttle includes a Quad Modular Redundant (QMR) system with a fifth backup flight computer containing uncommon code US Air Force and Naval airplanes that have possessed complex or redundant IMA configurations include the F14 Tomcat F15 Eagle F16 Falcon F18 Hornet F22 Raptor F35 Joint Strike Fighter F117 Nighthawk V22 Osprey C17 Globemaster and many more along with recent Unmanned Air Vehicle Systems (UAVS) The US Army complex systems on helicopters include the13RAH-66 Comanche DFBW Triple Modular Redundant (TMR) architecture13glass cockpit avionics on the UH-60M Blackhawk baseline 13Common Avionics Architecture System (CAAS) glass cockpit on the UH-60M Blackhawk modernization and CH-47F Chinooks13and other aircraft13Additionally there are many self-checking pair engine controller systems along with system of system Future Combat Systems (FCS) and Unmanned Air Vehicle Systems (UAVS) This has also permeated the commercial airliner market with the Airbus 320 and higher Airbus models Boeing 777 and Boeing 787 aircraft With this ever increasing technology something must be done about the reliability issue With such a wealth of data on aviation and non-aviation cyber-physical systems such as submarine ship nuclear medical locomotive and automotive systems there should be adequate information to get a start on modeling systems correctly for reliability Therefore this is not an isolated problem to avionics and other disciplines should aide in resolving this problem13

14

Some Complex System Failures

bull V-22 Osprey crashesbull Mars Climate Orbiter crashbull Mars Pathfinder software resetbull USS Vincennes downing an Airbus 320bull Therac-25 software radiation treatment

failurebull 1989 Airbus A320 air show crashbull China Airlines Airbus Industries A300

crashbull Ariane 5 satellite launcher malfunctionbull Failure of the primary flight system to

sync with the backup during prelaunch of STS-1

bull Mexicana Airlines Boeing 727 airliner crashed into a mountain due to the software not correctly determining the mountain position

bull Loss of the first American probe to Venus

bull Korean Airlines KAL 901 accidentbull Soviet Phobos I Mars probe lost

bull Three Mile Islandbull F-18 fighter plane crash due to bad

exceptionbull F-14 fighter plane lost to

uncontrollable spinbull Swedish Gripen prototype crashedbull Swedish Gripen air-show crashbull F-22 failure crossing the IDLbull 2006 German-Spanish Barracuda UAVbull 2004 FA-22 Raptor stealth fighter jet

crash bull FA-22 Raptor navigation system

software error at Nellis AFBbull 50 cockpit blackouts on A320bull A320 multiple avionics and electrical

failures at Newark NJbull Boeing 777 Malaysian Airlines jetlinerrsquos

nightmarish autopilot rollercoaster ridebull 3000 feet US Army and Air Force UAV

Crashes

bull hellip And Many Morehellip

Presenter
Presentation Notes
Multiple crashes have occurred with the V-22 Osprey [41]13In 1999 the Mars Climate Orbiter crashed because of incorrect units in a program caused by poor systems engineering practices [42 44]13In 1988 an Airbus 320 was shot down by the USS Vincennes because of cryptic and misleading output displayed by the tracking software [3]13In 1989 an Airbus A320 crashed at an air show due to altitude indication and software handling [3]13In 1994 a China Airlines Airbus Industries A300 crash on killing 264 from faulty software [3]13In 1996 the first Ariane 5 satellite launcher destruction mishap was caused by a faulty software design error with a few lines of ADA code containing unprotected variables Horizontal velocity of the Ariane 5 exceeded the Arian 4 resulting in the guidance system veering the rocket off course Insufficient testing did not catch this error which was a carry-over from Ariane 4[3 39]13In 1986 a Mexicana Airlines Boeing 727 airliner crashed into a mountain due to the software not correctly determining the mountain position [39]13In 1986 the Therac-25 radiation therapy machines overdosed cancer patients due to flaw in the computer program controlling the highly automated devices [3 39 45]13During the maiden launch in 1981 of the Discovery space shuttle a failure of the primary flight control computer system to establish sync with the backup during prelaunch [43]13On December 10 1990 the Space Shuttle Columbia had to land early due to computer software problems [39]13In 1997 The Mars Pathfinder software reset problem due to latent task execution caused by priority inversion with a mutex [3 44]13An error in a single FORTRAN statement resulted in the loss of the first American probe to Venus From G J Myers Software Reliability Principles amp Practice p 25 [3]13In September 1999 the Korean Airlines KAL 901 accident in Guam killed 225 out of 254 aboard A worldwide bug was discovered in barometric altimetry in Ground Proximity Warning System (GPWS) From ACM SIGSOFT Software Engineering Notes vol 23 no 1 [3]13The Soviet Phobos I Mars probe was lost due to a faulty software update at a cost of 300 million rubles Its disorientation broke the radio link and the solar batteries discharged before reacquisition From Aviation Week 13 Feb 1989 [3]13An F-18 fighter plane crash due to a missing exception condition From ACM SIGSOFT Software Engineering Notes vol 6 no 2 [3]13An F-14 fighter plane was lost to uncontrollable spin traced to tactical software From ACM SIGSOFT Software Engineering vol9 no5 [3]13In 1989 Swedish Gripen prototype crashed due to software in their digital fly-by-wire system [3 46]13In 1995 another Gripen fighter plane crashed during air-show caused by a software issue [3 46]13On February 11 2007 twelve FA-22 Raptors were forced to head back to the Hawaii when a software bug caused a computer crash as they were crossing International Date Line In the middle of the ocean all their systems comprising navigation fuel and part of the communications systems dumped All the attempts to reboot failed[47]13February 2006 German-Spanish Unmanned Combat Air Vehicle Barracuda crash due to software failure [4]13December 2004 a glitch in the software for controlling flight probably caused an FA-22 Raptor stealth fighter jet to crash on takeoff at Nellis Air Force [4]13In 2008 a United Airbus A320 registration N462UA experienced multiple avionics and electrical failures including loss of all communications shortly after rotation while departing Newark Liberty International Airport in Newark New Jersey [NTSB Report Identification DCA08IA033] 13In 2006 a Boeing 777 Malaysian Airlines jetlinerrsquos autopilot caused a stall to occur by climbing 3000 feet Pilots struggled to nose down the plane but plunged into a steep dive After pulling back up the pilots regained control Cause was defective flight software providing incorrect data for airspeed and acceleration confusing the flight computers and initially ignoring the pilotrsquos commands[49]13US Army and Air Force UAV crashes from control system or human error13

15

Lessons Learned from Failures

bull From Nancy Levesonrsquos paper ldquoThe Role of Software in Spacecraft Accidentsrdquondash ldquoFlaws in the safety culture diffusion of responsibility and

authorityndash Limited communication channels and poor information flowndash Inadequate system and software engineering ndash Poor or missing specifications ndash Unnecessary complexity and software functionality ndash Software reuse or changes without appropriate safety analysisndash [Shortcomings] in safety engineering practices ndash Flaws in test and simulation environments ndash Inadequate human factors design for softwarerdquo

Presenter
Presentation Notes
In Dr Nancy Levesonrsquos paper [36] ldquoThe Role of Software in Spacecraft Accidentsrdquo she cited problems with software development issues within NASA on various projects According to Dr Leveson there were ldquoflaws in the safety culture diffusion of responsibility and authority limited communication channels and poor information flow inadequate system and software engineering poor or missing specifications unnecessary complexity and software functionality software reuse or changes without appropriate safety analysis violation of basic safety engineering practices inadequate system safety engineering flaws in test and simulation environments and inadequate human factors design for softwarerdquo While these problems were identified for spacecraft development within NASA and corrected aviation in general could learn from these lessons to mitigate issues with complex systems development

16

Some Current Guidelines

bull DO-178B - Software Considerations in Airborne Systems and Equipment Certification bull DO-248B ndash Final Report for the Clarification of DO-178Bbull DO-278 - Guidelines for Communications Navigation Surveillance and Air Traffic Management

(CNSATM) Systems Software Integrity Assurancebull DO-254 - Design Assurance Guidance for Airborne Electronic Hardware bull DO-297 ndash Integrated Modular Avionics (IMA) Development Guidance and Certification

Considerationsbull SAE-ARP4754 ndash Certification Consideration for Highly Integrated or Complex Aircraft Systemsbull SAE-ARP4671- Guidelines and Methods for Conducting the Safety Assessment Process on

Airborne Systems and Equipmentbull FAA Advisory Circular AC27-1B - Certification of Normal Category Rotorcraftbull FAA Advisory Circular AC29-2C - Certification of Transport Category Rotorcraftbull ISOIEC 12207 - Software Life Cycle Processesbull ARINC 653 - Specification Standard for Time and System Partitionbull MIL-STD-882D - DoD System Safetybull ADS-51-HDBK - Rotorcraft and Aircraft Qualification Handbookbull AR-70-62 - Airworthiness Release Standardbull SED-SES-PMHFSA 001 - Software Engineering Directorate (SED) Software Engineering

Evaluation System (SEES) Program Manager Handbook for Flight Software Airworthinessbull SED-SES-PMHSS 001 - SED SEES Program Manager Handbook for Software Safety

WHATrsquoS MISSING - Reliability Standard for Complex Systems

Presenter
Presentation Notes
These problems previously stated drove the development of these guidelines however there is no standard for system reliability that includes software There are other standards and circulars that pertain to complex systems but a reliability standard for complex systems but a reliability standard is missing for complex systems which would outline the process for establishing cyber-physical systems reliability This standard should indicate how to model and analyze and ascertain the projected level of reliability

17

Certification Assessment Considerations

bull Sufficient data and time must be available for air worthiness evaluation

bull Certification processndash Currently lengthy ndash Depends much on human interpretation trade offs and risk mitigation ndash Overwhelming for complex integrated systems (FHAs FTAs FMECAs

risk mitigation etc)

bull Consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives

bull Certification Tasks outlined in DO-297 should be consideredndash Task 1 Module Acceptancendash Task 2 Application softwarehardware acceptancendash Task 3 IMA system acceptancendash Task 4 Aircraft integration of IMA system ndash including VampVndash Task 5 Change of modules or applicationsndash Task 6 Reused of modules or applications

Presenter
Presentation Notes
In order to execute an AWR sufficient data must be provided to the appropriate airworthiness authorities to demonstrate that safety requirements are met to the levels per the System Safety Assessment (SSA) or equivalent The certification process is currently lengthy and depends on much human interpretation of the myriad of complex architecture functions 13The current guidelines such as DO-178B DO-254 DO-297 SAE-ARP-4754 and SAE-ARP-4671 along with many other guidelines outline the proper steps that should be taken System safety managementrsquos military standard is MIL-STD-882 and has been in use for decades Civilian safety standards for the aviation industry include SAE ARP4754 which shows the incorporation of system safety activities into the design process and provides guidance on techniques to use to ensure a safe design SAE ARP4761 contains significant guidance on how to perform the system safety activities spoken about in SAE ARP4754 DO-178B outlines development assurance guidance for aviation software based on the failure condition categorization of the functionality provided by the software DO-254 embodies similar guidance for aviation hardware ARINC 653 is a widely accepted standard to ensure time and space partitioning for software DO-297 does an excellent job of describing the certification tasks to take for an IMA system which include13Task 1 Module acceptance13Task 2 Application softwarehardware acceptance13Task 3 IMA system acceptance13Task4 Aircraft integration of IMA systems including verification and validation13Task 5 Change of modules or applications13Task 6 Reuse of modules or applications13Taken together these standards provide guidance that if followed likely will result in safe highly reliable and cost-effective systems over the life-cycle of the system Yet while these guidelines exist there is not a consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives Also there is not a standard outlining overall reliability for a system to include hardware and software reliability In order to achieve this level of reliability a standard should be developed to define the process and method to achieve a quantifiable reliability number that would in turn lead to acceptance13

18

TRL 3 or 4 TRL 6 or 7 TRL 8 or 9

Component 4

ComplexityFundamentals

Reliability Parametrics

Component 3

ComplexityFundamentals

Reliability Parametrics

Definition of Complexity and Reliability is Needed

Subsystem 1

SystemIntegration ofComponents

ReliabilityDependencies

Component 2

ComplexityFundamentals

Reliability Parametrics

Component 1

ComplexityFundamentals

Reliability Parametrics

Subsystem 2

SystemIntegration ofComponents

ReliabilityDependencies

Integration System

Realized System

ReliabilitySensitivities

High Reliable Complex System

Certificate(eg AWR)

Integration

Integration

Integration

Integration

Integration

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) andor AWR

19

Analytical Models and Reliability

bull Analytical models of hardware reliability are well understood

bull Architecture modeling and software reliability modeling is not a novel idea but is highly debated

ndash There are many approaches and little consensus as to best wayndash Many models (Jelinski-Moranda Littlewood-Verrall Musa-Okumoto etc) [1]ndash Many tools (over 200+ tools since 1970s have been built) [2]

bull Predictability of software reliability is of great concern because it is a major contributor to unreliability[2]

bull Software Reliability is the probability of error-free operation of a computer program in a specified environment for a specified time [1]

bull Need basis for setting reliability figures based on previous systems and iteratively refine those figures in the future

bull NOT A REPLACEMENT FOR TESTING AND VERIFICATION

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability While not suggesting as solution in total better modeling practices should be considered toward a solution to bridge the gap between design test and implementation A method to model the architecture early in the requirements establishment phase follow the detailed design and coding and then be able to verify the system along with the model may be a path to greater confidence in the system and reduce the risks warnings and cautions that must be issued13In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result (see Figure 9) The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) or AWR 13To achieve this goal modeling and analysis tools that follow a standard for modeling reliability should exist As previously stated over 200 tools have been created since 1970s Here is a list of a few of the current tools13Universal Systems Language (USL)13Unified Modeling Language (UML)13Systems Modeling Language (SysML)13MATLABSimulink13Telelogic Rhapsody13MathCAD13Colored Petri Nets13Rate Monotonic Analysis (RMA)13STATEMATE (Used by Airbus)13Standard for the Development of Safety-Critical Embedded Software (SCADE)13OPNET13Embedded System Modeling Language (ESML)13Component Synthesis using Model-Integrated Computing (CoSMIC)13Architectural Analysis and Design Language (AADL)13By no means is this list complete Typically different companies and projects address this challenge and choose unit tools to perform the upfront analysis and modeling not following a standard approach Multiple tools need to converge or be compatible with the framework of a common tool such as AADL The resultant tool(s) could be used to verify the requirements up front to mitigate reliability issues down the life-cycle 13

20

Tools for Modeling and Analysis

bull Universal Systems Language (USL)bull Unified Modeling Language (UML)bull Systems Modeling Language (SysML)bull MATLABSimulinkbull Telelogic Rhapsodybull MathCadbull Colored Petri Netsbull Rate Monotonic Analysis (RMA)bull STATEMATE (Used by Airbus)bull SCADEbull OPNETbull Embedded System Modeling Language (ESML)bull Component Synthesis using Model-Integrated Computing (CoSMIC)bull Architectural Analysis and Design Language (AADL)bull At least 200+ more packages since the 70rsquosbull Certified tools needs to converge to an accepted standard

modelinganalysis method for complex system reliability

Presenter
Presentation Notes
Typically different companies and projects address the challenge and choose unit tools to perform the upfront analysis and modeling not following a guideline approach Multiple tools need to converge or be compatible with a set modeling standard for complex systems The compliant tools could be used to verify the requirements up front to mitigate reliability issues down the life-cycle A notional approach would follow that shown in Figure 8 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model The down side to modeling from certain circles is getting people to believe those models How do you certify a modeling tool and the actual models within the tools Those issues should be addressed going forward

21

Modification to Acquisition Model

Requirements Establishment

High Level Design

Detailed Specifications

Implementation Coding

Verification

Development TestingA

rchi

tect

ural

Mod

el amp

A

naly

sis

Propose standard modeling methodology to be applied at different phases of development to enhance requirements development reliability allocation reliability

measurement and testing (DISCLAIMER DOES NOT REPLACE TESTING)

Reliabilityallocated Reliability

measured

Operational Testing amp Validation

Deployed System

Presenter
Presentation Notes
A notional approach would follow that shown in Figure 10 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model This approach could bridge the design and test phases together It is emphasized here that the architectural model would not replace critical testing but augments the process to allow for better requirement identification and verification Thorough ground and flight tests should never be replaced by modeling Modeling would only allow for more robust and a higher level of confidence in the requirements and design The model could be used in conjunction with the testing to confirm the design Proper modeling and analysis would reduce total program costs by enforcing more complete and correct initial requirements which reduce issues discovered down the road in testing that are expensive to fix or impossible to fix and having to accept high risks Additionally if the model is maintained and optimized then it could possibly be used after system deployment to analyze impacts of upgrades or changes to the system allowing for more complete analysis and reduce overall system redesign costs1313A hurdle to cross with modeling and analysis is convincing people to believe those models Some method to certify these models and modeling tools should be addressed in the future Standards should be set in place for correct modeling techniques for complex systems Lastly consideration of standard verification checking tools should be made such as with the use of the Motor Industry Software Reliability Association (MISRA) compliance verification tool for the use of C in safety critical systems

22

Systems Reliability Standard Establishment

bull Establish a working group to define this standardndash Need a technical society to lead the charge on this

bull Collaborate with industry academia military and societiesndash Focus on development of a reliability standard with AWR safety in mindndash Draw upon the experiences to feed into this standard

bull Study existing and previous complex systemsndash Shuttle Space Station missile systems nuclear submarine and ship

systems nuclear control systems military and commercial jet systems ndash Obtain software reliability information from given existing and previous

systemsndash Build database which would serve as basis for future reliability

bull Research prior efforts in complex systems analysis

bull Establish consensus based modeling and analysis method

Presenter
Presentation Notes
In conclusion methods for achieving a design for complex systems do exist however achieving reliability and attaining a level of qualification that would permit better AWRs does not There needs to be a standard developed to tackle this issue rather than relying on current methods to ascertain airworthiness of complex systems This will not occur overnight An orchestrated collaboration among industry academia military labs and technically professional societies to focus on development of this standard should allow us to draw upon the experiences to feed into this reliability standard with AWR safety in mind We have a long living experiment with complex software systems on the Space Transportation System (STS) International Space Station (ISS) missile systems nuclear submarine and ship systems nuclear control systems military and commercial jet systems from which we should be able to obtain at least inferred software reliability information from the architecture and run time that the systems use We should look at the lessons learned from these systems to see what could have been done to improve and what was done right that should be carried forward The challenge is collecting this information into a central database and arriving at some figure of reliability from previous systems where data exists This would at least provide a starting point to allow initial assessments and could be optimized in the future Also this is not the only study for establishing reliability metrics to complex software systems There have been research projects of similarity to this effort that have risen and fallen The data from those projects should not be wasted but studied to feed into whatever standard that is developed While historical information would be useful each design is unique and requires tools to accomplish the design Investigation of architectural modeling constructs should be further investigated as a possible augmentation to the design and test process We need to determine which forum is best to conduct this effort (eg SAE IEEE AIAA ACM AHS INCOSE or other) As stated in the paper ldquoSpace Shuttle Avionicsrdquo [31] ldquoThe designers the flight crew and other operational users of the system often have a mindset established in a previous program or experience which results in a bias against new or different lsquounconventionalrsquo approachesrdquo If nothing is done to address this problem it will only get worse over time It is past time to address the issue of reliability of complex systems and software

23

BACKUP SLIDES

BACKUP SLIDES

24

PEO Aviation System Safety Management Decision Authority Matrix

Severity(Most Credible)

FrequentP gt 1E-3

A

Probable 1D-4 lt P lt= 1E-3

B

Occasional1E-5 lt P lt= 1E-4

C

Remote1E-6 lt P lt= 1E-5

D

Improbable1E-7 lt P lt= 1E-6

E

Catastrophic1

Critical 2

Marginal 3

Negligible4

Army Acquisition

PEO Aviation

ProgramManagement

HazardCategory

Description

1 Catastrophic Death or permanent total disability system loss

2 Critical Severe injury or minor occupational illness (no permanent effect) minor system or environmental damage

3 Marginal Minor injury or minor occupational illness (no permanent effect) minor system or environmental damage

4 Negligible Less than minor injury or occupational illness (no lost workdays) or less than minor environmental damage

RiskLevel

Description Probability (Frequency) (per 100000 flight hours)

A Frequent gt 100 (P gt 1E-3)

B Probable lt=100 and gt10 (1E-4 lt P lt= 1E-3)

C Occasional lt= 10 and gt1 (1E-5 lt P lt= 1E-4)

D Remote lt=1 and gt01 (1E-6 lt P lt= 1E-5)

E Improbable lt=01 and gt001 (1E-7 lt P lt= 1E-6)

Presenter
Presentation Notes
As already mentioned it is the goal of the US Army Aviation Engineering Directorate that the system is safe and reliable to operate and will perform the mission when delivered Another goal includes the released system continues to safely perform the mission if maintained and operated per the operatorrsquos manual Replacement parts and overhaul work must be high quality to support continued airworthiness Per the Program Element Office Memorandum 08-03 Risk Matrix US Army flight control systems are to achieve 1E-9 reliability for flight critical functions per civil airspace regulations [35] and 1E-6 reliability for tactical use [36] Quantifying these numbers is established for component hardware but not for software Just as hardware should have quantifiable reliability so should software13

25

Reliability Defined

bull Software Reliability - the probability that a given piece of software will execute without failure in a given environment for a given time [40] ndash Often debated as to how to measure

bull Hardware Reliability - the probability that a hardware component fails over time ndash Well defined and established

bull System Reliability - the probability of success or the probability that the system will perform its intended function under specified design limits [over a given amount of time] [39] ndash A combination of software and hardware reliability

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

26

Hardware vs Software Reliability

Hardware Reliability Software ReliabilityFailure rate has a bathtub curve The burn-in state is similar to the software debugging state

Without considering program evolution failure rate is statistically non-increasing

Material deterioration can cause failures even though the system is not used

Failures never occur if the software is not used

Failure data are fitted to some distributions The selection of the underlying distribution is based on the analysis of failure data and experiences Emphasis is placed on analyzing failure data

Most models are analytically derived from assumptions Emphasis is on developing the model the interpretation of the model assumptions and the physical meaning of the parameters

Failures are caused by material deterioration design errors misuse and environment

Failures are caused by incorrect logic incorrect statements or incorrect input data

Can be improved by better design better material applying redundancy and accelerated life cycle testing

Can be improved by increasing testing effort and correcting discovered faults Reliability tends to change continuously during testing due to the addition of problems in new code or the removal of problems by debugging errors

Hardware repairs restore the original condition Software repairs establish a new piece of software

Hardware failures are usually preceded by warnings Software failures are rarely preceded by warnings

Hardware components can be standardized Software components have rarely been standardized

Hardware can usually be tested exhaustively Software essentially requires infinite testing for completeness

Reference [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Hong Pham compares software versus hardware reliability with the information as shown in Table 2 [39] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

27

Acronym ListACRONYM DEFINITIONAADL Architectural Analysis and Design LanguageAC Advisory Circular (FAA)ACM Association of Computing MachineryAED Aviation Engineering Directorate (AMRDEC)AFTD Aviation Flight Test Directorate (US Army)AGC Apollo Guidance ComputerAHS American Helicopter SocietyAIAA American Institute of Aeronautics and Astronautics (Inc)AMCOM Aviation and Missile Command (US Army)AMRDEC Aviation and Missiles Research Development and Engineering Center (US Army)AR Army RegulationARINC Aeronautical Radio Inc ARP Aerospace Recommended PracticeASIF Avionics Software Integration FacilityATAM Architecture Tradeoff Analysis MethodATM Air Traffic ManagementAWR Airworthiness ReleaseCAAS Common Avionics Architecture SystemCH-47 Cargo Helicopter ChinookCMM Capability Maturity ModelCMMI Capability Maturity Model IndexCMU Carnegie Mellon UniversityCNS Communications Navigation SurveillanceCoSMIC Component Synthesis using Model-Integrated ComputingCPS Cyber-Physical SystemCRC Chemical Rubber Company (ie CRC Press)DFBW Digital Fly-By-WireDoD Department of DefenseE3 Electrical and Electromagnetic EffectsESML Embedded System Modeling LanguageFAA Federal Aviation AdministrationFCS Future Combat SystemsFHA Functional Hazard AssessmentFMEA Failure Modes Effects Analysis

28

Acronym List (concluded)

ACRONYM DEFINITIONGPWS Ground Proximity Warning SystemIBM International Business MachinesIEC International Engineering ConsortiumIL Instrumentation Lab (now Draper Laboratory)IMA Integrated Modular AvionicsINCOSE International Council On Systems EngineeringISO International Organization for StandardizationISS International Space StationKAL Korean AirlinesMISRA Motor Industry Standard Software Reliability AssociationMIT Massachusetts Institute of TechnologyNASA National Aeronautics and Space Administration (USA)PDR Preliminary Design ReviewPEO Program Element OfficePNAS Proceedings of the National Academy of SciencesRAQ Rotorcraft and Aircraft QualificationRMA Rate Monotonic AnalysisRTC Redstone Test Center (US Army) RTTC Redstone Technical Test Center (US Army)RTCA Radio Technical Commission for AeronauticsSAE Society of Automotive EngineersSED Software Engineering Directorate (AMRDEC)SEES Software Engineering Evaluation SystemSEI Software Engineering Institute (CMU)SIL System Integration LaboratorySSA System Safety AssessmentSTS Space Transportation SystemSysML Systems Modeling LanguageTMR Triple Modular RedundantTRL Technical Readiness LevelUAS Unmanned Aircraft SystemUH-60 Utility Helicopter BlackhawkUML Unified Modeling LanguageUS United StatesUSL Universal Systems Language

29

References

bull [1] Israel Koren and Mani Krishna ldquoFault-Tolerant Systemsrdquo Morgan Kaufmann 2007bull [2] Jianto Pan ldquoSoftware Reliabilityrdquo Carnegie Mellon University Spring 1999bull [3] Nachum Dershowitz httpwwwcstauacil~nachumdhorrorhtmlbull [4] httpwwwair-attackcombull [5] David A Mindell ldquoDigital Apollo Human and Machine in Spaceflightrdquo The MIT Press 2008bull [6] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Flight Software Airworthiness SED-SES-PMHFSA 001 December 2003bull [7] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Software Safety SED-SES-PMHSSA 001 February 2006bull [8] AMCOM Regulation 385-17 Software System Safety Policy 15 March 2008bull [9] NASA Software Safety Guidebook NASA-GB-871913 31 March 2004bull [10] Margaret Hamilton amp William Hackler ldquoUniversal Systems Language Lessons Learned from Apollordquo IEEE Computer

Society 2008bull [11] Margaret Hamilton ldquoFull Life Cycle Systems Engineering and Software Development Environment Development Before

The Fact In Actionrdquo httpwwwhtiuscomArticlesFull_Life_Cyclehtmbull [12] Peter Feiler David Gluch John Hudak ldquo The Architecture Analysis amp Design Language (AADL) An Introductionrdquo

CMUSEI-2006-TN-011 February 2006bull [13] Peter Feiler John Hudak ldquoDeveloping AADL Models for Control Systems A Practitionerrsquos Guiderdquo CMUSEI-2007-TR-

014 July 2007bull [14] Bruce Lewis ldquoUsing the Architecture Analysis and Design Language for System Verification and Validationrdquo SEI

Presentation 2006bull [15] Feiler Gluch Hudak Lewis ldquoEmbedded System Architecture Analysis Using SAE AADLrdquo CMUSEI-2004-TN-004 June

2004bull [16] Charles Pecheur Stacy Nelson ldquoVampV of Advanced Systems at NASArdquo NASACR-2002-211402 April 2002bull [17] Systems Integration Requirements Task Group ldquoARP 4754 Certification Considerations for Highly-Integrated or

Complex Aircraft Systemsrdquo SAE Aerospace 10 April 1996bull [18] SAE ldquoARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne System and

Equipmentrdquo December 1996bull [19] Aeronautical Radio Inc (ARINC) ldquoARINC Specification 653P1-2 Avionics Application Software Standard Interface Part 1

ndash Required Servicesrdquo 7 March 2006

30

References (Continued)

bull [20] Department of Defense ldquoMIL-STD-882D Standard Practice for System Safetyrdquo 19 January 1993bull [21] RTCA Incorporated ldquoDO-178 Software Considerations in Airborne Systems and Equipment Certificationrdquo 1

December 1992bull [22] RTCA Incorporated ldquoDO-254 Design Assurance Guidance for Airborne Electronic Hardwarerdquo 19 April 2000bull [23] US Army ldquoAeronautical Design Standard Handbook Rotorcraft and Aircraft Qualification (RAQ) Handbookrdquo 21

October 1996bull [24] Cary R Spitzer (Editor) ldquoAvionics Elements Software and Functionsrdquo CRC Press 2007bull [25] US Army ldquoArmy Regulation 70-62 Airworthiness Qualification of Aircraft Systemsrdquo 21 May 2007bull [26] US Army ldquoArmy Regulation 95-1 Aviation Flight Regulationsrdquo 3 February 2006bull [27] ldquoUsing the Architecture Tradeoff Analysis Method (ATAM) to Evaluate the Software Architecture for a Product Line

of Avionics Systems A Case Studyrdquo Barbacci Clements Lattanze Northrop Wood July 2003 CMUSEI-2003-TN-012bull [28] ldquoAll in the Family CAAS amp AADLrdquo Peter Feiler August 2008 CMUSEI-2008-SR-021bull [29] ldquoCMMI Guidelines for Process Integration and Product Improvementrdquo Chrissis Konrad Shrum Pearson Education

2007bull [30] ldquoModel Driven Performance Analysis for Avionics Systemsrdquo Brendan OrsquoConnell Draper Laboratory January 2006bull [31] John F Hanaway Robert W Moorehead ldquoSpace Shuttle Avionics Systemsrdquo NASA SP-504 1989bull [32] Lui Sha ldquoThe Complexity Challenge in Modern Avionics Softwarerdquo August 14 2006bull [33] ldquoIncidents Prompt New Scrutiny of Airplane Software Glitchesrdquo 30 May 2006 Wall Street Journalbull [34] Eyal Ophir Clifford Nass and Anthony Wagner ldquo Cognitive Control in Media Multitaskersrdquo PNAS 20 July 2009bull [35] ldquoAdvisory Circular AC 251309-1A System Design and Analysisrdquo Federal Aviation Administration 21 June 1988bull [36] Program Element Office Policy Memorandum 08-03bull [38] httpwwwnsfgovpubs2008nsf08611nsf08611htm National Science Foundation webpage on Cyber-Physical

Systemsbull [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000bull [40] Paul Rook editor ldquoSoftware Reliability Handbookrdquo Elsevier Science Publishers LTD 1990

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)

14

Some Complex System Failures

bull V-22 Osprey crashesbull Mars Climate Orbiter crashbull Mars Pathfinder software resetbull USS Vincennes downing an Airbus 320bull Therac-25 software radiation treatment

failurebull 1989 Airbus A320 air show crashbull China Airlines Airbus Industries A300

crashbull Ariane 5 satellite launcher malfunctionbull Failure of the primary flight system to

sync with the backup during prelaunch of STS-1

bull Mexicana Airlines Boeing 727 airliner crashed into a mountain due to the software not correctly determining the mountain position

bull Loss of the first American probe to Venus

bull Korean Airlines KAL 901 accidentbull Soviet Phobos I Mars probe lost

bull Three Mile Islandbull F-18 fighter plane crash due to bad

exceptionbull F-14 fighter plane lost to

uncontrollable spinbull Swedish Gripen prototype crashedbull Swedish Gripen air-show crashbull F-22 failure crossing the IDLbull 2006 German-Spanish Barracuda UAVbull 2004 FA-22 Raptor stealth fighter jet

crash bull FA-22 Raptor navigation system

software error at Nellis AFBbull 50 cockpit blackouts on A320bull A320 multiple avionics and electrical

failures at Newark NJbull Boeing 777 Malaysian Airlines jetlinerrsquos

nightmarish autopilot rollercoaster ridebull 3000 feet US Army and Air Force UAV

Crashes

bull hellip And Many Morehellip

Presenter
Presentation Notes
Multiple crashes have occurred with the V-22 Osprey [41]13In 1999 the Mars Climate Orbiter crashed because of incorrect units in a program caused by poor systems engineering practices [42 44]13In 1988 an Airbus 320 was shot down by the USS Vincennes because of cryptic and misleading output displayed by the tracking software [3]13In 1989 an Airbus A320 crashed at an air show due to altitude indication and software handling [3]13In 1994 a China Airlines Airbus Industries A300 crash on killing 264 from faulty software [3]13In 1996 the first Ariane 5 satellite launcher destruction mishap was caused by a faulty software design error with a few lines of ADA code containing unprotected variables Horizontal velocity of the Ariane 5 exceeded the Arian 4 resulting in the guidance system veering the rocket off course Insufficient testing did not catch this error which was a carry-over from Ariane 4[3 39]13In 1986 a Mexicana Airlines Boeing 727 airliner crashed into a mountain due to the software not correctly determining the mountain position [39]13In 1986 the Therac-25 radiation therapy machines overdosed cancer patients due to flaw in the computer program controlling the highly automated devices [3 39 45]13During the maiden launch in 1981 of the Discovery space shuttle a failure of the primary flight control computer system to establish sync with the backup during prelaunch [43]13On December 10 1990 the Space Shuttle Columbia had to land early due to computer software problems [39]13In 1997 The Mars Pathfinder software reset problem due to latent task execution caused by priority inversion with a mutex [3 44]13An error in a single FORTRAN statement resulted in the loss of the first American probe to Venus From G J Myers Software Reliability Principles amp Practice p 25 [3]13In September 1999 the Korean Airlines KAL 901 accident in Guam killed 225 out of 254 aboard A worldwide bug was discovered in barometric altimetry in Ground Proximity Warning System (GPWS) From ACM SIGSOFT Software Engineering Notes vol 23 no 1 [3]13The Soviet Phobos I Mars probe was lost due to a faulty software update at a cost of 300 million rubles Its disorientation broke the radio link and the solar batteries discharged before reacquisition From Aviation Week 13 Feb 1989 [3]13An F-18 fighter plane crash due to a missing exception condition From ACM SIGSOFT Software Engineering Notes vol 6 no 2 [3]13An F-14 fighter plane was lost to uncontrollable spin traced to tactical software From ACM SIGSOFT Software Engineering vol9 no5 [3]13In 1989 Swedish Gripen prototype crashed due to software in their digital fly-by-wire system [3 46]13In 1995 another Gripen fighter plane crashed during air-show caused by a software issue [3 46]13On February 11 2007 twelve FA-22 Raptors were forced to head back to the Hawaii when a software bug caused a computer crash as they were crossing International Date Line In the middle of the ocean all their systems comprising navigation fuel and part of the communications systems dumped All the attempts to reboot failed[47]13February 2006 German-Spanish Unmanned Combat Air Vehicle Barracuda crash due to software failure [4]13December 2004 a glitch in the software for controlling flight probably caused an FA-22 Raptor stealth fighter jet to crash on takeoff at Nellis Air Force [4]13In 2008 a United Airbus A320 registration N462UA experienced multiple avionics and electrical failures including loss of all communications shortly after rotation while departing Newark Liberty International Airport in Newark New Jersey [NTSB Report Identification DCA08IA033] 13In 2006 a Boeing 777 Malaysian Airlines jetlinerrsquos autopilot caused a stall to occur by climbing 3000 feet Pilots struggled to nose down the plane but plunged into a steep dive After pulling back up the pilots regained control Cause was defective flight software providing incorrect data for airspeed and acceleration confusing the flight computers and initially ignoring the pilotrsquos commands[49]13US Army and Air Force UAV crashes from control system or human error13

15

Lessons Learned from Failures

bull From Nancy Levesonrsquos paper ldquoThe Role of Software in Spacecraft Accidentsrdquondash ldquoFlaws in the safety culture diffusion of responsibility and

authorityndash Limited communication channels and poor information flowndash Inadequate system and software engineering ndash Poor or missing specifications ndash Unnecessary complexity and software functionality ndash Software reuse or changes without appropriate safety analysisndash [Shortcomings] in safety engineering practices ndash Flaws in test and simulation environments ndash Inadequate human factors design for softwarerdquo

Presenter
Presentation Notes
In Dr Nancy Levesonrsquos paper [36] ldquoThe Role of Software in Spacecraft Accidentsrdquo she cited problems with software development issues within NASA on various projects According to Dr Leveson there were ldquoflaws in the safety culture diffusion of responsibility and authority limited communication channels and poor information flow inadequate system and software engineering poor or missing specifications unnecessary complexity and software functionality software reuse or changes without appropriate safety analysis violation of basic safety engineering practices inadequate system safety engineering flaws in test and simulation environments and inadequate human factors design for softwarerdquo While these problems were identified for spacecraft development within NASA and corrected aviation in general could learn from these lessons to mitigate issues with complex systems development

16

Some Current Guidelines

bull DO-178B - Software Considerations in Airborne Systems and Equipment Certification bull DO-248B ndash Final Report for the Clarification of DO-178Bbull DO-278 - Guidelines for Communications Navigation Surveillance and Air Traffic Management

(CNSATM) Systems Software Integrity Assurancebull DO-254 - Design Assurance Guidance for Airborne Electronic Hardware bull DO-297 ndash Integrated Modular Avionics (IMA) Development Guidance and Certification

Considerationsbull SAE-ARP4754 ndash Certification Consideration for Highly Integrated or Complex Aircraft Systemsbull SAE-ARP4671- Guidelines and Methods for Conducting the Safety Assessment Process on

Airborne Systems and Equipmentbull FAA Advisory Circular AC27-1B - Certification of Normal Category Rotorcraftbull FAA Advisory Circular AC29-2C - Certification of Transport Category Rotorcraftbull ISOIEC 12207 - Software Life Cycle Processesbull ARINC 653 - Specification Standard for Time and System Partitionbull MIL-STD-882D - DoD System Safetybull ADS-51-HDBK - Rotorcraft and Aircraft Qualification Handbookbull AR-70-62 - Airworthiness Release Standardbull SED-SES-PMHFSA 001 - Software Engineering Directorate (SED) Software Engineering

Evaluation System (SEES) Program Manager Handbook for Flight Software Airworthinessbull SED-SES-PMHSS 001 - SED SEES Program Manager Handbook for Software Safety

WHATrsquoS MISSING - Reliability Standard for Complex Systems

Presenter
Presentation Notes
These problems previously stated drove the development of these guidelines however there is no standard for system reliability that includes software There are other standards and circulars that pertain to complex systems but a reliability standard for complex systems but a reliability standard is missing for complex systems which would outline the process for establishing cyber-physical systems reliability This standard should indicate how to model and analyze and ascertain the projected level of reliability

17

Certification Assessment Considerations

bull Sufficient data and time must be available for air worthiness evaluation

bull Certification processndash Currently lengthy ndash Depends much on human interpretation trade offs and risk mitigation ndash Overwhelming for complex integrated systems (FHAs FTAs FMECAs

risk mitigation etc)

bull Consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives

bull Certification Tasks outlined in DO-297 should be consideredndash Task 1 Module Acceptancendash Task 2 Application softwarehardware acceptancendash Task 3 IMA system acceptancendash Task 4 Aircraft integration of IMA system ndash including VampVndash Task 5 Change of modules or applicationsndash Task 6 Reused of modules or applications

Presenter
Presentation Notes
In order to execute an AWR sufficient data must be provided to the appropriate airworthiness authorities to demonstrate that safety requirements are met to the levels per the System Safety Assessment (SSA) or equivalent The certification process is currently lengthy and depends on much human interpretation of the myriad of complex architecture functions 13The current guidelines such as DO-178B DO-254 DO-297 SAE-ARP-4754 and SAE-ARP-4671 along with many other guidelines outline the proper steps that should be taken System safety managementrsquos military standard is MIL-STD-882 and has been in use for decades Civilian safety standards for the aviation industry include SAE ARP4754 which shows the incorporation of system safety activities into the design process and provides guidance on techniques to use to ensure a safe design SAE ARP4761 contains significant guidance on how to perform the system safety activities spoken about in SAE ARP4754 DO-178B outlines development assurance guidance for aviation software based on the failure condition categorization of the functionality provided by the software DO-254 embodies similar guidance for aviation hardware ARINC 653 is a widely accepted standard to ensure time and space partitioning for software DO-297 does an excellent job of describing the certification tasks to take for an IMA system which include13Task 1 Module acceptance13Task 2 Application softwarehardware acceptance13Task 3 IMA system acceptance13Task4 Aircraft integration of IMA systems including verification and validation13Task 5 Change of modules or applications13Task 6 Reuse of modules or applications13Taken together these standards provide guidance that if followed likely will result in safe highly reliable and cost-effective systems over the life-cycle of the system Yet while these guidelines exist there is not a consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives Also there is not a standard outlining overall reliability for a system to include hardware and software reliability In order to achieve this level of reliability a standard should be developed to define the process and method to achieve a quantifiable reliability number that would in turn lead to acceptance13

18

TRL 3 or 4 TRL 6 or 7 TRL 8 or 9

Component 4

ComplexityFundamentals

Reliability Parametrics

Component 3

ComplexityFundamentals

Reliability Parametrics

Definition of Complexity and Reliability is Needed

Subsystem 1

SystemIntegration ofComponents

ReliabilityDependencies

Component 2

ComplexityFundamentals

Reliability Parametrics

Component 1

ComplexityFundamentals

Reliability Parametrics

Subsystem 2

SystemIntegration ofComponents

ReliabilityDependencies

Integration System

Realized System

ReliabilitySensitivities

High Reliable Complex System

Certificate(eg AWR)

Integration

Integration

Integration

Integration

Integration

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) andor AWR

19

Analytical Models and Reliability

bull Analytical models of hardware reliability are well understood

bull Architecture modeling and software reliability modeling is not a novel idea but is highly debated

ndash There are many approaches and little consensus as to best wayndash Many models (Jelinski-Moranda Littlewood-Verrall Musa-Okumoto etc) [1]ndash Many tools (over 200+ tools since 1970s have been built) [2]

bull Predictability of software reliability is of great concern because it is a major contributor to unreliability[2]

bull Software Reliability is the probability of error-free operation of a computer program in a specified environment for a specified time [1]

bull Need basis for setting reliability figures based on previous systems and iteratively refine those figures in the future

bull NOT A REPLACEMENT FOR TESTING AND VERIFICATION

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability While not suggesting as solution in total better modeling practices should be considered toward a solution to bridge the gap between design test and implementation A method to model the architecture early in the requirements establishment phase follow the detailed design and coding and then be able to verify the system along with the model may be a path to greater confidence in the system and reduce the risks warnings and cautions that must be issued13In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result (see Figure 9) The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) or AWR 13To achieve this goal modeling and analysis tools that follow a standard for modeling reliability should exist As previously stated over 200 tools have been created since 1970s Here is a list of a few of the current tools13Universal Systems Language (USL)13Unified Modeling Language (UML)13Systems Modeling Language (SysML)13MATLABSimulink13Telelogic Rhapsody13MathCAD13Colored Petri Nets13Rate Monotonic Analysis (RMA)13STATEMATE (Used by Airbus)13Standard for the Development of Safety-Critical Embedded Software (SCADE)13OPNET13Embedded System Modeling Language (ESML)13Component Synthesis using Model-Integrated Computing (CoSMIC)13Architectural Analysis and Design Language (AADL)13By no means is this list complete Typically different companies and projects address this challenge and choose unit tools to perform the upfront analysis and modeling not following a standard approach Multiple tools need to converge or be compatible with the framework of a common tool such as AADL The resultant tool(s) could be used to verify the requirements up front to mitigate reliability issues down the life-cycle 13

20

Tools for Modeling and Analysis

bull Universal Systems Language (USL)bull Unified Modeling Language (UML)bull Systems Modeling Language (SysML)bull MATLABSimulinkbull Telelogic Rhapsodybull MathCadbull Colored Petri Netsbull Rate Monotonic Analysis (RMA)bull STATEMATE (Used by Airbus)bull SCADEbull OPNETbull Embedded System Modeling Language (ESML)bull Component Synthesis using Model-Integrated Computing (CoSMIC)bull Architectural Analysis and Design Language (AADL)bull At least 200+ more packages since the 70rsquosbull Certified tools needs to converge to an accepted standard

modelinganalysis method for complex system reliability

Presenter
Presentation Notes
Typically different companies and projects address the challenge and choose unit tools to perform the upfront analysis and modeling not following a guideline approach Multiple tools need to converge or be compatible with a set modeling standard for complex systems The compliant tools could be used to verify the requirements up front to mitigate reliability issues down the life-cycle A notional approach would follow that shown in Figure 8 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model The down side to modeling from certain circles is getting people to believe those models How do you certify a modeling tool and the actual models within the tools Those issues should be addressed going forward

21

Modification to Acquisition Model

Requirements Establishment

High Level Design

Detailed Specifications

Implementation Coding

Verification

Development TestingA

rchi

tect

ural

Mod

el amp

A

naly

sis

Propose standard modeling methodology to be applied at different phases of development to enhance requirements development reliability allocation reliability

measurement and testing (DISCLAIMER DOES NOT REPLACE TESTING)

Reliabilityallocated Reliability

measured

Operational Testing amp Validation

Deployed System

Presenter
Presentation Notes
A notional approach would follow that shown in Figure 10 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model This approach could bridge the design and test phases together It is emphasized here that the architectural model would not replace critical testing but augments the process to allow for better requirement identification and verification Thorough ground and flight tests should never be replaced by modeling Modeling would only allow for more robust and a higher level of confidence in the requirements and design The model could be used in conjunction with the testing to confirm the design Proper modeling and analysis would reduce total program costs by enforcing more complete and correct initial requirements which reduce issues discovered down the road in testing that are expensive to fix or impossible to fix and having to accept high risks Additionally if the model is maintained and optimized then it could possibly be used after system deployment to analyze impacts of upgrades or changes to the system allowing for more complete analysis and reduce overall system redesign costs1313A hurdle to cross with modeling and analysis is convincing people to believe those models Some method to certify these models and modeling tools should be addressed in the future Standards should be set in place for correct modeling techniques for complex systems Lastly consideration of standard verification checking tools should be made such as with the use of the Motor Industry Software Reliability Association (MISRA) compliance verification tool for the use of C in safety critical systems

22

Systems Reliability Standard Establishment

bull Establish a working group to define this standardndash Need a technical society to lead the charge on this

bull Collaborate with industry academia military and societiesndash Focus on development of a reliability standard with AWR safety in mindndash Draw upon the experiences to feed into this standard

bull Study existing and previous complex systemsndash Shuttle Space Station missile systems nuclear submarine and ship

systems nuclear control systems military and commercial jet systems ndash Obtain software reliability information from given existing and previous

systemsndash Build database which would serve as basis for future reliability

bull Research prior efforts in complex systems analysis

bull Establish consensus based modeling and analysis method

Presenter
Presentation Notes
In conclusion methods for achieving a design for complex systems do exist however achieving reliability and attaining a level of qualification that would permit better AWRs does not There needs to be a standard developed to tackle this issue rather than relying on current methods to ascertain airworthiness of complex systems This will not occur overnight An orchestrated collaboration among industry academia military labs and technically professional societies to focus on development of this standard should allow us to draw upon the experiences to feed into this reliability standard with AWR safety in mind We have a long living experiment with complex software systems on the Space Transportation System (STS) International Space Station (ISS) missile systems nuclear submarine and ship systems nuclear control systems military and commercial jet systems from which we should be able to obtain at least inferred software reliability information from the architecture and run time that the systems use We should look at the lessons learned from these systems to see what could have been done to improve and what was done right that should be carried forward The challenge is collecting this information into a central database and arriving at some figure of reliability from previous systems where data exists This would at least provide a starting point to allow initial assessments and could be optimized in the future Also this is not the only study for establishing reliability metrics to complex software systems There have been research projects of similarity to this effort that have risen and fallen The data from those projects should not be wasted but studied to feed into whatever standard that is developed While historical information would be useful each design is unique and requires tools to accomplish the design Investigation of architectural modeling constructs should be further investigated as a possible augmentation to the design and test process We need to determine which forum is best to conduct this effort (eg SAE IEEE AIAA ACM AHS INCOSE or other) As stated in the paper ldquoSpace Shuttle Avionicsrdquo [31] ldquoThe designers the flight crew and other operational users of the system often have a mindset established in a previous program or experience which results in a bias against new or different lsquounconventionalrsquo approachesrdquo If nothing is done to address this problem it will only get worse over time It is past time to address the issue of reliability of complex systems and software

23

BACKUP SLIDES

BACKUP SLIDES

24

PEO Aviation System Safety Management Decision Authority Matrix

Severity(Most Credible)

FrequentP gt 1E-3

A

Probable 1D-4 lt P lt= 1E-3

B

Occasional1E-5 lt P lt= 1E-4

C

Remote1E-6 lt P lt= 1E-5

D

Improbable1E-7 lt P lt= 1E-6

E

Catastrophic1

Critical 2

Marginal 3

Negligible4

Army Acquisition

PEO Aviation

ProgramManagement

HazardCategory

Description

1 Catastrophic Death or permanent total disability system loss

2 Critical Severe injury or minor occupational illness (no permanent effect) minor system or environmental damage

3 Marginal Minor injury or minor occupational illness (no permanent effect) minor system or environmental damage

4 Negligible Less than minor injury or occupational illness (no lost workdays) or less than minor environmental damage

RiskLevel

Description Probability (Frequency) (per 100000 flight hours)

A Frequent gt 100 (P gt 1E-3)

B Probable lt=100 and gt10 (1E-4 lt P lt= 1E-3)

C Occasional lt= 10 and gt1 (1E-5 lt P lt= 1E-4)

D Remote lt=1 and gt01 (1E-6 lt P lt= 1E-5)

E Improbable lt=01 and gt001 (1E-7 lt P lt= 1E-6)

Presenter
Presentation Notes
As already mentioned it is the goal of the US Army Aviation Engineering Directorate that the system is safe and reliable to operate and will perform the mission when delivered Another goal includes the released system continues to safely perform the mission if maintained and operated per the operatorrsquos manual Replacement parts and overhaul work must be high quality to support continued airworthiness Per the Program Element Office Memorandum 08-03 Risk Matrix US Army flight control systems are to achieve 1E-9 reliability for flight critical functions per civil airspace regulations [35] and 1E-6 reliability for tactical use [36] Quantifying these numbers is established for component hardware but not for software Just as hardware should have quantifiable reliability so should software13

25

Reliability Defined

bull Software Reliability - the probability that a given piece of software will execute without failure in a given environment for a given time [40] ndash Often debated as to how to measure

bull Hardware Reliability - the probability that a hardware component fails over time ndash Well defined and established

bull System Reliability - the probability of success or the probability that the system will perform its intended function under specified design limits [over a given amount of time] [39] ndash A combination of software and hardware reliability

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

26

Hardware vs Software Reliability

Hardware Reliability Software ReliabilityFailure rate has a bathtub curve The burn-in state is similar to the software debugging state

Without considering program evolution failure rate is statistically non-increasing

Material deterioration can cause failures even though the system is not used

Failures never occur if the software is not used

Failure data are fitted to some distributions The selection of the underlying distribution is based on the analysis of failure data and experiences Emphasis is placed on analyzing failure data

Most models are analytically derived from assumptions Emphasis is on developing the model the interpretation of the model assumptions and the physical meaning of the parameters

Failures are caused by material deterioration design errors misuse and environment

Failures are caused by incorrect logic incorrect statements or incorrect input data

Can be improved by better design better material applying redundancy and accelerated life cycle testing

Can be improved by increasing testing effort and correcting discovered faults Reliability tends to change continuously during testing due to the addition of problems in new code or the removal of problems by debugging errors

Hardware repairs restore the original condition Software repairs establish a new piece of software

Hardware failures are usually preceded by warnings Software failures are rarely preceded by warnings

Hardware components can be standardized Software components have rarely been standardized

Hardware can usually be tested exhaustively Software essentially requires infinite testing for completeness

Reference [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Hong Pham compares software versus hardware reliability with the information as shown in Table 2 [39] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

27

Acronym ListACRONYM DEFINITIONAADL Architectural Analysis and Design LanguageAC Advisory Circular (FAA)ACM Association of Computing MachineryAED Aviation Engineering Directorate (AMRDEC)AFTD Aviation Flight Test Directorate (US Army)AGC Apollo Guidance ComputerAHS American Helicopter SocietyAIAA American Institute of Aeronautics and Astronautics (Inc)AMCOM Aviation and Missile Command (US Army)AMRDEC Aviation and Missiles Research Development and Engineering Center (US Army)AR Army RegulationARINC Aeronautical Radio Inc ARP Aerospace Recommended PracticeASIF Avionics Software Integration FacilityATAM Architecture Tradeoff Analysis MethodATM Air Traffic ManagementAWR Airworthiness ReleaseCAAS Common Avionics Architecture SystemCH-47 Cargo Helicopter ChinookCMM Capability Maturity ModelCMMI Capability Maturity Model IndexCMU Carnegie Mellon UniversityCNS Communications Navigation SurveillanceCoSMIC Component Synthesis using Model-Integrated ComputingCPS Cyber-Physical SystemCRC Chemical Rubber Company (ie CRC Press)DFBW Digital Fly-By-WireDoD Department of DefenseE3 Electrical and Electromagnetic EffectsESML Embedded System Modeling LanguageFAA Federal Aviation AdministrationFCS Future Combat SystemsFHA Functional Hazard AssessmentFMEA Failure Modes Effects Analysis

28

Acronym List (concluded)

ACRONYM DEFINITIONGPWS Ground Proximity Warning SystemIBM International Business MachinesIEC International Engineering ConsortiumIL Instrumentation Lab (now Draper Laboratory)IMA Integrated Modular AvionicsINCOSE International Council On Systems EngineeringISO International Organization for StandardizationISS International Space StationKAL Korean AirlinesMISRA Motor Industry Standard Software Reliability AssociationMIT Massachusetts Institute of TechnologyNASA National Aeronautics and Space Administration (USA)PDR Preliminary Design ReviewPEO Program Element OfficePNAS Proceedings of the National Academy of SciencesRAQ Rotorcraft and Aircraft QualificationRMA Rate Monotonic AnalysisRTC Redstone Test Center (US Army) RTTC Redstone Technical Test Center (US Army)RTCA Radio Technical Commission for AeronauticsSAE Society of Automotive EngineersSED Software Engineering Directorate (AMRDEC)SEES Software Engineering Evaluation SystemSEI Software Engineering Institute (CMU)SIL System Integration LaboratorySSA System Safety AssessmentSTS Space Transportation SystemSysML Systems Modeling LanguageTMR Triple Modular RedundantTRL Technical Readiness LevelUAS Unmanned Aircraft SystemUH-60 Utility Helicopter BlackhawkUML Unified Modeling LanguageUS United StatesUSL Universal Systems Language

29

References

bull [1] Israel Koren and Mani Krishna ldquoFault-Tolerant Systemsrdquo Morgan Kaufmann 2007bull [2] Jianto Pan ldquoSoftware Reliabilityrdquo Carnegie Mellon University Spring 1999bull [3] Nachum Dershowitz httpwwwcstauacil~nachumdhorrorhtmlbull [4] httpwwwair-attackcombull [5] David A Mindell ldquoDigital Apollo Human and Machine in Spaceflightrdquo The MIT Press 2008bull [6] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Flight Software Airworthiness SED-SES-PMHFSA 001 December 2003bull [7] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Software Safety SED-SES-PMHSSA 001 February 2006bull [8] AMCOM Regulation 385-17 Software System Safety Policy 15 March 2008bull [9] NASA Software Safety Guidebook NASA-GB-871913 31 March 2004bull [10] Margaret Hamilton amp William Hackler ldquoUniversal Systems Language Lessons Learned from Apollordquo IEEE Computer

Society 2008bull [11] Margaret Hamilton ldquoFull Life Cycle Systems Engineering and Software Development Environment Development Before

The Fact In Actionrdquo httpwwwhtiuscomArticlesFull_Life_Cyclehtmbull [12] Peter Feiler David Gluch John Hudak ldquo The Architecture Analysis amp Design Language (AADL) An Introductionrdquo

CMUSEI-2006-TN-011 February 2006bull [13] Peter Feiler John Hudak ldquoDeveloping AADL Models for Control Systems A Practitionerrsquos Guiderdquo CMUSEI-2007-TR-

014 July 2007bull [14] Bruce Lewis ldquoUsing the Architecture Analysis and Design Language for System Verification and Validationrdquo SEI

Presentation 2006bull [15] Feiler Gluch Hudak Lewis ldquoEmbedded System Architecture Analysis Using SAE AADLrdquo CMUSEI-2004-TN-004 June

2004bull [16] Charles Pecheur Stacy Nelson ldquoVampV of Advanced Systems at NASArdquo NASACR-2002-211402 April 2002bull [17] Systems Integration Requirements Task Group ldquoARP 4754 Certification Considerations for Highly-Integrated or

Complex Aircraft Systemsrdquo SAE Aerospace 10 April 1996bull [18] SAE ldquoARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne System and

Equipmentrdquo December 1996bull [19] Aeronautical Radio Inc (ARINC) ldquoARINC Specification 653P1-2 Avionics Application Software Standard Interface Part 1

ndash Required Servicesrdquo 7 March 2006

30

References (Continued)

bull [20] Department of Defense ldquoMIL-STD-882D Standard Practice for System Safetyrdquo 19 January 1993bull [21] RTCA Incorporated ldquoDO-178 Software Considerations in Airborne Systems and Equipment Certificationrdquo 1

December 1992bull [22] RTCA Incorporated ldquoDO-254 Design Assurance Guidance for Airborne Electronic Hardwarerdquo 19 April 2000bull [23] US Army ldquoAeronautical Design Standard Handbook Rotorcraft and Aircraft Qualification (RAQ) Handbookrdquo 21

October 1996bull [24] Cary R Spitzer (Editor) ldquoAvionics Elements Software and Functionsrdquo CRC Press 2007bull [25] US Army ldquoArmy Regulation 70-62 Airworthiness Qualification of Aircraft Systemsrdquo 21 May 2007bull [26] US Army ldquoArmy Regulation 95-1 Aviation Flight Regulationsrdquo 3 February 2006bull [27] ldquoUsing the Architecture Tradeoff Analysis Method (ATAM) to Evaluate the Software Architecture for a Product Line

of Avionics Systems A Case Studyrdquo Barbacci Clements Lattanze Northrop Wood July 2003 CMUSEI-2003-TN-012bull [28] ldquoAll in the Family CAAS amp AADLrdquo Peter Feiler August 2008 CMUSEI-2008-SR-021bull [29] ldquoCMMI Guidelines for Process Integration and Product Improvementrdquo Chrissis Konrad Shrum Pearson Education

2007bull [30] ldquoModel Driven Performance Analysis for Avionics Systemsrdquo Brendan OrsquoConnell Draper Laboratory January 2006bull [31] John F Hanaway Robert W Moorehead ldquoSpace Shuttle Avionics Systemsrdquo NASA SP-504 1989bull [32] Lui Sha ldquoThe Complexity Challenge in Modern Avionics Softwarerdquo August 14 2006bull [33] ldquoIncidents Prompt New Scrutiny of Airplane Software Glitchesrdquo 30 May 2006 Wall Street Journalbull [34] Eyal Ophir Clifford Nass and Anthony Wagner ldquo Cognitive Control in Media Multitaskersrdquo PNAS 20 July 2009bull [35] ldquoAdvisory Circular AC 251309-1A System Design and Analysisrdquo Federal Aviation Administration 21 June 1988bull [36] Program Element Office Policy Memorandum 08-03bull [38] httpwwwnsfgovpubs2008nsf08611nsf08611htm National Science Foundation webpage on Cyber-Physical

Systemsbull [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000bull [40] Paul Rook editor ldquoSoftware Reliability Handbookrdquo Elsevier Science Publishers LTD 1990

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)

15

Lessons Learned from Failures

bull From Nancy Levesonrsquos paper ldquoThe Role of Software in Spacecraft Accidentsrdquondash ldquoFlaws in the safety culture diffusion of responsibility and

authorityndash Limited communication channels and poor information flowndash Inadequate system and software engineering ndash Poor or missing specifications ndash Unnecessary complexity and software functionality ndash Software reuse or changes without appropriate safety analysisndash [Shortcomings] in safety engineering practices ndash Flaws in test and simulation environments ndash Inadequate human factors design for softwarerdquo

Presenter
Presentation Notes
In Dr Nancy Levesonrsquos paper [36] ldquoThe Role of Software in Spacecraft Accidentsrdquo she cited problems with software development issues within NASA on various projects According to Dr Leveson there were ldquoflaws in the safety culture diffusion of responsibility and authority limited communication channels and poor information flow inadequate system and software engineering poor or missing specifications unnecessary complexity and software functionality software reuse or changes without appropriate safety analysis violation of basic safety engineering practices inadequate system safety engineering flaws in test and simulation environments and inadequate human factors design for softwarerdquo While these problems were identified for spacecraft development within NASA and corrected aviation in general could learn from these lessons to mitigate issues with complex systems development

16

Some Current Guidelines

bull DO-178B - Software Considerations in Airborne Systems and Equipment Certification bull DO-248B ndash Final Report for the Clarification of DO-178Bbull DO-278 - Guidelines for Communications Navigation Surveillance and Air Traffic Management

(CNSATM) Systems Software Integrity Assurancebull DO-254 - Design Assurance Guidance for Airborne Electronic Hardware bull DO-297 ndash Integrated Modular Avionics (IMA) Development Guidance and Certification

Considerationsbull SAE-ARP4754 ndash Certification Consideration for Highly Integrated or Complex Aircraft Systemsbull SAE-ARP4671- Guidelines and Methods for Conducting the Safety Assessment Process on

Airborne Systems and Equipmentbull FAA Advisory Circular AC27-1B - Certification of Normal Category Rotorcraftbull FAA Advisory Circular AC29-2C - Certification of Transport Category Rotorcraftbull ISOIEC 12207 - Software Life Cycle Processesbull ARINC 653 - Specification Standard for Time and System Partitionbull MIL-STD-882D - DoD System Safetybull ADS-51-HDBK - Rotorcraft and Aircraft Qualification Handbookbull AR-70-62 - Airworthiness Release Standardbull SED-SES-PMHFSA 001 - Software Engineering Directorate (SED) Software Engineering

Evaluation System (SEES) Program Manager Handbook for Flight Software Airworthinessbull SED-SES-PMHSS 001 - SED SEES Program Manager Handbook for Software Safety

WHATrsquoS MISSING - Reliability Standard for Complex Systems

Presenter
Presentation Notes
These problems previously stated drove the development of these guidelines however there is no standard for system reliability that includes software There are other standards and circulars that pertain to complex systems but a reliability standard for complex systems but a reliability standard is missing for complex systems which would outline the process for establishing cyber-physical systems reliability This standard should indicate how to model and analyze and ascertain the projected level of reliability

17

Certification Assessment Considerations

bull Sufficient data and time must be available for air worthiness evaluation

bull Certification processndash Currently lengthy ndash Depends much on human interpretation trade offs and risk mitigation ndash Overwhelming for complex integrated systems (FHAs FTAs FMECAs

risk mitigation etc)

bull Consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives

bull Certification Tasks outlined in DO-297 should be consideredndash Task 1 Module Acceptancendash Task 2 Application softwarehardware acceptancendash Task 3 IMA system acceptancendash Task 4 Aircraft integration of IMA system ndash including VampVndash Task 5 Change of modules or applicationsndash Task 6 Reused of modules or applications

Presenter
Presentation Notes
In order to execute an AWR sufficient data must be provided to the appropriate airworthiness authorities to demonstrate that safety requirements are met to the levels per the System Safety Assessment (SSA) or equivalent The certification process is currently lengthy and depends on much human interpretation of the myriad of complex architecture functions 13The current guidelines such as DO-178B DO-254 DO-297 SAE-ARP-4754 and SAE-ARP-4671 along with many other guidelines outline the proper steps that should be taken System safety managementrsquos military standard is MIL-STD-882 and has been in use for decades Civilian safety standards for the aviation industry include SAE ARP4754 which shows the incorporation of system safety activities into the design process and provides guidance on techniques to use to ensure a safe design SAE ARP4761 contains significant guidance on how to perform the system safety activities spoken about in SAE ARP4754 DO-178B outlines development assurance guidance for aviation software based on the failure condition categorization of the functionality provided by the software DO-254 embodies similar guidance for aviation hardware ARINC 653 is a widely accepted standard to ensure time and space partitioning for software DO-297 does an excellent job of describing the certification tasks to take for an IMA system which include13Task 1 Module acceptance13Task 2 Application softwarehardware acceptance13Task 3 IMA system acceptance13Task4 Aircraft integration of IMA systems including verification and validation13Task 5 Change of modules or applications13Task 6 Reuse of modules or applications13Taken together these standards provide guidance that if followed likely will result in safe highly reliable and cost-effective systems over the life-cycle of the system Yet while these guidelines exist there is not a consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives Also there is not a standard outlining overall reliability for a system to include hardware and software reliability In order to achieve this level of reliability a standard should be developed to define the process and method to achieve a quantifiable reliability number that would in turn lead to acceptance13

18

TRL 3 or 4 TRL 6 or 7 TRL 8 or 9

Component 4

ComplexityFundamentals

Reliability Parametrics

Component 3

ComplexityFundamentals

Reliability Parametrics

Definition of Complexity and Reliability is Needed

Subsystem 1

SystemIntegration ofComponents

ReliabilityDependencies

Component 2

ComplexityFundamentals

Reliability Parametrics

Component 1

ComplexityFundamentals

Reliability Parametrics

Subsystem 2

SystemIntegration ofComponents

ReliabilityDependencies

Integration System

Realized System

ReliabilitySensitivities

High Reliable Complex System

Certificate(eg AWR)

Integration

Integration

Integration

Integration

Integration

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) andor AWR

19

Analytical Models and Reliability

bull Analytical models of hardware reliability are well understood

bull Architecture modeling and software reliability modeling is not a novel idea but is highly debated

ndash There are many approaches and little consensus as to best wayndash Many models (Jelinski-Moranda Littlewood-Verrall Musa-Okumoto etc) [1]ndash Many tools (over 200+ tools since 1970s have been built) [2]

bull Predictability of software reliability is of great concern because it is a major contributor to unreliability[2]

bull Software Reliability is the probability of error-free operation of a computer program in a specified environment for a specified time [1]

bull Need basis for setting reliability figures based on previous systems and iteratively refine those figures in the future

bull NOT A REPLACEMENT FOR TESTING AND VERIFICATION

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability While not suggesting as solution in total better modeling practices should be considered toward a solution to bridge the gap between design test and implementation A method to model the architecture early in the requirements establishment phase follow the detailed design and coding and then be able to verify the system along with the model may be a path to greater confidence in the system and reduce the risks warnings and cautions that must be issued13In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result (see Figure 9) The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) or AWR 13To achieve this goal modeling and analysis tools that follow a standard for modeling reliability should exist As previously stated over 200 tools have been created since 1970s Here is a list of a few of the current tools13Universal Systems Language (USL)13Unified Modeling Language (UML)13Systems Modeling Language (SysML)13MATLABSimulink13Telelogic Rhapsody13MathCAD13Colored Petri Nets13Rate Monotonic Analysis (RMA)13STATEMATE (Used by Airbus)13Standard for the Development of Safety-Critical Embedded Software (SCADE)13OPNET13Embedded System Modeling Language (ESML)13Component Synthesis using Model-Integrated Computing (CoSMIC)13Architectural Analysis and Design Language (AADL)13By no means is this list complete Typically different companies and projects address this challenge and choose unit tools to perform the upfront analysis and modeling not following a standard approach Multiple tools need to converge or be compatible with the framework of a common tool such as AADL The resultant tool(s) could be used to verify the requirements up front to mitigate reliability issues down the life-cycle 13

20

Tools for Modeling and Analysis

bull Universal Systems Language (USL)bull Unified Modeling Language (UML)bull Systems Modeling Language (SysML)bull MATLABSimulinkbull Telelogic Rhapsodybull MathCadbull Colored Petri Netsbull Rate Monotonic Analysis (RMA)bull STATEMATE (Used by Airbus)bull SCADEbull OPNETbull Embedded System Modeling Language (ESML)bull Component Synthesis using Model-Integrated Computing (CoSMIC)bull Architectural Analysis and Design Language (AADL)bull At least 200+ more packages since the 70rsquosbull Certified tools needs to converge to an accepted standard

modelinganalysis method for complex system reliability

Presenter
Presentation Notes
Typically different companies and projects address the challenge and choose unit tools to perform the upfront analysis and modeling not following a guideline approach Multiple tools need to converge or be compatible with a set modeling standard for complex systems The compliant tools could be used to verify the requirements up front to mitigate reliability issues down the life-cycle A notional approach would follow that shown in Figure 8 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model The down side to modeling from certain circles is getting people to believe those models How do you certify a modeling tool and the actual models within the tools Those issues should be addressed going forward

21

Modification to Acquisition Model

Requirements Establishment

High Level Design

Detailed Specifications

Implementation Coding

Verification

Development TestingA

rchi

tect

ural

Mod

el amp

A

naly

sis

Propose standard modeling methodology to be applied at different phases of development to enhance requirements development reliability allocation reliability

measurement and testing (DISCLAIMER DOES NOT REPLACE TESTING)

Reliabilityallocated Reliability

measured

Operational Testing amp Validation

Deployed System

Presenter
Presentation Notes
A notional approach would follow that shown in Figure 10 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model This approach could bridge the design and test phases together It is emphasized here that the architectural model would not replace critical testing but augments the process to allow for better requirement identification and verification Thorough ground and flight tests should never be replaced by modeling Modeling would only allow for more robust and a higher level of confidence in the requirements and design The model could be used in conjunction with the testing to confirm the design Proper modeling and analysis would reduce total program costs by enforcing more complete and correct initial requirements which reduce issues discovered down the road in testing that are expensive to fix or impossible to fix and having to accept high risks Additionally if the model is maintained and optimized then it could possibly be used after system deployment to analyze impacts of upgrades or changes to the system allowing for more complete analysis and reduce overall system redesign costs1313A hurdle to cross with modeling and analysis is convincing people to believe those models Some method to certify these models and modeling tools should be addressed in the future Standards should be set in place for correct modeling techniques for complex systems Lastly consideration of standard verification checking tools should be made such as with the use of the Motor Industry Software Reliability Association (MISRA) compliance verification tool for the use of C in safety critical systems

22

Systems Reliability Standard Establishment

bull Establish a working group to define this standardndash Need a technical society to lead the charge on this

bull Collaborate with industry academia military and societiesndash Focus on development of a reliability standard with AWR safety in mindndash Draw upon the experiences to feed into this standard

bull Study existing and previous complex systemsndash Shuttle Space Station missile systems nuclear submarine and ship

systems nuclear control systems military and commercial jet systems ndash Obtain software reliability information from given existing and previous

systemsndash Build database which would serve as basis for future reliability

bull Research prior efforts in complex systems analysis

bull Establish consensus based modeling and analysis method

Presenter
Presentation Notes
In conclusion methods for achieving a design for complex systems do exist however achieving reliability and attaining a level of qualification that would permit better AWRs does not There needs to be a standard developed to tackle this issue rather than relying on current methods to ascertain airworthiness of complex systems This will not occur overnight An orchestrated collaboration among industry academia military labs and technically professional societies to focus on development of this standard should allow us to draw upon the experiences to feed into this reliability standard with AWR safety in mind We have a long living experiment with complex software systems on the Space Transportation System (STS) International Space Station (ISS) missile systems nuclear submarine and ship systems nuclear control systems military and commercial jet systems from which we should be able to obtain at least inferred software reliability information from the architecture and run time that the systems use We should look at the lessons learned from these systems to see what could have been done to improve and what was done right that should be carried forward The challenge is collecting this information into a central database and arriving at some figure of reliability from previous systems where data exists This would at least provide a starting point to allow initial assessments and could be optimized in the future Also this is not the only study for establishing reliability metrics to complex software systems There have been research projects of similarity to this effort that have risen and fallen The data from those projects should not be wasted but studied to feed into whatever standard that is developed While historical information would be useful each design is unique and requires tools to accomplish the design Investigation of architectural modeling constructs should be further investigated as a possible augmentation to the design and test process We need to determine which forum is best to conduct this effort (eg SAE IEEE AIAA ACM AHS INCOSE or other) As stated in the paper ldquoSpace Shuttle Avionicsrdquo [31] ldquoThe designers the flight crew and other operational users of the system often have a mindset established in a previous program or experience which results in a bias against new or different lsquounconventionalrsquo approachesrdquo If nothing is done to address this problem it will only get worse over time It is past time to address the issue of reliability of complex systems and software

23

BACKUP SLIDES

BACKUP SLIDES

24

PEO Aviation System Safety Management Decision Authority Matrix

Severity(Most Credible)

FrequentP gt 1E-3

A

Probable 1D-4 lt P lt= 1E-3

B

Occasional1E-5 lt P lt= 1E-4

C

Remote1E-6 lt P lt= 1E-5

D

Improbable1E-7 lt P lt= 1E-6

E

Catastrophic1

Critical 2

Marginal 3

Negligible4

Army Acquisition

PEO Aviation

ProgramManagement

HazardCategory

Description

1 Catastrophic Death or permanent total disability system loss

2 Critical Severe injury or minor occupational illness (no permanent effect) minor system or environmental damage

3 Marginal Minor injury or minor occupational illness (no permanent effect) minor system or environmental damage

4 Negligible Less than minor injury or occupational illness (no lost workdays) or less than minor environmental damage

RiskLevel

Description Probability (Frequency) (per 100000 flight hours)

A Frequent gt 100 (P gt 1E-3)

B Probable lt=100 and gt10 (1E-4 lt P lt= 1E-3)

C Occasional lt= 10 and gt1 (1E-5 lt P lt= 1E-4)

D Remote lt=1 and gt01 (1E-6 lt P lt= 1E-5)

E Improbable lt=01 and gt001 (1E-7 lt P lt= 1E-6)

Presenter
Presentation Notes
As already mentioned it is the goal of the US Army Aviation Engineering Directorate that the system is safe and reliable to operate and will perform the mission when delivered Another goal includes the released system continues to safely perform the mission if maintained and operated per the operatorrsquos manual Replacement parts and overhaul work must be high quality to support continued airworthiness Per the Program Element Office Memorandum 08-03 Risk Matrix US Army flight control systems are to achieve 1E-9 reliability for flight critical functions per civil airspace regulations [35] and 1E-6 reliability for tactical use [36] Quantifying these numbers is established for component hardware but not for software Just as hardware should have quantifiable reliability so should software13

25

Reliability Defined

bull Software Reliability - the probability that a given piece of software will execute without failure in a given environment for a given time [40] ndash Often debated as to how to measure

bull Hardware Reliability - the probability that a hardware component fails over time ndash Well defined and established

bull System Reliability - the probability of success or the probability that the system will perform its intended function under specified design limits [over a given amount of time] [39] ndash A combination of software and hardware reliability

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

26

Hardware vs Software Reliability

Hardware Reliability Software ReliabilityFailure rate has a bathtub curve The burn-in state is similar to the software debugging state

Without considering program evolution failure rate is statistically non-increasing

Material deterioration can cause failures even though the system is not used

Failures never occur if the software is not used

Failure data are fitted to some distributions The selection of the underlying distribution is based on the analysis of failure data and experiences Emphasis is placed on analyzing failure data

Most models are analytically derived from assumptions Emphasis is on developing the model the interpretation of the model assumptions and the physical meaning of the parameters

Failures are caused by material deterioration design errors misuse and environment

Failures are caused by incorrect logic incorrect statements or incorrect input data

Can be improved by better design better material applying redundancy and accelerated life cycle testing

Can be improved by increasing testing effort and correcting discovered faults Reliability tends to change continuously during testing due to the addition of problems in new code or the removal of problems by debugging errors

Hardware repairs restore the original condition Software repairs establish a new piece of software

Hardware failures are usually preceded by warnings Software failures are rarely preceded by warnings

Hardware components can be standardized Software components have rarely been standardized

Hardware can usually be tested exhaustively Software essentially requires infinite testing for completeness

Reference [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Hong Pham compares software versus hardware reliability with the information as shown in Table 2 [39] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

27

Acronym ListACRONYM DEFINITIONAADL Architectural Analysis and Design LanguageAC Advisory Circular (FAA)ACM Association of Computing MachineryAED Aviation Engineering Directorate (AMRDEC)AFTD Aviation Flight Test Directorate (US Army)AGC Apollo Guidance ComputerAHS American Helicopter SocietyAIAA American Institute of Aeronautics and Astronautics (Inc)AMCOM Aviation and Missile Command (US Army)AMRDEC Aviation and Missiles Research Development and Engineering Center (US Army)AR Army RegulationARINC Aeronautical Radio Inc ARP Aerospace Recommended PracticeASIF Avionics Software Integration FacilityATAM Architecture Tradeoff Analysis MethodATM Air Traffic ManagementAWR Airworthiness ReleaseCAAS Common Avionics Architecture SystemCH-47 Cargo Helicopter ChinookCMM Capability Maturity ModelCMMI Capability Maturity Model IndexCMU Carnegie Mellon UniversityCNS Communications Navigation SurveillanceCoSMIC Component Synthesis using Model-Integrated ComputingCPS Cyber-Physical SystemCRC Chemical Rubber Company (ie CRC Press)DFBW Digital Fly-By-WireDoD Department of DefenseE3 Electrical and Electromagnetic EffectsESML Embedded System Modeling LanguageFAA Federal Aviation AdministrationFCS Future Combat SystemsFHA Functional Hazard AssessmentFMEA Failure Modes Effects Analysis

28

Acronym List (concluded)

ACRONYM DEFINITIONGPWS Ground Proximity Warning SystemIBM International Business MachinesIEC International Engineering ConsortiumIL Instrumentation Lab (now Draper Laboratory)IMA Integrated Modular AvionicsINCOSE International Council On Systems EngineeringISO International Organization for StandardizationISS International Space StationKAL Korean AirlinesMISRA Motor Industry Standard Software Reliability AssociationMIT Massachusetts Institute of TechnologyNASA National Aeronautics and Space Administration (USA)PDR Preliminary Design ReviewPEO Program Element OfficePNAS Proceedings of the National Academy of SciencesRAQ Rotorcraft and Aircraft QualificationRMA Rate Monotonic AnalysisRTC Redstone Test Center (US Army) RTTC Redstone Technical Test Center (US Army)RTCA Radio Technical Commission for AeronauticsSAE Society of Automotive EngineersSED Software Engineering Directorate (AMRDEC)SEES Software Engineering Evaluation SystemSEI Software Engineering Institute (CMU)SIL System Integration LaboratorySSA System Safety AssessmentSTS Space Transportation SystemSysML Systems Modeling LanguageTMR Triple Modular RedundantTRL Technical Readiness LevelUAS Unmanned Aircraft SystemUH-60 Utility Helicopter BlackhawkUML Unified Modeling LanguageUS United StatesUSL Universal Systems Language

29

References

bull [1] Israel Koren and Mani Krishna ldquoFault-Tolerant Systemsrdquo Morgan Kaufmann 2007bull [2] Jianto Pan ldquoSoftware Reliabilityrdquo Carnegie Mellon University Spring 1999bull [3] Nachum Dershowitz httpwwwcstauacil~nachumdhorrorhtmlbull [4] httpwwwair-attackcombull [5] David A Mindell ldquoDigital Apollo Human and Machine in Spaceflightrdquo The MIT Press 2008bull [6] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Flight Software Airworthiness SED-SES-PMHFSA 001 December 2003bull [7] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Software Safety SED-SES-PMHSSA 001 February 2006bull [8] AMCOM Regulation 385-17 Software System Safety Policy 15 March 2008bull [9] NASA Software Safety Guidebook NASA-GB-871913 31 March 2004bull [10] Margaret Hamilton amp William Hackler ldquoUniversal Systems Language Lessons Learned from Apollordquo IEEE Computer

Society 2008bull [11] Margaret Hamilton ldquoFull Life Cycle Systems Engineering and Software Development Environment Development Before

The Fact In Actionrdquo httpwwwhtiuscomArticlesFull_Life_Cyclehtmbull [12] Peter Feiler David Gluch John Hudak ldquo The Architecture Analysis amp Design Language (AADL) An Introductionrdquo

CMUSEI-2006-TN-011 February 2006bull [13] Peter Feiler John Hudak ldquoDeveloping AADL Models for Control Systems A Practitionerrsquos Guiderdquo CMUSEI-2007-TR-

014 July 2007bull [14] Bruce Lewis ldquoUsing the Architecture Analysis and Design Language for System Verification and Validationrdquo SEI

Presentation 2006bull [15] Feiler Gluch Hudak Lewis ldquoEmbedded System Architecture Analysis Using SAE AADLrdquo CMUSEI-2004-TN-004 June

2004bull [16] Charles Pecheur Stacy Nelson ldquoVampV of Advanced Systems at NASArdquo NASACR-2002-211402 April 2002bull [17] Systems Integration Requirements Task Group ldquoARP 4754 Certification Considerations for Highly-Integrated or

Complex Aircraft Systemsrdquo SAE Aerospace 10 April 1996bull [18] SAE ldquoARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne System and

Equipmentrdquo December 1996bull [19] Aeronautical Radio Inc (ARINC) ldquoARINC Specification 653P1-2 Avionics Application Software Standard Interface Part 1

ndash Required Servicesrdquo 7 March 2006

30

References (Continued)

bull [20] Department of Defense ldquoMIL-STD-882D Standard Practice for System Safetyrdquo 19 January 1993bull [21] RTCA Incorporated ldquoDO-178 Software Considerations in Airborne Systems and Equipment Certificationrdquo 1

December 1992bull [22] RTCA Incorporated ldquoDO-254 Design Assurance Guidance for Airborne Electronic Hardwarerdquo 19 April 2000bull [23] US Army ldquoAeronautical Design Standard Handbook Rotorcraft and Aircraft Qualification (RAQ) Handbookrdquo 21

October 1996bull [24] Cary R Spitzer (Editor) ldquoAvionics Elements Software and Functionsrdquo CRC Press 2007bull [25] US Army ldquoArmy Regulation 70-62 Airworthiness Qualification of Aircraft Systemsrdquo 21 May 2007bull [26] US Army ldquoArmy Regulation 95-1 Aviation Flight Regulationsrdquo 3 February 2006bull [27] ldquoUsing the Architecture Tradeoff Analysis Method (ATAM) to Evaluate the Software Architecture for a Product Line

of Avionics Systems A Case Studyrdquo Barbacci Clements Lattanze Northrop Wood July 2003 CMUSEI-2003-TN-012bull [28] ldquoAll in the Family CAAS amp AADLrdquo Peter Feiler August 2008 CMUSEI-2008-SR-021bull [29] ldquoCMMI Guidelines for Process Integration and Product Improvementrdquo Chrissis Konrad Shrum Pearson Education

2007bull [30] ldquoModel Driven Performance Analysis for Avionics Systemsrdquo Brendan OrsquoConnell Draper Laboratory January 2006bull [31] John F Hanaway Robert W Moorehead ldquoSpace Shuttle Avionics Systemsrdquo NASA SP-504 1989bull [32] Lui Sha ldquoThe Complexity Challenge in Modern Avionics Softwarerdquo August 14 2006bull [33] ldquoIncidents Prompt New Scrutiny of Airplane Software Glitchesrdquo 30 May 2006 Wall Street Journalbull [34] Eyal Ophir Clifford Nass and Anthony Wagner ldquo Cognitive Control in Media Multitaskersrdquo PNAS 20 July 2009bull [35] ldquoAdvisory Circular AC 251309-1A System Design and Analysisrdquo Federal Aviation Administration 21 June 1988bull [36] Program Element Office Policy Memorandum 08-03bull [38] httpwwwnsfgovpubs2008nsf08611nsf08611htm National Science Foundation webpage on Cyber-Physical

Systemsbull [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000bull [40] Paul Rook editor ldquoSoftware Reliability Handbookrdquo Elsevier Science Publishers LTD 1990

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)

16

Some Current Guidelines

bull DO-178B - Software Considerations in Airborne Systems and Equipment Certification bull DO-248B ndash Final Report for the Clarification of DO-178Bbull DO-278 - Guidelines for Communications Navigation Surveillance and Air Traffic Management

(CNSATM) Systems Software Integrity Assurancebull DO-254 - Design Assurance Guidance for Airborne Electronic Hardware bull DO-297 ndash Integrated Modular Avionics (IMA) Development Guidance and Certification

Considerationsbull SAE-ARP4754 ndash Certification Consideration for Highly Integrated or Complex Aircraft Systemsbull SAE-ARP4671- Guidelines and Methods for Conducting the Safety Assessment Process on

Airborne Systems and Equipmentbull FAA Advisory Circular AC27-1B - Certification of Normal Category Rotorcraftbull FAA Advisory Circular AC29-2C - Certification of Transport Category Rotorcraftbull ISOIEC 12207 - Software Life Cycle Processesbull ARINC 653 - Specification Standard for Time and System Partitionbull MIL-STD-882D - DoD System Safetybull ADS-51-HDBK - Rotorcraft and Aircraft Qualification Handbookbull AR-70-62 - Airworthiness Release Standardbull SED-SES-PMHFSA 001 - Software Engineering Directorate (SED) Software Engineering

Evaluation System (SEES) Program Manager Handbook for Flight Software Airworthinessbull SED-SES-PMHSS 001 - SED SEES Program Manager Handbook for Software Safety

WHATrsquoS MISSING - Reliability Standard for Complex Systems

Presenter
Presentation Notes
These problems previously stated drove the development of these guidelines however there is no standard for system reliability that includes software There are other standards and circulars that pertain to complex systems but a reliability standard for complex systems but a reliability standard is missing for complex systems which would outline the process for establishing cyber-physical systems reliability This standard should indicate how to model and analyze and ascertain the projected level of reliability

17

Certification Assessment Considerations

bull Sufficient data and time must be available for air worthiness evaluation

bull Certification processndash Currently lengthy ndash Depends much on human interpretation trade offs and risk mitigation ndash Overwhelming for complex integrated systems (FHAs FTAs FMECAs

risk mitigation etc)

bull Consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives

bull Certification Tasks outlined in DO-297 should be consideredndash Task 1 Module Acceptancendash Task 2 Application softwarehardware acceptancendash Task 3 IMA system acceptancendash Task 4 Aircraft integration of IMA system ndash including VampVndash Task 5 Change of modules or applicationsndash Task 6 Reused of modules or applications

Presenter
Presentation Notes
In order to execute an AWR sufficient data must be provided to the appropriate airworthiness authorities to demonstrate that safety requirements are met to the levels per the System Safety Assessment (SSA) or equivalent The certification process is currently lengthy and depends on much human interpretation of the myriad of complex architecture functions 13The current guidelines such as DO-178B DO-254 DO-297 SAE-ARP-4754 and SAE-ARP-4671 along with many other guidelines outline the proper steps that should be taken System safety managementrsquos military standard is MIL-STD-882 and has been in use for decades Civilian safety standards for the aviation industry include SAE ARP4754 which shows the incorporation of system safety activities into the design process and provides guidance on techniques to use to ensure a safe design SAE ARP4761 contains significant guidance on how to perform the system safety activities spoken about in SAE ARP4754 DO-178B outlines development assurance guidance for aviation software based on the failure condition categorization of the functionality provided by the software DO-254 embodies similar guidance for aviation hardware ARINC 653 is a widely accepted standard to ensure time and space partitioning for software DO-297 does an excellent job of describing the certification tasks to take for an IMA system which include13Task 1 Module acceptance13Task 2 Application softwarehardware acceptance13Task 3 IMA system acceptance13Task4 Aircraft integration of IMA systems including verification and validation13Task 5 Change of modules or applications13Task 6 Reuse of modules or applications13Taken together these standards provide guidance that if followed likely will result in safe highly reliable and cost-effective systems over the life-cycle of the system Yet while these guidelines exist there is not a consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives Also there is not a standard outlining overall reliability for a system to include hardware and software reliability In order to achieve this level of reliability a standard should be developed to define the process and method to achieve a quantifiable reliability number that would in turn lead to acceptance13

18

TRL 3 or 4 TRL 6 or 7 TRL 8 or 9

Component 4

ComplexityFundamentals

Reliability Parametrics

Component 3

ComplexityFundamentals

Reliability Parametrics

Definition of Complexity and Reliability is Needed

Subsystem 1

SystemIntegration ofComponents

ReliabilityDependencies

Component 2

ComplexityFundamentals

Reliability Parametrics

Component 1

ComplexityFundamentals

Reliability Parametrics

Subsystem 2

SystemIntegration ofComponents

ReliabilityDependencies

Integration System

Realized System

ReliabilitySensitivities

High Reliable Complex System

Certificate(eg AWR)

Integration

Integration

Integration

Integration

Integration

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) andor AWR

19

Analytical Models and Reliability

bull Analytical models of hardware reliability are well understood

bull Architecture modeling and software reliability modeling is not a novel idea but is highly debated

ndash There are many approaches and little consensus as to best wayndash Many models (Jelinski-Moranda Littlewood-Verrall Musa-Okumoto etc) [1]ndash Many tools (over 200+ tools since 1970s have been built) [2]

bull Predictability of software reliability is of great concern because it is a major contributor to unreliability[2]

bull Software Reliability is the probability of error-free operation of a computer program in a specified environment for a specified time [1]

bull Need basis for setting reliability figures based on previous systems and iteratively refine those figures in the future

bull NOT A REPLACEMENT FOR TESTING AND VERIFICATION

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability While not suggesting as solution in total better modeling practices should be considered toward a solution to bridge the gap between design test and implementation A method to model the architecture early in the requirements establishment phase follow the detailed design and coding and then be able to verify the system along with the model may be a path to greater confidence in the system and reduce the risks warnings and cautions that must be issued13In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result (see Figure 9) The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) or AWR 13To achieve this goal modeling and analysis tools that follow a standard for modeling reliability should exist As previously stated over 200 tools have been created since 1970s Here is a list of a few of the current tools13Universal Systems Language (USL)13Unified Modeling Language (UML)13Systems Modeling Language (SysML)13MATLABSimulink13Telelogic Rhapsody13MathCAD13Colored Petri Nets13Rate Monotonic Analysis (RMA)13STATEMATE (Used by Airbus)13Standard for the Development of Safety-Critical Embedded Software (SCADE)13OPNET13Embedded System Modeling Language (ESML)13Component Synthesis using Model-Integrated Computing (CoSMIC)13Architectural Analysis and Design Language (AADL)13By no means is this list complete Typically different companies and projects address this challenge and choose unit tools to perform the upfront analysis and modeling not following a standard approach Multiple tools need to converge or be compatible with the framework of a common tool such as AADL The resultant tool(s) could be used to verify the requirements up front to mitigate reliability issues down the life-cycle 13

20

Tools for Modeling and Analysis

bull Universal Systems Language (USL)bull Unified Modeling Language (UML)bull Systems Modeling Language (SysML)bull MATLABSimulinkbull Telelogic Rhapsodybull MathCadbull Colored Petri Netsbull Rate Monotonic Analysis (RMA)bull STATEMATE (Used by Airbus)bull SCADEbull OPNETbull Embedded System Modeling Language (ESML)bull Component Synthesis using Model-Integrated Computing (CoSMIC)bull Architectural Analysis and Design Language (AADL)bull At least 200+ more packages since the 70rsquosbull Certified tools needs to converge to an accepted standard

modelinganalysis method for complex system reliability

Presenter
Presentation Notes
Typically different companies and projects address the challenge and choose unit tools to perform the upfront analysis and modeling not following a guideline approach Multiple tools need to converge or be compatible with a set modeling standard for complex systems The compliant tools could be used to verify the requirements up front to mitigate reliability issues down the life-cycle A notional approach would follow that shown in Figure 8 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model The down side to modeling from certain circles is getting people to believe those models How do you certify a modeling tool and the actual models within the tools Those issues should be addressed going forward

21

Modification to Acquisition Model

Requirements Establishment

High Level Design

Detailed Specifications

Implementation Coding

Verification

Development TestingA

rchi

tect

ural

Mod

el amp

A

naly

sis

Propose standard modeling methodology to be applied at different phases of development to enhance requirements development reliability allocation reliability

measurement and testing (DISCLAIMER DOES NOT REPLACE TESTING)

Reliabilityallocated Reliability

measured

Operational Testing amp Validation

Deployed System

Presenter
Presentation Notes
A notional approach would follow that shown in Figure 10 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model This approach could bridge the design and test phases together It is emphasized here that the architectural model would not replace critical testing but augments the process to allow for better requirement identification and verification Thorough ground and flight tests should never be replaced by modeling Modeling would only allow for more robust and a higher level of confidence in the requirements and design The model could be used in conjunction with the testing to confirm the design Proper modeling and analysis would reduce total program costs by enforcing more complete and correct initial requirements which reduce issues discovered down the road in testing that are expensive to fix or impossible to fix and having to accept high risks Additionally if the model is maintained and optimized then it could possibly be used after system deployment to analyze impacts of upgrades or changes to the system allowing for more complete analysis and reduce overall system redesign costs1313A hurdle to cross with modeling and analysis is convincing people to believe those models Some method to certify these models and modeling tools should be addressed in the future Standards should be set in place for correct modeling techniques for complex systems Lastly consideration of standard verification checking tools should be made such as with the use of the Motor Industry Software Reliability Association (MISRA) compliance verification tool for the use of C in safety critical systems

22

Systems Reliability Standard Establishment

bull Establish a working group to define this standardndash Need a technical society to lead the charge on this

bull Collaborate with industry academia military and societiesndash Focus on development of a reliability standard with AWR safety in mindndash Draw upon the experiences to feed into this standard

bull Study existing and previous complex systemsndash Shuttle Space Station missile systems nuclear submarine and ship

systems nuclear control systems military and commercial jet systems ndash Obtain software reliability information from given existing and previous

systemsndash Build database which would serve as basis for future reliability

bull Research prior efforts in complex systems analysis

bull Establish consensus based modeling and analysis method

Presenter
Presentation Notes
In conclusion methods for achieving a design for complex systems do exist however achieving reliability and attaining a level of qualification that would permit better AWRs does not There needs to be a standard developed to tackle this issue rather than relying on current methods to ascertain airworthiness of complex systems This will not occur overnight An orchestrated collaboration among industry academia military labs and technically professional societies to focus on development of this standard should allow us to draw upon the experiences to feed into this reliability standard with AWR safety in mind We have a long living experiment with complex software systems on the Space Transportation System (STS) International Space Station (ISS) missile systems nuclear submarine and ship systems nuclear control systems military and commercial jet systems from which we should be able to obtain at least inferred software reliability information from the architecture and run time that the systems use We should look at the lessons learned from these systems to see what could have been done to improve and what was done right that should be carried forward The challenge is collecting this information into a central database and arriving at some figure of reliability from previous systems where data exists This would at least provide a starting point to allow initial assessments and could be optimized in the future Also this is not the only study for establishing reliability metrics to complex software systems There have been research projects of similarity to this effort that have risen and fallen The data from those projects should not be wasted but studied to feed into whatever standard that is developed While historical information would be useful each design is unique and requires tools to accomplish the design Investigation of architectural modeling constructs should be further investigated as a possible augmentation to the design and test process We need to determine which forum is best to conduct this effort (eg SAE IEEE AIAA ACM AHS INCOSE or other) As stated in the paper ldquoSpace Shuttle Avionicsrdquo [31] ldquoThe designers the flight crew and other operational users of the system often have a mindset established in a previous program or experience which results in a bias against new or different lsquounconventionalrsquo approachesrdquo If nothing is done to address this problem it will only get worse over time It is past time to address the issue of reliability of complex systems and software

23

BACKUP SLIDES

BACKUP SLIDES

24

PEO Aviation System Safety Management Decision Authority Matrix

Severity(Most Credible)

FrequentP gt 1E-3

A

Probable 1D-4 lt P lt= 1E-3

B

Occasional1E-5 lt P lt= 1E-4

C

Remote1E-6 lt P lt= 1E-5

D

Improbable1E-7 lt P lt= 1E-6

E

Catastrophic1

Critical 2

Marginal 3

Negligible4

Army Acquisition

PEO Aviation

ProgramManagement

HazardCategory

Description

1 Catastrophic Death or permanent total disability system loss

2 Critical Severe injury or minor occupational illness (no permanent effect) minor system or environmental damage

3 Marginal Minor injury or minor occupational illness (no permanent effect) minor system or environmental damage

4 Negligible Less than minor injury or occupational illness (no lost workdays) or less than minor environmental damage

RiskLevel

Description Probability (Frequency) (per 100000 flight hours)

A Frequent gt 100 (P gt 1E-3)

B Probable lt=100 and gt10 (1E-4 lt P lt= 1E-3)

C Occasional lt= 10 and gt1 (1E-5 lt P lt= 1E-4)

D Remote lt=1 and gt01 (1E-6 lt P lt= 1E-5)

E Improbable lt=01 and gt001 (1E-7 lt P lt= 1E-6)

Presenter
Presentation Notes
As already mentioned it is the goal of the US Army Aviation Engineering Directorate that the system is safe and reliable to operate and will perform the mission when delivered Another goal includes the released system continues to safely perform the mission if maintained and operated per the operatorrsquos manual Replacement parts and overhaul work must be high quality to support continued airworthiness Per the Program Element Office Memorandum 08-03 Risk Matrix US Army flight control systems are to achieve 1E-9 reliability for flight critical functions per civil airspace regulations [35] and 1E-6 reliability for tactical use [36] Quantifying these numbers is established for component hardware but not for software Just as hardware should have quantifiable reliability so should software13

25

Reliability Defined

bull Software Reliability - the probability that a given piece of software will execute without failure in a given environment for a given time [40] ndash Often debated as to how to measure

bull Hardware Reliability - the probability that a hardware component fails over time ndash Well defined and established

bull System Reliability - the probability of success or the probability that the system will perform its intended function under specified design limits [over a given amount of time] [39] ndash A combination of software and hardware reliability

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

26

Hardware vs Software Reliability

Hardware Reliability Software ReliabilityFailure rate has a bathtub curve The burn-in state is similar to the software debugging state

Without considering program evolution failure rate is statistically non-increasing

Material deterioration can cause failures even though the system is not used

Failures never occur if the software is not used

Failure data are fitted to some distributions The selection of the underlying distribution is based on the analysis of failure data and experiences Emphasis is placed on analyzing failure data

Most models are analytically derived from assumptions Emphasis is on developing the model the interpretation of the model assumptions and the physical meaning of the parameters

Failures are caused by material deterioration design errors misuse and environment

Failures are caused by incorrect logic incorrect statements or incorrect input data

Can be improved by better design better material applying redundancy and accelerated life cycle testing

Can be improved by increasing testing effort and correcting discovered faults Reliability tends to change continuously during testing due to the addition of problems in new code or the removal of problems by debugging errors

Hardware repairs restore the original condition Software repairs establish a new piece of software

Hardware failures are usually preceded by warnings Software failures are rarely preceded by warnings

Hardware components can be standardized Software components have rarely been standardized

Hardware can usually be tested exhaustively Software essentially requires infinite testing for completeness

Reference [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Hong Pham compares software versus hardware reliability with the information as shown in Table 2 [39] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

27

Acronym ListACRONYM DEFINITIONAADL Architectural Analysis and Design LanguageAC Advisory Circular (FAA)ACM Association of Computing MachineryAED Aviation Engineering Directorate (AMRDEC)AFTD Aviation Flight Test Directorate (US Army)AGC Apollo Guidance ComputerAHS American Helicopter SocietyAIAA American Institute of Aeronautics and Astronautics (Inc)AMCOM Aviation and Missile Command (US Army)AMRDEC Aviation and Missiles Research Development and Engineering Center (US Army)AR Army RegulationARINC Aeronautical Radio Inc ARP Aerospace Recommended PracticeASIF Avionics Software Integration FacilityATAM Architecture Tradeoff Analysis MethodATM Air Traffic ManagementAWR Airworthiness ReleaseCAAS Common Avionics Architecture SystemCH-47 Cargo Helicopter ChinookCMM Capability Maturity ModelCMMI Capability Maturity Model IndexCMU Carnegie Mellon UniversityCNS Communications Navigation SurveillanceCoSMIC Component Synthesis using Model-Integrated ComputingCPS Cyber-Physical SystemCRC Chemical Rubber Company (ie CRC Press)DFBW Digital Fly-By-WireDoD Department of DefenseE3 Electrical and Electromagnetic EffectsESML Embedded System Modeling LanguageFAA Federal Aviation AdministrationFCS Future Combat SystemsFHA Functional Hazard AssessmentFMEA Failure Modes Effects Analysis

28

Acronym List (concluded)

ACRONYM DEFINITIONGPWS Ground Proximity Warning SystemIBM International Business MachinesIEC International Engineering ConsortiumIL Instrumentation Lab (now Draper Laboratory)IMA Integrated Modular AvionicsINCOSE International Council On Systems EngineeringISO International Organization for StandardizationISS International Space StationKAL Korean AirlinesMISRA Motor Industry Standard Software Reliability AssociationMIT Massachusetts Institute of TechnologyNASA National Aeronautics and Space Administration (USA)PDR Preliminary Design ReviewPEO Program Element OfficePNAS Proceedings of the National Academy of SciencesRAQ Rotorcraft and Aircraft QualificationRMA Rate Monotonic AnalysisRTC Redstone Test Center (US Army) RTTC Redstone Technical Test Center (US Army)RTCA Radio Technical Commission for AeronauticsSAE Society of Automotive EngineersSED Software Engineering Directorate (AMRDEC)SEES Software Engineering Evaluation SystemSEI Software Engineering Institute (CMU)SIL System Integration LaboratorySSA System Safety AssessmentSTS Space Transportation SystemSysML Systems Modeling LanguageTMR Triple Modular RedundantTRL Technical Readiness LevelUAS Unmanned Aircraft SystemUH-60 Utility Helicopter BlackhawkUML Unified Modeling LanguageUS United StatesUSL Universal Systems Language

29

References

bull [1] Israel Koren and Mani Krishna ldquoFault-Tolerant Systemsrdquo Morgan Kaufmann 2007bull [2] Jianto Pan ldquoSoftware Reliabilityrdquo Carnegie Mellon University Spring 1999bull [3] Nachum Dershowitz httpwwwcstauacil~nachumdhorrorhtmlbull [4] httpwwwair-attackcombull [5] David A Mindell ldquoDigital Apollo Human and Machine in Spaceflightrdquo The MIT Press 2008bull [6] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Flight Software Airworthiness SED-SES-PMHFSA 001 December 2003bull [7] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Software Safety SED-SES-PMHSSA 001 February 2006bull [8] AMCOM Regulation 385-17 Software System Safety Policy 15 March 2008bull [9] NASA Software Safety Guidebook NASA-GB-871913 31 March 2004bull [10] Margaret Hamilton amp William Hackler ldquoUniversal Systems Language Lessons Learned from Apollordquo IEEE Computer

Society 2008bull [11] Margaret Hamilton ldquoFull Life Cycle Systems Engineering and Software Development Environment Development Before

The Fact In Actionrdquo httpwwwhtiuscomArticlesFull_Life_Cyclehtmbull [12] Peter Feiler David Gluch John Hudak ldquo The Architecture Analysis amp Design Language (AADL) An Introductionrdquo

CMUSEI-2006-TN-011 February 2006bull [13] Peter Feiler John Hudak ldquoDeveloping AADL Models for Control Systems A Practitionerrsquos Guiderdquo CMUSEI-2007-TR-

014 July 2007bull [14] Bruce Lewis ldquoUsing the Architecture Analysis and Design Language for System Verification and Validationrdquo SEI

Presentation 2006bull [15] Feiler Gluch Hudak Lewis ldquoEmbedded System Architecture Analysis Using SAE AADLrdquo CMUSEI-2004-TN-004 June

2004bull [16] Charles Pecheur Stacy Nelson ldquoVampV of Advanced Systems at NASArdquo NASACR-2002-211402 April 2002bull [17] Systems Integration Requirements Task Group ldquoARP 4754 Certification Considerations for Highly-Integrated or

Complex Aircraft Systemsrdquo SAE Aerospace 10 April 1996bull [18] SAE ldquoARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne System and

Equipmentrdquo December 1996bull [19] Aeronautical Radio Inc (ARINC) ldquoARINC Specification 653P1-2 Avionics Application Software Standard Interface Part 1

ndash Required Servicesrdquo 7 March 2006

30

References (Continued)

bull [20] Department of Defense ldquoMIL-STD-882D Standard Practice for System Safetyrdquo 19 January 1993bull [21] RTCA Incorporated ldquoDO-178 Software Considerations in Airborne Systems and Equipment Certificationrdquo 1

December 1992bull [22] RTCA Incorporated ldquoDO-254 Design Assurance Guidance for Airborne Electronic Hardwarerdquo 19 April 2000bull [23] US Army ldquoAeronautical Design Standard Handbook Rotorcraft and Aircraft Qualification (RAQ) Handbookrdquo 21

October 1996bull [24] Cary R Spitzer (Editor) ldquoAvionics Elements Software and Functionsrdquo CRC Press 2007bull [25] US Army ldquoArmy Regulation 70-62 Airworthiness Qualification of Aircraft Systemsrdquo 21 May 2007bull [26] US Army ldquoArmy Regulation 95-1 Aviation Flight Regulationsrdquo 3 February 2006bull [27] ldquoUsing the Architecture Tradeoff Analysis Method (ATAM) to Evaluate the Software Architecture for a Product Line

of Avionics Systems A Case Studyrdquo Barbacci Clements Lattanze Northrop Wood July 2003 CMUSEI-2003-TN-012bull [28] ldquoAll in the Family CAAS amp AADLrdquo Peter Feiler August 2008 CMUSEI-2008-SR-021bull [29] ldquoCMMI Guidelines for Process Integration and Product Improvementrdquo Chrissis Konrad Shrum Pearson Education

2007bull [30] ldquoModel Driven Performance Analysis for Avionics Systemsrdquo Brendan OrsquoConnell Draper Laboratory January 2006bull [31] John F Hanaway Robert W Moorehead ldquoSpace Shuttle Avionics Systemsrdquo NASA SP-504 1989bull [32] Lui Sha ldquoThe Complexity Challenge in Modern Avionics Softwarerdquo August 14 2006bull [33] ldquoIncidents Prompt New Scrutiny of Airplane Software Glitchesrdquo 30 May 2006 Wall Street Journalbull [34] Eyal Ophir Clifford Nass and Anthony Wagner ldquo Cognitive Control in Media Multitaskersrdquo PNAS 20 July 2009bull [35] ldquoAdvisory Circular AC 251309-1A System Design and Analysisrdquo Federal Aviation Administration 21 June 1988bull [36] Program Element Office Policy Memorandum 08-03bull [38] httpwwwnsfgovpubs2008nsf08611nsf08611htm National Science Foundation webpage on Cyber-Physical

Systemsbull [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000bull [40] Paul Rook editor ldquoSoftware Reliability Handbookrdquo Elsevier Science Publishers LTD 1990

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)

17

Certification Assessment Considerations

bull Sufficient data and time must be available for air worthiness evaluation

bull Certification processndash Currently lengthy ndash Depends much on human interpretation trade offs and risk mitigation ndash Overwhelming for complex integrated systems (FHAs FTAs FMECAs

risk mitigation etc)

bull Consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives

bull Certification Tasks outlined in DO-297 should be consideredndash Task 1 Module Acceptancendash Task 2 Application softwarehardware acceptancendash Task 3 IMA system acceptancendash Task 4 Aircraft integration of IMA system ndash including VampVndash Task 5 Change of modules or applicationsndash Task 6 Reused of modules or applications

Presenter
Presentation Notes
In order to execute an AWR sufficient data must be provided to the appropriate airworthiness authorities to demonstrate that safety requirements are met to the levels per the System Safety Assessment (SSA) or equivalent The certification process is currently lengthy and depends on much human interpretation of the myriad of complex architecture functions 13The current guidelines such as DO-178B DO-254 DO-297 SAE-ARP-4754 and SAE-ARP-4671 along with many other guidelines outline the proper steps that should be taken System safety managementrsquos military standard is MIL-STD-882 and has been in use for decades Civilian safety standards for the aviation industry include SAE ARP4754 which shows the incorporation of system safety activities into the design process and provides guidance on techniques to use to ensure a safe design SAE ARP4761 contains significant guidance on how to perform the system safety activities spoken about in SAE ARP4754 DO-178B outlines development assurance guidance for aviation software based on the failure condition categorization of the functionality provided by the software DO-254 embodies similar guidance for aviation hardware ARINC 653 is a widely accepted standard to ensure time and space partitioning for software DO-297 does an excellent job of describing the certification tasks to take for an IMA system which include13Task 1 Module acceptance13Task 2 Application softwarehardware acceptance13Task 3 IMA system acceptance13Task4 Aircraft integration of IMA systems including verification and validation13Task 5 Change of modules or applications13Task 6 Reuse of modules or applications13Taken together these standards provide guidance that if followed likely will result in safe highly reliable and cost-effective systems over the life-cycle of the system Yet while these guidelines exist there is not a consistent industry-wide method to assess a system at any stage of the life-cycle to allow a tradeoff of design alternatives Also there is not a standard outlining overall reliability for a system to include hardware and software reliability In order to achieve this level of reliability a standard should be developed to define the process and method to achieve a quantifiable reliability number that would in turn lead to acceptance13

18

TRL 3 or 4 TRL 6 or 7 TRL 8 or 9

Component 4

ComplexityFundamentals

Reliability Parametrics

Component 3

ComplexityFundamentals

Reliability Parametrics

Definition of Complexity and Reliability is Needed

Subsystem 1

SystemIntegration ofComponents

ReliabilityDependencies

Component 2

ComplexityFundamentals

Reliability Parametrics

Component 1

ComplexityFundamentals

Reliability Parametrics

Subsystem 2

SystemIntegration ofComponents

ReliabilityDependencies

Integration System

Realized System

ReliabilitySensitivities

High Reliable Complex System

Certificate(eg AWR)

Integration

Integration

Integration

Integration

Integration

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) andor AWR

19

Analytical Models and Reliability

bull Analytical models of hardware reliability are well understood

bull Architecture modeling and software reliability modeling is not a novel idea but is highly debated

ndash There are many approaches and little consensus as to best wayndash Many models (Jelinski-Moranda Littlewood-Verrall Musa-Okumoto etc) [1]ndash Many tools (over 200+ tools since 1970s have been built) [2]

bull Predictability of software reliability is of great concern because it is a major contributor to unreliability[2]

bull Software Reliability is the probability of error-free operation of a computer program in a specified environment for a specified time [1]

bull Need basis for setting reliability figures based on previous systems and iteratively refine those figures in the future

bull NOT A REPLACEMENT FOR TESTING AND VERIFICATION

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability While not suggesting as solution in total better modeling practices should be considered toward a solution to bridge the gap between design test and implementation A method to model the architecture early in the requirements establishment phase follow the detailed design and coding and then be able to verify the system along with the model may be a path to greater confidence in the system and reduce the risks warnings and cautions that must be issued13In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result (see Figure 9) The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) or AWR 13To achieve this goal modeling and analysis tools that follow a standard for modeling reliability should exist As previously stated over 200 tools have been created since 1970s Here is a list of a few of the current tools13Universal Systems Language (USL)13Unified Modeling Language (UML)13Systems Modeling Language (SysML)13MATLABSimulink13Telelogic Rhapsody13MathCAD13Colored Petri Nets13Rate Monotonic Analysis (RMA)13STATEMATE (Used by Airbus)13Standard for the Development of Safety-Critical Embedded Software (SCADE)13OPNET13Embedded System Modeling Language (ESML)13Component Synthesis using Model-Integrated Computing (CoSMIC)13Architectural Analysis and Design Language (AADL)13By no means is this list complete Typically different companies and projects address this challenge and choose unit tools to perform the upfront analysis and modeling not following a standard approach Multiple tools need to converge or be compatible with the framework of a common tool such as AADL The resultant tool(s) could be used to verify the requirements up front to mitigate reliability issues down the life-cycle 13

20

Tools for Modeling and Analysis

bull Universal Systems Language (USL)bull Unified Modeling Language (UML)bull Systems Modeling Language (SysML)bull MATLABSimulinkbull Telelogic Rhapsodybull MathCadbull Colored Petri Netsbull Rate Monotonic Analysis (RMA)bull STATEMATE (Used by Airbus)bull SCADEbull OPNETbull Embedded System Modeling Language (ESML)bull Component Synthesis using Model-Integrated Computing (CoSMIC)bull Architectural Analysis and Design Language (AADL)bull At least 200+ more packages since the 70rsquosbull Certified tools needs to converge to an accepted standard

modelinganalysis method for complex system reliability

Presenter
Presentation Notes
Typically different companies and projects address the challenge and choose unit tools to perform the upfront analysis and modeling not following a guideline approach Multiple tools need to converge or be compatible with a set modeling standard for complex systems The compliant tools could be used to verify the requirements up front to mitigate reliability issues down the life-cycle A notional approach would follow that shown in Figure 8 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model The down side to modeling from certain circles is getting people to believe those models How do you certify a modeling tool and the actual models within the tools Those issues should be addressed going forward

21

Modification to Acquisition Model

Requirements Establishment

High Level Design

Detailed Specifications

Implementation Coding

Verification

Development TestingA

rchi

tect

ural

Mod

el amp

A

naly

sis

Propose standard modeling methodology to be applied at different phases of development to enhance requirements development reliability allocation reliability

measurement and testing (DISCLAIMER DOES NOT REPLACE TESTING)

Reliabilityallocated Reliability

measured

Operational Testing amp Validation

Deployed System

Presenter
Presentation Notes
A notional approach would follow that shown in Figure 10 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model This approach could bridge the design and test phases together It is emphasized here that the architectural model would not replace critical testing but augments the process to allow for better requirement identification and verification Thorough ground and flight tests should never be replaced by modeling Modeling would only allow for more robust and a higher level of confidence in the requirements and design The model could be used in conjunction with the testing to confirm the design Proper modeling and analysis would reduce total program costs by enforcing more complete and correct initial requirements which reduce issues discovered down the road in testing that are expensive to fix or impossible to fix and having to accept high risks Additionally if the model is maintained and optimized then it could possibly be used after system deployment to analyze impacts of upgrades or changes to the system allowing for more complete analysis and reduce overall system redesign costs1313A hurdle to cross with modeling and analysis is convincing people to believe those models Some method to certify these models and modeling tools should be addressed in the future Standards should be set in place for correct modeling techniques for complex systems Lastly consideration of standard verification checking tools should be made such as with the use of the Motor Industry Software Reliability Association (MISRA) compliance verification tool for the use of C in safety critical systems

22

Systems Reliability Standard Establishment

bull Establish a working group to define this standardndash Need a technical society to lead the charge on this

bull Collaborate with industry academia military and societiesndash Focus on development of a reliability standard with AWR safety in mindndash Draw upon the experiences to feed into this standard

bull Study existing and previous complex systemsndash Shuttle Space Station missile systems nuclear submarine and ship

systems nuclear control systems military and commercial jet systems ndash Obtain software reliability information from given existing and previous

systemsndash Build database which would serve as basis for future reliability

bull Research prior efforts in complex systems analysis

bull Establish consensus based modeling and analysis method

Presenter
Presentation Notes
In conclusion methods for achieving a design for complex systems do exist however achieving reliability and attaining a level of qualification that would permit better AWRs does not There needs to be a standard developed to tackle this issue rather than relying on current methods to ascertain airworthiness of complex systems This will not occur overnight An orchestrated collaboration among industry academia military labs and technically professional societies to focus on development of this standard should allow us to draw upon the experiences to feed into this reliability standard with AWR safety in mind We have a long living experiment with complex software systems on the Space Transportation System (STS) International Space Station (ISS) missile systems nuclear submarine and ship systems nuclear control systems military and commercial jet systems from which we should be able to obtain at least inferred software reliability information from the architecture and run time that the systems use We should look at the lessons learned from these systems to see what could have been done to improve and what was done right that should be carried forward The challenge is collecting this information into a central database and arriving at some figure of reliability from previous systems where data exists This would at least provide a starting point to allow initial assessments and could be optimized in the future Also this is not the only study for establishing reliability metrics to complex software systems There have been research projects of similarity to this effort that have risen and fallen The data from those projects should not be wasted but studied to feed into whatever standard that is developed While historical information would be useful each design is unique and requires tools to accomplish the design Investigation of architectural modeling constructs should be further investigated as a possible augmentation to the design and test process We need to determine which forum is best to conduct this effort (eg SAE IEEE AIAA ACM AHS INCOSE or other) As stated in the paper ldquoSpace Shuttle Avionicsrdquo [31] ldquoThe designers the flight crew and other operational users of the system often have a mindset established in a previous program or experience which results in a bias against new or different lsquounconventionalrsquo approachesrdquo If nothing is done to address this problem it will only get worse over time It is past time to address the issue of reliability of complex systems and software

23

BACKUP SLIDES

BACKUP SLIDES

24

PEO Aviation System Safety Management Decision Authority Matrix

Severity(Most Credible)

FrequentP gt 1E-3

A

Probable 1D-4 lt P lt= 1E-3

B

Occasional1E-5 lt P lt= 1E-4

C

Remote1E-6 lt P lt= 1E-5

D

Improbable1E-7 lt P lt= 1E-6

E

Catastrophic1

Critical 2

Marginal 3

Negligible4

Army Acquisition

PEO Aviation

ProgramManagement

HazardCategory

Description

1 Catastrophic Death or permanent total disability system loss

2 Critical Severe injury or minor occupational illness (no permanent effect) minor system or environmental damage

3 Marginal Minor injury or minor occupational illness (no permanent effect) minor system or environmental damage

4 Negligible Less than minor injury or occupational illness (no lost workdays) or less than minor environmental damage

RiskLevel

Description Probability (Frequency) (per 100000 flight hours)

A Frequent gt 100 (P gt 1E-3)

B Probable lt=100 and gt10 (1E-4 lt P lt= 1E-3)

C Occasional lt= 10 and gt1 (1E-5 lt P lt= 1E-4)

D Remote lt=1 and gt01 (1E-6 lt P lt= 1E-5)

E Improbable lt=01 and gt001 (1E-7 lt P lt= 1E-6)

Presenter
Presentation Notes
As already mentioned it is the goal of the US Army Aviation Engineering Directorate that the system is safe and reliable to operate and will perform the mission when delivered Another goal includes the released system continues to safely perform the mission if maintained and operated per the operatorrsquos manual Replacement parts and overhaul work must be high quality to support continued airworthiness Per the Program Element Office Memorandum 08-03 Risk Matrix US Army flight control systems are to achieve 1E-9 reliability for flight critical functions per civil airspace regulations [35] and 1E-6 reliability for tactical use [36] Quantifying these numbers is established for component hardware but not for software Just as hardware should have quantifiable reliability so should software13

25

Reliability Defined

bull Software Reliability - the probability that a given piece of software will execute without failure in a given environment for a given time [40] ndash Often debated as to how to measure

bull Hardware Reliability - the probability that a hardware component fails over time ndash Well defined and established

bull System Reliability - the probability of success or the probability that the system will perform its intended function under specified design limits [over a given amount of time] [39] ndash A combination of software and hardware reliability

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

26

Hardware vs Software Reliability

Hardware Reliability Software ReliabilityFailure rate has a bathtub curve The burn-in state is similar to the software debugging state

Without considering program evolution failure rate is statistically non-increasing

Material deterioration can cause failures even though the system is not used

Failures never occur if the software is not used

Failure data are fitted to some distributions The selection of the underlying distribution is based on the analysis of failure data and experiences Emphasis is placed on analyzing failure data

Most models are analytically derived from assumptions Emphasis is on developing the model the interpretation of the model assumptions and the physical meaning of the parameters

Failures are caused by material deterioration design errors misuse and environment

Failures are caused by incorrect logic incorrect statements or incorrect input data

Can be improved by better design better material applying redundancy and accelerated life cycle testing

Can be improved by increasing testing effort and correcting discovered faults Reliability tends to change continuously during testing due to the addition of problems in new code or the removal of problems by debugging errors

Hardware repairs restore the original condition Software repairs establish a new piece of software

Hardware failures are usually preceded by warnings Software failures are rarely preceded by warnings

Hardware components can be standardized Software components have rarely been standardized

Hardware can usually be tested exhaustively Software essentially requires infinite testing for completeness

Reference [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Hong Pham compares software versus hardware reliability with the information as shown in Table 2 [39] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

27

Acronym ListACRONYM DEFINITIONAADL Architectural Analysis and Design LanguageAC Advisory Circular (FAA)ACM Association of Computing MachineryAED Aviation Engineering Directorate (AMRDEC)AFTD Aviation Flight Test Directorate (US Army)AGC Apollo Guidance ComputerAHS American Helicopter SocietyAIAA American Institute of Aeronautics and Astronautics (Inc)AMCOM Aviation and Missile Command (US Army)AMRDEC Aviation and Missiles Research Development and Engineering Center (US Army)AR Army RegulationARINC Aeronautical Radio Inc ARP Aerospace Recommended PracticeASIF Avionics Software Integration FacilityATAM Architecture Tradeoff Analysis MethodATM Air Traffic ManagementAWR Airworthiness ReleaseCAAS Common Avionics Architecture SystemCH-47 Cargo Helicopter ChinookCMM Capability Maturity ModelCMMI Capability Maturity Model IndexCMU Carnegie Mellon UniversityCNS Communications Navigation SurveillanceCoSMIC Component Synthesis using Model-Integrated ComputingCPS Cyber-Physical SystemCRC Chemical Rubber Company (ie CRC Press)DFBW Digital Fly-By-WireDoD Department of DefenseE3 Electrical and Electromagnetic EffectsESML Embedded System Modeling LanguageFAA Federal Aviation AdministrationFCS Future Combat SystemsFHA Functional Hazard AssessmentFMEA Failure Modes Effects Analysis

28

Acronym List (concluded)

ACRONYM DEFINITIONGPWS Ground Proximity Warning SystemIBM International Business MachinesIEC International Engineering ConsortiumIL Instrumentation Lab (now Draper Laboratory)IMA Integrated Modular AvionicsINCOSE International Council On Systems EngineeringISO International Organization for StandardizationISS International Space StationKAL Korean AirlinesMISRA Motor Industry Standard Software Reliability AssociationMIT Massachusetts Institute of TechnologyNASA National Aeronautics and Space Administration (USA)PDR Preliminary Design ReviewPEO Program Element OfficePNAS Proceedings of the National Academy of SciencesRAQ Rotorcraft and Aircraft QualificationRMA Rate Monotonic AnalysisRTC Redstone Test Center (US Army) RTTC Redstone Technical Test Center (US Army)RTCA Radio Technical Commission for AeronauticsSAE Society of Automotive EngineersSED Software Engineering Directorate (AMRDEC)SEES Software Engineering Evaluation SystemSEI Software Engineering Institute (CMU)SIL System Integration LaboratorySSA System Safety AssessmentSTS Space Transportation SystemSysML Systems Modeling LanguageTMR Triple Modular RedundantTRL Technical Readiness LevelUAS Unmanned Aircraft SystemUH-60 Utility Helicopter BlackhawkUML Unified Modeling LanguageUS United StatesUSL Universal Systems Language

29

References

bull [1] Israel Koren and Mani Krishna ldquoFault-Tolerant Systemsrdquo Morgan Kaufmann 2007bull [2] Jianto Pan ldquoSoftware Reliabilityrdquo Carnegie Mellon University Spring 1999bull [3] Nachum Dershowitz httpwwwcstauacil~nachumdhorrorhtmlbull [4] httpwwwair-attackcombull [5] David A Mindell ldquoDigital Apollo Human and Machine in Spaceflightrdquo The MIT Press 2008bull [6] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Flight Software Airworthiness SED-SES-PMHFSA 001 December 2003bull [7] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Software Safety SED-SES-PMHSSA 001 February 2006bull [8] AMCOM Regulation 385-17 Software System Safety Policy 15 March 2008bull [9] NASA Software Safety Guidebook NASA-GB-871913 31 March 2004bull [10] Margaret Hamilton amp William Hackler ldquoUniversal Systems Language Lessons Learned from Apollordquo IEEE Computer

Society 2008bull [11] Margaret Hamilton ldquoFull Life Cycle Systems Engineering and Software Development Environment Development Before

The Fact In Actionrdquo httpwwwhtiuscomArticlesFull_Life_Cyclehtmbull [12] Peter Feiler David Gluch John Hudak ldquo The Architecture Analysis amp Design Language (AADL) An Introductionrdquo

CMUSEI-2006-TN-011 February 2006bull [13] Peter Feiler John Hudak ldquoDeveloping AADL Models for Control Systems A Practitionerrsquos Guiderdquo CMUSEI-2007-TR-

014 July 2007bull [14] Bruce Lewis ldquoUsing the Architecture Analysis and Design Language for System Verification and Validationrdquo SEI

Presentation 2006bull [15] Feiler Gluch Hudak Lewis ldquoEmbedded System Architecture Analysis Using SAE AADLrdquo CMUSEI-2004-TN-004 June

2004bull [16] Charles Pecheur Stacy Nelson ldquoVampV of Advanced Systems at NASArdquo NASACR-2002-211402 April 2002bull [17] Systems Integration Requirements Task Group ldquoARP 4754 Certification Considerations for Highly-Integrated or

Complex Aircraft Systemsrdquo SAE Aerospace 10 April 1996bull [18] SAE ldquoARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne System and

Equipmentrdquo December 1996bull [19] Aeronautical Radio Inc (ARINC) ldquoARINC Specification 653P1-2 Avionics Application Software Standard Interface Part 1

ndash Required Servicesrdquo 7 March 2006

30

References (Continued)

bull [20] Department of Defense ldquoMIL-STD-882D Standard Practice for System Safetyrdquo 19 January 1993bull [21] RTCA Incorporated ldquoDO-178 Software Considerations in Airborne Systems and Equipment Certificationrdquo 1

December 1992bull [22] RTCA Incorporated ldquoDO-254 Design Assurance Guidance for Airborne Electronic Hardwarerdquo 19 April 2000bull [23] US Army ldquoAeronautical Design Standard Handbook Rotorcraft and Aircraft Qualification (RAQ) Handbookrdquo 21

October 1996bull [24] Cary R Spitzer (Editor) ldquoAvionics Elements Software and Functionsrdquo CRC Press 2007bull [25] US Army ldquoArmy Regulation 70-62 Airworthiness Qualification of Aircraft Systemsrdquo 21 May 2007bull [26] US Army ldquoArmy Regulation 95-1 Aviation Flight Regulationsrdquo 3 February 2006bull [27] ldquoUsing the Architecture Tradeoff Analysis Method (ATAM) to Evaluate the Software Architecture for a Product Line

of Avionics Systems A Case Studyrdquo Barbacci Clements Lattanze Northrop Wood July 2003 CMUSEI-2003-TN-012bull [28] ldquoAll in the Family CAAS amp AADLrdquo Peter Feiler August 2008 CMUSEI-2008-SR-021bull [29] ldquoCMMI Guidelines for Process Integration and Product Improvementrdquo Chrissis Konrad Shrum Pearson Education

2007bull [30] ldquoModel Driven Performance Analysis for Avionics Systemsrdquo Brendan OrsquoConnell Draper Laboratory January 2006bull [31] John F Hanaway Robert W Moorehead ldquoSpace Shuttle Avionics Systemsrdquo NASA SP-504 1989bull [32] Lui Sha ldquoThe Complexity Challenge in Modern Avionics Softwarerdquo August 14 2006bull [33] ldquoIncidents Prompt New Scrutiny of Airplane Software Glitchesrdquo 30 May 2006 Wall Street Journalbull [34] Eyal Ophir Clifford Nass and Anthony Wagner ldquo Cognitive Control in Media Multitaskersrdquo PNAS 20 July 2009bull [35] ldquoAdvisory Circular AC 251309-1A System Design and Analysisrdquo Federal Aviation Administration 21 June 1988bull [36] Program Element Office Policy Memorandum 08-03bull [38] httpwwwnsfgovpubs2008nsf08611nsf08611htm National Science Foundation webpage on Cyber-Physical

Systemsbull [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000bull [40] Paul Rook editor ldquoSoftware Reliability Handbookrdquo Elsevier Science Publishers LTD 1990

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)

18

TRL 3 or 4 TRL 6 or 7 TRL 8 or 9

Component 4

ComplexityFundamentals

Reliability Parametrics

Component 3

ComplexityFundamentals

Reliability Parametrics

Definition of Complexity and Reliability is Needed

Subsystem 1

SystemIntegration ofComponents

ReliabilityDependencies

Component 2

ComplexityFundamentals

Reliability Parametrics

Component 1

ComplexityFundamentals

Reliability Parametrics

Subsystem 2

SystemIntegration ofComponents

ReliabilityDependencies

Integration System

Realized System

ReliabilitySensitivities

High Reliable Complex System

Certificate(eg AWR)

Integration

Integration

Integration

Integration

Integration

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) andor AWR

19

Analytical Models and Reliability

bull Analytical models of hardware reliability are well understood

bull Architecture modeling and software reliability modeling is not a novel idea but is highly debated

ndash There are many approaches and little consensus as to best wayndash Many models (Jelinski-Moranda Littlewood-Verrall Musa-Okumoto etc) [1]ndash Many tools (over 200+ tools since 1970s have been built) [2]

bull Predictability of software reliability is of great concern because it is a major contributor to unreliability[2]

bull Software Reliability is the probability of error-free operation of a computer program in a specified environment for a specified time [1]

bull Need basis for setting reliability figures based on previous systems and iteratively refine those figures in the future

bull NOT A REPLACEMENT FOR TESTING AND VERIFICATION

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability While not suggesting as solution in total better modeling practices should be considered toward a solution to bridge the gap between design test and implementation A method to model the architecture early in the requirements establishment phase follow the detailed design and coding and then be able to verify the system along with the model may be a path to greater confidence in the system and reduce the risks warnings and cautions that must be issued13In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result (see Figure 9) The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) or AWR 13To achieve this goal modeling and analysis tools that follow a standard for modeling reliability should exist As previously stated over 200 tools have been created since 1970s Here is a list of a few of the current tools13Universal Systems Language (USL)13Unified Modeling Language (UML)13Systems Modeling Language (SysML)13MATLABSimulink13Telelogic Rhapsody13MathCAD13Colored Petri Nets13Rate Monotonic Analysis (RMA)13STATEMATE (Used by Airbus)13Standard for the Development of Safety-Critical Embedded Software (SCADE)13OPNET13Embedded System Modeling Language (ESML)13Component Synthesis using Model-Integrated Computing (CoSMIC)13Architectural Analysis and Design Language (AADL)13By no means is this list complete Typically different companies and projects address this challenge and choose unit tools to perform the upfront analysis and modeling not following a standard approach Multiple tools need to converge or be compatible with the framework of a common tool such as AADL The resultant tool(s) could be used to verify the requirements up front to mitigate reliability issues down the life-cycle 13

20

Tools for Modeling and Analysis

bull Universal Systems Language (USL)bull Unified Modeling Language (UML)bull Systems Modeling Language (SysML)bull MATLABSimulinkbull Telelogic Rhapsodybull MathCadbull Colored Petri Netsbull Rate Monotonic Analysis (RMA)bull STATEMATE (Used by Airbus)bull SCADEbull OPNETbull Embedded System Modeling Language (ESML)bull Component Synthesis using Model-Integrated Computing (CoSMIC)bull Architectural Analysis and Design Language (AADL)bull At least 200+ more packages since the 70rsquosbull Certified tools needs to converge to an accepted standard

modelinganalysis method for complex system reliability

Presenter
Presentation Notes
Typically different companies and projects address the challenge and choose unit tools to perform the upfront analysis and modeling not following a guideline approach Multiple tools need to converge or be compatible with a set modeling standard for complex systems The compliant tools could be used to verify the requirements up front to mitigate reliability issues down the life-cycle A notional approach would follow that shown in Figure 8 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model The down side to modeling from certain circles is getting people to believe those models How do you certify a modeling tool and the actual models within the tools Those issues should be addressed going forward

21

Modification to Acquisition Model

Requirements Establishment

High Level Design

Detailed Specifications

Implementation Coding

Verification

Development TestingA

rchi

tect

ural

Mod

el amp

A

naly

sis

Propose standard modeling methodology to be applied at different phases of development to enhance requirements development reliability allocation reliability

measurement and testing (DISCLAIMER DOES NOT REPLACE TESTING)

Reliabilityallocated Reliability

measured

Operational Testing amp Validation

Deployed System

Presenter
Presentation Notes
A notional approach would follow that shown in Figure 10 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model This approach could bridge the design and test phases together It is emphasized here that the architectural model would not replace critical testing but augments the process to allow for better requirement identification and verification Thorough ground and flight tests should never be replaced by modeling Modeling would only allow for more robust and a higher level of confidence in the requirements and design The model could be used in conjunction with the testing to confirm the design Proper modeling and analysis would reduce total program costs by enforcing more complete and correct initial requirements which reduce issues discovered down the road in testing that are expensive to fix or impossible to fix and having to accept high risks Additionally if the model is maintained and optimized then it could possibly be used after system deployment to analyze impacts of upgrades or changes to the system allowing for more complete analysis and reduce overall system redesign costs1313A hurdle to cross with modeling and analysis is convincing people to believe those models Some method to certify these models and modeling tools should be addressed in the future Standards should be set in place for correct modeling techniques for complex systems Lastly consideration of standard verification checking tools should be made such as with the use of the Motor Industry Software Reliability Association (MISRA) compliance verification tool for the use of C in safety critical systems

22

Systems Reliability Standard Establishment

bull Establish a working group to define this standardndash Need a technical society to lead the charge on this

bull Collaborate with industry academia military and societiesndash Focus on development of a reliability standard with AWR safety in mindndash Draw upon the experiences to feed into this standard

bull Study existing and previous complex systemsndash Shuttle Space Station missile systems nuclear submarine and ship

systems nuclear control systems military and commercial jet systems ndash Obtain software reliability information from given existing and previous

systemsndash Build database which would serve as basis for future reliability

bull Research prior efforts in complex systems analysis

bull Establish consensus based modeling and analysis method

Presenter
Presentation Notes
In conclusion methods for achieving a design for complex systems do exist however achieving reliability and attaining a level of qualification that would permit better AWRs does not There needs to be a standard developed to tackle this issue rather than relying on current methods to ascertain airworthiness of complex systems This will not occur overnight An orchestrated collaboration among industry academia military labs and technically professional societies to focus on development of this standard should allow us to draw upon the experiences to feed into this reliability standard with AWR safety in mind We have a long living experiment with complex software systems on the Space Transportation System (STS) International Space Station (ISS) missile systems nuclear submarine and ship systems nuclear control systems military and commercial jet systems from which we should be able to obtain at least inferred software reliability information from the architecture and run time that the systems use We should look at the lessons learned from these systems to see what could have been done to improve and what was done right that should be carried forward The challenge is collecting this information into a central database and arriving at some figure of reliability from previous systems where data exists This would at least provide a starting point to allow initial assessments and could be optimized in the future Also this is not the only study for establishing reliability metrics to complex software systems There have been research projects of similarity to this effort that have risen and fallen The data from those projects should not be wasted but studied to feed into whatever standard that is developed While historical information would be useful each design is unique and requires tools to accomplish the design Investigation of architectural modeling constructs should be further investigated as a possible augmentation to the design and test process We need to determine which forum is best to conduct this effort (eg SAE IEEE AIAA ACM AHS INCOSE or other) As stated in the paper ldquoSpace Shuttle Avionicsrdquo [31] ldquoThe designers the flight crew and other operational users of the system often have a mindset established in a previous program or experience which results in a bias against new or different lsquounconventionalrsquo approachesrdquo If nothing is done to address this problem it will only get worse over time It is past time to address the issue of reliability of complex systems and software

23

BACKUP SLIDES

BACKUP SLIDES

24

PEO Aviation System Safety Management Decision Authority Matrix

Severity(Most Credible)

FrequentP gt 1E-3

A

Probable 1D-4 lt P lt= 1E-3

B

Occasional1E-5 lt P lt= 1E-4

C

Remote1E-6 lt P lt= 1E-5

D

Improbable1E-7 lt P lt= 1E-6

E

Catastrophic1

Critical 2

Marginal 3

Negligible4

Army Acquisition

PEO Aviation

ProgramManagement

HazardCategory

Description

1 Catastrophic Death or permanent total disability system loss

2 Critical Severe injury or minor occupational illness (no permanent effect) minor system or environmental damage

3 Marginal Minor injury or minor occupational illness (no permanent effect) minor system or environmental damage

4 Negligible Less than minor injury or occupational illness (no lost workdays) or less than minor environmental damage

RiskLevel

Description Probability (Frequency) (per 100000 flight hours)

A Frequent gt 100 (P gt 1E-3)

B Probable lt=100 and gt10 (1E-4 lt P lt= 1E-3)

C Occasional lt= 10 and gt1 (1E-5 lt P lt= 1E-4)

D Remote lt=1 and gt01 (1E-6 lt P lt= 1E-5)

E Improbable lt=01 and gt001 (1E-7 lt P lt= 1E-6)

Presenter
Presentation Notes
As already mentioned it is the goal of the US Army Aviation Engineering Directorate that the system is safe and reliable to operate and will perform the mission when delivered Another goal includes the released system continues to safely perform the mission if maintained and operated per the operatorrsquos manual Replacement parts and overhaul work must be high quality to support continued airworthiness Per the Program Element Office Memorandum 08-03 Risk Matrix US Army flight control systems are to achieve 1E-9 reliability for flight critical functions per civil airspace regulations [35] and 1E-6 reliability for tactical use [36] Quantifying these numbers is established for component hardware but not for software Just as hardware should have quantifiable reliability so should software13

25

Reliability Defined

bull Software Reliability - the probability that a given piece of software will execute without failure in a given environment for a given time [40] ndash Often debated as to how to measure

bull Hardware Reliability - the probability that a hardware component fails over time ndash Well defined and established

bull System Reliability - the probability of success or the probability that the system will perform its intended function under specified design limits [over a given amount of time] [39] ndash A combination of software and hardware reliability

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

26

Hardware vs Software Reliability

Hardware Reliability Software ReliabilityFailure rate has a bathtub curve The burn-in state is similar to the software debugging state

Without considering program evolution failure rate is statistically non-increasing

Material deterioration can cause failures even though the system is not used

Failures never occur if the software is not used

Failure data are fitted to some distributions The selection of the underlying distribution is based on the analysis of failure data and experiences Emphasis is placed on analyzing failure data

Most models are analytically derived from assumptions Emphasis is on developing the model the interpretation of the model assumptions and the physical meaning of the parameters

Failures are caused by material deterioration design errors misuse and environment

Failures are caused by incorrect logic incorrect statements or incorrect input data

Can be improved by better design better material applying redundancy and accelerated life cycle testing

Can be improved by increasing testing effort and correcting discovered faults Reliability tends to change continuously during testing due to the addition of problems in new code or the removal of problems by debugging errors

Hardware repairs restore the original condition Software repairs establish a new piece of software

Hardware failures are usually preceded by warnings Software failures are rarely preceded by warnings

Hardware components can be standardized Software components have rarely been standardized

Hardware can usually be tested exhaustively Software essentially requires infinite testing for completeness

Reference [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Hong Pham compares software versus hardware reliability with the information as shown in Table 2 [39] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

27

Acronym ListACRONYM DEFINITIONAADL Architectural Analysis and Design LanguageAC Advisory Circular (FAA)ACM Association of Computing MachineryAED Aviation Engineering Directorate (AMRDEC)AFTD Aviation Flight Test Directorate (US Army)AGC Apollo Guidance ComputerAHS American Helicopter SocietyAIAA American Institute of Aeronautics and Astronautics (Inc)AMCOM Aviation and Missile Command (US Army)AMRDEC Aviation and Missiles Research Development and Engineering Center (US Army)AR Army RegulationARINC Aeronautical Radio Inc ARP Aerospace Recommended PracticeASIF Avionics Software Integration FacilityATAM Architecture Tradeoff Analysis MethodATM Air Traffic ManagementAWR Airworthiness ReleaseCAAS Common Avionics Architecture SystemCH-47 Cargo Helicopter ChinookCMM Capability Maturity ModelCMMI Capability Maturity Model IndexCMU Carnegie Mellon UniversityCNS Communications Navigation SurveillanceCoSMIC Component Synthesis using Model-Integrated ComputingCPS Cyber-Physical SystemCRC Chemical Rubber Company (ie CRC Press)DFBW Digital Fly-By-WireDoD Department of DefenseE3 Electrical and Electromagnetic EffectsESML Embedded System Modeling LanguageFAA Federal Aviation AdministrationFCS Future Combat SystemsFHA Functional Hazard AssessmentFMEA Failure Modes Effects Analysis

28

Acronym List (concluded)

ACRONYM DEFINITIONGPWS Ground Proximity Warning SystemIBM International Business MachinesIEC International Engineering ConsortiumIL Instrumentation Lab (now Draper Laboratory)IMA Integrated Modular AvionicsINCOSE International Council On Systems EngineeringISO International Organization for StandardizationISS International Space StationKAL Korean AirlinesMISRA Motor Industry Standard Software Reliability AssociationMIT Massachusetts Institute of TechnologyNASA National Aeronautics and Space Administration (USA)PDR Preliminary Design ReviewPEO Program Element OfficePNAS Proceedings of the National Academy of SciencesRAQ Rotorcraft and Aircraft QualificationRMA Rate Monotonic AnalysisRTC Redstone Test Center (US Army) RTTC Redstone Technical Test Center (US Army)RTCA Radio Technical Commission for AeronauticsSAE Society of Automotive EngineersSED Software Engineering Directorate (AMRDEC)SEES Software Engineering Evaluation SystemSEI Software Engineering Institute (CMU)SIL System Integration LaboratorySSA System Safety AssessmentSTS Space Transportation SystemSysML Systems Modeling LanguageTMR Triple Modular RedundantTRL Technical Readiness LevelUAS Unmanned Aircraft SystemUH-60 Utility Helicopter BlackhawkUML Unified Modeling LanguageUS United StatesUSL Universal Systems Language

29

References

bull [1] Israel Koren and Mani Krishna ldquoFault-Tolerant Systemsrdquo Morgan Kaufmann 2007bull [2] Jianto Pan ldquoSoftware Reliabilityrdquo Carnegie Mellon University Spring 1999bull [3] Nachum Dershowitz httpwwwcstauacil~nachumdhorrorhtmlbull [4] httpwwwair-attackcombull [5] David A Mindell ldquoDigital Apollo Human and Machine in Spaceflightrdquo The MIT Press 2008bull [6] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Flight Software Airworthiness SED-SES-PMHFSA 001 December 2003bull [7] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Software Safety SED-SES-PMHSSA 001 February 2006bull [8] AMCOM Regulation 385-17 Software System Safety Policy 15 March 2008bull [9] NASA Software Safety Guidebook NASA-GB-871913 31 March 2004bull [10] Margaret Hamilton amp William Hackler ldquoUniversal Systems Language Lessons Learned from Apollordquo IEEE Computer

Society 2008bull [11] Margaret Hamilton ldquoFull Life Cycle Systems Engineering and Software Development Environment Development Before

The Fact In Actionrdquo httpwwwhtiuscomArticlesFull_Life_Cyclehtmbull [12] Peter Feiler David Gluch John Hudak ldquo The Architecture Analysis amp Design Language (AADL) An Introductionrdquo

CMUSEI-2006-TN-011 February 2006bull [13] Peter Feiler John Hudak ldquoDeveloping AADL Models for Control Systems A Practitionerrsquos Guiderdquo CMUSEI-2007-TR-

014 July 2007bull [14] Bruce Lewis ldquoUsing the Architecture Analysis and Design Language for System Verification and Validationrdquo SEI

Presentation 2006bull [15] Feiler Gluch Hudak Lewis ldquoEmbedded System Architecture Analysis Using SAE AADLrdquo CMUSEI-2004-TN-004 June

2004bull [16] Charles Pecheur Stacy Nelson ldquoVampV of Advanced Systems at NASArdquo NASACR-2002-211402 April 2002bull [17] Systems Integration Requirements Task Group ldquoARP 4754 Certification Considerations for Highly-Integrated or

Complex Aircraft Systemsrdquo SAE Aerospace 10 April 1996bull [18] SAE ldquoARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne System and

Equipmentrdquo December 1996bull [19] Aeronautical Radio Inc (ARINC) ldquoARINC Specification 653P1-2 Avionics Application Software Standard Interface Part 1

ndash Required Servicesrdquo 7 March 2006

30

References (Continued)

bull [20] Department of Defense ldquoMIL-STD-882D Standard Practice for System Safetyrdquo 19 January 1993bull [21] RTCA Incorporated ldquoDO-178 Software Considerations in Airborne Systems and Equipment Certificationrdquo 1

December 1992bull [22] RTCA Incorporated ldquoDO-254 Design Assurance Guidance for Airborne Electronic Hardwarerdquo 19 April 2000bull [23] US Army ldquoAeronautical Design Standard Handbook Rotorcraft and Aircraft Qualification (RAQ) Handbookrdquo 21

October 1996bull [24] Cary R Spitzer (Editor) ldquoAvionics Elements Software and Functionsrdquo CRC Press 2007bull [25] US Army ldquoArmy Regulation 70-62 Airworthiness Qualification of Aircraft Systemsrdquo 21 May 2007bull [26] US Army ldquoArmy Regulation 95-1 Aviation Flight Regulationsrdquo 3 February 2006bull [27] ldquoUsing the Architecture Tradeoff Analysis Method (ATAM) to Evaluate the Software Architecture for a Product Line

of Avionics Systems A Case Studyrdquo Barbacci Clements Lattanze Northrop Wood July 2003 CMUSEI-2003-TN-012bull [28] ldquoAll in the Family CAAS amp AADLrdquo Peter Feiler August 2008 CMUSEI-2008-SR-021bull [29] ldquoCMMI Guidelines for Process Integration and Product Improvementrdquo Chrissis Konrad Shrum Pearson Education

2007bull [30] ldquoModel Driven Performance Analysis for Avionics Systemsrdquo Brendan OrsquoConnell Draper Laboratory January 2006bull [31] John F Hanaway Robert W Moorehead ldquoSpace Shuttle Avionics Systemsrdquo NASA SP-504 1989bull [32] Lui Sha ldquoThe Complexity Challenge in Modern Avionics Softwarerdquo August 14 2006bull [33] ldquoIncidents Prompt New Scrutiny of Airplane Software Glitchesrdquo 30 May 2006 Wall Street Journalbull [34] Eyal Ophir Clifford Nass and Anthony Wagner ldquo Cognitive Control in Media Multitaskersrdquo PNAS 20 July 2009bull [35] ldquoAdvisory Circular AC 251309-1A System Design and Analysisrdquo Federal Aviation Administration 21 June 1988bull [36] Program Element Office Policy Memorandum 08-03bull [38] httpwwwnsfgovpubs2008nsf08611nsf08611htm National Science Foundation webpage on Cyber-Physical

Systemsbull [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000bull [40] Paul Rook editor ldquoSoftware Reliability Handbookrdquo Elsevier Science Publishers LTD 1990

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)

19

Analytical Models and Reliability

bull Analytical models of hardware reliability are well understood

bull Architecture modeling and software reliability modeling is not a novel idea but is highly debated

ndash There are many approaches and little consensus as to best wayndash Many models (Jelinski-Moranda Littlewood-Verrall Musa-Okumoto etc) [1]ndash Many tools (over 200+ tools since 1970s have been built) [2]

bull Predictability of software reliability is of great concern because it is a major contributor to unreliability[2]

bull Software Reliability is the probability of error-free operation of a computer program in a specified environment for a specified time [1]

bull Need basis for setting reliability figures based on previous systems and iteratively refine those figures in the future

bull NOT A REPLACEMENT FOR TESTING AND VERIFICATION

Presenter
Presentation Notes
As mentioned in the introduction the overall goal of this paper is to state the need for an industry standard method for qualifying complex integrated systems to a specified reliability While not suggesting as solution in total better modeling practices should be considered toward a solution to bridge the gap between design test and implementation A method to model the architecture early in the requirements establishment phase follow the detailed design and coding and then be able to verify the system along with the model may be a path to greater confidence in the system and reduce the risks warnings and cautions that must be issued13In order to achieve this the systems must be broken down to component levels and built up to subsystem and system levels An overall aggregated system reliability value should result (see Figure 9) The goal should be to establish the ability to assess the reliability from a component subsystem and then a system level with each phase working toward a higher Technical Readiness Level (TRL) The end result would be fed into the accepted Type Certificate (TC) or AWR 13To achieve this goal modeling and analysis tools that follow a standard for modeling reliability should exist As previously stated over 200 tools have been created since 1970s Here is a list of a few of the current tools13Universal Systems Language (USL)13Unified Modeling Language (UML)13Systems Modeling Language (SysML)13MATLABSimulink13Telelogic Rhapsody13MathCAD13Colored Petri Nets13Rate Monotonic Analysis (RMA)13STATEMATE (Used by Airbus)13Standard for the Development of Safety-Critical Embedded Software (SCADE)13OPNET13Embedded System Modeling Language (ESML)13Component Synthesis using Model-Integrated Computing (CoSMIC)13Architectural Analysis and Design Language (AADL)13By no means is this list complete Typically different companies and projects address this challenge and choose unit tools to perform the upfront analysis and modeling not following a standard approach Multiple tools need to converge or be compatible with the framework of a common tool such as AADL The resultant tool(s) could be used to verify the requirements up front to mitigate reliability issues down the life-cycle 13

20

Tools for Modeling and Analysis

bull Universal Systems Language (USL)bull Unified Modeling Language (UML)bull Systems Modeling Language (SysML)bull MATLABSimulinkbull Telelogic Rhapsodybull MathCadbull Colored Petri Netsbull Rate Monotonic Analysis (RMA)bull STATEMATE (Used by Airbus)bull SCADEbull OPNETbull Embedded System Modeling Language (ESML)bull Component Synthesis using Model-Integrated Computing (CoSMIC)bull Architectural Analysis and Design Language (AADL)bull At least 200+ more packages since the 70rsquosbull Certified tools needs to converge to an accepted standard

modelinganalysis method for complex system reliability

Presenter
Presentation Notes
Typically different companies and projects address the challenge and choose unit tools to perform the upfront analysis and modeling not following a guideline approach Multiple tools need to converge or be compatible with a set modeling standard for complex systems The compliant tools could be used to verify the requirements up front to mitigate reliability issues down the life-cycle A notional approach would follow that shown in Figure 8 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model The down side to modeling from certain circles is getting people to believe those models How do you certify a modeling tool and the actual models within the tools Those issues should be addressed going forward

21

Modification to Acquisition Model

Requirements Establishment

High Level Design

Detailed Specifications

Implementation Coding

Verification

Development TestingA

rchi

tect

ural

Mod

el amp

A

naly

sis

Propose standard modeling methodology to be applied at different phases of development to enhance requirements development reliability allocation reliability

measurement and testing (DISCLAIMER DOES NOT REPLACE TESTING)

Reliabilityallocated Reliability

measured

Operational Testing amp Validation

Deployed System

Presenter
Presentation Notes
A notional approach would follow that shown in Figure 10 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model This approach could bridge the design and test phases together It is emphasized here that the architectural model would not replace critical testing but augments the process to allow for better requirement identification and verification Thorough ground and flight tests should never be replaced by modeling Modeling would only allow for more robust and a higher level of confidence in the requirements and design The model could be used in conjunction with the testing to confirm the design Proper modeling and analysis would reduce total program costs by enforcing more complete and correct initial requirements which reduce issues discovered down the road in testing that are expensive to fix or impossible to fix and having to accept high risks Additionally if the model is maintained and optimized then it could possibly be used after system deployment to analyze impacts of upgrades or changes to the system allowing for more complete analysis and reduce overall system redesign costs1313A hurdle to cross with modeling and analysis is convincing people to believe those models Some method to certify these models and modeling tools should be addressed in the future Standards should be set in place for correct modeling techniques for complex systems Lastly consideration of standard verification checking tools should be made such as with the use of the Motor Industry Software Reliability Association (MISRA) compliance verification tool for the use of C in safety critical systems

22

Systems Reliability Standard Establishment

bull Establish a working group to define this standardndash Need a technical society to lead the charge on this

bull Collaborate with industry academia military and societiesndash Focus on development of a reliability standard with AWR safety in mindndash Draw upon the experiences to feed into this standard

bull Study existing and previous complex systemsndash Shuttle Space Station missile systems nuclear submarine and ship

systems nuclear control systems military and commercial jet systems ndash Obtain software reliability information from given existing and previous

systemsndash Build database which would serve as basis for future reliability

bull Research prior efforts in complex systems analysis

bull Establish consensus based modeling and analysis method

Presenter
Presentation Notes
In conclusion methods for achieving a design for complex systems do exist however achieving reliability and attaining a level of qualification that would permit better AWRs does not There needs to be a standard developed to tackle this issue rather than relying on current methods to ascertain airworthiness of complex systems This will not occur overnight An orchestrated collaboration among industry academia military labs and technically professional societies to focus on development of this standard should allow us to draw upon the experiences to feed into this reliability standard with AWR safety in mind We have a long living experiment with complex software systems on the Space Transportation System (STS) International Space Station (ISS) missile systems nuclear submarine and ship systems nuclear control systems military and commercial jet systems from which we should be able to obtain at least inferred software reliability information from the architecture and run time that the systems use We should look at the lessons learned from these systems to see what could have been done to improve and what was done right that should be carried forward The challenge is collecting this information into a central database and arriving at some figure of reliability from previous systems where data exists This would at least provide a starting point to allow initial assessments and could be optimized in the future Also this is not the only study for establishing reliability metrics to complex software systems There have been research projects of similarity to this effort that have risen and fallen The data from those projects should not be wasted but studied to feed into whatever standard that is developed While historical information would be useful each design is unique and requires tools to accomplish the design Investigation of architectural modeling constructs should be further investigated as a possible augmentation to the design and test process We need to determine which forum is best to conduct this effort (eg SAE IEEE AIAA ACM AHS INCOSE or other) As stated in the paper ldquoSpace Shuttle Avionicsrdquo [31] ldquoThe designers the flight crew and other operational users of the system often have a mindset established in a previous program or experience which results in a bias against new or different lsquounconventionalrsquo approachesrdquo If nothing is done to address this problem it will only get worse over time It is past time to address the issue of reliability of complex systems and software

23

BACKUP SLIDES

BACKUP SLIDES

24

PEO Aviation System Safety Management Decision Authority Matrix

Severity(Most Credible)

FrequentP gt 1E-3

A

Probable 1D-4 lt P lt= 1E-3

B

Occasional1E-5 lt P lt= 1E-4

C

Remote1E-6 lt P lt= 1E-5

D

Improbable1E-7 lt P lt= 1E-6

E

Catastrophic1

Critical 2

Marginal 3

Negligible4

Army Acquisition

PEO Aviation

ProgramManagement

HazardCategory

Description

1 Catastrophic Death or permanent total disability system loss

2 Critical Severe injury or minor occupational illness (no permanent effect) minor system or environmental damage

3 Marginal Minor injury or minor occupational illness (no permanent effect) minor system or environmental damage

4 Negligible Less than minor injury or occupational illness (no lost workdays) or less than minor environmental damage

RiskLevel

Description Probability (Frequency) (per 100000 flight hours)

A Frequent gt 100 (P gt 1E-3)

B Probable lt=100 and gt10 (1E-4 lt P lt= 1E-3)

C Occasional lt= 10 and gt1 (1E-5 lt P lt= 1E-4)

D Remote lt=1 and gt01 (1E-6 lt P lt= 1E-5)

E Improbable lt=01 and gt001 (1E-7 lt P lt= 1E-6)

Presenter
Presentation Notes
As already mentioned it is the goal of the US Army Aviation Engineering Directorate that the system is safe and reliable to operate and will perform the mission when delivered Another goal includes the released system continues to safely perform the mission if maintained and operated per the operatorrsquos manual Replacement parts and overhaul work must be high quality to support continued airworthiness Per the Program Element Office Memorandum 08-03 Risk Matrix US Army flight control systems are to achieve 1E-9 reliability for flight critical functions per civil airspace regulations [35] and 1E-6 reliability for tactical use [36] Quantifying these numbers is established for component hardware but not for software Just as hardware should have quantifiable reliability so should software13

25

Reliability Defined

bull Software Reliability - the probability that a given piece of software will execute without failure in a given environment for a given time [40] ndash Often debated as to how to measure

bull Hardware Reliability - the probability that a hardware component fails over time ndash Well defined and established

bull System Reliability - the probability of success or the probability that the system will perform its intended function under specified design limits [over a given amount of time] [39] ndash A combination of software and hardware reliability

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

26

Hardware vs Software Reliability

Hardware Reliability Software ReliabilityFailure rate has a bathtub curve The burn-in state is similar to the software debugging state

Without considering program evolution failure rate is statistically non-increasing

Material deterioration can cause failures even though the system is not used

Failures never occur if the software is not used

Failure data are fitted to some distributions The selection of the underlying distribution is based on the analysis of failure data and experiences Emphasis is placed on analyzing failure data

Most models are analytically derived from assumptions Emphasis is on developing the model the interpretation of the model assumptions and the physical meaning of the parameters

Failures are caused by material deterioration design errors misuse and environment

Failures are caused by incorrect logic incorrect statements or incorrect input data

Can be improved by better design better material applying redundancy and accelerated life cycle testing

Can be improved by increasing testing effort and correcting discovered faults Reliability tends to change continuously during testing due to the addition of problems in new code or the removal of problems by debugging errors

Hardware repairs restore the original condition Software repairs establish a new piece of software

Hardware failures are usually preceded by warnings Software failures are rarely preceded by warnings

Hardware components can be standardized Software components have rarely been standardized

Hardware can usually be tested exhaustively Software essentially requires infinite testing for completeness

Reference [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Hong Pham compares software versus hardware reliability with the information as shown in Table 2 [39] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

27

Acronym ListACRONYM DEFINITIONAADL Architectural Analysis and Design LanguageAC Advisory Circular (FAA)ACM Association of Computing MachineryAED Aviation Engineering Directorate (AMRDEC)AFTD Aviation Flight Test Directorate (US Army)AGC Apollo Guidance ComputerAHS American Helicopter SocietyAIAA American Institute of Aeronautics and Astronautics (Inc)AMCOM Aviation and Missile Command (US Army)AMRDEC Aviation and Missiles Research Development and Engineering Center (US Army)AR Army RegulationARINC Aeronautical Radio Inc ARP Aerospace Recommended PracticeASIF Avionics Software Integration FacilityATAM Architecture Tradeoff Analysis MethodATM Air Traffic ManagementAWR Airworthiness ReleaseCAAS Common Avionics Architecture SystemCH-47 Cargo Helicopter ChinookCMM Capability Maturity ModelCMMI Capability Maturity Model IndexCMU Carnegie Mellon UniversityCNS Communications Navigation SurveillanceCoSMIC Component Synthesis using Model-Integrated ComputingCPS Cyber-Physical SystemCRC Chemical Rubber Company (ie CRC Press)DFBW Digital Fly-By-WireDoD Department of DefenseE3 Electrical and Electromagnetic EffectsESML Embedded System Modeling LanguageFAA Federal Aviation AdministrationFCS Future Combat SystemsFHA Functional Hazard AssessmentFMEA Failure Modes Effects Analysis

28

Acronym List (concluded)

ACRONYM DEFINITIONGPWS Ground Proximity Warning SystemIBM International Business MachinesIEC International Engineering ConsortiumIL Instrumentation Lab (now Draper Laboratory)IMA Integrated Modular AvionicsINCOSE International Council On Systems EngineeringISO International Organization for StandardizationISS International Space StationKAL Korean AirlinesMISRA Motor Industry Standard Software Reliability AssociationMIT Massachusetts Institute of TechnologyNASA National Aeronautics and Space Administration (USA)PDR Preliminary Design ReviewPEO Program Element OfficePNAS Proceedings of the National Academy of SciencesRAQ Rotorcraft and Aircraft QualificationRMA Rate Monotonic AnalysisRTC Redstone Test Center (US Army) RTTC Redstone Technical Test Center (US Army)RTCA Radio Technical Commission for AeronauticsSAE Society of Automotive EngineersSED Software Engineering Directorate (AMRDEC)SEES Software Engineering Evaluation SystemSEI Software Engineering Institute (CMU)SIL System Integration LaboratorySSA System Safety AssessmentSTS Space Transportation SystemSysML Systems Modeling LanguageTMR Triple Modular RedundantTRL Technical Readiness LevelUAS Unmanned Aircraft SystemUH-60 Utility Helicopter BlackhawkUML Unified Modeling LanguageUS United StatesUSL Universal Systems Language

29

References

bull [1] Israel Koren and Mani Krishna ldquoFault-Tolerant Systemsrdquo Morgan Kaufmann 2007bull [2] Jianto Pan ldquoSoftware Reliabilityrdquo Carnegie Mellon University Spring 1999bull [3] Nachum Dershowitz httpwwwcstauacil~nachumdhorrorhtmlbull [4] httpwwwair-attackcombull [5] David A Mindell ldquoDigital Apollo Human and Machine in Spaceflightrdquo The MIT Press 2008bull [6] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Flight Software Airworthiness SED-SES-PMHFSA 001 December 2003bull [7] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Software Safety SED-SES-PMHSSA 001 February 2006bull [8] AMCOM Regulation 385-17 Software System Safety Policy 15 March 2008bull [9] NASA Software Safety Guidebook NASA-GB-871913 31 March 2004bull [10] Margaret Hamilton amp William Hackler ldquoUniversal Systems Language Lessons Learned from Apollordquo IEEE Computer

Society 2008bull [11] Margaret Hamilton ldquoFull Life Cycle Systems Engineering and Software Development Environment Development Before

The Fact In Actionrdquo httpwwwhtiuscomArticlesFull_Life_Cyclehtmbull [12] Peter Feiler David Gluch John Hudak ldquo The Architecture Analysis amp Design Language (AADL) An Introductionrdquo

CMUSEI-2006-TN-011 February 2006bull [13] Peter Feiler John Hudak ldquoDeveloping AADL Models for Control Systems A Practitionerrsquos Guiderdquo CMUSEI-2007-TR-

014 July 2007bull [14] Bruce Lewis ldquoUsing the Architecture Analysis and Design Language for System Verification and Validationrdquo SEI

Presentation 2006bull [15] Feiler Gluch Hudak Lewis ldquoEmbedded System Architecture Analysis Using SAE AADLrdquo CMUSEI-2004-TN-004 June

2004bull [16] Charles Pecheur Stacy Nelson ldquoVampV of Advanced Systems at NASArdquo NASACR-2002-211402 April 2002bull [17] Systems Integration Requirements Task Group ldquoARP 4754 Certification Considerations for Highly-Integrated or

Complex Aircraft Systemsrdquo SAE Aerospace 10 April 1996bull [18] SAE ldquoARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne System and

Equipmentrdquo December 1996bull [19] Aeronautical Radio Inc (ARINC) ldquoARINC Specification 653P1-2 Avionics Application Software Standard Interface Part 1

ndash Required Servicesrdquo 7 March 2006

30

References (Continued)

bull [20] Department of Defense ldquoMIL-STD-882D Standard Practice for System Safetyrdquo 19 January 1993bull [21] RTCA Incorporated ldquoDO-178 Software Considerations in Airborne Systems and Equipment Certificationrdquo 1

December 1992bull [22] RTCA Incorporated ldquoDO-254 Design Assurance Guidance for Airborne Electronic Hardwarerdquo 19 April 2000bull [23] US Army ldquoAeronautical Design Standard Handbook Rotorcraft and Aircraft Qualification (RAQ) Handbookrdquo 21

October 1996bull [24] Cary R Spitzer (Editor) ldquoAvionics Elements Software and Functionsrdquo CRC Press 2007bull [25] US Army ldquoArmy Regulation 70-62 Airworthiness Qualification of Aircraft Systemsrdquo 21 May 2007bull [26] US Army ldquoArmy Regulation 95-1 Aviation Flight Regulationsrdquo 3 February 2006bull [27] ldquoUsing the Architecture Tradeoff Analysis Method (ATAM) to Evaluate the Software Architecture for a Product Line

of Avionics Systems A Case Studyrdquo Barbacci Clements Lattanze Northrop Wood July 2003 CMUSEI-2003-TN-012bull [28] ldquoAll in the Family CAAS amp AADLrdquo Peter Feiler August 2008 CMUSEI-2008-SR-021bull [29] ldquoCMMI Guidelines for Process Integration and Product Improvementrdquo Chrissis Konrad Shrum Pearson Education

2007bull [30] ldquoModel Driven Performance Analysis for Avionics Systemsrdquo Brendan OrsquoConnell Draper Laboratory January 2006bull [31] John F Hanaway Robert W Moorehead ldquoSpace Shuttle Avionics Systemsrdquo NASA SP-504 1989bull [32] Lui Sha ldquoThe Complexity Challenge in Modern Avionics Softwarerdquo August 14 2006bull [33] ldquoIncidents Prompt New Scrutiny of Airplane Software Glitchesrdquo 30 May 2006 Wall Street Journalbull [34] Eyal Ophir Clifford Nass and Anthony Wagner ldquo Cognitive Control in Media Multitaskersrdquo PNAS 20 July 2009bull [35] ldquoAdvisory Circular AC 251309-1A System Design and Analysisrdquo Federal Aviation Administration 21 June 1988bull [36] Program Element Office Policy Memorandum 08-03bull [38] httpwwwnsfgovpubs2008nsf08611nsf08611htm National Science Foundation webpage on Cyber-Physical

Systemsbull [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000bull [40] Paul Rook editor ldquoSoftware Reliability Handbookrdquo Elsevier Science Publishers LTD 1990

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)

20

Tools for Modeling and Analysis

bull Universal Systems Language (USL)bull Unified Modeling Language (UML)bull Systems Modeling Language (SysML)bull MATLABSimulinkbull Telelogic Rhapsodybull MathCadbull Colored Petri Netsbull Rate Monotonic Analysis (RMA)bull STATEMATE (Used by Airbus)bull SCADEbull OPNETbull Embedded System Modeling Language (ESML)bull Component Synthesis using Model-Integrated Computing (CoSMIC)bull Architectural Analysis and Design Language (AADL)bull At least 200+ more packages since the 70rsquosbull Certified tools needs to converge to an accepted standard

modelinganalysis method for complex system reliability

Presenter
Presentation Notes
Typically different companies and projects address the challenge and choose unit tools to perform the upfront analysis and modeling not following a guideline approach Multiple tools need to converge or be compatible with a set modeling standard for complex systems The compliant tools could be used to verify the requirements up front to mitigate reliability issues down the life-cycle A notional approach would follow that shown in Figure 8 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model The down side to modeling from certain circles is getting people to believe those models How do you certify a modeling tool and the actual models within the tools Those issues should be addressed going forward

21

Modification to Acquisition Model

Requirements Establishment

High Level Design

Detailed Specifications

Implementation Coding

Verification

Development TestingA

rchi

tect

ural

Mod

el amp

A

naly

sis

Propose standard modeling methodology to be applied at different phases of development to enhance requirements development reliability allocation reliability

measurement and testing (DISCLAIMER DOES NOT REPLACE TESTING)

Reliabilityallocated Reliability

measured

Operational Testing amp Validation

Deployed System

Presenter
Presentation Notes
A notional approach would follow that shown in Figure 10 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model This approach could bridge the design and test phases together It is emphasized here that the architectural model would not replace critical testing but augments the process to allow for better requirement identification and verification Thorough ground and flight tests should never be replaced by modeling Modeling would only allow for more robust and a higher level of confidence in the requirements and design The model could be used in conjunction with the testing to confirm the design Proper modeling and analysis would reduce total program costs by enforcing more complete and correct initial requirements which reduce issues discovered down the road in testing that are expensive to fix or impossible to fix and having to accept high risks Additionally if the model is maintained and optimized then it could possibly be used after system deployment to analyze impacts of upgrades or changes to the system allowing for more complete analysis and reduce overall system redesign costs1313A hurdle to cross with modeling and analysis is convincing people to believe those models Some method to certify these models and modeling tools should be addressed in the future Standards should be set in place for correct modeling techniques for complex systems Lastly consideration of standard verification checking tools should be made such as with the use of the Motor Industry Software Reliability Association (MISRA) compliance verification tool for the use of C in safety critical systems

22

Systems Reliability Standard Establishment

bull Establish a working group to define this standardndash Need a technical society to lead the charge on this

bull Collaborate with industry academia military and societiesndash Focus on development of a reliability standard with AWR safety in mindndash Draw upon the experiences to feed into this standard

bull Study existing and previous complex systemsndash Shuttle Space Station missile systems nuclear submarine and ship

systems nuclear control systems military and commercial jet systems ndash Obtain software reliability information from given existing and previous

systemsndash Build database which would serve as basis for future reliability

bull Research prior efforts in complex systems analysis

bull Establish consensus based modeling and analysis method

Presenter
Presentation Notes
In conclusion methods for achieving a design for complex systems do exist however achieving reliability and attaining a level of qualification that would permit better AWRs does not There needs to be a standard developed to tackle this issue rather than relying on current methods to ascertain airworthiness of complex systems This will not occur overnight An orchestrated collaboration among industry academia military labs and technically professional societies to focus on development of this standard should allow us to draw upon the experiences to feed into this reliability standard with AWR safety in mind We have a long living experiment with complex software systems on the Space Transportation System (STS) International Space Station (ISS) missile systems nuclear submarine and ship systems nuclear control systems military and commercial jet systems from which we should be able to obtain at least inferred software reliability information from the architecture and run time that the systems use We should look at the lessons learned from these systems to see what could have been done to improve and what was done right that should be carried forward The challenge is collecting this information into a central database and arriving at some figure of reliability from previous systems where data exists This would at least provide a starting point to allow initial assessments and could be optimized in the future Also this is not the only study for establishing reliability metrics to complex software systems There have been research projects of similarity to this effort that have risen and fallen The data from those projects should not be wasted but studied to feed into whatever standard that is developed While historical information would be useful each design is unique and requires tools to accomplish the design Investigation of architectural modeling constructs should be further investigated as a possible augmentation to the design and test process We need to determine which forum is best to conduct this effort (eg SAE IEEE AIAA ACM AHS INCOSE or other) As stated in the paper ldquoSpace Shuttle Avionicsrdquo [31] ldquoThe designers the flight crew and other operational users of the system often have a mindset established in a previous program or experience which results in a bias against new or different lsquounconventionalrsquo approachesrdquo If nothing is done to address this problem it will only get worse over time It is past time to address the issue of reliability of complex systems and software

23

BACKUP SLIDES

BACKUP SLIDES

24

PEO Aviation System Safety Management Decision Authority Matrix

Severity(Most Credible)

FrequentP gt 1E-3

A

Probable 1D-4 lt P lt= 1E-3

B

Occasional1E-5 lt P lt= 1E-4

C

Remote1E-6 lt P lt= 1E-5

D

Improbable1E-7 lt P lt= 1E-6

E

Catastrophic1

Critical 2

Marginal 3

Negligible4

Army Acquisition

PEO Aviation

ProgramManagement

HazardCategory

Description

1 Catastrophic Death or permanent total disability system loss

2 Critical Severe injury or minor occupational illness (no permanent effect) minor system or environmental damage

3 Marginal Minor injury or minor occupational illness (no permanent effect) minor system or environmental damage

4 Negligible Less than minor injury or occupational illness (no lost workdays) or less than minor environmental damage

RiskLevel

Description Probability (Frequency) (per 100000 flight hours)

A Frequent gt 100 (P gt 1E-3)

B Probable lt=100 and gt10 (1E-4 lt P lt= 1E-3)

C Occasional lt= 10 and gt1 (1E-5 lt P lt= 1E-4)

D Remote lt=1 and gt01 (1E-6 lt P lt= 1E-5)

E Improbable lt=01 and gt001 (1E-7 lt P lt= 1E-6)

Presenter
Presentation Notes
As already mentioned it is the goal of the US Army Aviation Engineering Directorate that the system is safe and reliable to operate and will perform the mission when delivered Another goal includes the released system continues to safely perform the mission if maintained and operated per the operatorrsquos manual Replacement parts and overhaul work must be high quality to support continued airworthiness Per the Program Element Office Memorandum 08-03 Risk Matrix US Army flight control systems are to achieve 1E-9 reliability for flight critical functions per civil airspace regulations [35] and 1E-6 reliability for tactical use [36] Quantifying these numbers is established for component hardware but not for software Just as hardware should have quantifiable reliability so should software13

25

Reliability Defined

bull Software Reliability - the probability that a given piece of software will execute without failure in a given environment for a given time [40] ndash Often debated as to how to measure

bull Hardware Reliability - the probability that a hardware component fails over time ndash Well defined and established

bull System Reliability - the probability of success or the probability that the system will perform its intended function under specified design limits [over a given amount of time] [39] ndash A combination of software and hardware reliability

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

26

Hardware vs Software Reliability

Hardware Reliability Software ReliabilityFailure rate has a bathtub curve The burn-in state is similar to the software debugging state

Without considering program evolution failure rate is statistically non-increasing

Material deterioration can cause failures even though the system is not used

Failures never occur if the software is not used

Failure data are fitted to some distributions The selection of the underlying distribution is based on the analysis of failure data and experiences Emphasis is placed on analyzing failure data

Most models are analytically derived from assumptions Emphasis is on developing the model the interpretation of the model assumptions and the physical meaning of the parameters

Failures are caused by material deterioration design errors misuse and environment

Failures are caused by incorrect logic incorrect statements or incorrect input data

Can be improved by better design better material applying redundancy and accelerated life cycle testing

Can be improved by increasing testing effort and correcting discovered faults Reliability tends to change continuously during testing due to the addition of problems in new code or the removal of problems by debugging errors

Hardware repairs restore the original condition Software repairs establish a new piece of software

Hardware failures are usually preceded by warnings Software failures are rarely preceded by warnings

Hardware components can be standardized Software components have rarely been standardized

Hardware can usually be tested exhaustively Software essentially requires infinite testing for completeness

Reference [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Hong Pham compares software versus hardware reliability with the information as shown in Table 2 [39] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

27

Acronym ListACRONYM DEFINITIONAADL Architectural Analysis and Design LanguageAC Advisory Circular (FAA)ACM Association of Computing MachineryAED Aviation Engineering Directorate (AMRDEC)AFTD Aviation Flight Test Directorate (US Army)AGC Apollo Guidance ComputerAHS American Helicopter SocietyAIAA American Institute of Aeronautics and Astronautics (Inc)AMCOM Aviation and Missile Command (US Army)AMRDEC Aviation and Missiles Research Development and Engineering Center (US Army)AR Army RegulationARINC Aeronautical Radio Inc ARP Aerospace Recommended PracticeASIF Avionics Software Integration FacilityATAM Architecture Tradeoff Analysis MethodATM Air Traffic ManagementAWR Airworthiness ReleaseCAAS Common Avionics Architecture SystemCH-47 Cargo Helicopter ChinookCMM Capability Maturity ModelCMMI Capability Maturity Model IndexCMU Carnegie Mellon UniversityCNS Communications Navigation SurveillanceCoSMIC Component Synthesis using Model-Integrated ComputingCPS Cyber-Physical SystemCRC Chemical Rubber Company (ie CRC Press)DFBW Digital Fly-By-WireDoD Department of DefenseE3 Electrical and Electromagnetic EffectsESML Embedded System Modeling LanguageFAA Federal Aviation AdministrationFCS Future Combat SystemsFHA Functional Hazard AssessmentFMEA Failure Modes Effects Analysis

28

Acronym List (concluded)

ACRONYM DEFINITIONGPWS Ground Proximity Warning SystemIBM International Business MachinesIEC International Engineering ConsortiumIL Instrumentation Lab (now Draper Laboratory)IMA Integrated Modular AvionicsINCOSE International Council On Systems EngineeringISO International Organization for StandardizationISS International Space StationKAL Korean AirlinesMISRA Motor Industry Standard Software Reliability AssociationMIT Massachusetts Institute of TechnologyNASA National Aeronautics and Space Administration (USA)PDR Preliminary Design ReviewPEO Program Element OfficePNAS Proceedings of the National Academy of SciencesRAQ Rotorcraft and Aircraft QualificationRMA Rate Monotonic AnalysisRTC Redstone Test Center (US Army) RTTC Redstone Technical Test Center (US Army)RTCA Radio Technical Commission for AeronauticsSAE Society of Automotive EngineersSED Software Engineering Directorate (AMRDEC)SEES Software Engineering Evaluation SystemSEI Software Engineering Institute (CMU)SIL System Integration LaboratorySSA System Safety AssessmentSTS Space Transportation SystemSysML Systems Modeling LanguageTMR Triple Modular RedundantTRL Technical Readiness LevelUAS Unmanned Aircraft SystemUH-60 Utility Helicopter BlackhawkUML Unified Modeling LanguageUS United StatesUSL Universal Systems Language

29

References

bull [1] Israel Koren and Mani Krishna ldquoFault-Tolerant Systemsrdquo Morgan Kaufmann 2007bull [2] Jianto Pan ldquoSoftware Reliabilityrdquo Carnegie Mellon University Spring 1999bull [3] Nachum Dershowitz httpwwwcstauacil~nachumdhorrorhtmlbull [4] httpwwwair-attackcombull [5] David A Mindell ldquoDigital Apollo Human and Machine in Spaceflightrdquo The MIT Press 2008bull [6] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Flight Software Airworthiness SED-SES-PMHFSA 001 December 2003bull [7] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Software Safety SED-SES-PMHSSA 001 February 2006bull [8] AMCOM Regulation 385-17 Software System Safety Policy 15 March 2008bull [9] NASA Software Safety Guidebook NASA-GB-871913 31 March 2004bull [10] Margaret Hamilton amp William Hackler ldquoUniversal Systems Language Lessons Learned from Apollordquo IEEE Computer

Society 2008bull [11] Margaret Hamilton ldquoFull Life Cycle Systems Engineering and Software Development Environment Development Before

The Fact In Actionrdquo httpwwwhtiuscomArticlesFull_Life_Cyclehtmbull [12] Peter Feiler David Gluch John Hudak ldquo The Architecture Analysis amp Design Language (AADL) An Introductionrdquo

CMUSEI-2006-TN-011 February 2006bull [13] Peter Feiler John Hudak ldquoDeveloping AADL Models for Control Systems A Practitionerrsquos Guiderdquo CMUSEI-2007-TR-

014 July 2007bull [14] Bruce Lewis ldquoUsing the Architecture Analysis and Design Language for System Verification and Validationrdquo SEI

Presentation 2006bull [15] Feiler Gluch Hudak Lewis ldquoEmbedded System Architecture Analysis Using SAE AADLrdquo CMUSEI-2004-TN-004 June

2004bull [16] Charles Pecheur Stacy Nelson ldquoVampV of Advanced Systems at NASArdquo NASACR-2002-211402 April 2002bull [17] Systems Integration Requirements Task Group ldquoARP 4754 Certification Considerations for Highly-Integrated or

Complex Aircraft Systemsrdquo SAE Aerospace 10 April 1996bull [18] SAE ldquoARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne System and

Equipmentrdquo December 1996bull [19] Aeronautical Radio Inc (ARINC) ldquoARINC Specification 653P1-2 Avionics Application Software Standard Interface Part 1

ndash Required Servicesrdquo 7 March 2006

30

References (Continued)

bull [20] Department of Defense ldquoMIL-STD-882D Standard Practice for System Safetyrdquo 19 January 1993bull [21] RTCA Incorporated ldquoDO-178 Software Considerations in Airborne Systems and Equipment Certificationrdquo 1

December 1992bull [22] RTCA Incorporated ldquoDO-254 Design Assurance Guidance for Airborne Electronic Hardwarerdquo 19 April 2000bull [23] US Army ldquoAeronautical Design Standard Handbook Rotorcraft and Aircraft Qualification (RAQ) Handbookrdquo 21

October 1996bull [24] Cary R Spitzer (Editor) ldquoAvionics Elements Software and Functionsrdquo CRC Press 2007bull [25] US Army ldquoArmy Regulation 70-62 Airworthiness Qualification of Aircraft Systemsrdquo 21 May 2007bull [26] US Army ldquoArmy Regulation 95-1 Aviation Flight Regulationsrdquo 3 February 2006bull [27] ldquoUsing the Architecture Tradeoff Analysis Method (ATAM) to Evaluate the Software Architecture for a Product Line

of Avionics Systems A Case Studyrdquo Barbacci Clements Lattanze Northrop Wood July 2003 CMUSEI-2003-TN-012bull [28] ldquoAll in the Family CAAS amp AADLrdquo Peter Feiler August 2008 CMUSEI-2008-SR-021bull [29] ldquoCMMI Guidelines for Process Integration and Product Improvementrdquo Chrissis Konrad Shrum Pearson Education

2007bull [30] ldquoModel Driven Performance Analysis for Avionics Systemsrdquo Brendan OrsquoConnell Draper Laboratory January 2006bull [31] John F Hanaway Robert W Moorehead ldquoSpace Shuttle Avionics Systemsrdquo NASA SP-504 1989bull [32] Lui Sha ldquoThe Complexity Challenge in Modern Avionics Softwarerdquo August 14 2006bull [33] ldquoIncidents Prompt New Scrutiny of Airplane Software Glitchesrdquo 30 May 2006 Wall Street Journalbull [34] Eyal Ophir Clifford Nass and Anthony Wagner ldquo Cognitive Control in Media Multitaskersrdquo PNAS 20 July 2009bull [35] ldquoAdvisory Circular AC 251309-1A System Design and Analysisrdquo Federal Aviation Administration 21 June 1988bull [36] Program Element Office Policy Memorandum 08-03bull [38] httpwwwnsfgovpubs2008nsf08611nsf08611htm National Science Foundation webpage on Cyber-Physical

Systemsbull [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000bull [40] Paul Rook editor ldquoSoftware Reliability Handbookrdquo Elsevier Science Publishers LTD 1990

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)

21

Modification to Acquisition Model

Requirements Establishment

High Level Design

Detailed Specifications

Implementation Coding

Verification

Development TestingA

rchi

tect

ural

Mod

el amp

A

naly

sis

Propose standard modeling methodology to be applied at different phases of development to enhance requirements development reliability allocation reliability

measurement and testing (DISCLAIMER DOES NOT REPLACE TESTING)

Reliabilityallocated Reliability

measured

Operational Testing amp Validation

Deployed System

Presenter
Presentation Notes
A notional approach would follow that shown in Figure 10 where the system V is followed but Architectural Modeling and Analysis go parallel with the real development effort This would allow reliability to be measured during the design phase and measured during the implementation test and verification phase using the model This approach could bridge the design and test phases together It is emphasized here that the architectural model would not replace critical testing but augments the process to allow for better requirement identification and verification Thorough ground and flight tests should never be replaced by modeling Modeling would only allow for more robust and a higher level of confidence in the requirements and design The model could be used in conjunction with the testing to confirm the design Proper modeling and analysis would reduce total program costs by enforcing more complete and correct initial requirements which reduce issues discovered down the road in testing that are expensive to fix or impossible to fix and having to accept high risks Additionally if the model is maintained and optimized then it could possibly be used after system deployment to analyze impacts of upgrades or changes to the system allowing for more complete analysis and reduce overall system redesign costs1313A hurdle to cross with modeling and analysis is convincing people to believe those models Some method to certify these models and modeling tools should be addressed in the future Standards should be set in place for correct modeling techniques for complex systems Lastly consideration of standard verification checking tools should be made such as with the use of the Motor Industry Software Reliability Association (MISRA) compliance verification tool for the use of C in safety critical systems

22

Systems Reliability Standard Establishment

bull Establish a working group to define this standardndash Need a technical society to lead the charge on this

bull Collaborate with industry academia military and societiesndash Focus on development of a reliability standard with AWR safety in mindndash Draw upon the experiences to feed into this standard

bull Study existing and previous complex systemsndash Shuttle Space Station missile systems nuclear submarine and ship

systems nuclear control systems military and commercial jet systems ndash Obtain software reliability information from given existing and previous

systemsndash Build database which would serve as basis for future reliability

bull Research prior efforts in complex systems analysis

bull Establish consensus based modeling and analysis method

Presenter
Presentation Notes
In conclusion methods for achieving a design for complex systems do exist however achieving reliability and attaining a level of qualification that would permit better AWRs does not There needs to be a standard developed to tackle this issue rather than relying on current methods to ascertain airworthiness of complex systems This will not occur overnight An orchestrated collaboration among industry academia military labs and technically professional societies to focus on development of this standard should allow us to draw upon the experiences to feed into this reliability standard with AWR safety in mind We have a long living experiment with complex software systems on the Space Transportation System (STS) International Space Station (ISS) missile systems nuclear submarine and ship systems nuclear control systems military and commercial jet systems from which we should be able to obtain at least inferred software reliability information from the architecture and run time that the systems use We should look at the lessons learned from these systems to see what could have been done to improve and what was done right that should be carried forward The challenge is collecting this information into a central database and arriving at some figure of reliability from previous systems where data exists This would at least provide a starting point to allow initial assessments and could be optimized in the future Also this is not the only study for establishing reliability metrics to complex software systems There have been research projects of similarity to this effort that have risen and fallen The data from those projects should not be wasted but studied to feed into whatever standard that is developed While historical information would be useful each design is unique and requires tools to accomplish the design Investigation of architectural modeling constructs should be further investigated as a possible augmentation to the design and test process We need to determine which forum is best to conduct this effort (eg SAE IEEE AIAA ACM AHS INCOSE or other) As stated in the paper ldquoSpace Shuttle Avionicsrdquo [31] ldquoThe designers the flight crew and other operational users of the system often have a mindset established in a previous program or experience which results in a bias against new or different lsquounconventionalrsquo approachesrdquo If nothing is done to address this problem it will only get worse over time It is past time to address the issue of reliability of complex systems and software

23

BACKUP SLIDES

BACKUP SLIDES

24

PEO Aviation System Safety Management Decision Authority Matrix

Severity(Most Credible)

FrequentP gt 1E-3

A

Probable 1D-4 lt P lt= 1E-3

B

Occasional1E-5 lt P lt= 1E-4

C

Remote1E-6 lt P lt= 1E-5

D

Improbable1E-7 lt P lt= 1E-6

E

Catastrophic1

Critical 2

Marginal 3

Negligible4

Army Acquisition

PEO Aviation

ProgramManagement

HazardCategory

Description

1 Catastrophic Death or permanent total disability system loss

2 Critical Severe injury or minor occupational illness (no permanent effect) minor system or environmental damage

3 Marginal Minor injury or minor occupational illness (no permanent effect) minor system or environmental damage

4 Negligible Less than minor injury or occupational illness (no lost workdays) or less than minor environmental damage

RiskLevel

Description Probability (Frequency) (per 100000 flight hours)

A Frequent gt 100 (P gt 1E-3)

B Probable lt=100 and gt10 (1E-4 lt P lt= 1E-3)

C Occasional lt= 10 and gt1 (1E-5 lt P lt= 1E-4)

D Remote lt=1 and gt01 (1E-6 lt P lt= 1E-5)

E Improbable lt=01 and gt001 (1E-7 lt P lt= 1E-6)

Presenter
Presentation Notes
As already mentioned it is the goal of the US Army Aviation Engineering Directorate that the system is safe and reliable to operate and will perform the mission when delivered Another goal includes the released system continues to safely perform the mission if maintained and operated per the operatorrsquos manual Replacement parts and overhaul work must be high quality to support continued airworthiness Per the Program Element Office Memorandum 08-03 Risk Matrix US Army flight control systems are to achieve 1E-9 reliability for flight critical functions per civil airspace regulations [35] and 1E-6 reliability for tactical use [36] Quantifying these numbers is established for component hardware but not for software Just as hardware should have quantifiable reliability so should software13

25

Reliability Defined

bull Software Reliability - the probability that a given piece of software will execute without failure in a given environment for a given time [40] ndash Often debated as to how to measure

bull Hardware Reliability - the probability that a hardware component fails over time ndash Well defined and established

bull System Reliability - the probability of success or the probability that the system will perform its intended function under specified design limits [over a given amount of time] [39] ndash A combination of software and hardware reliability

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

26

Hardware vs Software Reliability

Hardware Reliability Software ReliabilityFailure rate has a bathtub curve The burn-in state is similar to the software debugging state

Without considering program evolution failure rate is statistically non-increasing

Material deterioration can cause failures even though the system is not used

Failures never occur if the software is not used

Failure data are fitted to some distributions The selection of the underlying distribution is based on the analysis of failure data and experiences Emphasis is placed on analyzing failure data

Most models are analytically derived from assumptions Emphasis is on developing the model the interpretation of the model assumptions and the physical meaning of the parameters

Failures are caused by material deterioration design errors misuse and environment

Failures are caused by incorrect logic incorrect statements or incorrect input data

Can be improved by better design better material applying redundancy and accelerated life cycle testing

Can be improved by increasing testing effort and correcting discovered faults Reliability tends to change continuously during testing due to the addition of problems in new code or the removal of problems by debugging errors

Hardware repairs restore the original condition Software repairs establish a new piece of software

Hardware failures are usually preceded by warnings Software failures are rarely preceded by warnings

Hardware components can be standardized Software components have rarely been standardized

Hardware can usually be tested exhaustively Software essentially requires infinite testing for completeness

Reference [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Hong Pham compares software versus hardware reliability with the information as shown in Table 2 [39] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

27

Acronym ListACRONYM DEFINITIONAADL Architectural Analysis and Design LanguageAC Advisory Circular (FAA)ACM Association of Computing MachineryAED Aviation Engineering Directorate (AMRDEC)AFTD Aviation Flight Test Directorate (US Army)AGC Apollo Guidance ComputerAHS American Helicopter SocietyAIAA American Institute of Aeronautics and Astronautics (Inc)AMCOM Aviation and Missile Command (US Army)AMRDEC Aviation and Missiles Research Development and Engineering Center (US Army)AR Army RegulationARINC Aeronautical Radio Inc ARP Aerospace Recommended PracticeASIF Avionics Software Integration FacilityATAM Architecture Tradeoff Analysis MethodATM Air Traffic ManagementAWR Airworthiness ReleaseCAAS Common Avionics Architecture SystemCH-47 Cargo Helicopter ChinookCMM Capability Maturity ModelCMMI Capability Maturity Model IndexCMU Carnegie Mellon UniversityCNS Communications Navigation SurveillanceCoSMIC Component Synthesis using Model-Integrated ComputingCPS Cyber-Physical SystemCRC Chemical Rubber Company (ie CRC Press)DFBW Digital Fly-By-WireDoD Department of DefenseE3 Electrical and Electromagnetic EffectsESML Embedded System Modeling LanguageFAA Federal Aviation AdministrationFCS Future Combat SystemsFHA Functional Hazard AssessmentFMEA Failure Modes Effects Analysis

28

Acronym List (concluded)

ACRONYM DEFINITIONGPWS Ground Proximity Warning SystemIBM International Business MachinesIEC International Engineering ConsortiumIL Instrumentation Lab (now Draper Laboratory)IMA Integrated Modular AvionicsINCOSE International Council On Systems EngineeringISO International Organization for StandardizationISS International Space StationKAL Korean AirlinesMISRA Motor Industry Standard Software Reliability AssociationMIT Massachusetts Institute of TechnologyNASA National Aeronautics and Space Administration (USA)PDR Preliminary Design ReviewPEO Program Element OfficePNAS Proceedings of the National Academy of SciencesRAQ Rotorcraft and Aircraft QualificationRMA Rate Monotonic AnalysisRTC Redstone Test Center (US Army) RTTC Redstone Technical Test Center (US Army)RTCA Radio Technical Commission for AeronauticsSAE Society of Automotive EngineersSED Software Engineering Directorate (AMRDEC)SEES Software Engineering Evaluation SystemSEI Software Engineering Institute (CMU)SIL System Integration LaboratorySSA System Safety AssessmentSTS Space Transportation SystemSysML Systems Modeling LanguageTMR Triple Modular RedundantTRL Technical Readiness LevelUAS Unmanned Aircraft SystemUH-60 Utility Helicopter BlackhawkUML Unified Modeling LanguageUS United StatesUSL Universal Systems Language

29

References

bull [1] Israel Koren and Mani Krishna ldquoFault-Tolerant Systemsrdquo Morgan Kaufmann 2007bull [2] Jianto Pan ldquoSoftware Reliabilityrdquo Carnegie Mellon University Spring 1999bull [3] Nachum Dershowitz httpwwwcstauacil~nachumdhorrorhtmlbull [4] httpwwwair-attackcombull [5] David A Mindell ldquoDigital Apollo Human and Machine in Spaceflightrdquo The MIT Press 2008bull [6] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Flight Software Airworthiness SED-SES-PMHFSA 001 December 2003bull [7] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Software Safety SED-SES-PMHSSA 001 February 2006bull [8] AMCOM Regulation 385-17 Software System Safety Policy 15 March 2008bull [9] NASA Software Safety Guidebook NASA-GB-871913 31 March 2004bull [10] Margaret Hamilton amp William Hackler ldquoUniversal Systems Language Lessons Learned from Apollordquo IEEE Computer

Society 2008bull [11] Margaret Hamilton ldquoFull Life Cycle Systems Engineering and Software Development Environment Development Before

The Fact In Actionrdquo httpwwwhtiuscomArticlesFull_Life_Cyclehtmbull [12] Peter Feiler David Gluch John Hudak ldquo The Architecture Analysis amp Design Language (AADL) An Introductionrdquo

CMUSEI-2006-TN-011 February 2006bull [13] Peter Feiler John Hudak ldquoDeveloping AADL Models for Control Systems A Practitionerrsquos Guiderdquo CMUSEI-2007-TR-

014 July 2007bull [14] Bruce Lewis ldquoUsing the Architecture Analysis and Design Language for System Verification and Validationrdquo SEI

Presentation 2006bull [15] Feiler Gluch Hudak Lewis ldquoEmbedded System Architecture Analysis Using SAE AADLrdquo CMUSEI-2004-TN-004 June

2004bull [16] Charles Pecheur Stacy Nelson ldquoVampV of Advanced Systems at NASArdquo NASACR-2002-211402 April 2002bull [17] Systems Integration Requirements Task Group ldquoARP 4754 Certification Considerations for Highly-Integrated or

Complex Aircraft Systemsrdquo SAE Aerospace 10 April 1996bull [18] SAE ldquoARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne System and

Equipmentrdquo December 1996bull [19] Aeronautical Radio Inc (ARINC) ldquoARINC Specification 653P1-2 Avionics Application Software Standard Interface Part 1

ndash Required Servicesrdquo 7 March 2006

30

References (Continued)

bull [20] Department of Defense ldquoMIL-STD-882D Standard Practice for System Safetyrdquo 19 January 1993bull [21] RTCA Incorporated ldquoDO-178 Software Considerations in Airborne Systems and Equipment Certificationrdquo 1

December 1992bull [22] RTCA Incorporated ldquoDO-254 Design Assurance Guidance for Airborne Electronic Hardwarerdquo 19 April 2000bull [23] US Army ldquoAeronautical Design Standard Handbook Rotorcraft and Aircraft Qualification (RAQ) Handbookrdquo 21

October 1996bull [24] Cary R Spitzer (Editor) ldquoAvionics Elements Software and Functionsrdquo CRC Press 2007bull [25] US Army ldquoArmy Regulation 70-62 Airworthiness Qualification of Aircraft Systemsrdquo 21 May 2007bull [26] US Army ldquoArmy Regulation 95-1 Aviation Flight Regulationsrdquo 3 February 2006bull [27] ldquoUsing the Architecture Tradeoff Analysis Method (ATAM) to Evaluate the Software Architecture for a Product Line

of Avionics Systems A Case Studyrdquo Barbacci Clements Lattanze Northrop Wood July 2003 CMUSEI-2003-TN-012bull [28] ldquoAll in the Family CAAS amp AADLrdquo Peter Feiler August 2008 CMUSEI-2008-SR-021bull [29] ldquoCMMI Guidelines for Process Integration and Product Improvementrdquo Chrissis Konrad Shrum Pearson Education

2007bull [30] ldquoModel Driven Performance Analysis for Avionics Systemsrdquo Brendan OrsquoConnell Draper Laboratory January 2006bull [31] John F Hanaway Robert W Moorehead ldquoSpace Shuttle Avionics Systemsrdquo NASA SP-504 1989bull [32] Lui Sha ldquoThe Complexity Challenge in Modern Avionics Softwarerdquo August 14 2006bull [33] ldquoIncidents Prompt New Scrutiny of Airplane Software Glitchesrdquo 30 May 2006 Wall Street Journalbull [34] Eyal Ophir Clifford Nass and Anthony Wagner ldquo Cognitive Control in Media Multitaskersrdquo PNAS 20 July 2009bull [35] ldquoAdvisory Circular AC 251309-1A System Design and Analysisrdquo Federal Aviation Administration 21 June 1988bull [36] Program Element Office Policy Memorandum 08-03bull [38] httpwwwnsfgovpubs2008nsf08611nsf08611htm National Science Foundation webpage on Cyber-Physical

Systemsbull [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000bull [40] Paul Rook editor ldquoSoftware Reliability Handbookrdquo Elsevier Science Publishers LTD 1990

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)

22

Systems Reliability Standard Establishment

bull Establish a working group to define this standardndash Need a technical society to lead the charge on this

bull Collaborate with industry academia military and societiesndash Focus on development of a reliability standard with AWR safety in mindndash Draw upon the experiences to feed into this standard

bull Study existing and previous complex systemsndash Shuttle Space Station missile systems nuclear submarine and ship

systems nuclear control systems military and commercial jet systems ndash Obtain software reliability information from given existing and previous

systemsndash Build database which would serve as basis for future reliability

bull Research prior efforts in complex systems analysis

bull Establish consensus based modeling and analysis method

Presenter
Presentation Notes
In conclusion methods for achieving a design for complex systems do exist however achieving reliability and attaining a level of qualification that would permit better AWRs does not There needs to be a standard developed to tackle this issue rather than relying on current methods to ascertain airworthiness of complex systems This will not occur overnight An orchestrated collaboration among industry academia military labs and technically professional societies to focus on development of this standard should allow us to draw upon the experiences to feed into this reliability standard with AWR safety in mind We have a long living experiment with complex software systems on the Space Transportation System (STS) International Space Station (ISS) missile systems nuclear submarine and ship systems nuclear control systems military and commercial jet systems from which we should be able to obtain at least inferred software reliability information from the architecture and run time that the systems use We should look at the lessons learned from these systems to see what could have been done to improve and what was done right that should be carried forward The challenge is collecting this information into a central database and arriving at some figure of reliability from previous systems where data exists This would at least provide a starting point to allow initial assessments and could be optimized in the future Also this is not the only study for establishing reliability metrics to complex software systems There have been research projects of similarity to this effort that have risen and fallen The data from those projects should not be wasted but studied to feed into whatever standard that is developed While historical information would be useful each design is unique and requires tools to accomplish the design Investigation of architectural modeling constructs should be further investigated as a possible augmentation to the design and test process We need to determine which forum is best to conduct this effort (eg SAE IEEE AIAA ACM AHS INCOSE or other) As stated in the paper ldquoSpace Shuttle Avionicsrdquo [31] ldquoThe designers the flight crew and other operational users of the system often have a mindset established in a previous program or experience which results in a bias against new or different lsquounconventionalrsquo approachesrdquo If nothing is done to address this problem it will only get worse over time It is past time to address the issue of reliability of complex systems and software

23

BACKUP SLIDES

BACKUP SLIDES

24

PEO Aviation System Safety Management Decision Authority Matrix

Severity(Most Credible)

FrequentP gt 1E-3

A

Probable 1D-4 lt P lt= 1E-3

B

Occasional1E-5 lt P lt= 1E-4

C

Remote1E-6 lt P lt= 1E-5

D

Improbable1E-7 lt P lt= 1E-6

E

Catastrophic1

Critical 2

Marginal 3

Negligible4

Army Acquisition

PEO Aviation

ProgramManagement

HazardCategory

Description

1 Catastrophic Death or permanent total disability system loss

2 Critical Severe injury or minor occupational illness (no permanent effect) minor system or environmental damage

3 Marginal Minor injury or minor occupational illness (no permanent effect) minor system or environmental damage

4 Negligible Less than minor injury or occupational illness (no lost workdays) or less than minor environmental damage

RiskLevel

Description Probability (Frequency) (per 100000 flight hours)

A Frequent gt 100 (P gt 1E-3)

B Probable lt=100 and gt10 (1E-4 lt P lt= 1E-3)

C Occasional lt= 10 and gt1 (1E-5 lt P lt= 1E-4)

D Remote lt=1 and gt01 (1E-6 lt P lt= 1E-5)

E Improbable lt=01 and gt001 (1E-7 lt P lt= 1E-6)

Presenter
Presentation Notes
As already mentioned it is the goal of the US Army Aviation Engineering Directorate that the system is safe and reliable to operate and will perform the mission when delivered Another goal includes the released system continues to safely perform the mission if maintained and operated per the operatorrsquos manual Replacement parts and overhaul work must be high quality to support continued airworthiness Per the Program Element Office Memorandum 08-03 Risk Matrix US Army flight control systems are to achieve 1E-9 reliability for flight critical functions per civil airspace regulations [35] and 1E-6 reliability for tactical use [36] Quantifying these numbers is established for component hardware but not for software Just as hardware should have quantifiable reliability so should software13

25

Reliability Defined

bull Software Reliability - the probability that a given piece of software will execute without failure in a given environment for a given time [40] ndash Often debated as to how to measure

bull Hardware Reliability - the probability that a hardware component fails over time ndash Well defined and established

bull System Reliability - the probability of success or the probability that the system will perform its intended function under specified design limits [over a given amount of time] [39] ndash A combination of software and hardware reliability

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

26

Hardware vs Software Reliability

Hardware Reliability Software ReliabilityFailure rate has a bathtub curve The burn-in state is similar to the software debugging state

Without considering program evolution failure rate is statistically non-increasing

Material deterioration can cause failures even though the system is not used

Failures never occur if the software is not used

Failure data are fitted to some distributions The selection of the underlying distribution is based on the analysis of failure data and experiences Emphasis is placed on analyzing failure data

Most models are analytically derived from assumptions Emphasis is on developing the model the interpretation of the model assumptions and the physical meaning of the parameters

Failures are caused by material deterioration design errors misuse and environment

Failures are caused by incorrect logic incorrect statements or incorrect input data

Can be improved by better design better material applying redundancy and accelerated life cycle testing

Can be improved by increasing testing effort and correcting discovered faults Reliability tends to change continuously during testing due to the addition of problems in new code or the removal of problems by debugging errors

Hardware repairs restore the original condition Software repairs establish a new piece of software

Hardware failures are usually preceded by warnings Software failures are rarely preceded by warnings

Hardware components can be standardized Software components have rarely been standardized

Hardware can usually be tested exhaustively Software essentially requires infinite testing for completeness

Reference [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Hong Pham compares software versus hardware reliability with the information as shown in Table 2 [39] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

27

Acronym ListACRONYM DEFINITIONAADL Architectural Analysis and Design LanguageAC Advisory Circular (FAA)ACM Association of Computing MachineryAED Aviation Engineering Directorate (AMRDEC)AFTD Aviation Flight Test Directorate (US Army)AGC Apollo Guidance ComputerAHS American Helicopter SocietyAIAA American Institute of Aeronautics and Astronautics (Inc)AMCOM Aviation and Missile Command (US Army)AMRDEC Aviation and Missiles Research Development and Engineering Center (US Army)AR Army RegulationARINC Aeronautical Radio Inc ARP Aerospace Recommended PracticeASIF Avionics Software Integration FacilityATAM Architecture Tradeoff Analysis MethodATM Air Traffic ManagementAWR Airworthiness ReleaseCAAS Common Avionics Architecture SystemCH-47 Cargo Helicopter ChinookCMM Capability Maturity ModelCMMI Capability Maturity Model IndexCMU Carnegie Mellon UniversityCNS Communications Navigation SurveillanceCoSMIC Component Synthesis using Model-Integrated ComputingCPS Cyber-Physical SystemCRC Chemical Rubber Company (ie CRC Press)DFBW Digital Fly-By-WireDoD Department of DefenseE3 Electrical and Electromagnetic EffectsESML Embedded System Modeling LanguageFAA Federal Aviation AdministrationFCS Future Combat SystemsFHA Functional Hazard AssessmentFMEA Failure Modes Effects Analysis

28

Acronym List (concluded)

ACRONYM DEFINITIONGPWS Ground Proximity Warning SystemIBM International Business MachinesIEC International Engineering ConsortiumIL Instrumentation Lab (now Draper Laboratory)IMA Integrated Modular AvionicsINCOSE International Council On Systems EngineeringISO International Organization for StandardizationISS International Space StationKAL Korean AirlinesMISRA Motor Industry Standard Software Reliability AssociationMIT Massachusetts Institute of TechnologyNASA National Aeronautics and Space Administration (USA)PDR Preliminary Design ReviewPEO Program Element OfficePNAS Proceedings of the National Academy of SciencesRAQ Rotorcraft and Aircraft QualificationRMA Rate Monotonic AnalysisRTC Redstone Test Center (US Army) RTTC Redstone Technical Test Center (US Army)RTCA Radio Technical Commission for AeronauticsSAE Society of Automotive EngineersSED Software Engineering Directorate (AMRDEC)SEES Software Engineering Evaluation SystemSEI Software Engineering Institute (CMU)SIL System Integration LaboratorySSA System Safety AssessmentSTS Space Transportation SystemSysML Systems Modeling LanguageTMR Triple Modular RedundantTRL Technical Readiness LevelUAS Unmanned Aircraft SystemUH-60 Utility Helicopter BlackhawkUML Unified Modeling LanguageUS United StatesUSL Universal Systems Language

29

References

bull [1] Israel Koren and Mani Krishna ldquoFault-Tolerant Systemsrdquo Morgan Kaufmann 2007bull [2] Jianto Pan ldquoSoftware Reliabilityrdquo Carnegie Mellon University Spring 1999bull [3] Nachum Dershowitz httpwwwcstauacil~nachumdhorrorhtmlbull [4] httpwwwair-attackcombull [5] David A Mindell ldquoDigital Apollo Human and Machine in Spaceflightrdquo The MIT Press 2008bull [6] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Flight Software Airworthiness SED-SES-PMHFSA 001 December 2003bull [7] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Software Safety SED-SES-PMHSSA 001 February 2006bull [8] AMCOM Regulation 385-17 Software System Safety Policy 15 March 2008bull [9] NASA Software Safety Guidebook NASA-GB-871913 31 March 2004bull [10] Margaret Hamilton amp William Hackler ldquoUniversal Systems Language Lessons Learned from Apollordquo IEEE Computer

Society 2008bull [11] Margaret Hamilton ldquoFull Life Cycle Systems Engineering and Software Development Environment Development Before

The Fact In Actionrdquo httpwwwhtiuscomArticlesFull_Life_Cyclehtmbull [12] Peter Feiler David Gluch John Hudak ldquo The Architecture Analysis amp Design Language (AADL) An Introductionrdquo

CMUSEI-2006-TN-011 February 2006bull [13] Peter Feiler John Hudak ldquoDeveloping AADL Models for Control Systems A Practitionerrsquos Guiderdquo CMUSEI-2007-TR-

014 July 2007bull [14] Bruce Lewis ldquoUsing the Architecture Analysis and Design Language for System Verification and Validationrdquo SEI

Presentation 2006bull [15] Feiler Gluch Hudak Lewis ldquoEmbedded System Architecture Analysis Using SAE AADLrdquo CMUSEI-2004-TN-004 June

2004bull [16] Charles Pecheur Stacy Nelson ldquoVampV of Advanced Systems at NASArdquo NASACR-2002-211402 April 2002bull [17] Systems Integration Requirements Task Group ldquoARP 4754 Certification Considerations for Highly-Integrated or

Complex Aircraft Systemsrdquo SAE Aerospace 10 April 1996bull [18] SAE ldquoARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne System and

Equipmentrdquo December 1996bull [19] Aeronautical Radio Inc (ARINC) ldquoARINC Specification 653P1-2 Avionics Application Software Standard Interface Part 1

ndash Required Servicesrdquo 7 March 2006

30

References (Continued)

bull [20] Department of Defense ldquoMIL-STD-882D Standard Practice for System Safetyrdquo 19 January 1993bull [21] RTCA Incorporated ldquoDO-178 Software Considerations in Airborne Systems and Equipment Certificationrdquo 1

December 1992bull [22] RTCA Incorporated ldquoDO-254 Design Assurance Guidance for Airborne Electronic Hardwarerdquo 19 April 2000bull [23] US Army ldquoAeronautical Design Standard Handbook Rotorcraft and Aircraft Qualification (RAQ) Handbookrdquo 21

October 1996bull [24] Cary R Spitzer (Editor) ldquoAvionics Elements Software and Functionsrdquo CRC Press 2007bull [25] US Army ldquoArmy Regulation 70-62 Airworthiness Qualification of Aircraft Systemsrdquo 21 May 2007bull [26] US Army ldquoArmy Regulation 95-1 Aviation Flight Regulationsrdquo 3 February 2006bull [27] ldquoUsing the Architecture Tradeoff Analysis Method (ATAM) to Evaluate the Software Architecture for a Product Line

of Avionics Systems A Case Studyrdquo Barbacci Clements Lattanze Northrop Wood July 2003 CMUSEI-2003-TN-012bull [28] ldquoAll in the Family CAAS amp AADLrdquo Peter Feiler August 2008 CMUSEI-2008-SR-021bull [29] ldquoCMMI Guidelines for Process Integration and Product Improvementrdquo Chrissis Konrad Shrum Pearson Education

2007bull [30] ldquoModel Driven Performance Analysis for Avionics Systemsrdquo Brendan OrsquoConnell Draper Laboratory January 2006bull [31] John F Hanaway Robert W Moorehead ldquoSpace Shuttle Avionics Systemsrdquo NASA SP-504 1989bull [32] Lui Sha ldquoThe Complexity Challenge in Modern Avionics Softwarerdquo August 14 2006bull [33] ldquoIncidents Prompt New Scrutiny of Airplane Software Glitchesrdquo 30 May 2006 Wall Street Journalbull [34] Eyal Ophir Clifford Nass and Anthony Wagner ldquo Cognitive Control in Media Multitaskersrdquo PNAS 20 July 2009bull [35] ldquoAdvisory Circular AC 251309-1A System Design and Analysisrdquo Federal Aviation Administration 21 June 1988bull [36] Program Element Office Policy Memorandum 08-03bull [38] httpwwwnsfgovpubs2008nsf08611nsf08611htm National Science Foundation webpage on Cyber-Physical

Systemsbull [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000bull [40] Paul Rook editor ldquoSoftware Reliability Handbookrdquo Elsevier Science Publishers LTD 1990

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)

23

BACKUP SLIDES

BACKUP SLIDES

24

PEO Aviation System Safety Management Decision Authority Matrix

Severity(Most Credible)

FrequentP gt 1E-3

A

Probable 1D-4 lt P lt= 1E-3

B

Occasional1E-5 lt P lt= 1E-4

C

Remote1E-6 lt P lt= 1E-5

D

Improbable1E-7 lt P lt= 1E-6

E

Catastrophic1

Critical 2

Marginal 3

Negligible4

Army Acquisition

PEO Aviation

ProgramManagement

HazardCategory

Description

1 Catastrophic Death or permanent total disability system loss

2 Critical Severe injury or minor occupational illness (no permanent effect) minor system or environmental damage

3 Marginal Minor injury or minor occupational illness (no permanent effect) minor system or environmental damage

4 Negligible Less than minor injury or occupational illness (no lost workdays) or less than minor environmental damage

RiskLevel

Description Probability (Frequency) (per 100000 flight hours)

A Frequent gt 100 (P gt 1E-3)

B Probable lt=100 and gt10 (1E-4 lt P lt= 1E-3)

C Occasional lt= 10 and gt1 (1E-5 lt P lt= 1E-4)

D Remote lt=1 and gt01 (1E-6 lt P lt= 1E-5)

E Improbable lt=01 and gt001 (1E-7 lt P lt= 1E-6)

Presenter
Presentation Notes
As already mentioned it is the goal of the US Army Aviation Engineering Directorate that the system is safe and reliable to operate and will perform the mission when delivered Another goal includes the released system continues to safely perform the mission if maintained and operated per the operatorrsquos manual Replacement parts and overhaul work must be high quality to support continued airworthiness Per the Program Element Office Memorandum 08-03 Risk Matrix US Army flight control systems are to achieve 1E-9 reliability for flight critical functions per civil airspace regulations [35] and 1E-6 reliability for tactical use [36] Quantifying these numbers is established for component hardware but not for software Just as hardware should have quantifiable reliability so should software13

25

Reliability Defined

bull Software Reliability - the probability that a given piece of software will execute without failure in a given environment for a given time [40] ndash Often debated as to how to measure

bull Hardware Reliability - the probability that a hardware component fails over time ndash Well defined and established

bull System Reliability - the probability of success or the probability that the system will perform its intended function under specified design limits [over a given amount of time] [39] ndash A combination of software and hardware reliability

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

26

Hardware vs Software Reliability

Hardware Reliability Software ReliabilityFailure rate has a bathtub curve The burn-in state is similar to the software debugging state

Without considering program evolution failure rate is statistically non-increasing

Material deterioration can cause failures even though the system is not used

Failures never occur if the software is not used

Failure data are fitted to some distributions The selection of the underlying distribution is based on the analysis of failure data and experiences Emphasis is placed on analyzing failure data

Most models are analytically derived from assumptions Emphasis is on developing the model the interpretation of the model assumptions and the physical meaning of the parameters

Failures are caused by material deterioration design errors misuse and environment

Failures are caused by incorrect logic incorrect statements or incorrect input data

Can be improved by better design better material applying redundancy and accelerated life cycle testing

Can be improved by increasing testing effort and correcting discovered faults Reliability tends to change continuously during testing due to the addition of problems in new code or the removal of problems by debugging errors

Hardware repairs restore the original condition Software repairs establish a new piece of software

Hardware failures are usually preceded by warnings Software failures are rarely preceded by warnings

Hardware components can be standardized Software components have rarely been standardized

Hardware can usually be tested exhaustively Software essentially requires infinite testing for completeness

Reference [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Hong Pham compares software versus hardware reliability with the information as shown in Table 2 [39] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

27

Acronym ListACRONYM DEFINITIONAADL Architectural Analysis and Design LanguageAC Advisory Circular (FAA)ACM Association of Computing MachineryAED Aviation Engineering Directorate (AMRDEC)AFTD Aviation Flight Test Directorate (US Army)AGC Apollo Guidance ComputerAHS American Helicopter SocietyAIAA American Institute of Aeronautics and Astronautics (Inc)AMCOM Aviation and Missile Command (US Army)AMRDEC Aviation and Missiles Research Development and Engineering Center (US Army)AR Army RegulationARINC Aeronautical Radio Inc ARP Aerospace Recommended PracticeASIF Avionics Software Integration FacilityATAM Architecture Tradeoff Analysis MethodATM Air Traffic ManagementAWR Airworthiness ReleaseCAAS Common Avionics Architecture SystemCH-47 Cargo Helicopter ChinookCMM Capability Maturity ModelCMMI Capability Maturity Model IndexCMU Carnegie Mellon UniversityCNS Communications Navigation SurveillanceCoSMIC Component Synthesis using Model-Integrated ComputingCPS Cyber-Physical SystemCRC Chemical Rubber Company (ie CRC Press)DFBW Digital Fly-By-WireDoD Department of DefenseE3 Electrical and Electromagnetic EffectsESML Embedded System Modeling LanguageFAA Federal Aviation AdministrationFCS Future Combat SystemsFHA Functional Hazard AssessmentFMEA Failure Modes Effects Analysis

28

Acronym List (concluded)

ACRONYM DEFINITIONGPWS Ground Proximity Warning SystemIBM International Business MachinesIEC International Engineering ConsortiumIL Instrumentation Lab (now Draper Laboratory)IMA Integrated Modular AvionicsINCOSE International Council On Systems EngineeringISO International Organization for StandardizationISS International Space StationKAL Korean AirlinesMISRA Motor Industry Standard Software Reliability AssociationMIT Massachusetts Institute of TechnologyNASA National Aeronautics and Space Administration (USA)PDR Preliminary Design ReviewPEO Program Element OfficePNAS Proceedings of the National Academy of SciencesRAQ Rotorcraft and Aircraft QualificationRMA Rate Monotonic AnalysisRTC Redstone Test Center (US Army) RTTC Redstone Technical Test Center (US Army)RTCA Radio Technical Commission for AeronauticsSAE Society of Automotive EngineersSED Software Engineering Directorate (AMRDEC)SEES Software Engineering Evaluation SystemSEI Software Engineering Institute (CMU)SIL System Integration LaboratorySSA System Safety AssessmentSTS Space Transportation SystemSysML Systems Modeling LanguageTMR Triple Modular RedundantTRL Technical Readiness LevelUAS Unmanned Aircraft SystemUH-60 Utility Helicopter BlackhawkUML Unified Modeling LanguageUS United StatesUSL Universal Systems Language

29

References

bull [1] Israel Koren and Mani Krishna ldquoFault-Tolerant Systemsrdquo Morgan Kaufmann 2007bull [2] Jianto Pan ldquoSoftware Reliabilityrdquo Carnegie Mellon University Spring 1999bull [3] Nachum Dershowitz httpwwwcstauacil~nachumdhorrorhtmlbull [4] httpwwwair-attackcombull [5] David A Mindell ldquoDigital Apollo Human and Machine in Spaceflightrdquo The MIT Press 2008bull [6] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Flight Software Airworthiness SED-SES-PMHFSA 001 December 2003bull [7] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Software Safety SED-SES-PMHSSA 001 February 2006bull [8] AMCOM Regulation 385-17 Software System Safety Policy 15 March 2008bull [9] NASA Software Safety Guidebook NASA-GB-871913 31 March 2004bull [10] Margaret Hamilton amp William Hackler ldquoUniversal Systems Language Lessons Learned from Apollordquo IEEE Computer

Society 2008bull [11] Margaret Hamilton ldquoFull Life Cycle Systems Engineering and Software Development Environment Development Before

The Fact In Actionrdquo httpwwwhtiuscomArticlesFull_Life_Cyclehtmbull [12] Peter Feiler David Gluch John Hudak ldquo The Architecture Analysis amp Design Language (AADL) An Introductionrdquo

CMUSEI-2006-TN-011 February 2006bull [13] Peter Feiler John Hudak ldquoDeveloping AADL Models for Control Systems A Practitionerrsquos Guiderdquo CMUSEI-2007-TR-

014 July 2007bull [14] Bruce Lewis ldquoUsing the Architecture Analysis and Design Language for System Verification and Validationrdquo SEI

Presentation 2006bull [15] Feiler Gluch Hudak Lewis ldquoEmbedded System Architecture Analysis Using SAE AADLrdquo CMUSEI-2004-TN-004 June

2004bull [16] Charles Pecheur Stacy Nelson ldquoVampV of Advanced Systems at NASArdquo NASACR-2002-211402 April 2002bull [17] Systems Integration Requirements Task Group ldquoARP 4754 Certification Considerations for Highly-Integrated or

Complex Aircraft Systemsrdquo SAE Aerospace 10 April 1996bull [18] SAE ldquoARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne System and

Equipmentrdquo December 1996bull [19] Aeronautical Radio Inc (ARINC) ldquoARINC Specification 653P1-2 Avionics Application Software Standard Interface Part 1

ndash Required Servicesrdquo 7 March 2006

30

References (Continued)

bull [20] Department of Defense ldquoMIL-STD-882D Standard Practice for System Safetyrdquo 19 January 1993bull [21] RTCA Incorporated ldquoDO-178 Software Considerations in Airborne Systems and Equipment Certificationrdquo 1

December 1992bull [22] RTCA Incorporated ldquoDO-254 Design Assurance Guidance for Airborne Electronic Hardwarerdquo 19 April 2000bull [23] US Army ldquoAeronautical Design Standard Handbook Rotorcraft and Aircraft Qualification (RAQ) Handbookrdquo 21

October 1996bull [24] Cary R Spitzer (Editor) ldquoAvionics Elements Software and Functionsrdquo CRC Press 2007bull [25] US Army ldquoArmy Regulation 70-62 Airworthiness Qualification of Aircraft Systemsrdquo 21 May 2007bull [26] US Army ldquoArmy Regulation 95-1 Aviation Flight Regulationsrdquo 3 February 2006bull [27] ldquoUsing the Architecture Tradeoff Analysis Method (ATAM) to Evaluate the Software Architecture for a Product Line

of Avionics Systems A Case Studyrdquo Barbacci Clements Lattanze Northrop Wood July 2003 CMUSEI-2003-TN-012bull [28] ldquoAll in the Family CAAS amp AADLrdquo Peter Feiler August 2008 CMUSEI-2008-SR-021bull [29] ldquoCMMI Guidelines for Process Integration and Product Improvementrdquo Chrissis Konrad Shrum Pearson Education

2007bull [30] ldquoModel Driven Performance Analysis for Avionics Systemsrdquo Brendan OrsquoConnell Draper Laboratory January 2006bull [31] John F Hanaway Robert W Moorehead ldquoSpace Shuttle Avionics Systemsrdquo NASA SP-504 1989bull [32] Lui Sha ldquoThe Complexity Challenge in Modern Avionics Softwarerdquo August 14 2006bull [33] ldquoIncidents Prompt New Scrutiny of Airplane Software Glitchesrdquo 30 May 2006 Wall Street Journalbull [34] Eyal Ophir Clifford Nass and Anthony Wagner ldquo Cognitive Control in Media Multitaskersrdquo PNAS 20 July 2009bull [35] ldquoAdvisory Circular AC 251309-1A System Design and Analysisrdquo Federal Aviation Administration 21 June 1988bull [36] Program Element Office Policy Memorandum 08-03bull [38] httpwwwnsfgovpubs2008nsf08611nsf08611htm National Science Foundation webpage on Cyber-Physical

Systemsbull [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000bull [40] Paul Rook editor ldquoSoftware Reliability Handbookrdquo Elsevier Science Publishers LTD 1990

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)

24

PEO Aviation System Safety Management Decision Authority Matrix

Severity(Most Credible)

FrequentP gt 1E-3

A

Probable 1D-4 lt P lt= 1E-3

B

Occasional1E-5 lt P lt= 1E-4

C

Remote1E-6 lt P lt= 1E-5

D

Improbable1E-7 lt P lt= 1E-6

E

Catastrophic1

Critical 2

Marginal 3

Negligible4

Army Acquisition

PEO Aviation

ProgramManagement

HazardCategory

Description

1 Catastrophic Death or permanent total disability system loss

2 Critical Severe injury or minor occupational illness (no permanent effect) minor system or environmental damage

3 Marginal Minor injury or minor occupational illness (no permanent effect) minor system or environmental damage

4 Negligible Less than minor injury or occupational illness (no lost workdays) or less than minor environmental damage

RiskLevel

Description Probability (Frequency) (per 100000 flight hours)

A Frequent gt 100 (P gt 1E-3)

B Probable lt=100 and gt10 (1E-4 lt P lt= 1E-3)

C Occasional lt= 10 and gt1 (1E-5 lt P lt= 1E-4)

D Remote lt=1 and gt01 (1E-6 lt P lt= 1E-5)

E Improbable lt=01 and gt001 (1E-7 lt P lt= 1E-6)

Presenter
Presentation Notes
As already mentioned it is the goal of the US Army Aviation Engineering Directorate that the system is safe and reliable to operate and will perform the mission when delivered Another goal includes the released system continues to safely perform the mission if maintained and operated per the operatorrsquos manual Replacement parts and overhaul work must be high quality to support continued airworthiness Per the Program Element Office Memorandum 08-03 Risk Matrix US Army flight control systems are to achieve 1E-9 reliability for flight critical functions per civil airspace regulations [35] and 1E-6 reliability for tactical use [36] Quantifying these numbers is established for component hardware but not for software Just as hardware should have quantifiable reliability so should software13

25

Reliability Defined

bull Software Reliability - the probability that a given piece of software will execute without failure in a given environment for a given time [40] ndash Often debated as to how to measure

bull Hardware Reliability - the probability that a hardware component fails over time ndash Well defined and established

bull System Reliability - the probability of success or the probability that the system will perform its intended function under specified design limits [over a given amount of time] [39] ndash A combination of software and hardware reliability

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

26

Hardware vs Software Reliability

Hardware Reliability Software ReliabilityFailure rate has a bathtub curve The burn-in state is similar to the software debugging state

Without considering program evolution failure rate is statistically non-increasing

Material deterioration can cause failures even though the system is not used

Failures never occur if the software is not used

Failure data are fitted to some distributions The selection of the underlying distribution is based on the analysis of failure data and experiences Emphasis is placed on analyzing failure data

Most models are analytically derived from assumptions Emphasis is on developing the model the interpretation of the model assumptions and the physical meaning of the parameters

Failures are caused by material deterioration design errors misuse and environment

Failures are caused by incorrect logic incorrect statements or incorrect input data

Can be improved by better design better material applying redundancy and accelerated life cycle testing

Can be improved by increasing testing effort and correcting discovered faults Reliability tends to change continuously during testing due to the addition of problems in new code or the removal of problems by debugging errors

Hardware repairs restore the original condition Software repairs establish a new piece of software

Hardware failures are usually preceded by warnings Software failures are rarely preceded by warnings

Hardware components can be standardized Software components have rarely been standardized

Hardware can usually be tested exhaustively Software essentially requires infinite testing for completeness

Reference [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Hong Pham compares software versus hardware reliability with the information as shown in Table 2 [39] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

27

Acronym ListACRONYM DEFINITIONAADL Architectural Analysis and Design LanguageAC Advisory Circular (FAA)ACM Association of Computing MachineryAED Aviation Engineering Directorate (AMRDEC)AFTD Aviation Flight Test Directorate (US Army)AGC Apollo Guidance ComputerAHS American Helicopter SocietyAIAA American Institute of Aeronautics and Astronautics (Inc)AMCOM Aviation and Missile Command (US Army)AMRDEC Aviation and Missiles Research Development and Engineering Center (US Army)AR Army RegulationARINC Aeronautical Radio Inc ARP Aerospace Recommended PracticeASIF Avionics Software Integration FacilityATAM Architecture Tradeoff Analysis MethodATM Air Traffic ManagementAWR Airworthiness ReleaseCAAS Common Avionics Architecture SystemCH-47 Cargo Helicopter ChinookCMM Capability Maturity ModelCMMI Capability Maturity Model IndexCMU Carnegie Mellon UniversityCNS Communications Navigation SurveillanceCoSMIC Component Synthesis using Model-Integrated ComputingCPS Cyber-Physical SystemCRC Chemical Rubber Company (ie CRC Press)DFBW Digital Fly-By-WireDoD Department of DefenseE3 Electrical and Electromagnetic EffectsESML Embedded System Modeling LanguageFAA Federal Aviation AdministrationFCS Future Combat SystemsFHA Functional Hazard AssessmentFMEA Failure Modes Effects Analysis

28

Acronym List (concluded)

ACRONYM DEFINITIONGPWS Ground Proximity Warning SystemIBM International Business MachinesIEC International Engineering ConsortiumIL Instrumentation Lab (now Draper Laboratory)IMA Integrated Modular AvionicsINCOSE International Council On Systems EngineeringISO International Organization for StandardizationISS International Space StationKAL Korean AirlinesMISRA Motor Industry Standard Software Reliability AssociationMIT Massachusetts Institute of TechnologyNASA National Aeronautics and Space Administration (USA)PDR Preliminary Design ReviewPEO Program Element OfficePNAS Proceedings of the National Academy of SciencesRAQ Rotorcraft and Aircraft QualificationRMA Rate Monotonic AnalysisRTC Redstone Test Center (US Army) RTTC Redstone Technical Test Center (US Army)RTCA Radio Technical Commission for AeronauticsSAE Society of Automotive EngineersSED Software Engineering Directorate (AMRDEC)SEES Software Engineering Evaluation SystemSEI Software Engineering Institute (CMU)SIL System Integration LaboratorySSA System Safety AssessmentSTS Space Transportation SystemSysML Systems Modeling LanguageTMR Triple Modular RedundantTRL Technical Readiness LevelUAS Unmanned Aircraft SystemUH-60 Utility Helicopter BlackhawkUML Unified Modeling LanguageUS United StatesUSL Universal Systems Language

29

References

bull [1] Israel Koren and Mani Krishna ldquoFault-Tolerant Systemsrdquo Morgan Kaufmann 2007bull [2] Jianto Pan ldquoSoftware Reliabilityrdquo Carnegie Mellon University Spring 1999bull [3] Nachum Dershowitz httpwwwcstauacil~nachumdhorrorhtmlbull [4] httpwwwair-attackcombull [5] David A Mindell ldquoDigital Apollo Human and Machine in Spaceflightrdquo The MIT Press 2008bull [6] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Flight Software Airworthiness SED-SES-PMHFSA 001 December 2003bull [7] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Software Safety SED-SES-PMHSSA 001 February 2006bull [8] AMCOM Regulation 385-17 Software System Safety Policy 15 March 2008bull [9] NASA Software Safety Guidebook NASA-GB-871913 31 March 2004bull [10] Margaret Hamilton amp William Hackler ldquoUniversal Systems Language Lessons Learned from Apollordquo IEEE Computer

Society 2008bull [11] Margaret Hamilton ldquoFull Life Cycle Systems Engineering and Software Development Environment Development Before

The Fact In Actionrdquo httpwwwhtiuscomArticlesFull_Life_Cyclehtmbull [12] Peter Feiler David Gluch John Hudak ldquo The Architecture Analysis amp Design Language (AADL) An Introductionrdquo

CMUSEI-2006-TN-011 February 2006bull [13] Peter Feiler John Hudak ldquoDeveloping AADL Models for Control Systems A Practitionerrsquos Guiderdquo CMUSEI-2007-TR-

014 July 2007bull [14] Bruce Lewis ldquoUsing the Architecture Analysis and Design Language for System Verification and Validationrdquo SEI

Presentation 2006bull [15] Feiler Gluch Hudak Lewis ldquoEmbedded System Architecture Analysis Using SAE AADLrdquo CMUSEI-2004-TN-004 June

2004bull [16] Charles Pecheur Stacy Nelson ldquoVampV of Advanced Systems at NASArdquo NASACR-2002-211402 April 2002bull [17] Systems Integration Requirements Task Group ldquoARP 4754 Certification Considerations for Highly-Integrated or

Complex Aircraft Systemsrdquo SAE Aerospace 10 April 1996bull [18] SAE ldquoARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne System and

Equipmentrdquo December 1996bull [19] Aeronautical Radio Inc (ARINC) ldquoARINC Specification 653P1-2 Avionics Application Software Standard Interface Part 1

ndash Required Servicesrdquo 7 March 2006

30

References (Continued)

bull [20] Department of Defense ldquoMIL-STD-882D Standard Practice for System Safetyrdquo 19 January 1993bull [21] RTCA Incorporated ldquoDO-178 Software Considerations in Airborne Systems and Equipment Certificationrdquo 1

December 1992bull [22] RTCA Incorporated ldquoDO-254 Design Assurance Guidance for Airborne Electronic Hardwarerdquo 19 April 2000bull [23] US Army ldquoAeronautical Design Standard Handbook Rotorcraft and Aircraft Qualification (RAQ) Handbookrdquo 21

October 1996bull [24] Cary R Spitzer (Editor) ldquoAvionics Elements Software and Functionsrdquo CRC Press 2007bull [25] US Army ldquoArmy Regulation 70-62 Airworthiness Qualification of Aircraft Systemsrdquo 21 May 2007bull [26] US Army ldquoArmy Regulation 95-1 Aviation Flight Regulationsrdquo 3 February 2006bull [27] ldquoUsing the Architecture Tradeoff Analysis Method (ATAM) to Evaluate the Software Architecture for a Product Line

of Avionics Systems A Case Studyrdquo Barbacci Clements Lattanze Northrop Wood July 2003 CMUSEI-2003-TN-012bull [28] ldquoAll in the Family CAAS amp AADLrdquo Peter Feiler August 2008 CMUSEI-2008-SR-021bull [29] ldquoCMMI Guidelines for Process Integration and Product Improvementrdquo Chrissis Konrad Shrum Pearson Education

2007bull [30] ldquoModel Driven Performance Analysis for Avionics Systemsrdquo Brendan OrsquoConnell Draper Laboratory January 2006bull [31] John F Hanaway Robert W Moorehead ldquoSpace Shuttle Avionics Systemsrdquo NASA SP-504 1989bull [32] Lui Sha ldquoThe Complexity Challenge in Modern Avionics Softwarerdquo August 14 2006bull [33] ldquoIncidents Prompt New Scrutiny of Airplane Software Glitchesrdquo 30 May 2006 Wall Street Journalbull [34] Eyal Ophir Clifford Nass and Anthony Wagner ldquo Cognitive Control in Media Multitaskersrdquo PNAS 20 July 2009bull [35] ldquoAdvisory Circular AC 251309-1A System Design and Analysisrdquo Federal Aviation Administration 21 June 1988bull [36] Program Element Office Policy Memorandum 08-03bull [38] httpwwwnsfgovpubs2008nsf08611nsf08611htm National Science Foundation webpage on Cyber-Physical

Systemsbull [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000bull [40] Paul Rook editor ldquoSoftware Reliability Handbookrdquo Elsevier Science Publishers LTD 1990

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)

25

Reliability Defined

bull Software Reliability - the probability that a given piece of software will execute without failure in a given environment for a given time [40] ndash Often debated as to how to measure

bull Hardware Reliability - the probability that a hardware component fails over time ndash Well defined and established

bull System Reliability - the probability of success or the probability that the system will perform its intended function under specified design limits [over a given amount of time] [39] ndash A combination of software and hardware reliability

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

26

Hardware vs Software Reliability

Hardware Reliability Software ReliabilityFailure rate has a bathtub curve The burn-in state is similar to the software debugging state

Without considering program evolution failure rate is statistically non-increasing

Material deterioration can cause failures even though the system is not used

Failures never occur if the software is not used

Failure data are fitted to some distributions The selection of the underlying distribution is based on the analysis of failure data and experiences Emphasis is placed on analyzing failure data

Most models are analytically derived from assumptions Emphasis is on developing the model the interpretation of the model assumptions and the physical meaning of the parameters

Failures are caused by material deterioration design errors misuse and environment

Failures are caused by incorrect logic incorrect statements or incorrect input data

Can be improved by better design better material applying redundancy and accelerated life cycle testing

Can be improved by increasing testing effort and correcting discovered faults Reliability tends to change continuously during testing due to the addition of problems in new code or the removal of problems by debugging errors

Hardware repairs restore the original condition Software repairs establish a new piece of software

Hardware failures are usually preceded by warnings Software failures are rarely preceded by warnings

Hardware components can be standardized Software components have rarely been standardized

Hardware can usually be tested exhaustively Software essentially requires infinite testing for completeness

Reference [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Hong Pham compares software versus hardware reliability with the information as shown in Table 2 [39] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

27

Acronym ListACRONYM DEFINITIONAADL Architectural Analysis and Design LanguageAC Advisory Circular (FAA)ACM Association of Computing MachineryAED Aviation Engineering Directorate (AMRDEC)AFTD Aviation Flight Test Directorate (US Army)AGC Apollo Guidance ComputerAHS American Helicopter SocietyAIAA American Institute of Aeronautics and Astronautics (Inc)AMCOM Aviation and Missile Command (US Army)AMRDEC Aviation and Missiles Research Development and Engineering Center (US Army)AR Army RegulationARINC Aeronautical Radio Inc ARP Aerospace Recommended PracticeASIF Avionics Software Integration FacilityATAM Architecture Tradeoff Analysis MethodATM Air Traffic ManagementAWR Airworthiness ReleaseCAAS Common Avionics Architecture SystemCH-47 Cargo Helicopter ChinookCMM Capability Maturity ModelCMMI Capability Maturity Model IndexCMU Carnegie Mellon UniversityCNS Communications Navigation SurveillanceCoSMIC Component Synthesis using Model-Integrated ComputingCPS Cyber-Physical SystemCRC Chemical Rubber Company (ie CRC Press)DFBW Digital Fly-By-WireDoD Department of DefenseE3 Electrical and Electromagnetic EffectsESML Embedded System Modeling LanguageFAA Federal Aviation AdministrationFCS Future Combat SystemsFHA Functional Hazard AssessmentFMEA Failure Modes Effects Analysis

28

Acronym List (concluded)

ACRONYM DEFINITIONGPWS Ground Proximity Warning SystemIBM International Business MachinesIEC International Engineering ConsortiumIL Instrumentation Lab (now Draper Laboratory)IMA Integrated Modular AvionicsINCOSE International Council On Systems EngineeringISO International Organization for StandardizationISS International Space StationKAL Korean AirlinesMISRA Motor Industry Standard Software Reliability AssociationMIT Massachusetts Institute of TechnologyNASA National Aeronautics and Space Administration (USA)PDR Preliminary Design ReviewPEO Program Element OfficePNAS Proceedings of the National Academy of SciencesRAQ Rotorcraft and Aircraft QualificationRMA Rate Monotonic AnalysisRTC Redstone Test Center (US Army) RTTC Redstone Technical Test Center (US Army)RTCA Radio Technical Commission for AeronauticsSAE Society of Automotive EngineersSED Software Engineering Directorate (AMRDEC)SEES Software Engineering Evaluation SystemSEI Software Engineering Institute (CMU)SIL System Integration LaboratorySSA System Safety AssessmentSTS Space Transportation SystemSysML Systems Modeling LanguageTMR Triple Modular RedundantTRL Technical Readiness LevelUAS Unmanned Aircraft SystemUH-60 Utility Helicopter BlackhawkUML Unified Modeling LanguageUS United StatesUSL Universal Systems Language

29

References

bull [1] Israel Koren and Mani Krishna ldquoFault-Tolerant Systemsrdquo Morgan Kaufmann 2007bull [2] Jianto Pan ldquoSoftware Reliabilityrdquo Carnegie Mellon University Spring 1999bull [3] Nachum Dershowitz httpwwwcstauacil~nachumdhorrorhtmlbull [4] httpwwwair-attackcombull [5] David A Mindell ldquoDigital Apollo Human and Machine in Spaceflightrdquo The MIT Press 2008bull [6] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Flight Software Airworthiness SED-SES-PMHFSA 001 December 2003bull [7] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Software Safety SED-SES-PMHSSA 001 February 2006bull [8] AMCOM Regulation 385-17 Software System Safety Policy 15 March 2008bull [9] NASA Software Safety Guidebook NASA-GB-871913 31 March 2004bull [10] Margaret Hamilton amp William Hackler ldquoUniversal Systems Language Lessons Learned from Apollordquo IEEE Computer

Society 2008bull [11] Margaret Hamilton ldquoFull Life Cycle Systems Engineering and Software Development Environment Development Before

The Fact In Actionrdquo httpwwwhtiuscomArticlesFull_Life_Cyclehtmbull [12] Peter Feiler David Gluch John Hudak ldquo The Architecture Analysis amp Design Language (AADL) An Introductionrdquo

CMUSEI-2006-TN-011 February 2006bull [13] Peter Feiler John Hudak ldquoDeveloping AADL Models for Control Systems A Practitionerrsquos Guiderdquo CMUSEI-2007-TR-

014 July 2007bull [14] Bruce Lewis ldquoUsing the Architecture Analysis and Design Language for System Verification and Validationrdquo SEI

Presentation 2006bull [15] Feiler Gluch Hudak Lewis ldquoEmbedded System Architecture Analysis Using SAE AADLrdquo CMUSEI-2004-TN-004 June

2004bull [16] Charles Pecheur Stacy Nelson ldquoVampV of Advanced Systems at NASArdquo NASACR-2002-211402 April 2002bull [17] Systems Integration Requirements Task Group ldquoARP 4754 Certification Considerations for Highly-Integrated or

Complex Aircraft Systemsrdquo SAE Aerospace 10 April 1996bull [18] SAE ldquoARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne System and

Equipmentrdquo December 1996bull [19] Aeronautical Radio Inc (ARINC) ldquoARINC Specification 653P1-2 Avionics Application Software Standard Interface Part 1

ndash Required Servicesrdquo 7 March 2006

30

References (Continued)

bull [20] Department of Defense ldquoMIL-STD-882D Standard Practice for System Safetyrdquo 19 January 1993bull [21] RTCA Incorporated ldquoDO-178 Software Considerations in Airborne Systems and Equipment Certificationrdquo 1

December 1992bull [22] RTCA Incorporated ldquoDO-254 Design Assurance Guidance for Airborne Electronic Hardwarerdquo 19 April 2000bull [23] US Army ldquoAeronautical Design Standard Handbook Rotorcraft and Aircraft Qualification (RAQ) Handbookrdquo 21

October 1996bull [24] Cary R Spitzer (Editor) ldquoAvionics Elements Software and Functionsrdquo CRC Press 2007bull [25] US Army ldquoArmy Regulation 70-62 Airworthiness Qualification of Aircraft Systemsrdquo 21 May 2007bull [26] US Army ldquoArmy Regulation 95-1 Aviation Flight Regulationsrdquo 3 February 2006bull [27] ldquoUsing the Architecture Tradeoff Analysis Method (ATAM) to Evaluate the Software Architecture for a Product Line

of Avionics Systems A Case Studyrdquo Barbacci Clements Lattanze Northrop Wood July 2003 CMUSEI-2003-TN-012bull [28] ldquoAll in the Family CAAS amp AADLrdquo Peter Feiler August 2008 CMUSEI-2008-SR-021bull [29] ldquoCMMI Guidelines for Process Integration and Product Improvementrdquo Chrissis Konrad Shrum Pearson Education

2007bull [30] ldquoModel Driven Performance Analysis for Avionics Systemsrdquo Brendan OrsquoConnell Draper Laboratory January 2006bull [31] John F Hanaway Robert W Moorehead ldquoSpace Shuttle Avionics Systemsrdquo NASA SP-504 1989bull [32] Lui Sha ldquoThe Complexity Challenge in Modern Avionics Softwarerdquo August 14 2006bull [33] ldquoIncidents Prompt New Scrutiny of Airplane Software Glitchesrdquo 30 May 2006 Wall Street Journalbull [34] Eyal Ophir Clifford Nass and Anthony Wagner ldquo Cognitive Control in Media Multitaskersrdquo PNAS 20 July 2009bull [35] ldquoAdvisory Circular AC 251309-1A System Design and Analysisrdquo Federal Aviation Administration 21 June 1988bull [36] Program Element Office Policy Memorandum 08-03bull [38] httpwwwnsfgovpubs2008nsf08611nsf08611htm National Science Foundation webpage on Cyber-Physical

Systemsbull [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000bull [40] Paul Rook editor ldquoSoftware Reliability Handbookrdquo Elsevier Science Publishers LTD 1990

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)

26

Hardware vs Software Reliability

Hardware Reliability Software ReliabilityFailure rate has a bathtub curve The burn-in state is similar to the software debugging state

Without considering program evolution failure rate is statistically non-increasing

Material deterioration can cause failures even though the system is not used

Failures never occur if the software is not used

Failure data are fitted to some distributions The selection of the underlying distribution is based on the analysis of failure data and experiences Emphasis is placed on analyzing failure data

Most models are analytically derived from assumptions Emphasis is on developing the model the interpretation of the model assumptions and the physical meaning of the parameters

Failures are caused by material deterioration design errors misuse and environment

Failures are caused by incorrect logic incorrect statements or incorrect input data

Can be improved by better design better material applying redundancy and accelerated life cycle testing

Can be improved by increasing testing effort and correcting discovered faults Reliability tends to change continuously during testing due to the addition of problems in new code or the removal of problems by debugging errors

Hardware repairs restore the original condition Software repairs establish a new piece of software

Hardware failures are usually preceded by warnings Software failures are rarely preceded by warnings

Hardware components can be standardized Software components have rarely been standardized

Hardware can usually be tested exhaustively Software essentially requires infinite testing for completeness

Reference [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000

Presenter
Presentation Notes
ldquoReliability is defined as the probability of success or the probability that the system will perform its intended function under specified design limitsrdquo[39] ldquoSoftware reliability is defined as the probability that a given piece of software will execute without failure in a given environment for a given timerdquo [40] Hong Pham compares software versus hardware reliability with the information as shown in Table 2 [39] Systems rely on both and thus must have a combination of the two to formulate an overall reliability

27

Acronym ListACRONYM DEFINITIONAADL Architectural Analysis and Design LanguageAC Advisory Circular (FAA)ACM Association of Computing MachineryAED Aviation Engineering Directorate (AMRDEC)AFTD Aviation Flight Test Directorate (US Army)AGC Apollo Guidance ComputerAHS American Helicopter SocietyAIAA American Institute of Aeronautics and Astronautics (Inc)AMCOM Aviation and Missile Command (US Army)AMRDEC Aviation and Missiles Research Development and Engineering Center (US Army)AR Army RegulationARINC Aeronautical Radio Inc ARP Aerospace Recommended PracticeASIF Avionics Software Integration FacilityATAM Architecture Tradeoff Analysis MethodATM Air Traffic ManagementAWR Airworthiness ReleaseCAAS Common Avionics Architecture SystemCH-47 Cargo Helicopter ChinookCMM Capability Maturity ModelCMMI Capability Maturity Model IndexCMU Carnegie Mellon UniversityCNS Communications Navigation SurveillanceCoSMIC Component Synthesis using Model-Integrated ComputingCPS Cyber-Physical SystemCRC Chemical Rubber Company (ie CRC Press)DFBW Digital Fly-By-WireDoD Department of DefenseE3 Electrical and Electromagnetic EffectsESML Embedded System Modeling LanguageFAA Federal Aviation AdministrationFCS Future Combat SystemsFHA Functional Hazard AssessmentFMEA Failure Modes Effects Analysis

28

Acronym List (concluded)

ACRONYM DEFINITIONGPWS Ground Proximity Warning SystemIBM International Business MachinesIEC International Engineering ConsortiumIL Instrumentation Lab (now Draper Laboratory)IMA Integrated Modular AvionicsINCOSE International Council On Systems EngineeringISO International Organization for StandardizationISS International Space StationKAL Korean AirlinesMISRA Motor Industry Standard Software Reliability AssociationMIT Massachusetts Institute of TechnologyNASA National Aeronautics and Space Administration (USA)PDR Preliminary Design ReviewPEO Program Element OfficePNAS Proceedings of the National Academy of SciencesRAQ Rotorcraft and Aircraft QualificationRMA Rate Monotonic AnalysisRTC Redstone Test Center (US Army) RTTC Redstone Technical Test Center (US Army)RTCA Radio Technical Commission for AeronauticsSAE Society of Automotive EngineersSED Software Engineering Directorate (AMRDEC)SEES Software Engineering Evaluation SystemSEI Software Engineering Institute (CMU)SIL System Integration LaboratorySSA System Safety AssessmentSTS Space Transportation SystemSysML Systems Modeling LanguageTMR Triple Modular RedundantTRL Technical Readiness LevelUAS Unmanned Aircraft SystemUH-60 Utility Helicopter BlackhawkUML Unified Modeling LanguageUS United StatesUSL Universal Systems Language

29

References

bull [1] Israel Koren and Mani Krishna ldquoFault-Tolerant Systemsrdquo Morgan Kaufmann 2007bull [2] Jianto Pan ldquoSoftware Reliabilityrdquo Carnegie Mellon University Spring 1999bull [3] Nachum Dershowitz httpwwwcstauacil~nachumdhorrorhtmlbull [4] httpwwwair-attackcombull [5] David A Mindell ldquoDigital Apollo Human and Machine in Spaceflightrdquo The MIT Press 2008bull [6] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Flight Software Airworthiness SED-SES-PMHFSA 001 December 2003bull [7] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Software Safety SED-SES-PMHSSA 001 February 2006bull [8] AMCOM Regulation 385-17 Software System Safety Policy 15 March 2008bull [9] NASA Software Safety Guidebook NASA-GB-871913 31 March 2004bull [10] Margaret Hamilton amp William Hackler ldquoUniversal Systems Language Lessons Learned from Apollordquo IEEE Computer

Society 2008bull [11] Margaret Hamilton ldquoFull Life Cycle Systems Engineering and Software Development Environment Development Before

The Fact In Actionrdquo httpwwwhtiuscomArticlesFull_Life_Cyclehtmbull [12] Peter Feiler David Gluch John Hudak ldquo The Architecture Analysis amp Design Language (AADL) An Introductionrdquo

CMUSEI-2006-TN-011 February 2006bull [13] Peter Feiler John Hudak ldquoDeveloping AADL Models for Control Systems A Practitionerrsquos Guiderdquo CMUSEI-2007-TR-

014 July 2007bull [14] Bruce Lewis ldquoUsing the Architecture Analysis and Design Language for System Verification and Validationrdquo SEI

Presentation 2006bull [15] Feiler Gluch Hudak Lewis ldquoEmbedded System Architecture Analysis Using SAE AADLrdquo CMUSEI-2004-TN-004 June

2004bull [16] Charles Pecheur Stacy Nelson ldquoVampV of Advanced Systems at NASArdquo NASACR-2002-211402 April 2002bull [17] Systems Integration Requirements Task Group ldquoARP 4754 Certification Considerations for Highly-Integrated or

Complex Aircraft Systemsrdquo SAE Aerospace 10 April 1996bull [18] SAE ldquoARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne System and

Equipmentrdquo December 1996bull [19] Aeronautical Radio Inc (ARINC) ldquoARINC Specification 653P1-2 Avionics Application Software Standard Interface Part 1

ndash Required Servicesrdquo 7 March 2006

30

References (Continued)

bull [20] Department of Defense ldquoMIL-STD-882D Standard Practice for System Safetyrdquo 19 January 1993bull [21] RTCA Incorporated ldquoDO-178 Software Considerations in Airborne Systems and Equipment Certificationrdquo 1

December 1992bull [22] RTCA Incorporated ldquoDO-254 Design Assurance Guidance for Airborne Electronic Hardwarerdquo 19 April 2000bull [23] US Army ldquoAeronautical Design Standard Handbook Rotorcraft and Aircraft Qualification (RAQ) Handbookrdquo 21

October 1996bull [24] Cary R Spitzer (Editor) ldquoAvionics Elements Software and Functionsrdquo CRC Press 2007bull [25] US Army ldquoArmy Regulation 70-62 Airworthiness Qualification of Aircraft Systemsrdquo 21 May 2007bull [26] US Army ldquoArmy Regulation 95-1 Aviation Flight Regulationsrdquo 3 February 2006bull [27] ldquoUsing the Architecture Tradeoff Analysis Method (ATAM) to Evaluate the Software Architecture for a Product Line

of Avionics Systems A Case Studyrdquo Barbacci Clements Lattanze Northrop Wood July 2003 CMUSEI-2003-TN-012bull [28] ldquoAll in the Family CAAS amp AADLrdquo Peter Feiler August 2008 CMUSEI-2008-SR-021bull [29] ldquoCMMI Guidelines for Process Integration and Product Improvementrdquo Chrissis Konrad Shrum Pearson Education

2007bull [30] ldquoModel Driven Performance Analysis for Avionics Systemsrdquo Brendan OrsquoConnell Draper Laboratory January 2006bull [31] John F Hanaway Robert W Moorehead ldquoSpace Shuttle Avionics Systemsrdquo NASA SP-504 1989bull [32] Lui Sha ldquoThe Complexity Challenge in Modern Avionics Softwarerdquo August 14 2006bull [33] ldquoIncidents Prompt New Scrutiny of Airplane Software Glitchesrdquo 30 May 2006 Wall Street Journalbull [34] Eyal Ophir Clifford Nass and Anthony Wagner ldquo Cognitive Control in Media Multitaskersrdquo PNAS 20 July 2009bull [35] ldquoAdvisory Circular AC 251309-1A System Design and Analysisrdquo Federal Aviation Administration 21 June 1988bull [36] Program Element Office Policy Memorandum 08-03bull [38] httpwwwnsfgovpubs2008nsf08611nsf08611htm National Science Foundation webpage on Cyber-Physical

Systemsbull [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000bull [40] Paul Rook editor ldquoSoftware Reliability Handbookrdquo Elsevier Science Publishers LTD 1990

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)

27

Acronym ListACRONYM DEFINITIONAADL Architectural Analysis and Design LanguageAC Advisory Circular (FAA)ACM Association of Computing MachineryAED Aviation Engineering Directorate (AMRDEC)AFTD Aviation Flight Test Directorate (US Army)AGC Apollo Guidance ComputerAHS American Helicopter SocietyAIAA American Institute of Aeronautics and Astronautics (Inc)AMCOM Aviation and Missile Command (US Army)AMRDEC Aviation and Missiles Research Development and Engineering Center (US Army)AR Army RegulationARINC Aeronautical Radio Inc ARP Aerospace Recommended PracticeASIF Avionics Software Integration FacilityATAM Architecture Tradeoff Analysis MethodATM Air Traffic ManagementAWR Airworthiness ReleaseCAAS Common Avionics Architecture SystemCH-47 Cargo Helicopter ChinookCMM Capability Maturity ModelCMMI Capability Maturity Model IndexCMU Carnegie Mellon UniversityCNS Communications Navigation SurveillanceCoSMIC Component Synthesis using Model-Integrated ComputingCPS Cyber-Physical SystemCRC Chemical Rubber Company (ie CRC Press)DFBW Digital Fly-By-WireDoD Department of DefenseE3 Electrical and Electromagnetic EffectsESML Embedded System Modeling LanguageFAA Federal Aviation AdministrationFCS Future Combat SystemsFHA Functional Hazard AssessmentFMEA Failure Modes Effects Analysis

28

Acronym List (concluded)

ACRONYM DEFINITIONGPWS Ground Proximity Warning SystemIBM International Business MachinesIEC International Engineering ConsortiumIL Instrumentation Lab (now Draper Laboratory)IMA Integrated Modular AvionicsINCOSE International Council On Systems EngineeringISO International Organization for StandardizationISS International Space StationKAL Korean AirlinesMISRA Motor Industry Standard Software Reliability AssociationMIT Massachusetts Institute of TechnologyNASA National Aeronautics and Space Administration (USA)PDR Preliminary Design ReviewPEO Program Element OfficePNAS Proceedings of the National Academy of SciencesRAQ Rotorcraft and Aircraft QualificationRMA Rate Monotonic AnalysisRTC Redstone Test Center (US Army) RTTC Redstone Technical Test Center (US Army)RTCA Radio Technical Commission for AeronauticsSAE Society of Automotive EngineersSED Software Engineering Directorate (AMRDEC)SEES Software Engineering Evaluation SystemSEI Software Engineering Institute (CMU)SIL System Integration LaboratorySSA System Safety AssessmentSTS Space Transportation SystemSysML Systems Modeling LanguageTMR Triple Modular RedundantTRL Technical Readiness LevelUAS Unmanned Aircraft SystemUH-60 Utility Helicopter BlackhawkUML Unified Modeling LanguageUS United StatesUSL Universal Systems Language

29

References

bull [1] Israel Koren and Mani Krishna ldquoFault-Tolerant Systemsrdquo Morgan Kaufmann 2007bull [2] Jianto Pan ldquoSoftware Reliabilityrdquo Carnegie Mellon University Spring 1999bull [3] Nachum Dershowitz httpwwwcstauacil~nachumdhorrorhtmlbull [4] httpwwwair-attackcombull [5] David A Mindell ldquoDigital Apollo Human and Machine in Spaceflightrdquo The MIT Press 2008bull [6] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Flight Software Airworthiness SED-SES-PMHFSA 001 December 2003bull [7] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Software Safety SED-SES-PMHSSA 001 February 2006bull [8] AMCOM Regulation 385-17 Software System Safety Policy 15 March 2008bull [9] NASA Software Safety Guidebook NASA-GB-871913 31 March 2004bull [10] Margaret Hamilton amp William Hackler ldquoUniversal Systems Language Lessons Learned from Apollordquo IEEE Computer

Society 2008bull [11] Margaret Hamilton ldquoFull Life Cycle Systems Engineering and Software Development Environment Development Before

The Fact In Actionrdquo httpwwwhtiuscomArticlesFull_Life_Cyclehtmbull [12] Peter Feiler David Gluch John Hudak ldquo The Architecture Analysis amp Design Language (AADL) An Introductionrdquo

CMUSEI-2006-TN-011 February 2006bull [13] Peter Feiler John Hudak ldquoDeveloping AADL Models for Control Systems A Practitionerrsquos Guiderdquo CMUSEI-2007-TR-

014 July 2007bull [14] Bruce Lewis ldquoUsing the Architecture Analysis and Design Language for System Verification and Validationrdquo SEI

Presentation 2006bull [15] Feiler Gluch Hudak Lewis ldquoEmbedded System Architecture Analysis Using SAE AADLrdquo CMUSEI-2004-TN-004 June

2004bull [16] Charles Pecheur Stacy Nelson ldquoVampV of Advanced Systems at NASArdquo NASACR-2002-211402 April 2002bull [17] Systems Integration Requirements Task Group ldquoARP 4754 Certification Considerations for Highly-Integrated or

Complex Aircraft Systemsrdquo SAE Aerospace 10 April 1996bull [18] SAE ldquoARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne System and

Equipmentrdquo December 1996bull [19] Aeronautical Radio Inc (ARINC) ldquoARINC Specification 653P1-2 Avionics Application Software Standard Interface Part 1

ndash Required Servicesrdquo 7 March 2006

30

References (Continued)

bull [20] Department of Defense ldquoMIL-STD-882D Standard Practice for System Safetyrdquo 19 January 1993bull [21] RTCA Incorporated ldquoDO-178 Software Considerations in Airborne Systems and Equipment Certificationrdquo 1

December 1992bull [22] RTCA Incorporated ldquoDO-254 Design Assurance Guidance for Airborne Electronic Hardwarerdquo 19 April 2000bull [23] US Army ldquoAeronautical Design Standard Handbook Rotorcraft and Aircraft Qualification (RAQ) Handbookrdquo 21

October 1996bull [24] Cary R Spitzer (Editor) ldquoAvionics Elements Software and Functionsrdquo CRC Press 2007bull [25] US Army ldquoArmy Regulation 70-62 Airworthiness Qualification of Aircraft Systemsrdquo 21 May 2007bull [26] US Army ldquoArmy Regulation 95-1 Aviation Flight Regulationsrdquo 3 February 2006bull [27] ldquoUsing the Architecture Tradeoff Analysis Method (ATAM) to Evaluate the Software Architecture for a Product Line

of Avionics Systems A Case Studyrdquo Barbacci Clements Lattanze Northrop Wood July 2003 CMUSEI-2003-TN-012bull [28] ldquoAll in the Family CAAS amp AADLrdquo Peter Feiler August 2008 CMUSEI-2008-SR-021bull [29] ldquoCMMI Guidelines for Process Integration and Product Improvementrdquo Chrissis Konrad Shrum Pearson Education

2007bull [30] ldquoModel Driven Performance Analysis for Avionics Systemsrdquo Brendan OrsquoConnell Draper Laboratory January 2006bull [31] John F Hanaway Robert W Moorehead ldquoSpace Shuttle Avionics Systemsrdquo NASA SP-504 1989bull [32] Lui Sha ldquoThe Complexity Challenge in Modern Avionics Softwarerdquo August 14 2006bull [33] ldquoIncidents Prompt New Scrutiny of Airplane Software Glitchesrdquo 30 May 2006 Wall Street Journalbull [34] Eyal Ophir Clifford Nass and Anthony Wagner ldquo Cognitive Control in Media Multitaskersrdquo PNAS 20 July 2009bull [35] ldquoAdvisory Circular AC 251309-1A System Design and Analysisrdquo Federal Aviation Administration 21 June 1988bull [36] Program Element Office Policy Memorandum 08-03bull [38] httpwwwnsfgovpubs2008nsf08611nsf08611htm National Science Foundation webpage on Cyber-Physical

Systemsbull [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000bull [40] Paul Rook editor ldquoSoftware Reliability Handbookrdquo Elsevier Science Publishers LTD 1990

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)

28

Acronym List (concluded)

ACRONYM DEFINITIONGPWS Ground Proximity Warning SystemIBM International Business MachinesIEC International Engineering ConsortiumIL Instrumentation Lab (now Draper Laboratory)IMA Integrated Modular AvionicsINCOSE International Council On Systems EngineeringISO International Organization for StandardizationISS International Space StationKAL Korean AirlinesMISRA Motor Industry Standard Software Reliability AssociationMIT Massachusetts Institute of TechnologyNASA National Aeronautics and Space Administration (USA)PDR Preliminary Design ReviewPEO Program Element OfficePNAS Proceedings of the National Academy of SciencesRAQ Rotorcraft and Aircraft QualificationRMA Rate Monotonic AnalysisRTC Redstone Test Center (US Army) RTTC Redstone Technical Test Center (US Army)RTCA Radio Technical Commission for AeronauticsSAE Society of Automotive EngineersSED Software Engineering Directorate (AMRDEC)SEES Software Engineering Evaluation SystemSEI Software Engineering Institute (CMU)SIL System Integration LaboratorySSA System Safety AssessmentSTS Space Transportation SystemSysML Systems Modeling LanguageTMR Triple Modular RedundantTRL Technical Readiness LevelUAS Unmanned Aircraft SystemUH-60 Utility Helicopter BlackhawkUML Unified Modeling LanguageUS United StatesUSL Universal Systems Language

29

References

bull [1] Israel Koren and Mani Krishna ldquoFault-Tolerant Systemsrdquo Morgan Kaufmann 2007bull [2] Jianto Pan ldquoSoftware Reliabilityrdquo Carnegie Mellon University Spring 1999bull [3] Nachum Dershowitz httpwwwcstauacil~nachumdhorrorhtmlbull [4] httpwwwair-attackcombull [5] David A Mindell ldquoDigital Apollo Human and Machine in Spaceflightrdquo The MIT Press 2008bull [6] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Flight Software Airworthiness SED-SES-PMHFSA 001 December 2003bull [7] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Software Safety SED-SES-PMHSSA 001 February 2006bull [8] AMCOM Regulation 385-17 Software System Safety Policy 15 March 2008bull [9] NASA Software Safety Guidebook NASA-GB-871913 31 March 2004bull [10] Margaret Hamilton amp William Hackler ldquoUniversal Systems Language Lessons Learned from Apollordquo IEEE Computer

Society 2008bull [11] Margaret Hamilton ldquoFull Life Cycle Systems Engineering and Software Development Environment Development Before

The Fact In Actionrdquo httpwwwhtiuscomArticlesFull_Life_Cyclehtmbull [12] Peter Feiler David Gluch John Hudak ldquo The Architecture Analysis amp Design Language (AADL) An Introductionrdquo

CMUSEI-2006-TN-011 February 2006bull [13] Peter Feiler John Hudak ldquoDeveloping AADL Models for Control Systems A Practitionerrsquos Guiderdquo CMUSEI-2007-TR-

014 July 2007bull [14] Bruce Lewis ldquoUsing the Architecture Analysis and Design Language for System Verification and Validationrdquo SEI

Presentation 2006bull [15] Feiler Gluch Hudak Lewis ldquoEmbedded System Architecture Analysis Using SAE AADLrdquo CMUSEI-2004-TN-004 June

2004bull [16] Charles Pecheur Stacy Nelson ldquoVampV of Advanced Systems at NASArdquo NASACR-2002-211402 April 2002bull [17] Systems Integration Requirements Task Group ldquoARP 4754 Certification Considerations for Highly-Integrated or

Complex Aircraft Systemsrdquo SAE Aerospace 10 April 1996bull [18] SAE ldquoARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne System and

Equipmentrdquo December 1996bull [19] Aeronautical Radio Inc (ARINC) ldquoARINC Specification 653P1-2 Avionics Application Software Standard Interface Part 1

ndash Required Servicesrdquo 7 March 2006

30

References (Continued)

bull [20] Department of Defense ldquoMIL-STD-882D Standard Practice for System Safetyrdquo 19 January 1993bull [21] RTCA Incorporated ldquoDO-178 Software Considerations in Airborne Systems and Equipment Certificationrdquo 1

December 1992bull [22] RTCA Incorporated ldquoDO-254 Design Assurance Guidance for Airborne Electronic Hardwarerdquo 19 April 2000bull [23] US Army ldquoAeronautical Design Standard Handbook Rotorcraft and Aircraft Qualification (RAQ) Handbookrdquo 21

October 1996bull [24] Cary R Spitzer (Editor) ldquoAvionics Elements Software and Functionsrdquo CRC Press 2007bull [25] US Army ldquoArmy Regulation 70-62 Airworthiness Qualification of Aircraft Systemsrdquo 21 May 2007bull [26] US Army ldquoArmy Regulation 95-1 Aviation Flight Regulationsrdquo 3 February 2006bull [27] ldquoUsing the Architecture Tradeoff Analysis Method (ATAM) to Evaluate the Software Architecture for a Product Line

of Avionics Systems A Case Studyrdquo Barbacci Clements Lattanze Northrop Wood July 2003 CMUSEI-2003-TN-012bull [28] ldquoAll in the Family CAAS amp AADLrdquo Peter Feiler August 2008 CMUSEI-2008-SR-021bull [29] ldquoCMMI Guidelines for Process Integration and Product Improvementrdquo Chrissis Konrad Shrum Pearson Education

2007bull [30] ldquoModel Driven Performance Analysis for Avionics Systemsrdquo Brendan OrsquoConnell Draper Laboratory January 2006bull [31] John F Hanaway Robert W Moorehead ldquoSpace Shuttle Avionics Systemsrdquo NASA SP-504 1989bull [32] Lui Sha ldquoThe Complexity Challenge in Modern Avionics Softwarerdquo August 14 2006bull [33] ldquoIncidents Prompt New Scrutiny of Airplane Software Glitchesrdquo 30 May 2006 Wall Street Journalbull [34] Eyal Ophir Clifford Nass and Anthony Wagner ldquo Cognitive Control in Media Multitaskersrdquo PNAS 20 July 2009bull [35] ldquoAdvisory Circular AC 251309-1A System Design and Analysisrdquo Federal Aviation Administration 21 June 1988bull [36] Program Element Office Policy Memorandum 08-03bull [38] httpwwwnsfgovpubs2008nsf08611nsf08611htm National Science Foundation webpage on Cyber-Physical

Systemsbull [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000bull [40] Paul Rook editor ldquoSoftware Reliability Handbookrdquo Elsevier Science Publishers LTD 1990

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)

29

References

bull [1] Israel Koren and Mani Krishna ldquoFault-Tolerant Systemsrdquo Morgan Kaufmann 2007bull [2] Jianto Pan ldquoSoftware Reliabilityrdquo Carnegie Mellon University Spring 1999bull [3] Nachum Dershowitz httpwwwcstauacil~nachumdhorrorhtmlbull [4] httpwwwair-attackcombull [5] David A Mindell ldquoDigital Apollo Human and Machine in Spaceflightrdquo The MIT Press 2008bull [6] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Flight Software Airworthiness SED-SES-PMHFSA 001 December 2003bull [7] Software Engineering Directorate Software Engineering Evaluation System (SEES) ldquoProgram Manager Handbook for

Software Safety SED-SES-PMHSSA 001 February 2006bull [8] AMCOM Regulation 385-17 Software System Safety Policy 15 March 2008bull [9] NASA Software Safety Guidebook NASA-GB-871913 31 March 2004bull [10] Margaret Hamilton amp William Hackler ldquoUniversal Systems Language Lessons Learned from Apollordquo IEEE Computer

Society 2008bull [11] Margaret Hamilton ldquoFull Life Cycle Systems Engineering and Software Development Environment Development Before

The Fact In Actionrdquo httpwwwhtiuscomArticlesFull_Life_Cyclehtmbull [12] Peter Feiler David Gluch John Hudak ldquo The Architecture Analysis amp Design Language (AADL) An Introductionrdquo

CMUSEI-2006-TN-011 February 2006bull [13] Peter Feiler John Hudak ldquoDeveloping AADL Models for Control Systems A Practitionerrsquos Guiderdquo CMUSEI-2007-TR-

014 July 2007bull [14] Bruce Lewis ldquoUsing the Architecture Analysis and Design Language for System Verification and Validationrdquo SEI

Presentation 2006bull [15] Feiler Gluch Hudak Lewis ldquoEmbedded System Architecture Analysis Using SAE AADLrdquo CMUSEI-2004-TN-004 June

2004bull [16] Charles Pecheur Stacy Nelson ldquoVampV of Advanced Systems at NASArdquo NASACR-2002-211402 April 2002bull [17] Systems Integration Requirements Task Group ldquoARP 4754 Certification Considerations for Highly-Integrated or

Complex Aircraft Systemsrdquo SAE Aerospace 10 April 1996bull [18] SAE ldquoARP 4761 Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne System and

Equipmentrdquo December 1996bull [19] Aeronautical Radio Inc (ARINC) ldquoARINC Specification 653P1-2 Avionics Application Software Standard Interface Part 1

ndash Required Servicesrdquo 7 March 2006

30

References (Continued)

bull [20] Department of Defense ldquoMIL-STD-882D Standard Practice for System Safetyrdquo 19 January 1993bull [21] RTCA Incorporated ldquoDO-178 Software Considerations in Airborne Systems and Equipment Certificationrdquo 1

December 1992bull [22] RTCA Incorporated ldquoDO-254 Design Assurance Guidance for Airborne Electronic Hardwarerdquo 19 April 2000bull [23] US Army ldquoAeronautical Design Standard Handbook Rotorcraft and Aircraft Qualification (RAQ) Handbookrdquo 21

October 1996bull [24] Cary R Spitzer (Editor) ldquoAvionics Elements Software and Functionsrdquo CRC Press 2007bull [25] US Army ldquoArmy Regulation 70-62 Airworthiness Qualification of Aircraft Systemsrdquo 21 May 2007bull [26] US Army ldquoArmy Regulation 95-1 Aviation Flight Regulationsrdquo 3 February 2006bull [27] ldquoUsing the Architecture Tradeoff Analysis Method (ATAM) to Evaluate the Software Architecture for a Product Line

of Avionics Systems A Case Studyrdquo Barbacci Clements Lattanze Northrop Wood July 2003 CMUSEI-2003-TN-012bull [28] ldquoAll in the Family CAAS amp AADLrdquo Peter Feiler August 2008 CMUSEI-2008-SR-021bull [29] ldquoCMMI Guidelines for Process Integration and Product Improvementrdquo Chrissis Konrad Shrum Pearson Education

2007bull [30] ldquoModel Driven Performance Analysis for Avionics Systemsrdquo Brendan OrsquoConnell Draper Laboratory January 2006bull [31] John F Hanaway Robert W Moorehead ldquoSpace Shuttle Avionics Systemsrdquo NASA SP-504 1989bull [32] Lui Sha ldquoThe Complexity Challenge in Modern Avionics Softwarerdquo August 14 2006bull [33] ldquoIncidents Prompt New Scrutiny of Airplane Software Glitchesrdquo 30 May 2006 Wall Street Journalbull [34] Eyal Ophir Clifford Nass and Anthony Wagner ldquo Cognitive Control in Media Multitaskersrdquo PNAS 20 July 2009bull [35] ldquoAdvisory Circular AC 251309-1A System Design and Analysisrdquo Federal Aviation Administration 21 June 1988bull [36] Program Element Office Policy Memorandum 08-03bull [38] httpwwwnsfgovpubs2008nsf08611nsf08611htm National Science Foundation webpage on Cyber-Physical

Systemsbull [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000bull [40] Paul Rook editor ldquoSoftware Reliability Handbookrdquo Elsevier Science Publishers LTD 1990

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)

30

References (Continued)

bull [20] Department of Defense ldquoMIL-STD-882D Standard Practice for System Safetyrdquo 19 January 1993bull [21] RTCA Incorporated ldquoDO-178 Software Considerations in Airborne Systems and Equipment Certificationrdquo 1

December 1992bull [22] RTCA Incorporated ldquoDO-254 Design Assurance Guidance for Airborne Electronic Hardwarerdquo 19 April 2000bull [23] US Army ldquoAeronautical Design Standard Handbook Rotorcraft and Aircraft Qualification (RAQ) Handbookrdquo 21

October 1996bull [24] Cary R Spitzer (Editor) ldquoAvionics Elements Software and Functionsrdquo CRC Press 2007bull [25] US Army ldquoArmy Regulation 70-62 Airworthiness Qualification of Aircraft Systemsrdquo 21 May 2007bull [26] US Army ldquoArmy Regulation 95-1 Aviation Flight Regulationsrdquo 3 February 2006bull [27] ldquoUsing the Architecture Tradeoff Analysis Method (ATAM) to Evaluate the Software Architecture for a Product Line

of Avionics Systems A Case Studyrdquo Barbacci Clements Lattanze Northrop Wood July 2003 CMUSEI-2003-TN-012bull [28] ldquoAll in the Family CAAS amp AADLrdquo Peter Feiler August 2008 CMUSEI-2008-SR-021bull [29] ldquoCMMI Guidelines for Process Integration and Product Improvementrdquo Chrissis Konrad Shrum Pearson Education

2007bull [30] ldquoModel Driven Performance Analysis for Avionics Systemsrdquo Brendan OrsquoConnell Draper Laboratory January 2006bull [31] John F Hanaway Robert W Moorehead ldquoSpace Shuttle Avionics Systemsrdquo NASA SP-504 1989bull [32] Lui Sha ldquoThe Complexity Challenge in Modern Avionics Softwarerdquo August 14 2006bull [33] ldquoIncidents Prompt New Scrutiny of Airplane Software Glitchesrdquo 30 May 2006 Wall Street Journalbull [34] Eyal Ophir Clifford Nass and Anthony Wagner ldquo Cognitive Control in Media Multitaskersrdquo PNAS 20 July 2009bull [35] ldquoAdvisory Circular AC 251309-1A System Design and Analysisrdquo Federal Aviation Administration 21 June 1988bull [36] Program Element Office Policy Memorandum 08-03bull [38] httpwwwnsfgovpubs2008nsf08611nsf08611htm National Science Foundation webpage on Cyber-Physical

Systemsbull [39] Hoang Pham ldquoSoftware Reliabilityrdquo Springer 2000bull [40] Paul Rook editor ldquoSoftware Reliability Handbookrdquo Elsevier Science Publishers LTD 1990

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)

31

References (Concluded)

bull [41]httpwwwnavairnavymilv22indexcfmfuseaction=newsdetailampid=128bull [42]httpmars8jplnasagovmsp98newsmco990930htmlbull [43]John Garmen ldquoThe Bug Heard Around the Worldrdquo ACM SIGSOFT October 1981bull [44]httpmarsprogramjplnasagovMPFnewspiompfstatuspf970715html ldquoMars Pathfinder Mission Statusrdquo July 15

1997bull [45] Nancy Leveson ldquoSafeware System Safety and Computersrdquo Addison-Wesley Publishing Company 1995bull [46] httpwwwelectronicaviationcomaircraftJAS-39_Gripen810bull [47] Brandon Hillhttpwwwfreerepubliccomfocusf-news1791574posts Lockheeds F-22 Raptor Gets Zapped by

International Date Line DailyTech LLC February 26 2007 bull [48] httpwwwmilitarycomnewsarticlehuman-error-cited-in-most-uav-crasheshtmlbull [49] Daniel Michaels and Andy Pasztor ldquoIncidents Prompt New Scrutiny Of Airplane Software Glitches As Programs

Grow Complex Bugs Are Hard to Detect A Jets Roller-Coaster Ride Teaching Pilots to Get Controlrdquo Wall-Street Journal May 30 2006

  • Qualification and Reliability of Complex Electronic Rotorcraft Systemsby Alex Boydston amp Dr William Lewis AMRDECfor AFRL Safe and Secure Symposium 15-17 June 2010
  • Agenda
  • Objective
  • Defense Acquisition Approach to Systems Development and Test
  • US Army Airworthiness
  • AED and Qualification
  • Evolution of Helicopter Systems
  • Present Approach to Testing
  • Development Challenges
  • Complexity Issues
  • Complexity Issues (continued)
  • Reliability vs Complexity amp Cost vs Complexity
  • A Few Examples of Complex Systems
  • Some Complex System Failures
  • Lessons Learned from Failures
  • Some Current Guidelines
  • Certification Assessment Considerations
  • Definition of Complexity and Reliability is Needed
  • Analytical Models and Reliability
  • Tools for Modeling and Analysis
  • Modification to Acquisition Model
  • Systems Reliability Standard Establishment
  • BACKUP SLIDES
  • PEO Aviation System Safety Management Decision Authority Matrix
  • Reliability Defined
  • Hardware vs Software Reliability
  • Acronym List
  • Acronym List (concluded)
  • References
  • References (Continued)
  • References (Concluded)