23
4-15-99 1 Steven F. Mattern Science and Engineering Associates, In (505) 346-9839

4-15-99 1 Steven F. Mattern Science and Engineering Associates, Inc. (505) 346-9839

Embed Size (px)

Citation preview

Page 1: 4-15-99 1 Steven F. Mattern Science and Engineering Associates, Inc. (505) 346-9839

4-15-99 1

Steven F. MatternScience and Engineering Associates, Inc.(505) 346-9839

Page 2: 4-15-99 1 Steven F. Mattern Science and Engineering Associates, Inc. (505) 346-9839

4-15-99 2

Todays TopicsTodays Topics

Software Safety AnalysisSoftware Safety Analysis

– A Historical Perspective– A Personal Perspective

Elements of Sound Safety EngineeringElements of Sound Safety Engineering

Structured Software SafetyStructured Software Safety

Object Oriented Software Safety

Products Produced (examples)

Page 3: 4-15-99 1 Steven F. Mattern Science and Engineering Associates, Inc. (505) 346-9839

4-15-99 3

Failures…A Performance IssueFailures…A Performance Issue

UndesiredUndesiredPerformancePerformance

UnintendedUnintendedPerformancePerformance

Fault-InducedFault-InducedPerformancePerformance

UnsafeUnsafePerformancePerformance

Page 4: 4-15-99 1 Steven F. Mattern Science and Engineering Associates, Inc. (505) 346-9839

4-15-99 4

Undesired Event AnalysisUndesired Event Analysis

• System Safety Engineering– Hazards– Hazard Causes– Hazard Mitigation and Fault Avoidance

• Reliability Engineering– Faults and Failure Modes– Fault Pathways to Undesired Events– Fault Detection, Fault Tolerance, and Fault Recovery

• Operational Effectiveness– Faults and Failure Conditions– Fault Pathways to Undesired Events– Undesired Event Mitigation or Fail Safe/Fail Operational

““CAUSED BY”CAUSED BY”a Combination of:a Combination of:

HardwareHardware

SoftwareSoftware

Human ErrorHuman Error

Software-InfluencedSoftware-InfluencedHuman ErrorHuman Error

HMIHMIData InputData Input

Page 5: 4-15-99 1 Steven F. Mattern Science and Engineering Associates, Inc. (505) 346-9839

4-15-99 5

UNDESIREDEVENT

Software HardwareHumanError

Software FailureModes and/orCausal Factors

System-LevelUndesired Eventsand/or Conditions

THIS WAY

...VIA System SafetyEngineering

Analysis Techniques

BRIDGINGBetween System-Level

Events and theSystem Hardware, Software, and the

Operator/Maintainer...

Top-Down ApproachTop-Down Approach

Page 6: 4-15-99 1 Steven F. Mattern Science and Engineering Associates, Inc. (505) 346-9839

4-15-99 6

ElementsElements of System Safety

• Definition of Safety-Critical Functions• Tailored Safety Requirements & Guidelines• Identification of System/Subsystem Hazards & Failure

Modes• Determination of System-Level Effects• Categorization of Hazard Severity & Likelihood• Identification of Hazard Causes (HW/SW/Human Interaction)

• Derivation of Functional Hazard Mitigation Requirements• Determination of Safety Requirements Implementation• Determination of Residual Safety Risk• Final Categorization Hazard of Severity & Likelihood

Page 7: 4-15-99 1 Steven F. Mattern Science and Engineering Associates, Inc. (505) 346-9839

4-15-99 7

Sources of System Safety Requirements

Generic SoftwareSafety Requirements

Guidelines and Specifications

Derived Functional Safety Requirements

Lessons Learned

Design SpecificationsSimilar System Analysis

User Inputs

Systems EngineeringAutomated Environment Hazard Causal

Factor Analysis

SYSTEM SAFETY REQUIREMENTSSYSTEM SAFETY REQUIREMENTS

System & SubsystemHazard Analysis

System Requirements

System Architecture

Initial Constraints Functional Requirements

Page 8: 4-15-99 1 Steven F. Mattern Science and Engineering Associates, Inc. (505) 346-9839

4-15-99 8

Causal Factors

H/W S/W HE S/WIHE

Causal Factors

H/W S/W HE S/WIHE

Causal Factors

H/W S/W HE S/WIHE

Causal Factors

H/W S/W HE S/WIHE

ROOTHAZARD

Failure ModeA

Failure ModeB

Software-Influenced

Human Error

Failure ModeC

Failure ModeD

Initial Depth of Analysis

To the Depth Requiredto Mitigate Effectively

~~

PDR-Level

CDR-Level

To The Depth RequiredTo The Depth Required

Page 9: 4-15-99 1 Steven F. Mattern Science and Engineering Associates, Inc. (505) 346-9839

4-15-99 9

Methods For Causal Factor Methods For Causal Factor

AnalysesAnalyses

• Safety-Critical Functions Analysis• System States/Modes Analysis• Hazards Analysis (System & Subsystem Level)

• Fault Tree Analysis (Usually Limited to System-Level Hazards)

• Failure Modes & Effects Criticality Analysis• Hybrid Event Trees or Reliability Block Diagrams• Software Data Flow Analysis• Software Functional Flow Analysis• Interface Analysis

(Hardware/Software/Operator/Maintainer)

Tailored For Customer, Program, & Environment

Page 10: 4-15-99 1 Steven F. Mattern Science and Engineering Associates, Inc. (505) 346-9839

4-15-99 10

A REQUIRED FUNCTION DOES NOT OCCURFailure of the software to perform a required function; that is, the function is never executed, or no output is produced.

AN UNDESIRED EVENT OCCURSThe software performs a function not required. (i.e. getting the wrong answer, issuing the wrongcontrol instruction, or doing the right thing but under inappropriate conditions).

AN INCORRECT SEQUENCE OF REQUIRED EVENTSThe software possesses sequencing problems. For example, failing to ensure that two eventshappen at the same time, at different times, or in a particular order.

TIMING FAILURES IN EVENT SEQUENCESThe software exceeds maximum time constraints between events, fails to ensure minimum time constraints between events or possesses duration failures.

AN INCORRECT RESPONSE TO A SAFETY-CRITICAL EVENTThe software fails to recognize a hazardous condition requiring corrective action, fails to initiatea fault tolerant response to a recognized safety-critical function, or produces the wrong responseto a hazardous condition or failure mode.

Potential Influence of SoftwarePotential Influence of Software

Page 11: 4-15-99 1 Steven F. Mattern Science and Engineering Associates, Inc. (505) 346-9839

4-15-99 11

Hazards(HAZ)

Hazards(HAZ)

Software Req(SWR)

Software Req(SWR)

CSCI Scenario(SCE)

SegmentScenario (SSC)

SegmentScenario (SSC)

SegmentBehavior (SBE)

SegmentBehavior (SBE)

Interface(CIM/CID)

Interface(CIM/CID)

Testing(VER_SWR)

Testing(VER_SWR)

Object Oriented DesignObject Oriented Design

Page 12: 4-15-99 1 Steven F. Mattern Science and Engineering Associates, Inc. (505) 346-9839

4-15-99 12

MCS SCE00820Activate a Scheduled Mission

CUI SCE00???Begin Resupply Mission

SAS SCE00899Perform DockedResupply (Fwd Veh)

#5 <Commence _Execution _Order>

#1 MCS< Commence_ Execution_Command>

#2 Notify Crew to Begin Resupply Mission (CUI)

#3 Uses SAS SCE00882

#1 VMG <Dock_Deploy _Notification>

#2 Uses SAS SCE00868

#1 <Inventory_Request> (RPC and ACU)

#2 <Ammunition_ Inventory _ Data> (RPC)

SAS SCE00882Develop Resupply Op Request (Fwd Veh)

SAS SCE00868Obtain Vehicle Inventory

2

1

VMG SCE02057Deploy SPH

#6 <Dock_Deploy_Notification

3

1. Can not find the Crew Notification within CUI to begin resupply mission. SCE00899 sends message…but to where? SAS to CUI interfaces describes Resupply_Authorization_Request, but it does not appear in the CUI SRS

2. This Scenario uses the term “USES”. Is “Uses” the same as “Invokes”?3. It is unsure if SCE00868 and SCE00871 are processed concurrently or are they processed in series4. This message is used to request resupply guidances. Resupply guidances are provided by the POC, and

include substitution rules, end-state guidance, controlled supply rate, and resupply thresholds. However, there is no development of this message in the TDA SRS…No message received and no message provided to SAS

5. AFT SCE01132 does not receive this message from RPC SCE00498, does not obtain inventory data from any know source, nor does it send the data back to RPC SCE00498

6. Same comment as #57. Assume the command to Retrieve Projectile Inventory Data is processed internal to RPC SCE004988. ACU SCE02338 Has not yet been developed

Notes:Notes:

RPC SCE00489Provide Inventory Data

AFT SCE01132Provides Propellant Inventory Data

#1 <Inventory_ Request>

#2 <Propellant_Inventory _Request> (AFT)

#3 Retrieve the Projectile Inventory Data

#4 <Propellant_Inventory_Request>

#5 <Ammunition_Inventory_Data

5

6

7

ACU SCE02338Perform Resupply Fuel Transfer (Sender)

8

Safety Analysis of ScenariosSafety Analysis of Scenarios

Page 13: 4-15-99 1 Steven F. Mattern Science and Engineering Associates, Inc. (505) 346-9839

4-15-99 13

QUESTIONS/DISCUSSION

1. IFC SCE00598, Step #6 sends the <Warning_Order> to both RPC SCE00391 and RPC SCE00493.Additionally, <Warning_Order> is not listed in the IFC SRS Input/Output Tables.

Steps for Scenario Execute A Fire Command (SCE00598)# Stimulus

(As Req.) Scenario StepsResponse(As Req.)

5 IFC determines whether a check firing condition exists.6 IFC sends ammunition order to RPC. [RPC:

Warning_Order]

7 [RPC:Ammunition_Order_Data]

RPC sends projectile weights and preliminary propellanttemperature for all rounds contained in the ammunitionorder to IFC.

RPC SCE00493 shows receipt of <Ammunition_Order> in Step #1 while RPC SCE00391 showsreceipt of <Ammunition_Order> as the scenario stimulus. Do we need to make the SCE00391 Step #1the Rx of <Ammunition_Order>? Finally, RPC SCE00391 Step #1 indicates that the scenario“Invokes RPC SCE00493 raising an apparent conflict with scenario activation in response to<Ammunition_Order>.

Steps for Scenario SPH Supply Multiple Rounds for Firing (SCE00391)# Stimulus

(As Req.) Scenario StepsResponse(As Req.)

1. Invoke Scenario [SCE00493] "SPH Select Inventory forFiring"

2. Verify the ability to support the fire mission event

Contextual Applicability to HazardsContextual Applicability to Hazards

Page 14: 4-15-99 1 Steven F. Mattern Science and Engineering Associates, Inc. (505) 346-9839

4-15-99 14

Test Requirements Based Upon

Causes & Mitigation

UNDESIREDEVENT

Failure ModeA

Failure ModeB

Failure ModeC

Failure ModeD

HardwareFailure

HumanError

SoftwareError

TimingError

SequencingError

AlgorithmError

• What Causes, Initiates, or What Causes, Initiates, or Influences The Undesired Event ?Influences The Undesired Event ?

• What Requirements Have BeenWhat Requirements Have BeenImplemented To Mitigate ?Implemented To Mitigate ?

• How Will Testing Be AccomplishedHow Will Testing Be Accomplishedto Prove Implementation ofto Prove Implementation ofMitigation Requirements ?Mitigation Requirements ?

• Testing for Functionality and LossTesting for Functionality and Lossof Functionality.of Functionality.

• How Will The System Respond toHow Will The System Respond toFailure ? Fault Tolerance/RecoveryFailure ? Fault Tolerance/Recovery

Page 15: 4-15-99 1 Steven F. Mattern Science and Engineering Associates, Inc. (505) 346-9839

4-15-99 15

Hazards(SHZ)

Hazards(SHZ)

CSCI Scenario(SCE)

Testing(VER_SWR)

Testing(VER_SWR)

Hazards(HAZ)

Hazards(HAZ) CSCI Scenario

(SCE)CSCI Scenario

(SCE)CSCI Scenario

(SCE)CSCI Scenario

(SCE)CSCI Scenario

(SCE)Testing

(VER_SWR)

Testing(VER_SWR)Testing

(VER_SWR)

Testing(VER_SWR)Testing

(VER_SWR)

Testing(VER_SWR)

ContextContext DetailDetailCompletenessCompleteness

SegmentSegmentElementElementCSCICSCI

Testing Scenario CompletenessTesting Scenario Completeness

Page 16: 4-15-99 1 Steven F. Mattern Science and Engineering Associates, Inc. (505) 346-9839

4-15-99 16

System SafetyRequirement Software Requirement Description

AffectedCSCI

Test ProcedureRequirement Test Results

Software Requirements Traceability Matrix

1.2.3.1

1.2.3.3

1.3.4.1

The WCS Shall Monitor the Status of the JWS Missiles That Are Powered up.

The WCS Shall Safe and Deselect Any JWS Missile That Fails BIT

The JWS Missile Shall Withhold Active RF Emissions Until Terminal Impact Phase

System Safety Requirement Test or Analysis Activity

Verified Date Comments

System Safety Requirements Verification Matrix

The JWS WCS Shall Include Processing Elements to Verify IPL Results

The JWS WCS Shall Include Software Elements to Safe the RM During Unsafe States

The JWS Missile Shall Include Software to Prevent Active RF Until Terminal Phase

JWS WCS

JWS WCS

JWS Missile

FTH

WDC

FIC

VU0012

VU10002

VU234003

FAIL

PASS

PASS

5/7/02

5/9/02

6/3/02

Failed 1ST Attempt

See Test Log 1001

None

AN/ SWY-1/ 2/ 3Safety Program

HAZARD ACTIONREPORT

HAR Number:Date:

FROM: TO: Commander Naval Surface Warfare Center, Dahlgren Division Code G71 (J . Bowden) 17320 Dahlgren Road Dahlgren, VA 22448-5001

ACTION ADDRESS:

SYSTEM: COMPONENT:

SYSTEM PHASE OR OPERATION:

HAZARD DESCRIPTION:

SUMMARY OF FAILURE MODES AND CAUSAL FACTORS:

HAZARD CONTROL/ MITIGATION ACTION(S):

REFERENCES:

ORIGINATOR: TELEPHONE:

FINAL RESOLUTION:

HAZARD RESOLVED & HAR CLOSED:

USE ADDITIONAL SHEETS AS NECESSARY

PAGE _ OF _

HRI:

PROGRAM MANAGER DATE

PRINCIPAL FOR SAfETY DATE

Evidence of Hazard ControlEvidence of Hazard Control

Page 17: 4-15-99 1 Steven F. Mattern Science and Engineering Associates, Inc. (505) 346-9839

4-15-99 17

EXAMPLE

1.0 Man Machine Interface (MMI) 1.1 PROCESS CANCELLATION

1.1.1 The System shall be designed such that the operator may exit from a potentially unsafe state with a single action1.1.2 Exiting from an unsafe state or condition shall place the system in a known safe state, report the failure, and display the system status to the operator.

1.2 SAFETY-CRITICAL PROCESS INITIATION1.2.1 Safety-Critical operator displays, legends, and other interface functions shall be clear concise, and unambiguous.1.2.2 Safety-Critical displays shall be duplicated, where possible, on separate display devices.1.2.3 Safety-Critical alerts to the crew shall be readily distinguishable from routine alerts.1.2.4 Upon detection of an unsafe state, the system shall alert the operator to the anomaly detected, the action taken, and the resulting system configuration and status.

1.3 OPERATOR ENTRY ERRORS1.3.1 The software shall be capable of detecting improper operator entries, or sequence of entries, and thereby prevent the the execution of safety-critical functions.1.3.2 The software shall alert the operator to an erroneous entry.1.3.3 Operator alerts shall indicate the error and corrective action.1.3.4 After operator corrective action to an erroneous entry, the software shall provide positive confirmation of a valid data entry, and a real-time indication that the system is functioning properly.1.3.5 Safety-Critical functions which require several seconds or longer to process shall provide a status indicator to the operator during processing.

1.4 POSITIVE FEEDBACK1.4.1 Software control of safety-critical functions shall have positive feedback indicators to the operator to provide assurance that the system is functioning properly.

1.5 SAFETY-CRITICAL ALERTS1.5.1 The operator shall not be able to clear a safety-critical alert without taking corrective action.

Initial Software SafetyConstraints

Page 18: 4-15-99 1 Steven F. Mattern Science and Engineering Associates, Inc. (505) 346-9839

4-15-99 18

Safety-Critical Function AnalysisSafety-Critical Function Analysis

System Function Ramification of Failure Relevant SubSystemsTactical & Technical SoftwareData Displays & ControlsControl (Armament)

Tactical & Technical SoftwareData Displays & ControlsTrainingMaintenance

Fail to define baseline data - Results in weaponfiring into friendly forces

Tactical & Technical SoftwareData Displays & ControlsGeneral ProcessingMass MemorySensor/Actuator InterfaceServo Amplifier

Fail to define baseline data - Results in weaponfiring into friendly forces (dependent on modesand states of the weapon system)

Fail to protect self from friendly forces

Fire projectile into friendly forces

Tactical & Technical SoftwareData Display & ControlsGeneral ProcessingMass MemorySensor/Actuator InterfaceServo Amplifier

Sensors & Peripherals

Tactical & Technical Software

1. Engagement Control A. Check-Fire; Manual B. Cease-Fire; Automatic

Safety SignificanceFail to break engagement of an incorrectly designated target; Fire projectile into friendlyForces if incorrectly identified friendly target as hostile

Safety-Critical Function

2. Mode Control A. Tactical: move, shoot, resupply B. Support: training, test, & maintenance

Inadvertent weapon firingSafety-Critical Function

3. Initialization, Re-Initialization

4. Re-Configuration

Safety-Critical Function

Safety-Critical Function

Safety-Critical Function

Safety-Critical Function5. Transmit Friend Signal To Friendly Forces

6. Target Integrity

Improper actions leading to designation of incorrect target. Fail to accomplish offensive ordefensive operation. Fail to protect self or friendly forces

Tactical & Technical SoftwareData Displays & Controls Safety-Critical Function7. Receive, Verify, & Process

Battlefield Information

ExampleExample

2.Determine

Ramifications ofFunction Failure

2.Determine

Ramifications ofFunction Failure

3.Determine

SubsystemsImpacted

3.Determine

SubsystemsImpacted

4. Determine

SafetySignificance

1.DefineSystem

Function

Page 19: 4-15-99 1 Steven F. Mattern Science and Engineering Associates, Inc. (505) 346-9839

4-15-99 19

2.Define Primary

Capabilities of theSystem State

2.Define Primary

Capabilities of theSystem State

3.Identify Prohibited

Functions of theSystem State

3.Identify Prohibited

Functions of theSystem State

4. Identify EssentialFunctions of the

System State

1.Define System

States and Substates

Safe States & Modes AnalysisSafe States & Modes Analysis

SYSTEM STATE PRIMARY CAPABILITIESPROHIBITEDFUNCTIONS

1. Unpowered State

2. Maintenance State

3. Test State

4. Training State

5. Tactical State

Weapon System Is Shut Down and Powered off

Perform Intrusive Fault Protection and Isolation. Perform Servicing, Repair, And LRU Replacement Actions, Etc

Test WCS software and hardware end-to-end using synthetic training targets and Operational Software

Perform WCS Operator Training using Simulated Environments Using Training Software. No Live-fire, Simulated Target

Weapon System Active for Live-fire

System Operation of any kind

Weapon Control System Test, Training, and Tactical Operation

Tactical Links InstalledArm/Enable Switch On

Tactical Links InstalledArm/Enable Switch On

Training, Maintenance, and Test Modes

ESSENTIALFUNCTIONS

Weapon System Physical Security

Primary Power On

Break-Engage ActivePrimary Power OnTraining Software Load

Primary Power OnSim Software LoadedFire Command Cleared

Primary Power On

ExampleExample

Page 20: 4-15-99 1 Steven F. Mattern Science and Engineering Associates, Inc. (505) 346-9839

4-15-99 20

LTEController

UnitLCR

LEE LRE LBE LSPEvent Queue

Processing UnitLCR

Event QueueProcessing Unit

LCR

Bearing/Jam LineEvent Queue

Processing LCR

SectorProcessing Unit

LCR

LGD LTP LME

Get Data UnitLCR

Processing UnitLCR

Modification Event Processing LCR

LNE LNW LRW LSDTrack Not

Eligible UnitLCR

#1 SolutionProcessing Unit LCR

#2 SolutionProcessing Unit LCR

Save Data UnitLCR

A40

B40

C40

Functional Analysis of SoftwareFunctional Analysis of SoftwareArchitectureArchitecture

• Graphical Representation Graphical Representation for Each Hazard or Failurefor Each Hazard or FailureModeMode

• Links Specific Software Links Specific Software Modules to Undesired Modules to Undesired EventsEvents

• Safety Implication/Effects Safety Implication/Effects Communicated to theCommunicated to theDesign/Domain ExpertDesign/Domain Expert

Example AnalysisExample Analysis

Page 21: 4-15-99 1 Steven F. Mattern Science and Engineering Associates, Inc. (505) 346-9839

4-15-99 21

Do SIL’s Make Sense?Do SIL’s Make Sense?

• Definition of Safety-Critical Functions

• Tailored Safety Requirements & Guidelines

• Identification of System/Subsystem Hazards & Failure Modes

• Determination of System-Level Effects

• Categorization of Hazard Severity & Likelihood

• Identification of Hazard Causes (HW/SW/Human Interaction)

• Derivation of Functional Hazard Mitigation Requirements

• Determination of Safety Requirements Implementation

• Determination of Residual Safety Risk

• Final Categorization Hazard of Severity & Likelihood

Elements of System Safety

Page 22: 4-15-99 1 Steven F. Mattern Science and Engineering Associates, Inc. (505) 346-9839

4-15-99 22

The Role of SIL’sThe Role of SIL’s

• Forces the Functional Linking of Software Architecture to Undesired Events or Hazards of the System

• Forces “Cause Analysis” to be Accomplished to Prove Software Influences or Causes the Hazard to Initiate

• Forces the Software Development Team to Interact with the Safety Team

• Forces a Defined Protocol of Software Development Activity in the Design, Code, Test, V&V, and Configuration Management for each of the SIL Categories

Page 23: 4-15-99 1 Steven F. Mattern Science and Engineering Associates, Inc. (505) 346-9839

4-15-99 23

Issues To ConsiderIssues To Consider

• Solving Problems = High VisibilityPreventing Problems = Low Visibility

• Size and Complexity of the Software

• Software Development Life Cycle, Tools/Techniques– Structured Design

– Object Oriented Design

• Interfaces (Hardware, Software, Human)– Functionality of Interface– Ramification of Loss of Interface

• Role of Systems Engineering

• Requirements versus Recommendations/Considerations

• Budgets for “Specialty Engineering”