34
University of Washington, Information Security & Risk Management IMT 553 Final Project: Evaluation of Preventive Technologies Date:June 2, 2015 Authors: Larry DeBellis Carlos Cabello Mary Marks Steve Morehouse Steve Vincent Mike Whaley

FMEA Final Project

Embed Size (px)

Citation preview

Page 1: FMEA Final Project

University  of  Washington,  Information  Security  &  Risk  Management  

IMT  553  Final  Project:  Evaluation  of  Preventive  Technologies  

Date:  June  2,  2015  

Authors:  Larry  DeBellis  Carlos  Cabello    Mary  Marks  Steve  Morehouse    Steve  Vincent  Mike  Whaley

08Fall

Page 2: FMEA Final Project

2

Spring 2015 IMT 553 Final Report DeBellis, Cabello, Marks, Morehouse, Vincent, Whaley Table  of   Contents  

Executive  Summary  ......................................................................................................................................  3  Risk  Assessment  Key  ......................................................................................................................................................  3  Top  5  Risks  .....................................................................................................................................................................  3  

Company  Overview  .......................................................................................................................................  3  

I.  Failure  Modes  ............................................................................................................................................  4  TOP  Failure  Modes  and  Effects  Analysis  (FMEA)  .......................................................................................................  4  

II. Measures  to  reduce  the  named  residual  risks  ............................................................................................  8Patching  ...................................................................................................................................................................  8  

Change  Control  Risk  .......................................................................................................................................................  8  Antiquated  firewall  ........................................................................................................................................................  8  Employee  Turnover  .......................................................................................................................................................  8  Microsoft  Exchange  Server  ............................................................................................................................................  9  Bring  Your  Own  Device  (BYOD)  ......................................................................................................................................  9  Backup  ...........................................................................................................................................................................  9  Physical  security  ............................................................................................................................................................  9  Connectivity  ...................................................................................................................................................................  9  Availability  ...................................................................................................................................................................  10  Non-­‐segmented  Network  ............................................................................................................................................  10  Business  Risks  ..............................................................................................................................................................  10  Patient  Tracking/Medical  Records/Claims  Management  and  Customer  Billing  Risks  .................................................  10  

III. Residual  Risks  ........................................................................................................................................  12

IV. Plan  of  Action  and  Milestones  (POA&M)  ................................................................................................  13

Acronyms  ....................................................................................................................................................  14  

Team  ..........................................................................................................................................................  14  

ANNEX  1:  Severity,  Probability,  &  Hazard  Score  Key  ....................................................................................  15  

ANNEX  2:  Risk  Evaluation  ............................................................................................................................18  

ANNEX  3:  Recommended  Control  Measures  ...............................................................................................  24  

ANNEX  4:  Residual  Risk  Assessment  ............................................................................................................28  

ANNEX  5:  Plan  of  Action  and  Milestones  (POA&M)  .....................................................................................31  

ANNEX  6:  Bibliography  ................................................................................................................................34  

Page 3: FMEA Final Project

Spring 2015 IMT 553 Final Report DeBellis, Cabello, Marks, Morehouse, Vincent, Whaley

3

Executive Summary We are undertaking an identification of the known hazards and making a list of associated risks. We can weigh the risk with their control measures, and identify the residual risk that remains based on our wherewithal to adopt the controls based on how well they minimize risk.

Risks can be divided into technical and business risks. Technical risks are risks that support operations and are do not directly face the customer. Business risks are customer-facing risks that affect business continuity and company growth.

Risk Assessment Key The current Risk Assessment describes 4 High, 11 Medium, and 6 Low. If the recommended actions are taken expeditiously, we anticipate Residual Risk of zero High, 2 Medium, and 19 Low - which we recommend the officers find acceptable given the controls to be implemented.

Risk Low Medium High Current Risk 6 11 4 Residual Risk 19 2 0

Top 5 Risks

1. Patching2. Hardware3. BYOD4. Data Backups5. Change Control

Company Overview

Kangaroo Inc. is a Seattle, Washington based provider of dental practice management and imaging software solutions. The company offers digital imaging equipment, dental supplies, and software/hardware technical support services and recently launched cloud-based practice management solution for its clients.

The company’s main servers in Seattle are connected via the Internet to its clients throughout the United States.

Our customers depend on our services to support their business and we provide services that organize their day-to-day activities and host their data. We want to make sure our customers have full confidence in our ability to protect their data in terms of confidentiality, integrity, and availability.

.

Page 4: FMEA Final Project

Spring 2015 IMT 553 Final Report DeBellis, Cabello, Marks, Morehouse, Vincent, Whaley

4

I. Failure Modes

TOP Failure Modes and Effects Analysis (FMEA) Preventive  Technologies  

How  They  Fail   Causes   Effects  

Patching   Unpatched  vulnerabilities  allow  attack  or  other  server  software  failure.  

High  turnover  in  IT  department.  Insufficient  documentation  and  tracking  of  updates.    Heterogeneous  systems  =  high  patch  diversity  and  pace.  

Unexpected  datacenter  downtimes  resulting  from  poor  patch  management.    Impacts  regional  offices  servers,  connections  to  HQ  services,  potential  loss  of  data.  Affects  customers  during  biz  hours  if  servers  down  for  patching.    

Comments:    Software  patching  and  updates  are  an  on-­‐going,  organization-­‐wide  problem.        We  face  challenges  keeping  our  software  suite  up  to  date  and  applying  patches  in  a  timely  manner.  Critical  patches  must  be  patched  during  regular  business  hours  and  this  creates  an  on-­‐going  problem  for  the  company  and  our  customers.  Our  clients  experience  downtime  when  we  apply  critical  software  patches  during  business  hours.  This  places  our  ability  to  meet  SLAs  at  risk.    

Antiquated  firewall  with  expired  support  

Expired  IDS/IPS  in  regional  offices  

High  turnover  within  IT  Department    IT  budget  re-­‐allocated  to  business  projects  

HQ  caught  an  average  of  801  malware  events  on  their  perimeter  devices  a  week  DOS  Attacks  are  becoming  a  common  event    

Comments:      We  only  have  a  single  firewall  device  that  protects  our  data  that  places  data  availability  and  our  ability  to  maintain  SLAs  at  risk.  We  have  an  immediate  problem  with  vendor  support  that  has  expired.  To  address,  staff  will  be  assigned  to  track  and  manage  software  support  contracts  to  avoid  a  lapse.    

Running  MSFT  Exchange  Server  2007.  Have  not  upgraded  to  2010  

Spam  filter  for  Exchange  server  not  reliable.  Exchange  Server  management  is  a  mess      Microsoft  has  halted  support  for  Exchange  2007  

Increased  attack  risk   23%  of  users  click  on  Phishing  links  and  11%  of  those  are  clicking  on  the  attachments  

Comments:  Our  company  uses  Microsoft  Exchange  Server  2007  and  has  not  upgraded  to  the  Exchange  Server  2010.  The  spam  filter  is  not  reliable  and  our  users  are  receiving  an  increasing  number  of  spam  messages.    Microsoft  is  sun-­‐setting  support  for  our  aging  release  of  Exchange  Server.  Without  Microsoft  support,  we  are  at  increased  risk  of  defending  ourselves  from  cyber-­‐tacks  and  new  defenses  from  emerging  threats.  

BYOD   Employee  devices  introduce  malware  onto  network  

No  malware  detection  implemented  on  BYOD  (e.g.  employee  phones)  

Malware  present  on  network  is  a  threat  and  requires  funding  and  staff  resources  to  combat,  determine  extent  of  damage  and  potential  loss  of  data  

Fnetwork  (non-­‐segmented)  allows  devices  on  same  LAN  as  corporate  devices  

Malware  "jumps"  from  BYOD  devices  to  corporate  devices  on  same  LAN  

Comments:  The  Bring  Your  Own  Device  (BYOD)  risk  manifests  itself  in  various  means,  and  requires  as  many  controls.  The  first  control  is  the  design  of  a  BYOD  policy  to  include  controls  to  reduce  cost  and  risks  introduced  by  BYO.  Our  BYOD  policy  must  be  well  thought  out  and  address  our  changing  environment  and  updated  regularly.  The  second  BYOD  control  identified  is  to  implement  VLAN  or  other  form  of  network  segregation  with  a  separate  WiFi  signal  supported  by  most  high-­‐end  small  office/home  office  (SOHO)  and  nearly  all  enterprise  WiFi  access  points  /  routers.  The  separate  WiFi  network  would  be  for  non-­‐company  personal  and  business  use,  and  based  on  

Page 5: FMEA Final Project

Spring 2015 IMT 553 Final Report DeBellis, Cabello, Marks, Morehouse, Vincent, Whaley

5

demand,  could  be  even  split  further  for  increased  separation.  As  an  added  note,  we  do  have  to  ensure  our  access  points/routers  support  this  network  segregation.  

Data  Backups   Corrupted  Backup   Hardware  Fails   If  full-­‐backup,  then  impact  is  one-­‐week  data  loss  

If  incremental,  then  impact  is  "n-­‐days"  data  loss  

Same-­‐Location  Disaster   Lack  of  Offsite  store   Current  lack  of  an  offsite-­‐store  means  a  disaster  affecting  the  datacenter  may  destroy  all  backups  

Recovery-­‐Fails   Hardware  Fails   Assumes  good  copy  of  data,  recovery  attempt  fails,  but  replacing  hardware  will  allow  another  recovery  attempt.  

We  have  three  different  risk  mitigations  for  data  backups.  1.    Test  each  backup  immediately  after  copy  to  ensure  non-­‐corruption  and  to  allow  time  to  repair  of  backup  system  and  make  new  copies.  2.  Transfer  backup  copies  to  offsite  storage  vault  outside  of  regional  disaster  impact  zone.  We  would  also  keep  an  additional  copy  of  our  most  recent  backup  locally  for  recovery.  3.  Keep  spare  hardware  onsite  including  replacement  drives,  blades,  and  ensure  technicians  are  trained  to  swap  out  failed  parts.  

Change  Control  Management  

Patch-­‐management  delayed   Maintenance  window  scheduling  adversely  impacted  by  insufficient  coordination  with  customers  and/or  SLA    

Delayed  patching  results  in  potentially  vulnerable  systems  in  the  datacenter  

The  risks  associated  with  Software  Change  Management  are  similar  to  software  patching  issues.  Apply  to  software  configuration  management  and  its  supported  hardware.  Issues  occur  when  software  fixes  are  not  tested  against  prior  software  releases  and  hardware  configurations.  We  must  implement  controls  to  ensure  that  current  software  fixes  are  thoroughly  tested  to  ensure  they  fix  the  problem  and  do  not  affect  prior  fixes.  

Business  Continuity  Planning  

Earthquake  destroys  datacenter   HQ  Office  building  is  in  a  level4  earthquake  zone,  3-­‐story,  brick  that  is  not  seismically  retrofitted  with  no  plans  to  retrofit.    

All  hosted  services  offline,  all  hardware  damaged,  all  data  backups  destroyed  (assuming  current  status  of  no  offsite  backups)  

Thieves  destroy  or  damage  servers  

Server  not  secured  in  a  proper  manner   Stolen  sensitive  information  can  be  used  against  our  clients  and  damage  reputation.    

Business  Continuity  of  operations  after  a  disaster  is  a  concern.  • Our  headquarters,  regional  infrastructure,  and  remote  employee  access  are  at  risk  in  case  of  a  disaster• These  risks  potentially  inhibit  our  ability  to  perform  day-­‐to-­‐day  operations.

Connectivity   Datacenter  to  ISP  link  goes  down   Datacenter  is  only  connected  upstream  via  a  single  ISP  

All  hosted  services  offline  

Availability   Hardware  failure  (firewall  or  server)  

Architecture  of  datacenter;  multiple  single  points  of  failure  (e.g.  single  firewall  device,  single  blade-­‐chassis)  

Hosted  services  offline.  Firewall  or  server  chassis  failure  means  services  are  offline  until  manual  intervention.    

Service  agreement;  datacenter  staff  is  not  authorized  to  enter  Kangaroo  cabinet;  repairs  require  Kangaroo  staff  to  travel  to  datacenter.  

Hosted  services  offline.  Any  failure  within  the  cabinet  requires  Kangaroo  staff  to  travel  to  site.  

Regional  network  not  segmented,  phones  on  same  network  as  workstations  

Allows  malicious  traffic  to  mask  itself  as  VOIP  traffic  and  infect  workstations.  

Infrastructure  grew  fast  at  regional  locations  and  lack  of  segmentation  was  an  oversight  

Potential  downtime  at  regional  office  if  workstations  infected.  

Page 6: FMEA Final Project

Spring 2015 IMT 553 Final Report DeBellis, Cabello, Marks, Morehouse, Vincent, Whaley

6

Patient  tracking  

Patient  database  unavailable  Tracking  system  software  not  properly  exchanging  data  between  database  and  customer  system    

Servers  are  segmented  from  each  other  and  unable  to  communicate  with  each  other  

Customers  (Providers)  unable  to  conduct  patient  monitoring  because  medical  records  not  available.  Medical  files  and  billing  and  insurance  records  contain  the  most  valuable  patient  data.  Most  often  and  successfully  targeted  (55%  successfully  targeted)    Providers  at  risk  of  malpractice  due  lack  visibility  into  patient  information    

Company  at  risk  of  liability  for  a  breach  of  personal  financial  and  health  information  belonging  to  our  customers  and  their  patients.    

We  are  at  risk  of  business  disruption  which  adversely  affects  our  company,  our  customer  providers,  and  their  patients    

Adverse  impact  on  future  financial  results  due  to  the  theft,  destruction,  loss,  misappropriation  or  release  of  confidential  data  or  IP  

Insurance  eligibility  verification  

Kangaroo  system  offline   Unauthorized  access  to  patients  records  due  to  employee  negligence,  criminal  activity,  service  disruptions,  network  failure  

Customers  (HC  Providers)  unable  to  determine  insurance  eligibility  of  patients  

Appointment  scheduling  

Kangaroo  system  offline   * Antiquated  Operating  Systems  is  outdatedand  no  longer  supported  by  the  manufacturer  *Patient  kept  on  hold,  lost  call,  unanswered  phone  

Customers  unable  to  schedule  new  appointments  or  modify  existing  appointments.  Loss  of  revenue  risk  due  to  operational  and  business  delays  Negative  publicity  resulting  in  reputation  or  brand  damage  with  our  customers,  suppliers  or  industry  peers    Adverse  impact  on  future  financial  results  due  to  the  theft,  destruction,  loss,  misappropriation  or  release  of  confidential  data  or  IP  

Page 7: FMEA Final Project

Spring 2015 IMT 553 Final Report DeBellis, Cabello, Marks, Morehouse, Vincent, Whaley

7

Claims  management  

Kangaroo  system  offline   Antiquated  Operating  Systems  is  no  longer  supported  by  the  manufacturer  

Company  unable  to  process  medical  claims.  * Liability  risk  due  to  providercustomer  loss  of  revenue  due  to  operational  and  business  delays    * Negative  publicity  resulting  inreputation  or  brand  damage  with  our  customers,  suppliers  or  industry  peers    * Operational  or  business  delaysresulting  from  the  disruption  of  IS  and  subsequent  cleanup  and  mitigation  activities    

Customer  billing  

Kangaroo  system  offline   • Antiquated  Operating  Systems  is  nolonger  supported  by  themanufacturer

• *Inadequate  or  incomplete  documentation.

Company  unable  to  process  customer  billing  for  claims.    Billing  and  insurance  records  contains  some  of  the  most  valuable  patient  data  and  targeted  (46%  successfully  targeted)    Liability  risk  for  a  breach  of  personal  financial  and  health  information  belonging  to  our  customers  and  their  patients.  

Business  disruption  risk  adversely  affects  our  company,  our  customer  providers,  and  their  patients    

Adverse  impact  on  future  financial  results  due  to  the  theft,  destruction,  loss,  misappropriation  or  release  of  confidential  data  or  IP  

Prescription  drug  management  

Kangaroo  system  offline   Legacy  systems  unable  to  support  minimum  requirements  

Providers  are  unable  to  enter  new  prescriptions  for  patients,  unable  to  conduct  drug  interaction  checking  for  new  prescriptions.  

Page 8: FMEA Final Project

Spring 2015 IMT 553 Final Report DeBellis, Cabello, Marks, Morehouse, Vincent, Whaley

8

II. Measures to reduce the named residual risksWe applied the FMEA issue tracking to focus on the “what” rather than the “how” or “why”. This approach allowed for thorough application of controls.

The following are actions to address the identified risks mentioned above. Each risk is addressed below

Patching • Establish clear policy and procedures for systems maintenance.• Ensure recommended patches are tested first in a development environment, then in a production environment to

simulate the real environment as close as possible.o When concerned and conscientious testers adequately test the code, it will be patched in a timely

manner.• Employ knowledgeable configuration management staff that carefully monitor and control our releases, and not

only will we know exactly what is in the software fixes, but we can create these fixes later for failure diagnosticsand analysis. More importantly, the software generation process will be deterministic and controlled to minimizeerror into our software generation process.

• Ensure effective staff communication and work in tandem to ensure fixes work together and with targetedhardware to ensure fixes or patches that are pushed out are pushed out once without hot fixes or overlayreleases.

• Carefully document release notes to ensure our own development and test staff are aware of the changes andcan buy off on the fixes.

o The release notes will be informative to the user, and they will be aware and confident of our fixes, thatthey have been thoroughly and comprehensively tested and rework by hot fixes or overlay releases isminimized.

• Apply patches to our headquarters and regional offices.• Ensure that our regional offices get the attention they need too and not be overlooked.

Change Control Risk The previous controls for patching also apply to change control risk.

• We will also ensure recommended patches are tested like other software fixes, namely in a thoughtful, deliberate,and consistent manner

• Establish clear policies and procedures for systems maintenance, similar to software patching.

Antiquated firewall Concerning our antiquated firewall, we have an immediate problem with vendor support that has expired.

• Assign staff to closely manage and track software support contracts and coverage to ensure support does notlapse.

• Submit a follow up request for coverage in time for the next financial cycle to ensure our coverage does not lapse.As a result, our software support personnel must be tied in budgeting.

• Deploy new Network Intrusion Prevention System (NIPS) hardware in regional offices.

Employee Turnover We also have a problem with turnover in our IT department, which manifests itself in infrastructure issues.

• Provide necessary leadership, i.e. management to recruit and retain talented personnel.

Page 9: FMEA Final Project

Spring 2015 IMT 553 Final Report DeBellis, Cabello, Marks, Morehouse, Vincent, Whaley

9

• Work with Human Resources (HR) to ensure compensation is commensurate with similar outside staff.• Develop an IT personnel lifecycle that can support continued operations in the following years without disruption.• Include a plan for a career of growth for our employees to

o Minimize bottleneckso Provide opportunities to either specialize in an area in IT or generalize in areas that support the customer

and our company, while complementing their own careers.• Implement training plan to support the education of our IT staff so they can intelligently plan and deploy solutions

to minimize our firewall risk.

Microsoft Exchange Server We are still running Microsoft Exchange Server 2007 and have not upgraded to the most recent version.

• Our solution is to upgrade from Microsoft Exchange Server 2007 to 2010 or higher.• Establish a contingency plan ready if our main server is taken over by an attacker.

o To mitigate this, our IT staff would keep up-to-date with current attack trends, and design monitoring rulesas new attacks are found in the wild, this ties in with our IT staff recommendations as mentioned before.

• Secure the email server by using well-established guidelines, such as the National Vulnerability Database (refhttps://web.nvd.nist.gov/view/ncp/repository/checklistDetail?id=186) to reduce our exposure significantly.

Bring Your Own Device (BYOD) The Bring Your Own Device (BYOD) risk, although not as high a risk as others, still manifests itself in various means, and requires as many controls.

• Design a BYOD policy including controls to reduce cost and risks introduced by BYOD. This policy must be wellthought out and address our changing environment; it must be updated regularly as with our other policies.

• Implement a VLAN or other form of network segregation with a separate WiFi signal supported by most high-endsmall office/home office (SOHO) and nearly all enterprise WiFi access points / routers.

o The separate WiFi network would be for non-company personal and business use, and based ondemand, could be even split further for increased separation.

o As an added note, we do have to ensure our access points/routers support this network segregation.

Backup We have three different ways of mitigating our risks.

• Test each backup immediately after copy to ensure non-corruption and to allow for time for repair of backupsystem and the making of new copies.

• Transfer backup copies to offsite storage vault outside of regional disaster impact zone.o We would also keep an additional copy of our most recent backup locally for recovery.

• Keep spare hardware onsite including replacement drives, blades and ensure technicians are trained to swap outfailed parts.

Physical security Physical security of our main datacenter is also an identified risk. We mitigate this by

• Implement a continuity plan includes a cloud backup of data servers to a third party vendor in Salt Lake City, UT.• Set up a failover data server to bring data back online within minutes of ISP DNS load balancing.

o Failover data servers hosted in third party facility with similar security controls as the Duwamish basedfacility.

Connectivity Connectivity is also a risk when the data center to ISP link goes down.

• Engage a service, e.g. Cloudflare to provide always-online presence as a control.

Page 10: FMEA Final Project

Spring 2015 IMT 553 Final Report DeBellis, Cabello, Marks, Morehouse, Vincent, Whaley

10

o Cloudflare caches static pages and serves them to visitors to maintain general site availability duringoutage or DDoS attack.

Availability Availability, as far as a hardware failure, e.g. firewall or server is a risk that needs a control. We have identified two controls.

• Deploy redundant architecture.o If using single device for a function, e.g. firewall, ensure it has redundancy built in for power, connectivity,

and compute function. Also, ensure it "fails-open" which assumes a layered defense.• Provision replacement parts as appropriate, and train staff in replacement and re-configuration procedures.

Non-segmented Network Another risk is that our regional network is not segmented. VOIP phone are on the same network as our workstations that present a potential risk.

• Place VLANS on managed switches to separate VOIP traffic from workstation traffic.

Business Risks The following are some of our business risks that directly affect our clients. These risks are particularly suited towards our medical based business.

Patient Tracking/Medical Records/Claims Management and Customer Billing Risks

The first and most significant business risks to the business are Patient Tracking/Medical Records, Claims Management, and Customer Billing. These risks are lumped together in our analysis since their controls are similar and intertwined:

The following controls apply to each of these three risks.

Policies and Procedures ● Establish clear policies and procedures for systems maintenance.● Communicate with customers regarding scheduled downtime maintenance windows, and preferably in a window

that affects them the least.● Ensure recommended patches are tested first in a development environment, then in a production environment to

simulate the real environment as close as possible.

Business Continuity Plan ● Build, test, and maintain a business continuity plan that includes a cloud backup of data servers to an offsite

location in a different geography, e.g. location not in the same geographical region, e.g. to a third party vendor in Salt Lake City, UT.

● Set up failover data servers to bring data back online within minutes of ISP DNS load balancing. Failover dataservers can be hosted in a third party facility with similar security controls as the Duwamish based facility.

Always-Online Service ● Engage a service, e.g. Cloudflare to provide an always-online presence. Cloudflare caches static pages and

serves them to visitors to maintain general site availability during outage or DDoS attack.

Page 11: FMEA Final Project

11

Spring 2015 IMT 553 Final Report DeBellis, Cabello, Marks, Morehouse, Vincent, Whaley Insurance Eligibility Verification

• Ensure that our customers can generate revenue from people they interact with, and determine what paymentsare available and at what price, based on varying conditions.

o While a complex issue, we can use software as a service (Saas) solution to enable our customers’administrative staff to improve insurance eligibility verification and meet criteria such as the HealthInformation Exchanged Accreditation Program (HIEAP).

Appointment Scheduling Since appointments are the first step in which our customers interact with “their” customers, the integrity of that system must remain intact, even when disconnected.

• Implement integrated communication solutions that works directly with the Patient Customer RelationshipManager (CRM) can address this risk.

Prescription Drug Management The last risk to be addressed is Prescription Drug Management to ensure that he risk is the prescription of the wrong drug or dosage, or a lethal or harmful reaction of the drugs being prescribed.

• Use software as a service (Saas) solution for this risk that meets criteria such as the Health InformationExchanged Accreditation Program (HIEAP).

o This solution takes into consideration all medication the patient is on, and all known side effects. Knowingall known adverse drug interactions would also lower negative reactions from multiple medications.

Page 12: FMEA Final Project

Spring 2015 IMT 553 Final Report DeBellis, Cabello, Marks, Morehouse, Vincent, Whaley

12

III. Residual Risks

We still have some high areas of concern that remain after our controls are in place. We still need to verify the effectiveness of our controls, but at first glance, there are no high probability, high severity risks.

We do have some high severity, medium probability risks. They can be grouped in the following categories.

• Software patching and overall software development is our first high severity, medium probability risk.o This falls in the technical side and affects our customers in terms of minimizing the defense of their data

thru the firewall and other patch support.There are obvious risks, but underlying causes. Utilizing a Root Cause, Corrective Analysis, we can trace this problem down to the support staff, namely our IT department, which is used to develop, maintain, and update this software.

While it is a software problem, it is really a people problem. There are Band-Aid fixes we can apply for temporary application, but a longer-term fix is needed, namely a good IT department, which is outside of the scope of this paper.

We have a high severity, low probability risk, which is another software problem, but it is keeping updated with the latest Microsoft Exchange Server software, but this can apply to other older software. The software needs to be updated to minimize risk, and a combination of time, money to upgrade is needed to remedy this issue.

Page 13: FMEA Final Project

Spring 2015 IMT 553 Final Report DeBellis, Cabello, Marks, Morehouse, Vincent, Whaley

13

IV. Plan of Action and Milestones (POA&M)

A Plan Of Action and Milestones (POA&M) is an effective tool for prioritizing and tracking issues to ensure they are resolved in a time and sequence that supports the overall strategic intent. We used this tool to track risks, identify solutions, assign responsibility, and then have a clear picture of what we had either been unable to resolve in time or did not have an effective solution to address. This helped determine our residual risk.

Outstanding medium and high risks still outstanding, not able to close and we must accept.

The attached Plan of Action and Milestones (POAM) maintains the outstanding issues with POCs and status of remediation. Any deviation from the plan must be approved by management, including delayed fixes or modifications of planned controls.

Item# Effect

Process Function Failure Modes Causes Current

Risk

Actions to Reduce Failure Mode (Recommended Additional Controls)

Residual Risk

1 Patching Unpatched vulnerabilities allow attack or other server software failure.

High turnover in IT department. Insufficient documentation and tracking of updates. Heterogeneous systems = high patch diversity and pace.

High Establish clear policy and procedures for systems maintenance. Ensure recommended patches are tested in a development environ and applied to production systems in a timely manner.

Low

4 BYOD Employee devices introduce malware onto network

No malware detection implemented on BYOD (e.g. employee phones)

Med Discussion and design of a BYOD policy, as well as the possible controls and their costs to reduce the increased risk introduced by BYOD.

Med

10 Change Control

Patch-management delayed

Maintenance window scheduling adversely impacted by insufficient coordination with customers and/or SLA language not supportive

High Establish clear policy and procedures for systems maintenance. Communicate with customers regarding scheduled downtime maintenance windows. Ensure recommended patches are tested in development environment and applied to production systems in a timely manner.

Med

16 Patient Tracking

Patient database unavailable. Tracking system software not properly exchanging data between database and customer system

Servers are segmented from each other and unable to communicate with each other

High Implement backup and failover controls specified in items 12-14

Low

20 Customer Billing

Kangaroo system offline

The manufacturer no longer supports the antiquated operating system. Inadequate or incomplete documentation.

High Implement backup and failover controls specified in items 12-14

Low

Page 14: FMEA Final Project

Spring 2015 IMT 553 Final Report DeBellis, Cabello, Marks, Morehouse, Vincent, Whaley

14

Acronyms

BYOD Bring Your Own Device CRM Customer Relationship Manager DDoS Directed Denial of Service DNS Domain Network Server HIEAP Health Information Exchanged Accreditation Program HR Human Resources IT Information Technology NIPS Network Intrusion Prevention Software. ISP Internet Service Provider SaaS Software as a Service SOHO Small Office Home Office VOIP Voice of Internet Protocol VLAN Virtual Local Area Network.WIFI Wireless network, a play on “Hi-Fi”

Team

Larry DeBellis - Project Manager Carlos Cabello - Company Structure / Research Analyst Jordan Hanna - Technical Analysis / Control Measures Mary Marks - Business Analysis/ Research Analyst / Type Editor David L. Morse - Propose Controls / Research Analyst Mike Whaley - Company Structure / Research Analyst Steve Vincent - Report Documents / Business Continuity Steve Morehouse - Final Report Documents / Research Analyst

Page 15: FMEA Final Project

Effect SEVERITY of Effect Ranking RankingHazardous without warning Very high severity ranking when a potential

failure mode affects business operationswithout warning

10

Hazardous with warning Very high severity ranking when a potentialfailure mode affects business operationswith warning

9

Very High System inoperable with destructive failurewithout compromising safety 8

High System inoperable with network damage 7Moderate System inoperable with minor damage 6Low System inoperable without damage 5Very Low System operable with significant degradation

of performance 4Minor System operable with some degradation of

performance 3Very Minor System operable with minimal interference 2None No effect 1

High

Med

Low

Annex 1: Severity, Probability, & Hazard Score Key

Annex 1: Severity Key

15

Page 16: FMEA Final Project

PROBABILITY of Failure Failure Prob Ranking RankingVery High: Failure is almost inevit >1 in 2 10

1 in 3 9High: Repeated failures 1 in 8 8

1 in 20 7Moderate: Occasional failures 1 in 80 6

1 in 400 51 in 2,000 4

Low: Relatively few failures 1 in 15,000 31 in 150,000 2

Remote: Failure is unlikely <1 in 1,500,000 1

High

Med

Low

Annex 1: Probability Key

Severity, Probability, & Hazard Score Key

16

Page 17: FMEA Final Project

Hazard Severity of Hazard Ranking RankingAbsoluteUncertainty

Design control cannot detect potential cause/mechanism andsubsequent failure mode

10

Very Remote Very remote chance the design control will detect potentialcause/mechanism and subsequent failure mode

9

Remote Remote chance the design control will detect potentialcause/mechanism and subsequent failure mode

8

Very Low Very low chance the design control will detect potentialcause/mechanism and subsequent failure mode

7

Low Low chance the design control will detect potentialcause/mechanism and subsequent failure mode

6

Moderate Moderate chance the design control will detect potentialcause/mechanism and subsequent failure mode

5

Moderately High Moderately High chance the design control will detectpotential cause/mechanism and subsequent failure mode

4

High High chance the design control will detect potentialcause/mechanism and subsequent failure mode

3

Very High Very high chance the design control will detect potentialcause/mechanism and subsequent failure mode

2

Almost Certain Design control will detect potential cause/mechanism andsubsequent failure mode

1

High

Med

Low

Annex 1: Hazard Key

Severity, Probability, & Hazard Score Key17

Page 18: FMEA Final Project

Item#(Effect) Process Function

1 Patching

2Antiquatedfirewall withexpired support

3

Customer runsMicrosoftExchange Server2007 and has notupgraded toMicrosoftExchange Server2010

4

5

6

7

8

9

Annex 2 Risk Evaluation

FAILURE MODE AND EFFECTS ANALYSIS (FMEA)

Failure Modes Causes Effects Severity Probability Hazard Score

Unpatched vulnerabilitiesallow attack or other serversoftware failure.

High turnover in ITdepartment.Insufficientdocumentation andtracking of updates.Heterogeneoussystems = high patchdiversity and pace.

Impact to clients via Datacenterunexpected downtimes resultingfrom poor patch management.Impacts to regional officesservers or connections to HQservices, potential loss of data.

High High [1] High

ExpiredIDS/IPS in regionaloffices

High turnover within ITDept.Funds used forreinvestement in IT arereallocated forMarketing & SalesEvents

HQ caught an average of 801malware events on theirperimeter devices aweek DOS Attacks are becoming acommon event

High High Low

Spam filter for Exchangeserver not reliable andmanagement of Exchangeis a messMicrosoft has halted supportfor Exchange 2007

Increased attack risk23% of users click on Phishinglinks and 11% of those areclicking on the attachments

High [2] High [3] med

BYOD Employee devices introducemalware onto network

No malware detectionimplemented on BYOD(eg. employee phones)

malware present on network - isa threat and requires expendingresources to combat, includingdetermining extent of damageand potential loss of data

med high med[4]

Flat network(non-segmented)allows BYOD deviceson same LAN ascorporate devices

Malware "jumps" from BYODdevices to corporate devices onsame LAN

med med [5] med [6]

Data Backups

Corrupted Backup Hardware FailsIf full-backup, then impact isone-week data loss med low med

If incremental, then impact is low low low

Same-Location Disaster Lack of Offsite storeCurrent lack of an offsite-storemeans a disaster affecting thedatacenter may destroy allbackups

high low low

Recovery-Fails Hardware FailsAssumes good copy of data,recovery attempt fails, butreplacing hardware will allowanother recovery attempt.

low low low

18

Page 19: FMEA Final Project

Item#(Effect) Process Function

10 Change Control

12 Connectivity

13

14

15

Regional networknot segmented.Phones on samenetwork asworkstations

Annex 2 Risk Evaluation

FAILURE MODE AND EFFECTS ANALYSIS (FMEA)

Failure Modes Causes Effects Severity Probability Hazard Score

Patch-management delayed

maint. windowscheduling adverselyimpacted by insufficientcoordination withcustomers and/or SLAlanguage notsupportive

delayed patching results inpotentially vulnerable systems inthe datacenter

high [7] high high

11 Physical Security(datacenter)

earthquake destroysdatacenter

building is in a level4earthquake zone,3-story, brick, notseismic retrofitted

all hosted services offline, allhardware damaged, alldatabackups destroyed(assuming current status of nooffsite backups)

high low med

Thieves destroy or damageservers

Server not secured in aproper manner

stolen sensitive information canbe used against our clientsresulting in repuation damage

high low med

datacenter to ISP link goesdown

datacenter is onlyconnected upstream via all hosted services offline high low med

Availability hardware failure (firewall orserver)

architecture ofdatacenter; multiplesingle points of failure(eg. single firewalldevice, singleblade-chassi)

hosted services offline. failureof the firewall or the serverchassi means services areoffline until manual intervention.

high low low

service agreement;datacenter staff is notauthorized to enterKangaroo cabinet,repairs requireKangaroo staff to travelto datacenter.

hosted services offline. anyfailure within the cabinetrequires Kangaroo staff to travelto site.

high low med [8]

Allows malicious traffic tomask itself as VOIP trafficand infect workstations.

Infrastructure grew fastat regional locationsand lack ofsegmentation was anoversite

Potential downtime at regionaloffice if workstations infected. Med Med Med [9]

19

Page 20: FMEA Final Project

Item#(Effect) Process Function

16 Patient tracking

17Insuranceeligibilityverification

Annex 2 Risk Evaluation

FAILURE MODE AND EFFECTS ANALYSIS (FMEA)

Failure Modes Causes Effects Severity Probability Hazard Score

Patient databaseunavailable Trackingsystem software notproperly exchanging databetween database andcustomer system

Servers are segmentedfrom each other andunable to communicatewith each other

Customers (Providers) unableto conduct patient monitoringbecause medical records notavailable. Medical files andbilling and insurance recordscontain the most valuablepatient data. Most often andsuccessfully targeted (55%successfully targeted)* Providers at risk of malpracticedue lack lack visibility intopatient information* Company at risk of liability fora breach of personal financialand health informationbelonging to our customers andtheir patients.* We are at risk of businessdisruption which adverselyaffects our company, ourcustomer providers, and theirpatients* Adverse impact on futurefinancial results due to the theft,destruction, loss,misappropriation or release ofconfidential data or IP

High High High

Kangaroo system offline

Unauthorized access topatients records due toemployee negligence,criminal activity, servicedisruptions, networkfailure

Customers (HC Providers)unable to determine insuranceeligibility of patients

High Med Med

20

Page 21: FMEA Final Project

Item#(Effect) Process Function

18 Appointmentscheduling

19 Claimsmanagement

Annex 2 Risk Evaluation

FAILURE MODE AND EFFECTS ANALYSIS (FMEA)

Failure Modes Causes Effects Severity Probability Hazard Score

Kangaroo system offline

* Antiquated OperatingSystems is outdatedand no longersupported by thenanufacturer *Patientkept on hold, lost call,unanswered phone

Customers unable to schedulenew appointments or modifyexisting appointments.* Company at risk of loss ofrevenue due to operational andbusiness delays* Company at risk of negativepublicity resulting in reputationor brand damage with ourcustomers, suppliers or industrypeers* Adverse impact on futurefinancial results due to the theft,destruction, loss,misappropriation or release ofconfidential data or IP

Med Med Low

Kangaroo system offlineAntiquated OperatingSystems is no longersupported by thenanufacturer

Company unable to processmedical claims.* Company at risk of liability dueto provider customer loss ofrevenue due to operational andbusiness delays

* Company at risk of negativepublicity resulting in reputationor brand damage with ourcustomers, suppliers or industrypeers* Company at risk of operationalor business delays resultingfrom the disruption of IS andsubsequent clean-up andmitigation activities

High med med

21

Page 22: FMEA Final Project

Item#(Effect) Process Function

20 Customer billing

21 Prescription drugmanagement

Annex 2 Risk Evaluation

FAILURE MODE AND EFFECTS ANALYSIS (FMEA)

Failure Modes Causes Effects Severity Probability Hazard Score

Kangaroo system offline

* Antiquated OperatingSystems is no longersupported by themanufacturer

*Inadequate orincompletedocumentation.

Company unable to processcustomer billing for claims.Billing and insurance recordscontains some of the mostvaluable patient data andtargeted (46% successfullytargeted)* Company at risk of liability fora breach of personal financialand health informationbelonging to our customers andtheir patients.* We are at risk of businessdisruption which adverselyaffects our company, ourcustomer providers, and theirpatients* Adverse impact on futurefinancial results due to the theft,destruction, loss,misappropriation or release ofconfidential data or IP

High high high

Kangaroo system offlineLegacy systems unableto support minimumrequirements

Provider are unable to enternew prescriptions for patients;unable to conduct druginteraction checking for newprescriptions.

high med med

22

Page 23: FMEA Final Project

[1] "average time between vulnerability discovery and the release of exploit code is less than one week"

"99% of intrusions result from exploitation of known vulnerabilities orconfiguration errors where countermeasures were available"

http://www.sans.org/reading-room/whitepapers/application/reducing-organizational-risk-virtual-patching-33589[2] http://www.theemailadmin.com/2011/05/5-repercussions-of-a-hacked-exchange-server-account/

[3] Very high probability of compromise. This is like having windows XP and having it face the web. Since the mail server is publicfacing (assuming in a DMZ at least), it is very vulnerable to attackers as an initial attack vector, and in its current state is one of thelowest hanging fruits.

[4] Depends on the current policy and Network Security Monitoring (NSM) in place, but a phone with malware is generally not used asa network pivot, and worms / viruses will usually alert on an IPS / IDS.[5] Phone with malware is generally not used as a network pivot, and worms / viruses will usually alert on an IPS / IDS.[6] Higher hazard than just BYOD because now we are considering it connected to the entire network, rather than just general BYOD.[7] "average time between vulnerability discovery and the release of exploit code is less than one week"

"99% of intrusions result from exploitation of known vulnerabilities orconfiguration errors where countermeasures were available"

http://www.sans.org/reading-room/whitepapers/application/reducing-organizational-risk-virtual-patching-33589[8] I scored this higher as it requires staff to physically travel to the site to determine full extent. This means the "control" does notquickly or fully detect the issue.[9] ( industry standard AV actively monitoring workstations)

23

Page 24: FMEA Final Project

Item#(Effect) Process Function

1 Patching

2Antiquated firewallwith expiredsupport

3

Customer runsMicrosoftExchange Server2007 and has notupgraded toMicrosoftExchange Server2010

4

5

6

7

8

9

FAILURE MODE AND EFFECTS ANALYSIS (FMEA)

Failure Modes Causes Actions to Reduce Failure Mode (Recommended Additional Controls)

Unpatched vulnerabilitiesallow attack or other serversoftware failure.

High turnover in ITdepartment.Insufficientdocumentation andtracking of updates.Heterogeneoussystems = highpatch diversity andpace.

Establish clear policy and procedures for systems maint. Ensure recomended patches aretested in dev environ and applied to production systems in a timely manner.

ExpiredIDS/IPS in regionaloffices

High turnoverwithin ITDept. Funds used forreinvestement in ITare reallocated forMarketing & SalesEvents

Deploy new NIPS hardware in regional offices. Establish clear policy and procedures forsystems maint. Ensure recomended patches are tested in dev environ and applied toproduction systems in a timely manner.

Spam filter for Exchangeserver not reliable andmanagement of Exchangeis a mess

Increased attackrisk

Upgrade Microsoft Exchange Server 2007 to 2010 or higher. If this is not a very near-futuretask, then consider designing monitoring rules specifically to watch the email server. Havecontigency plan ready if server is taken over by an attacker. Keep up-to-date with currentattack trends, and design monitoring rules as new attacks are found in the wild. If 2007 willbe kept for a reasonable amount of time (e.g. 6+ months), then it should be locked down aswell as possible using well-established guidelines, such ashttps://web.nvd.nist.gov/view/ncp/repository/checklistDetail?id=186

BYOD Employee devices introducemalware onto network

No malwaredetectionimplemented onBYOD (eg.employee phones)

Discussion and design of a BYOD policy, as well as the possible controls and their costs toreduce the increased risk introduced by BYOD.

Flat network(non-segmented)allows BYODdevices on sameLAN as corporatedevices

VLAN or other form of network segregation with separate WIFI signal (supported by mosthigh-end SOHO and nearly all enterprise WIFI access points / routers)

Data Backups

Corrupted Backup Hardware FailsTest each backup immediately after copy to ensure non-corrupted and to allow for time forrepair of backup system and making new copies.

Test each backup immediately after copy to ensure non-corrupted and to allow for time forrepair of backup system and making new copies.

Same-Location Disaster Lack of Offsitestore

Transfer backup copies to offsite storage vault outside of regional disaster impact zone.Note: keep additional copy of most recent set locally for recovery.

Recovery-Fails Hardware Fails Have spare hardware onsite - replacement drives, blades - and ensure technicians aretrained to swap out failed parts.

Annex 3: Recommended Control Measures

24

Page 25: FMEA Final Project

Item#(Effect) Process Function

10 Change Control

11 Physical Security(datacenter)

12 Connectivity

13

14

15

Regional networknot segmented.Phones on samenetwork asworkstations

FAILURE MODE AND EFFECTS ANALYSIS (FMEA)

Failure Modes Causes Actions to Reduce Failure Mode (Recommended Additional Controls)

Patch-management delayed

maint. windowschedulingadversely impactedby insufficientcoordination withcustomers and/orSLA language notsupportive

Establish clear policy and procedures for systems maint. Communicate with customers re.scheduled downtime maint. windows. Ensure recomended patches are tested in devenviron and applied to production systems in a timely manner.

earthquake destroysdatacenter

building is in alevel4 earthquakezone, 3-story, brick,not seismicretrofitted

Continuity plan to include a cloud backup of data servers to Salt Lake City, UT. Failoverdata servers set up in facility in Salt Lake City, UT. Failover data server set up to bring databack on line within minutes of ISP DNS loadbalancing. Failover data servers hosted in 3rdparty facility with similar security controls as the Duamish based facility.

datacenter to ISP link goesdown

datacenter is onlyconnectedupstream via asingle ISP

Engage a service (eg. Cloudflare) to provide always-online presence. (Cloudflare cachesstatic pages and serves them to visitors to maintain general site availablility during outageor DDoS attack).

Availability hardware failure (firewall orserver)

architecture ofdatacenter; multiplesingle points offailure (eg. singlefirewall device,singlebladeserver-chasis)

Deploy redundant architecture - if using single device for a function (eg. firewall) ensure ithas redundancy built in for power, connectivity and compute function. Also ensure it"fails-open" (this assumes layered defense).

service agreement;datacenter staff isnot authorized toenter Kangaroocabinet, repairsrequire Kangaroostaff to travel todatacenter.

Position replacement parts as appropriate, and train staff in replacement andre-configuration proceedures.

Allows malicious traffic tomask itself as VOIP trafficand infect workstations.

Infrastructure grewfast at regionallocations and lackof segmentationwas an oversite

VLANS on managed switches to seperate VIOP traffic from workstation traffic

Annex 3: Recommended Control Measures

25

Page 26: FMEA Final Project

Item#(Effect) Process Function

16 Patient tracking

17 Insurance eligibilityverification

18 Appointmentscheduling

FAILURE MODE AND EFFECTS ANALYSIS (FMEA)

Failure Modes Causes Actions to Reduce Failure Mode (Recommended Additional Controls)

Patient databaseunavailable. Trackingsystem software notproperly exchanging databetween database andcustomer system.

Undetected breach,unauthorizedaccess to patientdata, employeenegligence,criminal activity,service disruptions,network failure

-Backup and automatic failover in the event of systems failure to avoid interruption in service-Automatically detect malware including email attachments and unauthorized access to patient medical records.

- Implement SIEM technology to effectively detect unauthorized access to patient tinformation.

- Hire infosec experts (direct or consulting) and train internal IT and end user staff. The Poneman 2015 Benchmark Study on Privacy & Security of Healthcare Data (PonemanHC), report found most health care providers and business partners lack software tools, staff, and expertise to detect and prevent loss or theft of patient data) (Source:https://www2.idexpertscorp.com/fifth-annual-ponemon-study-on-privacy-security-in cidents-of-healthcare-data - Audit existing policies and procedures and implement policy changes to address gaps that include evise policy to address security gaps. Poneman HC found more data breaches are discovered through audits and assessments followed by employee detections.

- Implement and enforce security policies and conduct employee security awareness training to address employee negligence, the #1 security threat facing HC organizations(Source Poneman HC). - Implement, strengthen, and enforce BYOD policies to reduce compromise from Smartphones and tablets. PonemanHC found this is the first year that smartphones and tablets are the types of devices most commonly compromised or stolen. Before 2015, the primary sources of compromise were desktop and laptop computers. -Assign top priority to protecting electronic health records. PonemanHC research found medical records are the in the top 2 types patient data most frequently lost or stolen.. More data breaches are discovered through audits and assessments followed by employee detections. - Invest in information security technologies and engage security experts to harden systems to deter hackers. - Offer credit monitoring for patients. Despite the risks to patients who have had their records lost or stolen, 65 percent of respondents do not offer protection services. Only 19 percent offer credit monitoring (Poneman HC) percent offer other identity monitoring

Kangaroo system offline

Unauthorizedaccess to patiencerecords due toemployeenegligence,criminal activity,service disruptions,network failure

-Backup and automatic failover in the event of systems failure to avoid interruption inservice. - Automatically detect malware including email attachments and unauthorizedaccess

Kangaroo system offline

Unauthorizedaccess to patiencerecords due toemployeenegligence,criminal activity,service disruptions,network failure

Same actions as item #18

Annex 3: Recommended Control Measures

26

Page 27: FMEA Final Project

Item#(Effect) Process Function

19 Claimsmanagement

20 Customer billing

21 Prescription drugmanagement

FAILURE MODE AND EFFECTS ANALYSIS (FMEA)

Failure Modes Causes Actions to Reduce Failure Mode (Recommended Additional Controls)

Kangaroo system offline

Unauthorizedaccess to patiencerecords due toemployeenegligence,criminal activity,service disruptions,network failure

Same actions as item #18

Kangaroo system offline

* AntiquatedOperating Systemsis no longersupported by themanufacturer

*Inadequate orincompletedocumentation.

Customer billing records breach is the #1 target for HC InfoSec Breach (Poneman HC)Same actions as item #17

Kangaroo system offlineLegacy systemsunable to supportminimumrequirements

Same actions as item #18

Annex 3: Recommended Control Measures

27

Page 28: FMEA Final Project

Item#(Effect)

ProcessFunction Actions to Reduce Failure Mode (Recommended Additional Controls)

Severity Probability Hazard Score

1 PatchingEstablish clear policy and procedures for systems maint. Ensurerecomended patches are tested in dev environ and applied toproduction systems in a timely manner.

High med Low

2Antiquatedfirewall withexpiredsupport

Deploy new NIPS hardware in regional offices. Establish clear policyand procedures for systems maint. Ensure recomended patches aretested in dev environ and applied to production systems in a timelymanner.(what is NIPS hardware, can we spell out theacronym?....Network Intrusion Prevention Solution)

High med Low

3

Customer runsMicrosoftExchangeServer 2007and has notupgraded toMicrosoftExchangeServer 2010

Upgrade Microsoft Exchange Server 2007 to 2010 or higher. Havecontigency plan ready if server is taken over by an attacker. Keep up-to-date with current attack trends, and design monitoring rules as newattacks are found in the wild. Secure the email server by using well-established guidelines, such ashttps://web.nvd.nist.gov/view/ncp/repository/checklistDetail?id=186 toreduce probability significantly.

High Low Low

4Discussion and design of a BYOD policy, as well as the possiblecontrols and their costs to reduce the increased risk introduced byBYOD. med med med

5VLAN or other form of network segregation with separate WIFI signal(supported by most high-end SOHO and nearly all enterprise WIFIaccess points / routers)

med low low

6 Test each backup immediately after copy to ensure non-corrupted andto allow for time for repair of backup system and making new copies. med low low

7 Test each backup immediately after copy to ensure non-corrupted andto allow for time for repair of backup system and making new copies. low low low

8Transfer backup copies to offsite storage vault outside of regionaldisaster impact zone. Note: keep additional copy of most recent setlocally for recovery.

high low low

9 Have spare hardware onsite - replacement drives, blades - and ensuretechnicians are trained to swap out failed parts. low low low

10 ChangeControl

Establish clear policy and procedures for systems maint. Communicatewith customers re. scheduled downtime maint. windows. Ensurerecomended patches are tested in dev environ and applied toproduction systems in a timely manner.

med med med

FAILURE MODE AND EFFECTS ANALYSIS (FMEA) Residual Risk (after RecommendedControls)

Failure Modes Causes

Unpatched vulnerabilitiesallow attack or other serversoftware failure.

High turnover in ITdepartment.Insufficientdocumentation andtracking of updates.Heterogeneoussystems = highpatch diversity and

ExpiredIDS/IPS in regionaloffices

High turnoverwithin IT Dept.

Spam filter for Exchangeserver not reliable andmanagement of Exchangeis a mess

Increased attackrisk

BYOD Employee devices introducemalware onto network

No malwaredetectionimplemented onBYOD (eg.

Flat network(non-segmented)allows BYODdevices on sameLAN as corporate

Data Backups

Corrupted Backup Hardware Fails

Same-Location Disaster Lack of Offsitestore

Recovery-Fails Hardware Fails

Patch-management delayed

maint. windowschedulingadversely impactedby insufficientcoordination withcustomers and/orSLA language not

Annex 4: Residual Risk Assessment

28

Page 29: FMEA Final Project

Item#(Effect)

ProcessFunction Actions to Reduce Failure Mode (Recommended Additional Controls)

Severity Probability Hazard Score

11PhysicalSecurity(datacenter)

Continuity plan to include a cloud backup of data servers to Salt LakeCity, UT. Failover data servers set up in facility in Salt Lake City, UT.Failover data server set up to bring data back on line within minutes ofISP DNS loadbalancing. Failover data servers hosted in 3rd party facilitywith similar security controls as the Duamish based facility.

high low low

12 ConnectivityEngage a service (eg. Cloudflare) to provide always-online presence.(Cloudflare caches static pages and serves them to visitors to maintaingeneral site availablility during outage or DDoS attack).

high low low

13Deploy redundant architecture - if using single device for a function (eg.firewall) ensure it has redundancy built in for power, connectivity andcompute function. Also ensure it "fails-open" (this assumes layereddefense).

high low low

14 Position replacement parts as appropriate, and train staff in replacementand re-configuration proceedures. high low low

15

Regionalnetwork notsegmented.Phones onsame networkasworkstations

VLANS on managed switches to seperate VIOP traffic from workstationtraffic low low low

16 PatientTracking Implement backup and failover controls specified in items 12-14 high low low

17InsuranceEligibilityVerification

Use a software as a service (Saas) solution to enable administrativestaff to improve insurance eligibility verification and meets criteria suchas the Health Information Exchanged Accreditation Program (HIEAP)

med med low

18 ApointmentScheduling

Integrated comunication solutions that works directly with the PatienCustomer Relationship Manager (CRM) med low low

FAILURE MODE AND EFFECTS ANALYSIS (FMEA) Residual Risk (after RecommendedControls)

Failure Modes Causes

earthquake destroysdatacenter

building is in alevel4 earthquakezone, 3-story, brick,not seismic

datacenter to ISP link goesdown

datacenter is onlyconnectedupstream via asingle ISP

Availability hardware failure (firewall orserver)

architecture ofdatacenter; multiplesingle points offailure (eg. singlefirewall device,single

service agreement;datacenter staff isnot authorized toenter Kangaroocabinet, repairsrequire Kangaroostaff to travel to

Allows malicious traffic tomask itself as VOIP trafficand infect workstations.

Infrastructure grewfast at regionallocations and lackof segmentationwas an oversite

Patient databaseunavailable Trackingsystem software notproperly exchanging databetween database and

Servers aresegmented fromeach other andunable tocommunicate with

Kangaroo system offline

Unauthorizedaccess to patientsrecords due toemployeenegligence,criminal activity,service disruptions,

Kangaroo system offline

* AntiquatedOperating Systemsis outdated and nolonger supportedby the nanufacturer*Patient kept onhold, lost call,

Annex 4: Residual Risk Assessment

29

Page 30: FMEA Final Project

Item#(Effect)

ProcessFunction Actions to Reduce Failure Mode (Recommended Additional Controls)

Severity Probability Hazard Score

19 ClaimsManagement Implement backup and failover controls specified in items 12-14 med low low

20 CustomerBilling Implement backup and failover controls specified in items 12-14 high low low

21PrescriptionDrugManagement

Use a software as a service (Saas) solution for prescription drugmanagement that meets criteria such as Health Information ExchangedAccreditation Program (HIEAP) which takes into consideration allmedication the patient is on; prohibiting negative adverse reactions frommultiple medications

med low low

FAILURE MODE AND EFFECTS ANALYSIS (FMEA) Residual Risk (after RecommendedControls)

Failure Modes Causes

Kangaroo system offline

AntiquatedOperating Systemsis no longersupported by the

Kangaroo system offline

* AntiquatedOperating Systemsis no longersupported by themanufacturer

*Inadequate orincomplete

Kangaroo system offlineLegacy systemsunable to supportminimumrequirements

Annex 4: Residual Risk Assessment

30

Page 31: FMEA Final Project

Item#(Effect) Process Function POC Resources

RequiredScheduledCompletion

Actions to Reduce Failure Mode(Recommended Additional Controls) Progress

Residual Risk

1 PatchingChangeManagementBoard

Inter-departmental,80 HRS

30 daysEstablish clear policy and procedures for systems maint. Ensurerecomended patches are tested in dev environ and applied toproduction systems in a timely manner.

Low

2Antiquatedfirewall withexpired support

CISOInter-departmental,300 HRS

90 days

Deploy new NIPS hardware in regional offices. Establish clearpolicy and procedures for systems maint. Ensure recomendedpatches are tested in dev environ and applied to productionsystems in a timely manner.(what is NIPS hardware, can we spellout the acronym?....Network Intrusion Prevention Solution)

Low

3 Mail ServerOutdated CIO IT, 120 HRS 60 days

Upgrade Microsoft Exchange Server 2007 to 2010 or higher. Havecontigency plan ready if server is taken over by an attacker. Keepup-to-date with current attack trends, and design monitoring rulesas new attacks are found in the wild. Secure the email server byusing well-established guidelines, such ashttps://web.nvd.nist.gov/view/ncp/repository/checklistDetail?id=186 to reduce probability significantly.

Low

4 CISOInter-departmental,80 HRS

60 daysDiscussion and design of a BYOD policy, as well as the possiblecontrols and their costs to reduce the increased risk introduced byBYOD. med

5 CIOInter-departmental,600 HRS

120 daysVLAN or other form of network segregation with separate WIFIsignal (supported by most high-end SOHO and nearly allenterprise WIFI access points / routers)

low

6 CIO IT, 20 HRS 30 daysTest each backup immediately after copy to ensure non-corruptedand to allow for time for repair of backup system and making newcopies.

low

7 CIO IT, 20 HRS 30 daysTest each backup immediately after copy to ensure non-corruptedand to allow for time for repair of backup system and making newcopies.

8 DR Team IT, 20 HRS 30 daysTransfer backup copies to offsite storage vault outside of regionaldisaster impact zone. Note: keep additional copy of most recentset locally for recovery.

low

9 DR Team IT, 60 HRS 60 days Have spare hardware onsite - replacement drives, blades - andensure technicians are trained to swap out failed parts. low

10 Change ControlChangeManagementBoard

Inter-departmental,80 HRS

30 daysEstablish clear policy and procedures for systems maint.Communicate with customers re. scheduled downtime maint.windows. Ensure recomended patches are tested in dev environand applied to production systems in a timely manner.

med

11 Physical Security(datacenter) DR Team

Inter-departmental,80 HRS

60 days

Continuity plan to include a cloud backup of data servers to SaltLake City, UT. Failover data servers set up in facility in Salt LakeCity, UT. Failover data server set up to bring data back on linewithin minutes of ISP DNS loadbalancing. Failover data servershosted in 3rd party facility with similar security controls as theDuamish based facility.

low

12 Connectivity CIO IT 40 HRS 15 daysEngage a service (eg. Cloudflare) to provide always-onlinepresence. (Cloudflare caches static pages and serves them tovisitors to maintain general site availablility during outage or DDoSattack).

low

13 CIO IT 20 HRS 60 daysDeploy redundant architecture - if using single device for afunction (eg. firewall) ensure it has redundancy built in for power,connectivity and compute function. Also ensure it "fails-open"(this assumes layered defense).

low

14 CIO IT 200 HRS 90 days Position replacement parts as appropriate, and train staff inreplacement and re-configuration proceedures. low

POAM

Failure Modes Current Risk

Unpatchedvulnerabilities allowattack or other server High

ExpiredIDS/IPS inregional offices Low

Spam filter forExchange server notreliable andmanagement ofExchange is a mess

med

BYODEmployee devicesintroduce malwareonto network

med[1]

med [2]

Data Backups

Corrupted Backupmed

low

Same-LocationDisaster low

Recovery-Fails low

Patch-managementdelayed high

earthquake destroysdatacenter med

datacenter to ISP linkgoes down med

Availability hardware failure(firewall or server)

low

med

Annex 5: POAM

31

Page 32: FMEA Final Project

Item#(Effect) Process Function POC Resources

RequiredScheduledCompletion

Actions to Reduce Failure Mode(Recommended Additional Controls) Progress

Residual Risk

15 Regional networknot segmented CIO

Inter-departmental,600 HRS

120 days VLANS on managed switches to seperate VIOP traffic fromworkstation traffic low

16 Patient Tracking CIO see items 12 -14

see items 12- 14 Implement backup and failover controls specified in items 12-14 low

17InsuranceEligibilityVerification

CEO / HR Interdepartmental, 300 HRS 120 days

Use a software as a service (Saas) solution to enableadministrative staff to improve insurance eligibility verification andmeets criteria such as the Health Information ExchangedAccreditation Program (HIEAP)

low

18 ApointmentScheduling

CTO + BZMngr

Interdepartmental, 300 HRS 120 days Integrated comunication solutions that works directly with the

Patien Customer Relationship Manager (CRM) low

19 ClaimsManagement

CTO + BZMngr

see items 12 -14

see items 12- 14 Implement backup and failover controls specified in items 12-14 low

20 Customer Billing CTO + BZMngr

see items 12 -14

see items 12- 14 Implement backup and failover controls specified in items 12-14 low

21 Prescription DrugManagement

CTO + BZMngr

Interdepartmental, 300 HRS 120 days

Use a software as a service (Saas) solution for prescription drugmanagement that meets criteria such as Health InformationExchanged Accreditation Program (HIEAP) which takes intoconsideration all medication the patient is on; prohibiting negativeadverse reactions from multiple medications

low

POAM

Failure Modes Current Risk

Allows malicioustraffic to mask itselfas VOIP traffic and Med [3]

Patient databaseunavailable Trackingsystem software notproperly exchangingdata between

high

Kangaroo systemoffline med

Kangaroo system lowKangaroo system medKangaroo system high

Kangaroo systemoffline med

Annex 5: POAM

32

Page 33: FMEA Final Project

[1] Depends on the current policy and Network Security Monitoring (NSM) in place, but a phone with malware is generally not used asa network pivot, and worms / viruses will usually alert on an IPS / IDS.[2] Higher hazard than just BYOD because now we are considering it connected to the entire network, rather than just general BYOD.[3] ( industry standard AV actively monitoring workstations)

Annex 5: POAM

33

Page 34: FMEA Final Project

Spring 2015 IMT 553 Final Report DeBellis, Cabello, Marks, Morehouse, Vincent, Whaley

34

Annex 6Bibliography

2015 Data Breach Investigations Report, Verizon http://www.verizonenterprise.com/DBIR/2015/

2015 Second Annual Data Breach Industry Forecast, Experian http://www.experian.com/assets/data-breach/white-papers/2015-industry-forecast-experian.pdf?_ga=1.172114915.1943093614.1418003182

Deloitte COSO Guide, Risk Assessment in Practice, October 2012 http://www.coso.org/documents/COSOAnncsOnlineSurvy2GainInpt4Updt2IntrnlCntrlIntgratdFrmwrk%20-%20for%20merge_files/COSO-ERM%20Risk%20Assessment%20inPractice%20Thought%20Paper%20OCtober%202012.pdf

Failure Mode Effects Analysis, Create a Simple Framework To Validate FMEA Performance, Steve Pollock, Six Sigma Forum Magazine, August 2005 http://rube.asq.org/sixsigma/create-a-simple-framework-to-validate-fmea-performance.pdf

Fifth Annual Benchmark Study on Privacy & Security of Healthcare Data Ponemon Institute, May 2015 https://www2.idexpertscorp.com/fifth-annual-ponemon-study-on-privacy-security-incidents-of-healthcare-data

iSixSigma Quick Guide to Failure Mode and Effects Analysis http://www.isixsigma.com/tools-templates/fmea/quick-guide-failure-mode-and-effects-analysis/

Institute for Safe Medication Practices Example of a Health Care Failure Mode and Effects Analysis for IV Patient Controlled Analgesia (PCA) FMEA http://www.ismp.org/Tools/FMEAofPCA.pdf https://www2.idexpertscorp.com/fifth-annual-ponemon-study-on-privacy-security-incidents-of-healthcare-data

Planning For Failure, by John Kindervag, Rick Holland, and Heidi Shey, February 11, 2015 Forrester Research https://www.forrester.com/Planning+For+Failure/fulltext/-/E-RES60564

Tips for Creating an Information Security Assessment Report, Lenny Zeltser https://zeltser.com/security-assessment-report-cheat-sheet/