12
Control Engineering Practice 71 (2018) 96–107 Contents lists available at ScienceDirect Control Engineering Practice journal homepage: www.elsevier.com/locate/conengprac Analysis and design of time-deadbands for univariate alarm systems Muhammad Shahzad Afzal a, *, Tongwen Chen a , Ali Bandehkhoda b , Iman Izadi b a Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB, Canada T6G 1H9 b Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan 84165-8311, Iran article info Keywords: Alarm configurations Alarm systems Time-deadbands Markov processes abstract Time-deadbands (or alarm latches) are popular alarm configuration methods used in industry to improve the alarm system performance. In this paper, time-deadband based configurations for the case of univariate alarm systems are analyzed. Mathematical models are developed based on Markov processes, and analytical expressions for performance indices (the false alarm rate, missed alarm rate, and expected detection delay) are derived. Systematic design procedures are also proposed, and the utility of the methods is illustrated through design examples. © 2017 Elsevier Ltd. All rights reserved. 1. Introduction For safe and productive operation, large scale industrial processes are equipped with hundreds and thousands of sensors and actuators. While this has helped in increasing the number of monitored process variables, it has also resulted in degraded alarm system performance. As operators these days see many false and nuisance alarms on their monitoring screens due to poor alarm configurations, programmed very conveniently in Distributed Control Systems (DCS). Various surveys have shown that the performance of alarm systems from different sectors of industries is far from the benchmark set by the two de-facto standards (ISA 18.2 and EEMUA 191) on monitoring systems (EEMUA, 2007; International Society of Automation, 2009). In view of the current status of alarm systems, industry personnel have started to put a lot of effort in alarm system management and rationalization. Different alarm configuration methods, e.g., filters, deadbands and delay-timers, are some of the commonly used techniques to reduce false and chattering alarms. In addition, some advanced techniques such as state-based alarming, logic-based alarming, and predictive alarming are also in practice (Hwang et al., 2008; Jang, Suh, Kim, Suh, & Park, 2013; Jerhotova, Sikora, & Stluka, 2012). For the last few years, researchers from academia have also been engaged with industries to help them in improving their alarm systems. This collaboration has also enriched the published literature on the use and design of various alarm configuration methods (Adnan, Izadi, & Chen, 2011; Cheng, Izadi, & Chen, 2011; Hugo, 2009; Simeu-Abazi, Lefebvre, & Derain, 2011). In the following subsection different types of alarm configuration methods are described, and the literature survey is provided. A preliminary version of this paper was appeared in the Proceeding of American Control Conference (ACC), pp. 4815–4820, Seattle, WA, USA, May 24–26, 2017. * Corresponding author. E-mail addresses: [email protected] (M.S. Afzal), [email protected] (T. Chen), [email protected] (A. Bandehkhoda), [email protected] (I. Izadi). 1.1. Taxonomy of alarm configuration methods Alarm configuration methods can be broadly classified into two main types, namely, basic methods and advanced or enhanced methods (In- ternational Society of Automation, 2009). Filters, deadbands, and delay- timers can be categorized as the basic alarm configuration methods; whereas techniques like state-based alarming, predictive alarming, and logic based alarming fall under the umbrella of advanced methods. Fig. 1 gives the taxonomy of alarm configuration methods. In this figure, vertical dots in the last tier show that there is a wide range of filters, and advanced methods available, while only a few are shown here as examples. In the literature many papers can be found that deal with the analysis and design of different alarm configuration methods, e.g., in Cheng et al. (2011) and Cheng, Izadi, and Chen (2013) the problem of designing optimal alarm filters was studied, and it was found out that the log-likelihood ratio filters gave the optimal performance in terms of alarm system accuracy. Numerical optimization procedures were also proposed for linear and quadratic forms of optimal filters. In Adnan et al. (2011) the authors have computed the detection delays for both on and off delay-timers, and measurement-deadbands. A design procedure based on the Receiver Operating Curve (ROC) was also proposed. Analytical expressions for run-length distribution based chattering in- dex were computed for delay-timers and measurement-deadbands in Naghoosi, Izadi, and Chen (2011). The concept of generalized delay- timers was studied in Adnan, Cheng, Izadi, and Chen (2013), and the https://doi.org/10.1016/j.conengprac.2017.10.016 Received 22 March 2017; Received in revised form 7 September 2017; Accepted 29 October 2017 Available online 20 November 2017 0967-0661/© 2017 Elsevier Ltd. All rights reserved.

Control Engineering Practicestatic.tongtianta.site/paper_pdf/2f7cb464-7d73-11e9-82ea...M.S. Afzal et al. Control Engineering Practice 71 (2018) 96–107 performance of the generalized

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

  • Control Engineering Practice 71 (2018) 96–107

    Contents lists available at ScienceDirect

    Control Engineering Practice

    journal homepage: www.elsevier.com/locate/conengprac

    Analysis and design of time-deadbands for univariate alarm systems✩

    Muhammad Shahzad Afzal a,*, Tongwen Chen a, Ali Bandehkhoda b, Iman Izadi ba Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB, Canada T6G 1H9b Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan 84165-8311, Iran

    a r t i c l e i n f o

    Keywords:Alarm configurationsAlarm systemsTime-deadbandsMarkov processes

    a b s t r a c t

    Time-deadbands (or alarm latches) are popular alarm configuration methods used in industry to improve thealarm system performance. In this paper, time-deadband based configurations for the case of univariate alarmsystems are analyzed. Mathematical models are developed based on Markov processes, and analytical expressionsfor performance indices (the false alarm rate, missed alarm rate, and expected detection delay) are derived.Systematic design procedures are also proposed, and the utility of the methods is illustrated through designexamples.

    © 2017 Elsevier Ltd. All rights reserved.

    1. Introduction

    For safe and productive operation, large scale industrial processesare equipped with hundreds and thousands of sensors and actuators.While this has helped in increasing the number of monitored processvariables, it has also resulted in degraded alarm system performance.As operators these days see many false and nuisance alarms on theirmonitoring screens due to poor alarm configurations, programmed veryconveniently in Distributed Control Systems (DCS). Various surveyshave shown that the performance of alarm systems from different sectorsof industries is far from the benchmark set by the two de-facto standards(ISA 18.2 and EEMUA 191) on monitoring systems (EEMUA, 2007;International Society of Automation, 2009). In view of the currentstatus of alarm systems, industry personnel have started to put a lot ofeffort in alarm system management and rationalization. Different alarmconfiguration methods, e.g., filters, deadbands and delay-timers, aresome of the commonly used techniques to reduce false and chatteringalarms. In addition, some advanced techniques such as state-basedalarming, logic-based alarming, and predictive alarming are also inpractice (Hwang et al., 2008; Jang, Suh, Kim, Suh, & Park, 2013;Jerhotova, Sikora, & Stluka, 2012). For the last few years, researchersfrom academia have also been engaged with industries to help them inimproving their alarm systems. This collaboration has also enriched thepublished literature on the use and design of various alarm configurationmethods (Adnan, Izadi, & Chen, 2011; Cheng, Izadi, & Chen, 2011;Hugo, 2009; Simeu-Abazi, Lefebvre, & Derain, 2011). In the followingsubsection different types of alarm configuration methods are described,and the literature survey is provided.

    ✩ A preliminary version of this paper was appeared in the Proceeding of American Control Conference (ACC), pp. 4815–4820, Seattle, WA, USA, May 24–26, 2017.* Corresponding author.

    E-mail addresses: [email protected] (M.S. Afzal), [email protected] (T. Chen), [email protected] (A. Bandehkhoda), [email protected] (I. Izadi).

    1.1. Taxonomy of alarm configuration methods

    Alarm configuration methods can be broadly classified into two maintypes, namely, basic methods and advanced or enhanced methods (In-ternational Society of Automation, 2009). Filters, deadbands, and delay-timers can be categorized as the basic alarm configuration methods;whereas techniques like state-based alarming, predictive alarming, andlogic based alarming fall under the umbrella of advanced methods.Fig. 1 gives the taxonomy of alarm configuration methods. In this figure,vertical dots in the last tier show that there is a wide range of filters,and advanced methods available, while only a few are shown here asexamples.

    In the literature many papers can be found that deal with theanalysis and design of different alarm configuration methods, e.g., inCheng et al. (2011) and Cheng, Izadi, and Chen (2013) the problem ofdesigning optimal alarm filters was studied, and it was found out thatthe log-likelihood ratio filters gave the optimal performance in terms ofalarm system accuracy. Numerical optimization procedures were alsoproposed for linear and quadratic forms of optimal filters. In Adnan etal. (2011) the authors have computed the detection delays for both onand off delay-timers, and measurement-deadbands. A design procedurebased on the Receiver Operating Curve (ROC) was also proposed.Analytical expressions for run-length distribution based chattering in-dex were computed for delay-timers and measurement-deadbands inNaghoosi, Izadi, and Chen (2011). The concept of generalized delay-timers was studied in Adnan, Cheng, Izadi, and Chen (2013), and the

    https://doi.org/10.1016/j.conengprac.2017.10.016Received 22 March 2017; Received in revised form 7 September 2017; Accepted 29 October 2017Available online 20 November 20170967-0661/© 2017 Elsevier Ltd. All rights reserved.

  • M.S. Afzal et al. Control Engineering Practice 71 (2018) 96–107

    performance of the generalized delay-timers was also compared withthe traditional on and off delay-timers. In Tan, Sun, Azad, and Chen(2017) performance indices, like the False Alarm Rate (FAR), MissedAlarm Rate (MAR), and Expected Detection Delay (EDD) were computedfor rank order filters based on univariate alarm systems, and the perfor-mance was compared with other filters. In Afzal and Chen (2017) theauthors studied the application of delay-timers for multimode processes.The performance indices (FAR, MAR, and EDD) were computed, and adesign procedure based on particle swarm optimization was proposed.

    A few papers can also be found in the literature that deal with the en-hanced configuration methods for alarm systems, e.g., in Miao, Sforna,and Liu (1996) the authors have developed a logic-based alarm systemfor power distribution unit, by taking into account the information fromthe breaker operation and the sequence of event recorders. Based onthe testing results provided in the paper, the logic-based alarm systemshowed superior performance. A method of dynamic alarming basedon online removal of chattering and repeating alarms was proposedin Wang and Chen (2014). In this method, alarm durations and timedifference between two alarms were considered in the detection ofchattering and repeating alarms. In Nihlwing and Kaarstad (2012)the authors devised a state-based alarm system for a nuclear powerplant simulator, and through tests it was observed that the state-basedalarming system provided higher usability ratings as compared to thetraditional alarm system. In Lai and Chen (2015) a pattern miningbased predictive alarming system was proposed for alarm floods. Inthis method, a multiple sequence alignment algorithm was developed,and a similarity score was used to detect the similarity of incomingalarm sequences with the mined database. Evidence theory based alarmsystems were proposed in Xu, Li, Song, Wen, and Xu (2016), wherefuzzy thresholds were designed and a recursive algorithm was proposedto generate alarms based on online processing of sampled processvariables. Some more efforts on the advanced alarm configurationmethods can be found in Hu, Wang, and Chen (2015), Jerhotova et al.(2012), Laberge, Bullemer, Tolsma, and Dal Vernon (2014) and Varga,Szeifert, and Abonyi (2009).

    1.2. Contribution of our work

    Time-deadbands, also known as alarm latches, are one of the typesof the deadbands used in industry, as shown in Fig. 1. A few papers onthe quantitative analysis of the measurement-deadbands can be found inthe literature, which can help in assessing and designing alarm systemsbased on measurement-deadbands (Adnan et al., 2011; Naghoosi et al.,2011). While there is also some literature available that deal with time-deadband configurations, e.g., in Kondaveeti, Izadi, Shah, and Chen(2011) the authors provided some practical implications on the use ofalarm latches based on qualitative analysis, and in Hugo (2009) theauthor proposed a time-series prediction based approach to estimatethe lengths of time-deadbands, to the best of the authors’ knowledgethere is not any study available that provides quantitative analysis ofthe performance of the time-deadband configurations, and proposes sys-tematic design procedures. Consequently, in this paper we are analyzingtime-deadband configurations for univariate alarm systems. In the fieldof alarm system analysis and design, the contributions of the paper arethreefold: (1) A mathematical model is developed for time-deadbandconfigurations based on Markov processes; (2) Analytical expressionsfor the performance indices (the false alarm rate, missed alarm rate,and expected detection delay) are derived; (3) Design procedures basedon process data and alarm data are proposed.

    1.3. Organization of the paper

    The rest of the paper is organized as follows. In Section 2, back-ground information on the types of deadbands is given. A run-lengthbased encoding scheme for alarm sequences is also introduced in thissection. In Section 3, a model of the time-deadband configuration is

    Fig. 1. Taxonomy of alarm configuration methods.

    developed. In Section 4, definitions of different performance indices areprovided, and their expressions for time-deadbands are derived. Sec-tion 5 provides the description of design procedures for time-deadbandsalong with illustrative examples. In Section 6, a comparative discussionwith conventional delay-timers is provided. Concluding remarks aregiven in Section 7.

    2. Background

    In this section, different types of deadbands are discussed. A briefdescription of run-length based encoding of alarm sequences is also partof this section.

    2.1. Types of deadbands

    Two types of deadband configurations can be found in alarm sys-tems: measurement-deadbands and time-deadbands, as shown in Fig. 1.A measurement-deadband can only be applied on a continuous processvariable, and it can be considered as a bi-threshold alarm system, wherean alarm is raised when the process variable goes above the upperthreshold, and the alarm is cleared when the process variable falls belowthe lower threshold. Let 𝑥(𝑡) be the measurement of a process variable attime 𝑡, and 𝑥𝑎(𝑡) be the corresponding alarm signal, then mathematicallythe measurement-deadband configuration can be defined as:

    𝑥𝑎(𝑡) =

    1; if 𝑥𝑎(𝑡 − 1) = 0 and 𝑥(𝑡) ≥ 𝜁𝑢0; if 𝑥𝑎(𝑡 − 1) = 1 and 𝑥(𝑡) < 𝜁𝑙𝑥𝑎(𝑡 − 1) otherwise

    (1)

    where 𝜁𝑢 and 𝜁𝑙 are the upper and lower thresholds for raising andclearing an alarm, respectively. Fig. 2 shows an example of a processvariable configured with a measurement-deadband. From this figure itcan be seen that there are a few instances when the process variable goesabove the lower threshold; however alarms are delayed until it crossedthe upper threshold. Similarly, clearance of alarms is also delayed untilthe process variable goes below the lower threshold.

    97

  • M.S. Afzal et al. Control Engineering Practice 71 (2018) 96–107

    Fig. 2. Example of a measurement-deadband configuration.

    Table 1Recommendations for time-deadband configurations.

    Variable type Time-deadband (s)

    Flow 15Pressure 15Level 60Temperature 60Composition 120

    Time-deadbands constitute uni-threshold alarm systems. Unlikemeasurement-deadbands, time-deadbands can be applied to both con-tinuous and discrete process variables. In general, time-deadband repre-sents two waiting times: (1) Minimum waiting-time before an alarm canbe cleared; (2) Minimum waiting-time before an alarm can be raised.In this paper, these two waiting times are denoted as 𝑇𝐴 and 𝑇𝑁𝐴,respectively. Whenever a process variable goes above the threshold,an alarm is raised; however the alarm can be cleared only after theminimum waiting-time (𝑇𝐴) in the alarm state has passed. Similarly, theraise of an alarm is possible only if the minimum waiting-time (𝑇𝑁𝐴) inthe no-alarm state has passed. Mathematically, an alarm signal 𝑥𝑎(𝑡) fora time-deadband configuration can be written as:

    𝑥𝑎(𝑡) =

    1; if 𝑥𝑎[𝑡 − 𝑇𝑁𝐴 ∶ 𝑡 − 1] = 0 and 𝑥(𝑡) ≥ 𝜁0; if 𝑥𝑎[𝑡 − 𝑇𝐴 ∶ 𝑡 − 1] = 1 and 𝑥(𝑡) < 𝜁𝑥𝑎(𝑡 − 1) otherwise

    (2)

    where 𝑥(𝑡) is the underlying process variable, and 𝜁 is the alarmthreshold. In Fig. 3 an example of a time-deadband configuration ona continuous process variable is shown. In this example waiting timesare configured to be 5 sample-times for both alarm and no-alarm states.From this figure it can be seen that the first alarm is raised at thetime instance marked as 𝑥1, and although the variable goes below thethreshold (𝑥2) after a few samples, but the alarm is not cleared becauseof the configured waiting-time in the alarm state. Similarly, at thetime instance 𝑦1, the process variable goes above the threshold again;however, raise of an alarm is delayed until 𝑦2, because of the waiting-time in the no-alarm state.

    Recommendations on the use of both measurement-deadbands andtime-deadbands can be found in ISA 18.2 and EEMUA 191 standards(EEMUA, 2007; International Society of Automation, 2009). For benefitsof readers, recommendations for time-deadbands based on the types ofprocess variables are listed in Table 1.

    It is worth mentioning here that these recommendations are onlyused as initial estimates for designing time-deadbands. Actual valuesfor the waiting times are typically decided by considering the criticalityof the process variable, and signal to noise ratio of the measurements.

    Fig. 3. Example of a time-deadband configuration.

    2.2. Run-length encoding of alarm sequences

    An alarm sequence is a series of ones and zeros, which is generatedafter comparing the measurements of a process variable against a config-ured threshold. The design of different types of alarm configurations ona process variable can make use of either actual process measurementsor the corresponding alarm sequence. In this paper both design scenariosfor time-deadbands are considered, and for the case of alarm sequencebased design, run-length based encoding schemes of alarm sequencesare utilized.

    Run-length encoding is a type of lossy or lossless data compression, inwhich runs (specific sequence of elements occurring throughout dataset)of the data are used to represent the entire dataset (Kotz & Johnson,1988). Run-length based encoding has its application in many fields,like image and signal processing, statistical control, and finance. Forthe case of alarm signals, runs contain sequences of 1’s (alarm state) and0’s (no-alarm states), and depending on the definitions of the run manytypes of run-length encoding schemes can be defined. In the following,a few of these types are described, with the help of an example of analarm sequence, shown in Fig. 4. For illustration purposes, lengths ofeach alarm and no-alarm states are also indicated in the figure.

    RTN–RTN run-length based encodingA Return-to-Normal (RTN) point in an alarm sequence is the time

    instances at which the alarm sequence returns to a no-alarm state froman alarm state, as shown in Fig. 4. A lossy run-length encoding schemefor an alarm sequence can be generated by considering the lengthbetween one RTN point to the next RTN point to be a run. In otherwords, for RTN–RTN run-length encoding scheme, sequences of 0’s forthe no-alarm state and the following 1’s for the alarm state form a run,and it is assumed that the alarm sequence is starting in a no-alarm state.The cases where this assumption is not true, 1’s corresponding to thefirst alarm state are discarded, as is the case for the example shownin Fig. 4. For this alarm sequence, first RTN–RTN run is formed bysumming up the lengths of the first no-alarm state, and the second alarmstate, i.e., 1 + 2 = 3. Overall, RTN–RTN run-length encoding will be:RTN–RTN = {3, 5, 3, 2, 6, 3}.

    ALM–ALM run-length based encodingAn Alarm (ALM) point in an alarm sequence is the time instance at

    which alarm sequence jumps to an alarm state from a no-alarm state.Another lossy encoding scheme for an alarm sequence can be generatedby considering the length between two consecutive ALM points in thealarm sequence to be a run. In particular, the sum of the lengths of analarm state and the following no-alarm state constitutes a run. For the

    98

  • M.S. Afzal et al. Control Engineering Practice 71 (2018) 96–107

    Fig. 4. Time trend of an alarm sequence.

    Fig. 5. Normal and abnormal distributions of a process variable.

    example shown in Fig. 4, ALM–ALM encoding scheme will be: ALM–ALM= {3, 4, 4, 3, 3, 5}.

    Lossless run-length encodingA Lossless Run-Length Encoding (LRLE) scheme for an alarm se-

    quence can be generated by considering the lengths of alarm and no-alarm states separately and in a time-synchronous manner. Unlike RTN–RTN and ALM–ALM encoding schemes, alarm sequences representedusing LRLE can be reconstructed without any errors. LRLE basedrepresentation of the considered example will be: LRLE = {2, 1, 2, 2,3, 1, 2, 1, 1, 2, 4, 1, 2}. The usage of run-length encoding schemes fordesigning time-deadbands is illustrated later in Section 5.2.

    3. Modeling of time-deadbands

    Consider a process variable 𝑥 following a distribution 𝑃𝑛(𝑥) duringnormal operation, and 𝑃𝑎𝑏(𝑥) during abnormal operation of the process,as shown in Fig. 5. Without loss of generality, assume that only highalarm threshold is configured on this variable. Let 𝑝2 = 1 − 𝑝1 be theprobability of the process variable to go above the threshold duringnormal operation (shaded region under the normal distribution), and𝑞1 = 1 − 𝑞2 be the probability of the process variable to fall below thethreshold during abnormal operation (shaded region under the abnor-mal distribution of the process variable). Further, let 𝑇𝐴 sample-timesbe the minimum waiting-time before an alarm can be cleared, and 𝑇𝑁𝐴be the minimum waiting-time in the no-alarm state before an alarm canbe raised. Then such an alarm configuration can be defined completelyusing a semi-Markov process. A semi-Markov process is an extension of

    Fig. 6. A sample-path of a semi-Markov chain for a time-deadband configuration.

    standard Markov processes, in which in addition to modeling the statetransition probabilities, the time spent in each state (resting time) isalso captured (Ghosh, 2012). For the aforementioned time-deadbandconfiguration, a semi-Markov process with two states, namely, the alarmstate (A) and no-alarm state (NA) is defined. Transition probabilitiesbetween the two states are inferred from the probability distributions ofthe process variable, and the resting time of each state is dependent onboth the probability distributions, and the time-deadband configuration.For the normal operation of the process variable, the following set ofequations completely describe the time-deadband configuration:

    P[

    𝑆[𝑡 = A | 𝑆[𝑡−𝑑∶𝑡−1] = A]

    ={

    𝑝2, 𝑑 > 𝑇𝐴1, 𝑑 ≤ 𝑇𝐴

    P[

    𝑆[𝑡 = NA | 𝑆[𝑡−𝑑∶𝑡−1] = A]

    ={

    𝑝1, 𝑑 > 𝑇𝐴0, 𝑑 ≤ 𝑇𝐴

    P[

    𝑆[𝑡 = A | 𝑆[𝑡−𝑑∶𝑡−1] = NA]

    ={

    𝑝2, 𝑑 > 𝑇𝑁𝐴0, 𝑑 ≤ 𝑇𝑁𝐴

    P[

    𝑆[𝑡 = NA | 𝑆[𝑡−𝑑∶𝑡−1] = NA]

    ={

    𝑝1, 𝑑 > 𝑇𝑁𝐴1, 𝑑 ≤ 𝑇𝑁𝐴

    (3)

    where the notation 𝑆[𝑡1∶𝑡2] represents the status of the state during timeinterval 𝑡1 to 𝑡2 (end points included), and 𝑆[𝑡 represents the status ofthe state starting from time 𝑡. For example P

    [

    𝑆[𝑡 = A | 𝑆[𝑡−𝑑∶𝑡−1] = NA]

    represents that starting from time 𝑡 the process is in state A, giventhat the process was in NA state during time interval [𝑡 − 𝑑 ∶ 𝑡 − 1].In general, P

    [

    𝑆[𝑡 = 𝑖 | 𝑆[𝑡−𝑑∶𝑡−1] = 𝑗]

    represents the probability of themodel of switching to state 𝑖 ∈ {𝐴, 𝑁𝐴} at time 𝑡, given that the stateof the model was 𝑗 ∈ {𝐴, 𝑁𝐴} during time interval 𝑡 − 𝑑 ∶ 𝑡 − 1. Asimilar set of equations can be written for the case when the processis operating under abnormal conditions, by replacing 𝑝1 and 𝑝2 with𝑞1 and 𝑞2, respectively. Fig. 6 shows an example of the sample path ofa semi-Markov process based model with time-deadband configuration(𝑇𝑁𝐴 = 2, 𝑇𝐴 = 3). In this example the probabilities of switchingbetween alarm and no-alarm states are assumed to be 𝑝1 = 0.8, and 𝑝2 =0.2. Waiting times both in alarm and no-alarm states are represented ashorizontal lines of probabilities equal to 1, whereas the length of theline corresponds to the amount of waiting-time in different states. Oncethe waiting time has passed, the probability of switching to a differentstate depends on the underlying process variable distribution.

    While a semi-Markov process can completely define the time-deadband configuration, for analysis purposes and to derive perfor-mance indices it is required to study the long term behavior of themodel, which is not a trivial task for semi-Markov process based models.Fortunately, for the case of time-deadbands, it is possible to convertthe semi-Markov model to a standard Markov model, by introducingnon-self transitioning states corresponding to the lengths of the waitingtimes for both the alarm and no-alarm states. Fig. 7 shows a resultant

    99

  • M.S. Afzal et al. Control Engineering Practice 71 (2018) 96–107

    Fig. 7. The Markov chain model of a time-deadband configuration (𝑇𝐴 = 3, 𝑇𝑁𝐴 = 2).

    Fig. 8. The Markov chain model of a time-deadband configuration (𝑇𝐴, 𝑇𝑁𝐴).

    Markov chain based model of the semi-Markov process shown in Fig. 6.Three non-self transiting states (A1, A2, A3) correspond to the waiting-time in alarm state (𝑇𝐴 = 3), and two non-self transiting states (NA1,NA2) represent the 𝑇𝑁𝐴 = 2. A more generic standard Markov model fortime-deadband configuration (𝑇𝐴, 𝑇𝑁𝐴) under the normal operation ofthe process is shown in Fig. 8. A similar model can be obtained for theabnormal situation by replacing 𝑝1 and 𝑝2 with 𝑞1 and 𝑞2, respectively.

    3.1. Long-term behavior of the model

    Performance of the time-deadband configurations can be assessed bystudying the long-term behavior of the Markov chain model. Under theassumption that the samples from the process variable are independentand identically distributed, long-term behavior of the model can bestudied by finding the stationary distribution of the model. For theproposed Markov model (Fig. 8) the state transition matrix can bewritten as:

    𝑃𝑛 =[

    𝑃11 𝑃12𝑃21 𝑃22

    ]

    (𝛼+𝛽)×(𝛼+𝛽)(4)

    where 𝛼 = 1 + 𝑇𝑁𝐴, 𝛽 = 1 + 𝑇𝐴, and 𝑃𝑖𝑗 (1 ≤ 𝑖, 𝑗 ≤ 2) are sub-matrices,which are given by:

    𝑃11 =

    0 1 0 ⋯ 00 0 1 ⋯ 0⋮ ⋮ ⋮ ⋱ ⋮0 0 0 ⋯ 10 0 0 ⋯ 𝑝1

    ⎦𝛼×𝛼

    𝑃12 =

    0 0 0 ⋯ 00 0 0 ⋯ 0⋮ ⋮ ⋮ ⋱ ⋮0 0 0 ⋯ 0𝑝2 0 0 ⋯ 0

    ⎦𝛼×𝛽

    (5)

    𝑃21 =

    0 0 0 ⋯ 00 0 0 ⋯ 0⋮ ⋮ ⋮ ⋱ ⋮0 0 0 ⋯ 0𝑝1 0 0 ⋯ 0

    ⎦𝛽×𝛼

    𝑃22 =

    0 1 0 ⋯ 00 0 1 ⋯ 0⋮ ⋮ ⋮ ⋱ ⋮0 0 0 ⋯ 10 0 0 ⋯ 𝑝2

    ⎦𝛽×𝛽

    . (6)

    A similar state transition matrix (𝑃𝑎𝑏) can be obtained by replacing𝑝1 and 𝑝2 with 𝑞1 and 𝑞2, respectively. The stationary distributionof the model can be found using the following equation (Zucchini &MacDonald, 2009):

    𝜋𝑛 = 𝟏𝛼+𝛽(

    𝐈 − 𝑃𝑛 + 𝐔)−1(𝛼+𝛽) × (𝛼+𝛽) (7)

    where 1 represents a row vector of ones, 𝐈 is an identity matrix, and 𝐔is a square matrix of ones. This results in the following:

    𝜋𝑛 =[

    𝛿1 ⋯ 𝛿𝛼−1 𝛿𝛼 𝛿𝛼+1 ⋯ 𝛿𝛼+𝛽−1 𝛿𝛼+𝛽]

    (8)

    where 𝛿𝑘 (1 ≤ 𝑘 ≤ 𝛼 + 𝛽) represents the probability of the 𝑘th state inthe long-term run of the model. After some simplifications the stationarydistribution of the model reduces to the following vector:

    𝜋𝑛 =

    [

    1 ⋯ 1 1𝑝2

    1 ⋯ 1 1𝑝1

    ]

    𝑇𝐴 + 𝑇𝑁𝐴 +1𝑝1

    + 1𝑝2

    . (9)

    A similar distribution of the model can be written for the abnormaloperation of the process variable:

    𝜋𝑎𝑏 =

    [

    1 ⋯ 1 1𝑞2

    1 ⋯ 1 1𝑞1

    ]

    𝑇𝐴 + 𝑇𝑁𝐴 +1𝑞1

    + 1𝑞2

    . (10)

    In the long-run of the process, the stationary distributions of the modelcan be used to find the probability of the process variable to be presentin one of the alarm or no-alarm states.

    3.2. Discussion on the assumptions

    While developing the model a number of assumptions were made.In this subsection the impact of these assumptions for practical cases isdiscussed.

    Distributions of the process dataIt is assumed that the distributions of the process data are known for

    both the normal and abnormal operations. However, in many cases it isnot possible to know the distributions of the process data beforehand.In such cases, a Kernel Density Estimation (KDE) based approach canbe used to find the estimates of the distributions, given that sufficienthistorical data is available. Another hurdle in finding the estimates forthe normal and abnormal distributions is to distinguish between thenormal and abnormal data in the historical data. This problem can beovercome by either referring to the event logs of the process operationor by using some data based methods to distinguish the abnormal datafrom normal data, e.g., in Yu, Zhu, Wang, and Zhao (2017) authorshave proposed a correlation directions based method for abnormal datadetection.

    Existence of the stationary distributionsTo study the long-term behavior of the time-deadband configuration,

    it is assumed that the distributions of the process variable during boththe normal and abnormal operation are independent and identicallydistributed. However, in practice this assumption does not hold truewhen considering the entire historical data of the process variable,because such data includes a lot of transitions due to mode changesand various other factors pertaining to the process operation. However,if only part of the data is considered by eliminating the transitionalchanges, an identical distribution for the process variable can beassumed (Basseville & Nikiforov, 1993). Furthermore, for the case ofunivariate alarm systems, the underlying assumption is that the processvariable under consideration is independent of the effects of the othervariables in the process.

    100

  • M.S. Afzal et al. Control Engineering Practice 71 (2018) 96–107

    4. Performance assessment

    For performance assessment of time-deadband configurations, threeperformance indices, the false alarm rate, the missed alarm rate, and theexpected detection delay, are considered.

    4.1. False alarm rate

    The false alarm rate is the probability of raising an alarm while theprocess is operating under normal conditions. For the case of univariatealarm systems without any configuration installed, the false alarm ratecan be calculated by finding the probability of the process variable togo above the threshold under the normal distribution (𝑃𝑛(𝑥)):

    FAR = ∫∞

    𝜁𝑃𝑛(𝑥)𝑑𝑥 (11)

    where 𝜁 represents the alarm threshold. With time-deadband config-urations, the false alarm rate can no longer be computed by onlyconsidering the normal distribution of the process variable, in fact thelength of 𝑇𝐴 is also required to be considered. There the false alarmrate for time-deadband configuration is calculated by considering theprobabilities of all the alarm states (𝐴1, 𝐴2,… , 𝐴𝑇𝐴 , 𝐴) in the Markovmodel developed in Section 3, i.e.

    FAR = P(𝐴1) + P(𝐴2) + ⋯ + P(𝐴𝑇𝐴 ) + P(𝐴). (12)

    The probabilities of all the alarm states can be found from the stationarydistribution (𝜋𝑛), given by (9), and after a few simplifications thefollowing expression for the false alarm rate can be obtained:

    FAR =

    𝑇𝐴⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞1 + 1 +⋯ + 1+ 1𝑝1𝑇𝐴 + 𝑇𝑁𝐴 +

    1𝑝1

    + 1𝑝2

    (13)

    =𝑇𝐴 +

    1𝑝1

    𝑇𝐴 + 𝑇𝑁𝐴 +1𝑝1

    + 1𝑝2

    . (14)

    4.2. Missed alarm rate

    The missed alarm rate is the failure probability of an alarm system inraising an alarm, while the process is in abnormal mode. Theoretically,the probability for a univariate alarm system can be calculated asfollows:

    MAR = ∫𝜁

    −∞𝑃𝑎𝑏(𝑥)𝑑𝑥 (15)

    where 𝑃𝑎𝑏(𝑥) is the distribution of the process variable under abnormalconditions. The missed alarm probability for the time-deadband configu-ration can be calculated by summing up the probabilities correspondingto the no-alarm states (𝑁𝐴1, 𝑁𝐴2,… , 𝑁𝐴𝑇𝑁𝐴 , 𝑁𝐴), i.e.

    MAR = P(𝑁𝐴1) + P(𝑁𝐴2) +⋯ + P(𝑁𝐴𝑇𝑁𝐴 ) + P(𝑁𝐴). (16)

    After plugging in the probabilities of no-alarm states from (10), thefollowing analytical expression for the computation of the missed alarmrate is obtained:

    MAR =

    𝑇𝑁𝐴⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞1 + 1 +⋯ + 1+ 1𝑞2𝑇𝐴 + 𝑇𝑁𝐴 +

    1𝑞1

    + 1𝑞2

    (17)

    =𝑇𝑁𝐴 +

    1𝑞2

    𝑇𝐴 + 𝑇𝑁𝐴 +1𝑞1

    + 1𝑞2

    . (18)

    4.3. Expected detection delay

    The detection delay is defined as the time taken by the alarm systemto raise an alarm after the process has entered the abnormal region

    of operation. The mean value of the detection delay is termed as theexpected detection delay. Let 𝑡𝑎𝑏 be the time instance when processenters the abnormal region, and let 𝑡𝑎 be the time at which an alarmis raised; then the detection delay in terms of number of time samples,can be defined as:

    Detection delay = 𝑡𝑎 − 𝑡𝑎𝑏. (19)

    For the case of Markov process based model, detection delay can bedefined as the time samples taken by the Markov chain in switching fromno-alarm states (𝑁𝐴1, 𝑁𝐴2,… , 𝑁𝐴𝑇𝑁𝐴 , 𝑁𝐴) to any of the alarm states(𝐴1, 𝐴2,… , 𝐴𝑇𝐴 , 𝐴), which is known as hitting time of the Markov chain(Lawler, 2006). A hitting time of 𝑧 samples for the developed model canbe found as:

    P(

    detection delay = 𝑧)

    = 𝜋𝑛𝑃𝑎𝑏 𝑃 𝑧[

    𝑇𝑁𝐴+1⏞⏞⏞0 ⋯ 0 1 ⋯ 1

    ⏟⏟⏟𝑇𝐴+1

    ]𝑇(20)

    where 𝑃 is a square matrix of size 𝛼 + 𝛽, and is obtained from 𝑃𝑎𝑏by replacing all the transition probabilities corresponding to the alarmstates with zeros. The expected detection delay can be found by takingthe mean value of the hitting time over the range of delays 𝑧 ∈ [0 ∞):

    EDD =∞∑

    𝑧=0𝑧 P

    (

    detection delay = 𝑧)

    = 𝜋𝑛𝑃𝑎𝑏(

    ∞∑

    𝑧=0𝑧 𝑃 𝑧

    ) [

    𝑇𝑁𝐴+1⏞⏞⏞0 ⋯ 0 1 ⋯ 1

    ⏟⏟⏟𝑇𝐴+1

    ]𝑇. (21)

    Since 𝑃 is a sub-stochastic matrix with all the eigenvalues strictly lessthan 1, ∑∞𝑧=0𝑧 𝑃 𝑧 converges to 𝑃 (𝐼 − 𝑃 )−2 (Lawler, 2006), and thefollowing expression for EDD is obtained:

    EDD = 𝜋𝑛𝑃𝑎𝑏𝑃 (𝐼 − 𝑃 )−2[

    0 ⋯ 0 1 ⋯ 1]𝑇. (22)

    Eq. (22) can be further simplified by letting 𝛿 = 𝜋𝑛𝑃𝑎𝑏, and 𝛾 =𝑃 (𝐼 − 𝑃 )−2[0 ⋯ 0 1 ⋯ 1]𝑇 , and then the following expressions for𝛿 and 𝛾 can be obtained:

    𝛿 =

    [

    𝑇𝑁𝐴⏞⏞⏞⏞⏞⏞⏞⏞⏞𝑞1𝑝1

    1 ⋯ 1 (1 + 𝑞1𝑝2)

    𝑇𝐴⏞⏞⏞⏞⏞⏞⏞⏞⏞𝑞2𝑝2

    1 ⋯ 1 (1 + 𝑞2𝑝1)

    ]

    𝑇𝐴 + 𝑇𝑁𝐴 +1𝑝1

    + 1𝑝2

    (23)

    𝛾 =

    [

    𝑇𝑁𝐴⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞𝑞2𝑇𝑁𝐴 + 1

    𝑞2

    𝑞2(𝑇𝑁𝐴 − 1) + 1𝑞2

    ⋯𝑞2 + 1𝑞2

    1 +𝑞1𝑞2

    0⋯ 0⏟⏟⏟𝑇𝐴+1

    ]

    𝑇 . (24)

    Finally, the expression for the expected detection delay for time-deadbands takes the following form:

    EDD

    =

    𝑞1𝑝1

    (

    𝑞2𝑇𝑁𝐴+1𝑞2

    )

    +∑𝑇𝑁𝐴−1𝑖=1

    𝑞2(

    𝑇𝑁𝐴−𝑖)

    +1𝑞2

    + 1𝑞2

    (

    1 + 𝑞1𝑝2

    )

    𝑇𝐴 + 𝑇𝑁𝐴 +1𝑝1

    + 1𝑝2

    . (25)

    4.4. Simulation verification

    To verify the analytical expressions derived for the false alarmrate, the missed alarm rate, and the expected detection delay, 10,000Monte Carlo simulations are performed, by considering Gaussian andGamma distributions of the process variables. Different setting of the

    101

  • M.S. Afzal et al. Control Engineering Practice 71 (2018) 96–107

    Table 2Performance comparison of the time-deadband configurations.

    Configuration (FAR, MAR) (%) EDD Threshold

    𝑇𝐴 = 0, 𝑇𝑁𝐴 = 0 (15.87, 15.87) 0.35 1𝑇𝐴 = 15, 𝑇𝑁𝐴 = 2 (11.43, 22.88) 1.73 2.40𝑇𝐴 = 25, 𝑇𝑁𝐴 = 10 (10.38, 34.10) 2.63 2.60𝑇𝐴 = 35, 𝑇𝑁𝐴 = 0 (8.42, 11.52) 3.43 2.80𝑇𝐴 = 65, 𝑇𝑁𝐴 = 1 (8.17, 9.94) 4.87 3.0

    time-deadband configurations are also considered. Fig. 9 shows thesimulation results obtained for the false alarm rate. From the figures,it can be seen that curves obtained using the Markov process model,and the Monte Carlo simulations are very close to each other, whichvalidates the proposed analytical expression for the false alarm rate fortime-deadbands. Satisfactory results are also obtained while testing theformulas for the missed alarm rate and the expected detection delay,and are shown in Figs. 10 and 11, respectively.

    5. Design of time-deadbands

    In this section two design methods are proposed for time-deadbandconfigurations. The first method makes use of the process data andthe performance indices (FAR, MAR, and EDD) to design a threshold,and time-deadband configuration on the process variable. The secondmethod is based on the use of alarm data, and the objective is todesign a time-deadband configuration such that alarm count and alarmchattering are reduced, while threshold setting is not altered.

    5.1. Design based on process data

    Let 𝑦 be the collection of all the design parameters, i.e., 𝑦 ={threshold, 𝑇𝑁𝐴, 𝑇𝐴}, then the design problem can be stated as:

    Design 𝑦

    s.t.⎧

    FAR ≤ 𝑎MAR ≤ 𝑏EDD ≤ 𝑐

    (26)

    where 𝑎, 𝑏, 𝑐 are the upper allowable limits on the false alarm rate,missed alarm rate, and the expected detection delay, respectively. Forthis design problem, a Receiver Operating Curve (ROC) based graphicalmethod is utilized. Furthermore, it is assumed that the distributions ofthe process variable under normal and abnormal situations are known.

    For illustration purposes, consider an example of a process variable,which is Gaussian distributed with the mean value of 0 and the standarddeviation of 1 during normal operation, and for the abnormal operationthe mean value of the data is shifted to 2, while the standard deviation iskept the same. Time-trends of the variable are shown in Fig. 12, wherethe first 1000 samples are corresponding to the normal operation, andthe last 1000 samples are during the abnormal operation of the process.The objective is to design a time-deadband configuration, and an alarmthreshold such that the false and missed alarm rates are less 10%, andthe value of expected detection delay is less than 5 samples. Variousconfigurations of the time-deadbands are tested, and the values of theperformance indices are calculated using (14), (18) and (25) for therange of the threshold. Resulting ROC curves for some selected time-deadband configurations are shown in Fig. 13. The points nearest to theorigin are found by calculating the Euclidean distance for each pointon the curve and the origin. These points are listed in Table 2 alongwith the corresponding values of the expected detection delays andthe thresholds. From this table, it can be seen that the time-deadbandconfiguration of 𝑇𝐴 = 65 and 𝑇𝑁𝐴 = 1 satisfies the desired performancerequirements, and thus can be used for the considered process variable.

    (a) Gaussian distributed variable (𝑇𝑁𝐴 = 3, 𝑇𝐴 = 5)

    (b) Gamma distributed variable (𝑇𝑁𝐴 = 8 and 𝑇𝐴 = 8).

    Fig. 9. Simulation verification of the FAR formula for time-deadbands.

    5.2. Design based on alarm data

    Many times operators are reluctant in changing the alarm limitsonce the process has been commissioned, and is in running state. Forsuch cases, design procedure that involves a change in threshold forthe process variable is not suited. At the same time, the operators areinterested in improving the alarm system performance by removingchattering alarms, and reducing the alarm count on their monitoringscreens. The objective of this design procedure is to reduce the chat-tering alarms, and thus the alarm count without altering the thresholdsettings. For quantifying the alarm chattering we have used the run-length distribution based chattering index, proposed in Kondaveeti etal. (2013). The chattering index makes use of the ALM–ALM encodingscheme, and mathematically it is defined as:

    𝜓 =∑

    𝑟 ∈ NP(𝑟) 1

    𝑟(27)

    where P(𝑟) is the probability of run length 𝑟 in the alarm sequenceencoded as ALM–ALM, and N is the set of all run lengths found in thealarm sequence.

    5.2.1. Time-deadband only on no-alarm stateIn this case time-deadband is configured only on no-alarm state,

    i.e., 𝑇𝑁𝐴 ≠ 0 and 𝑇𝐴 = 0. This configuration is useful for the caseswhere operators can tolerate a waiting-time in the no-alarm state beforean alarm can be raised. For design purposes, a range of possible values

    102

  • M.S. Afzal et al. Control Engineering Practice 71 (2018) 96–107

    (a) Gaussian distributed variable (𝑇𝑁𝐴 = 10, 𝑇𝐴 = 6).

    (b) Gamma distributed variable (𝑇𝑁𝐴 = 20 and 𝑇𝐴 = 12).

    Fig. 10. Simulation verification of the MAR formula for time-deadbands.

    for 𝑇𝑁𝐴 are considered, and an algorithm is proposed to compute thepercentage alarm reduction after applying a 𝑇𝑁𝐴 of certain length onthe historical alarm data. Algorithm 1 provides the pseudo code of thedesign procedure for 𝑇𝑁𝐴. The algorithm makes use of the RTN–RTNrun-length based encoded alarm data. While traversing through theencoded alarm data, the alarm count is reduced by one every time if thecondition on line 6 of the nested while loop is satisfied. This conditionin practice translates to the fact that with the time-deadband in placethe alarm would not have raised in the first place. For design purposes,range of values for 𝑇𝑁𝐴 are fed into the algorithm to get a set of possiblesolutions.

    5.2.2. Time-deadband only on alarm stateIn this case operators can tolerate a minimum waiting-time only

    on the alarm state; however, an alarm should be raised without anydelay, i.e., 𝑇𝐴 ≠ 0, and 𝑇𝑁𝐴 = 0. This case is applicable for very criticalprocess variables, where no delay in raising the alarm can be afforded. Adesign procedure based on ALM–ALM run-length encoded alarm data isproposed for designing a recommended length of 𝑇𝐴. The pseudo code ofthe procedure is shown in Algorithm 2. Similar to Algorithm 1, the alarmcount is reduced if condition on line 6 of the nested loop is satisfied. Thedesign of time-deadband based on Algorithm 1 requires historic alarmdata encoded as ALM–ALM sequences, and the range of allowable valuesfor 𝑇𝐴.

    (a) Gaussian distributed variable (𝑇𝑁𝐴 = 10, 𝑇𝐴 = 5).

    (b) Gamma distributed variable (𝑇𝑁𝐴 = 20 and 𝑇𝐴 = 12).

    Fig. 11. Simulation verification of the EDD formula for time-deadbands.

    Fig. 12. Time-trends of a process variable.

    5.2.3. Time-deadband on both alarm and no-alarm statesThis case applicable for the process variables, where operators can

    afford to have waiting times on both alarm and no-alarm states. A designprocedure of a time-deadband configuration with 𝑇𝑁𝐴 ≠ 0, and 𝑇𝐴 ≠ 0is proposed in Algorithm 3. In this case, lossless run-length encoding

    103

  • M.S. Afzal et al. Control Engineering Practice 71 (2018) 96–107

    Fig. 13. ROC curves for the time-deadband configurations.

    Algorithm 1 Applying time-deadband only on no-alarm state (𝑇𝑁𝐴 ≠ 0,𝑇𝐴 = 0)

    Inputs: RTN–RTN run-length encoded alarm sequence, 𝑇𝑁𝐴Output: Alarm reduction

    1: procedure Alarm Reduction(RTN–RTN, 𝑇𝑁𝐴)Initialization

    2: n = length(RTN–RTN) ⊳ initial alarm count3: temp = 0 ⊳ temporary bufferComputation

    4: while 𝑖 ≤ length(RTN–RTN)5: temp = RTN–RTN(i)6: while 𝑇𝑁𝐴 ≥ 𝑡𝑒𝑚𝑝7: n = n - 1 ⊳ decrease in alarm count8: i = i + 19: temp = temp + RTN–RTN(i)

    10: end while11: i = i + 112: end while13: end procedure

    Algorithm 2 Applying time-deadband only on alarm state (𝑇𝐴 ≠ 0, 𝑇𝑁𝐴= 0)

    Inputs: ALM–ALM run-length encoded alarm sequence, 𝑇𝐴Output: Alarm reduction

    1: procedure Alarm Reduction(ALM–ALM, 𝑇𝐴)Initialization

    2: n = length(ALM–ALM) ⊳ initial alarm count3: temp = 0 ⊳ temporary bufferComputation

    4: while 𝑖 ≤ length(ALM–ALM)5: temp = ALM–ALM(i)6: while 𝑇𝐴 ≥ 𝑡𝑒𝑚𝑝7: n = n - 1 ⊳ decrease in alarm count8: i = i + 19: temp = temp + ALM–ALM(i)

    10: end while11: i = i + 112: end while13: end procedure

    (LRLE) scheme of the alarm data is considered. Two nested while loopsare used to calculate the alarm count reduction. The logic behind thetwo alarm reduction conditions in the two nested loops is similar to

    Algorithm 3 Applying time-deadband on both alarm and no-alarmstates (𝑇𝐴 ≠ 0, 𝑇𝑁𝐴 ≠ 0)

    Inputs: Lossless run-length encoded alarm sequence, 𝑇𝐴, 𝑇𝑁𝐴, alarmcountOutput: Alarm reduction

    1: procedure Alarm Reduction(LRLE, 𝑇𝐴, 𝑇𝑁𝐴)Initialization

    2: n = alarm count ⊳ initial alarm count3: flag = 0 ⊳ flag for mode selector4: temp = 0 ⊳ temporary bufferComputation

    5: while 𝑖 ≤ length(LRLE)6: temp = LRLE(i)7: if flag == 0 ⊳ alarm data is in no-alarm state8: while 𝑇𝑁𝐴 ≥ 𝑡𝑒𝑚𝑝9: i = i + 1

    10: temp = temp + LRLE(i)11: if (Mod(i,2) == 1)12: n = n - 113: i = i + 114: temp = temp + LRLE(i)15: end if16: flag = 117: end while18: i = i + 119: temp = temp - 𝑇𝑁𝐴20: else ⊳ alarm data is in alarm state21: while 𝑇𝐴 ≥ 𝑡𝑒𝑚𝑝22: i = i + 123: temp = temp + LRLE(i)24: if (Mod(i,2) == 0)25: n = n - 126: i = i + 127: temp = temp + LRLE(i)28: end if29: flag = 030: end while31: i = i + 132: temp = temp - 𝑇𝐴33: end if34: end while35: end procedure

    the one used in both the Algorithms 1 and 2. In this algorithm 𝑀𝑜𝑑(⋅)represents the Modulo operation.

    For illustration purposes, industrial alarm data, coming from aunderlying level process variable, for a period of 25 days was collected.A snapshot of a few samples of the alarm sequence is shown in Fig. 14.For the considered alarm sequence, the alarm count was observed to be1387, and the chattering index was calculated to be 0.2538 alarms/s,which is above the cut off value of 0.05 alarms/s, indicating the problemof severe chattering. All three design scenarios are considered, and theplots for percentage alarm reductions are obtained.

    Fig. 15 shows the curve of percentage alarm reduction for the rangeof values for 𝑇𝑁𝐴, while keeping the 𝑇𝐴 = 0. From this figure it canbe seen that the alarm count reduces as the length of 𝑇𝑁𝐴 is increased.In this figure a marked point indicate that the alarm count reductionis observed to be 83.33% for 𝑇𝑁𝐴 = 22. Similarly, an alternativeconfiguration is proposed by considering only 𝑇𝐴 while keeping the𝑇𝑁𝐴 = 0. Fig. 16 shows the percentage alarm count reduction forthe range of 𝑇𝐴, e.g., 𝑇𝐴 of length 27 results in 85.79% reduction inthe alarm count. For both of these marked cases, the chattering indexis calculated to be 0.031 alarms/s, which is acceptable according thestandards.

    104

  • M.S. Afzal et al. Control Engineering Practice 71 (2018) 96–107

    Fig. 14. Snapshot of the alarm data from the industrial process.

    Fig. 15. Alarm reduction based on 𝑇𝑁𝐴 (Simulated case based on Algorithm 1).

    Fig. 17 shows the case where the design of both 𝑇𝐴 and 𝑇𝑁𝐴 isconsidered. The color bar indicates the percentage alarm reduction,when waiting times of particular lengths are configured on both thealarm and no-alarm states. Table 3 lists some configurations along withthe resultant chattering index, and percentage alarm reduction. Fromthis table it can be seen that the configurations (𝑇𝐴 = 15, 𝑇𝑁𝐴 = 20) and(𝑇𝐴 = 24, 𝑇𝑁𝐴 = 27) show superior performance as compared to otherlisted configurations.

    It is worth mentioning here that the proposed design proceduresprovide a set of possible time-deadband configurations to consider.Depending on the type of the process variable, its criticality, and thesignal to noise ratio of the process variable, one may choose the onethat suits best for the conditions.

    6. A comparison with delay-timers

    The operation of time-deadbands (alarm latches) bears some sim-ilarity to the conventional delay-timers. An on delay-timer of length𝑛, which raises an alarm only if 𝑛 consecutive samples of the processvariable go above the threshold, can be compared to the time-deadbandconfigured on no-alarm state (𝑇𝑁𝐴). Similarly, an off delay-timer oflength 𝑚, which clears an alarm only if 𝑚 consecutive samples fall belowthe threshold, can be compared to the time-deadband on alarm state(𝑇𝐴) of same length.

    Fig. 16. Alarm reduction based on 𝑇𝐴 (Simulated case based on Algorithm 2).

    Fig. 17. Alarm reduction based on both 𝑇𝑁𝐴 and 𝑇𝐴 (Simulated case based on Algorithm3).

    Table 3Performance comparison of time-deadband configuration.

    Configuration Alarm count Chattering index Alarm reduction (%)

    𝑇𝐴 = 0, 𝑇𝑁𝐴 = 0 1387 0.2538 –𝑇𝐴 = 15, 𝑇𝑁𝐴 = 20 118 0.0253 91.49𝑇𝐴 = 6, 𝑇𝑁𝐴 = 11 277 0.0481 83.63𝑇𝐴 = 5, 𝑇𝑁𝐴 = 4 464 0.0812 66.55𝑇𝐴 = 24, 𝑇𝑁𝐴 = 27 106 0.0179 92.36

    A Gaussian distributed process variable with zero mean and unitvariance for the normal case is considered. The mean value is alteredto 2 for the abnormal case. For comparison different combinations of onand off delay-timers and time-deadbands (𝑇𝑁𝐴 and 𝑇𝐴) are considered,and the accuracy and detection delays of the configurations are lookedat. Accuracy is related to the false and missed alarm rates, and an alarmconfiguration is considered to be accurate if it results in lower values ofthe false and missed alarm rates. Fig. 18 shows the ROC curves for therange of thresholds and various sets of configurations of delay-timersand time-deadbands. From this figure it can be seen that in terms ofaccuracy delay-timers outperform time-deadbands; however, in terms ofdetection delay, time-deadbands show better performance than delay-timers, as shown in Fig. 19. Similar performance results were obtainedfor the case of Gamma distributed process variable with unit scalingfactor and shaping factors of 0.5 and 7.5 for normal and abnormal cases,respectively. Performance results are shown in Figs. 20 and 21.

    105

  • M.S. Afzal et al. Control Engineering Practice 71 (2018) 96–107

    Fig. 18. ROC curves for various configurations of delay-timers and time-deadbands(Gaussian distributed process variable).

    Fig. 19. Expected delays for various configurations of delay-timers and time-deadbands(Gaussian distributed process variable).

    Fig. 20. ROC curves for various configurations of delay-timers and time-deadbands(Gamma distributed process variable).

    Fig. 21. Expected delays for various configurations of delay-timers and time-deadbands(Gamma distributed process variable).

    7. Conclusion

    In this paper time-deadband configurations for univariate alarmsystems have been studied. A Markov process based model was devel-oped under the assumptions the distributions of the process variableare known for both the normal and abnormal operations. Performanceindices, the false alarm rate, the missed alarm rate, and the expecteddetection delay have been calculated by studying the long-term behaviorof the Markov model. Design procedures based on the process data, andthe alarm data have been developed, to help the operators in achievingacceptable alarm system performance.

    Acknowledgment

    The authors would like to thank the NSERC Canada for the financialsupport.

    References

    Adnan, N. A., Cheng, Y., Izadi, I., & Chen, T. (2013). Study of generalized delay-timers inalarm configuration. Journal of Process Control, 23(3), 382–395.

    Adnan, N. A., Izadi, I., & Chen, T. (2011). On expected detection delays for alarm systemswith deadbands and delay-timers. Journal of Process Control, 21(9), 1318–1331.

    Afzal, M. S., & Chen, T. (2017). Analysis and design of multimode delay-timers. ChemicalEngineering Research and Design. http://dx.doi.org/10.1016/j.cherd.2017.01.029.

    Basseville, M., & Nikiforov, I. V. (1993). Detection of abrupt changes: theory and application,vol. 104. Prentice Hall Englewood Cliffs.

    Cheng, Y., Izadi, I., & Chen, T. (2011). On optimal alarm filter design. In Internationalsymposium on advanced control of industrial processes (ADCONIP) (pp. 139–145). IEEE.

    Cheng, Y., Izadi, I., & Chen, T. (2013). Optimal alarm signal processing: Filter design andperformance analysis. IEEE Transactions on Automation Science and Engineering , 10(2),446–451.

    Engineering Equipment and Materials Users’ Association (EEMUA), (2007). Alarm systems:A guide to design, management and procurement. EEMUA Publication 191.

    Ghosh, J. K. (2012). Introduction to modeling and analysis of stochastic systems.International Statistical Review, 80(3), 487.

    Hu, W., Wang, J., & Chen, T. (2015). A new method to detect and quantify correlatedalarms with occurrence delays. Computers & Chemical Engineering , 80, 189–198.

    Hugo, A. (2009). Estimation of alarm deadbands. In Proceedings of 7th IFAC symposium onfault detection, supervision and safety of technical processes (pp. 663–667).

    Hwang, S.-L., Lin, J.-T., Liang, G.-F., Yau, Y.-J., Yenn, T.-C., & Hsu, C.-C. (2008).Application control chart concepts of designing a pre-alarm system in the nuclearpower plant control room. Nuclear Engineering and Design, 238(12), 3522–3527.

    International Society of Automation, (2009). Management of alarm systems for the processindustries, ANSI/ISA 18.2.

    Jang, G.-S., Suh, S.-M., Kim, S.-K., Suh, Y.-S., & Park, J.-Y. (2013). A proactive alarmreduction method and its human factors validation test for a main control room forSMART. Annals of Nuclear Energy , 51, 125–134.

    106

  • M.S. Afzal et al. Control Engineering Practice 71 (2018) 96–107

    Jerhotova, E., Sikora, M., & Stluka, P. (2012). Dynamic alarm management in nextgeneration process control systems. In IFIP international conference on advances inproduction management systems (pp. 224–231). Springer.

    Kondaveeti, S. R., Izadi, I., Shah, S. L., & Chen, T. (2011). On the use of delay timers andlatches for efficient alarm design. In 19th IEEE mediterranean conference on control &automation (MED) (pp. 970–975).

    Kondaveeti, S. R., Izadi, I., Shah, S. L., Shook, D. S., Kadali, R., & Chen, T. (2013).Quantification of alarm chatter based on run length distributions. Chemical EngineeringResearch and Design, 91(12), 2550–2558.

    Kotz, S., & Johnson, N. L. (1988). Encyclopedia of statistical sciences. Wiley, New York.Laberge, J. C., Bullemer, P., Tolsma, M., & Dal Vernon, C. R. (2014). Addressing alarm

    flood situations in the process industries through alarm summary display design andalarm response strategy. International Journal of Industrial Ergonomics, 44(3), 395–406.

    Lai, S., & Chen, T. (2015). A method for pattern mining in multiple alarm flood sequences.Chemical Engineering Research and Design, 117 , 831–839.

    Lawler, G. F. (2006). Introduction to stochastic processes. CRC Press.Miao, H., Sforna, M., & Liu, C.-C. (1996). A new logic-based alarm analyzer for on-line

    operational environment. IEEE Transactions on Power Systems, 11(3), 1600–1606.Naghoosi, E., Izadi, I., & Chen, T. (2011). Estimation of alarm chattering. Journal of Process

    Control, 21(9), 1243–1249.

    Nihlwing, C., & Kaarstad, M. (2012). The development and usability test of a state basedalarm system for a nuclear power plant simulator. In Proc. NPIC and HMIT (pp. 22–26).

    Simeu-Abazi, Z., Lefebvre, A., & Derain, J.-P. (2011). A methodology of alarm filteringusing dynamic fault trees. Reliability Engineering & System Safety , 96(2), 257–266.

    Tan, W., Sun, Y., Azad, I. I., & Chen, T. (2017). Design of univariate alarm systems viarank order filters. Control Engineering Practice, 59, 55–63.

    Varga, T., Szeifert, F., & Abonyi, J. (2009). Detection of safe operating regions: a noveldynamic process simulator based predictive alarm management approach. Industrialand Engineering Chemistry Research, 49(2), 658–668.

    Wang, J., & Chen, T. (2014). An online method to remove chattering and repeating alarmsbased on alarm durations and intervals. Computers & Chemical Engineering , 67 , 43–52.

    Xu, X., Li, S., Song, X., Wen, C., & Xu, D. (2016). The optimal design of industrial alarmsystems based on evidence theory. Control Engineering Practice, 46, 142–156.

    Yu, Y., Zhu, D., Wang, J., & Zhao, Y. (2017). Abnormal data detection for multivariatealarm systems based on correlation directions. Journal of Loss Prevention in the ProcessIndustries, 45, 43–55.

    Zucchini, W., & MacDonald, I. L. (2009). Hidden markov models for time series: anintroduction using R. CRC Press.

    107