9
Methodology and Framework for Predicting Rolling Element Helicopter Bearing Failure David Siegel, Jay Lee University of Cincinnati NSF Center for Intelligent Maintenance Systems Cincinnati, Ohio, United States of America Canh Ly Army Research Lab Sensors and Electronic Devices Directorate Adelphi, Maryland, United States of America AbstractThe enhanced ability to predict the remaining useful life of helicopter drive train components offers potential improvement with regards to safety, maintainability, and reliability of a helicopter fleet. Current existing helicopter health and usage monitoring systems provide diagnostic information that indicates when the condition of a drive train component is degraded; however, prediction techniques are not currently used. Although various algorithms exist for providing remaining life predictions, considering the limited number of run-to-failure data sets, the maturation of the prognostic techniques has not been achieved. This particular study addresses remaining useful life predictions for the helicopter oil-cooler bearing. The paper proposes a general methodology of how to perform rolling element bearing prognostics and presents the results using a robust regression curve fitting approach. The proposed methodology includes a series of processing steps prior to the prediction routine, including feature extraction, feature selection, and health assessment. This provides a framework for including prediction algorithms into existing health and usage monitoring systems. An oil-cooler bearing test-rig constructed by Impact Technologies LLC is used to facilitate the development of the remaining life prediction techniques. Two data sets are used in this study, in which both bearings experienced an inner race spall that progressed until the test was stopped due to an unsafe vibration level. The robust regression curve fitting results are promising in that the actual and predicted remaining life estimates converge for the run-to-failure oil-cooler bearing data sets a few hours prior to the stopping of the test. Future work would consider using the same methodology but comparing the accuracy of this prediction method with Bayesian filtering techniques, usage based methods, and other time series prediction methods. Keywords-Remaining Useful Life; Robust Regression; Bearing Envelope Analysis; Bearing Failure Prediction I. INTRODUCTION The potential benefits with regard to reducing the logistics and maintenance cost for a fleet of helicopters would require improvements in not only drive-train diagnostic capabilities, but predictive capabilities that can offer insight on when a component is expected to reach an undesirable level of health. Accurate prediction of the remaining useful life (RUL) of critical components on the rotorcraft can allow for maintenance scheduling and parts to be ordered or shipped hours in advance; however, achieving this scenario requires improvements and validation of the prediction algorithms [1]. There is a vast amount of component remaining life prediction algorithms; however, the maturation of these techniques has not been achieved for a variety of reasons, including the scarcity of run-to-failure data. A review of existing prognostic algorithms highlights that there are a few techniques that are gaining much popularity among researchers in prognostics and health management. The use of particle filters for remaining life prediction has been examined in a variety of studies with much success, including rolling element bearings, batteries, and turbine engine blades [2-4]. This Bayesian filtering approach can account for uncertainty in the prediction and provide a probability distribution function for the remaining life estimation. However, this approach would require physical equations that describe the progression of the fault. For rolling element bearings, the commonly used physical equations are based on Paris’ law. The accuracy of this prediction method has done quite well in laboratory settings, but the inputs require a mapping that can relate vibration indicators to a spall or crack size [4]. This would likely require ground-truth information that can relate the processed vibration data to a measured spall size. Techniques that do not require a physical model of the system for remaining life prediction can be considered less restrictive based on the required inputs needed for the prediction algorithm; however, whether the accuracy is reduced by not using a physical model has not been fully explored. One unique technique was employed by Wang et al. [5]; this technique matches previous degradation patterns to the current degradation pattern. This similarity-based prediction method was used for an aircraft engine data set with accurate remaining life estimates; however, it requires multiple historical run-to-failure data sets, which are difficult to achieve in practice. The uses of regression or time series based prediction approaches are also a common data-driven approach to remaining life estimation [6]. Even though there are a few studies that evaluate multiple prediction algorithms, it is difficult to conclude whether one particular prediction algorithm performs better, in general; this 978-1-4244-9827-7/11/$26.00 ©2011 IEEE

[IEEE 2011 IEEE Conference on Prognostics and Health Management (PHM) - Denver, CO, USA (2011.06.20-2011.06.23)] 2011 IEEE Conference on Prognostics and Health Management - Methodology

  • Upload
    canh

  • View
    213

  • Download
    1

Embed Size (px)

Citation preview

Page 1: [IEEE 2011 IEEE Conference on Prognostics and Health Management (PHM) - Denver, CO, USA (2011.06.20-2011.06.23)] 2011 IEEE Conference on Prognostics and Health Management - Methodology

Methodology and Framework for Predicting Rolling Element Helicopter Bearing Failure

David Siegel, Jay Lee University of Cincinnati

NSF Center for Intelligent Maintenance Systems Cincinnati, Ohio, United States of America

Canh Ly Army Research Lab

Sensors and Electronic Devices Directorate Adelphi, Maryland, United States of America

Abstract— The enhanced ability to predict the remaining useful life of helicopter drive train components offers potential improvement with regards to safety, maintainability, and reliability of a helicopter fleet. Current existing helicopter health and usage monitoring systems provide diagnostic information that indicates when the condition of a drive train component is degraded; however, prediction techniques are not currently used. Although various algorithms exist for providing remaining life predictions, considering the limited number of run-to-failure data sets, the maturation of the prognostic techniques has not been achieved. This particular study addresses remaining useful life predictions for the helicopter oil-cooler bearing. The paper proposes a general methodology of how to perform rolling element bearing prognostics and presents the results using a robust regression curve fitting approach. The proposed methodology includes a series of processing steps prior to the prediction routine, including feature extraction, feature selection, and health assessment. This provides a framework for including prediction algorithms into existing health and usage monitoring systems. An oil-cooler bearing test-rig constructed by Impact Technologies LLC is used to facilitate the development of the remaining life prediction techniques. Two data sets are used in this study, in which both bearings experienced an inner race spall that progressed until the test was stopped due to an unsafe vibration level. The robust regression curve fitting results are promising in that the actual and predicted remaining life estimates converge for the run-to-failure oil-cooler bearing data sets a few hours prior to the stopping of the test. Future work would consider using the same methodology but comparing the accuracy of this prediction method with Bayesian filtering techniques, usage based methods, and other time series prediction methods.

Keywords-Remaining Useful Life; Robust Regression; Bearing Envelope Analysis; Bearing Failure Prediction

I. INTRODUCTION The potential benefits with regard to reducing the logistics

and maintenance cost for a fleet of helicopters would require improvements in not only drive-train diagnostic capabilities, but predictive capabilities that can offer insight on when a component is expected to reach an undesirable level of health. Accurate prediction of the remaining useful life (RUL) of critical components on the rotorcraft can allow for

maintenance scheduling and parts to be ordered or shipped hours in advance; however, achieving this scenario requires improvements and validation of the prediction algorithms [1]. There is a vast amount of component remaining life prediction algorithms; however, the maturation of these techniques has not been achieved for a variety of reasons, including the scarcity of run-to-failure data.

A review of existing prognostic algorithms highlights that there are a few techniques that are gaining much popularity among researchers in prognostics and health management. The use of particle filters for remaining life prediction has been examined in a variety of studies with much success, including rolling element bearings, batteries, and turbine engine blades [2-4]. This Bayesian filtering approach can account for uncertainty in the prediction and provide a probability distribution function for the remaining life estimation. However, this approach would require physical equations that describe the progression of the fault. For rolling element bearings, the commonly used physical equations are based on Paris’ law. The accuracy of this prediction method has done quite well in laboratory settings, but the inputs require a mapping that can relate vibration indicators to a spall or crack size [4]. This would likely require ground-truth information that can relate the processed vibration data to a measured spall size.

Techniques that do not require a physical model of the system for remaining life prediction can be considered less restrictive based on the required inputs needed for the prediction algorithm; however, whether the accuracy is reduced by not using a physical model has not been fully explored. One unique technique was employed by Wang et al. [5]; this technique matches previous degradation patterns to the current degradation pattern. This similarity-based prediction method was used for an aircraft engine data set with accurate remaining life estimates; however, it requires multiple historical run-to-failure data sets, which are difficult to achieve in practice. The uses of regression or time series based prediction approaches are also a common data-driven approach to remaining life estimation [6].

Even though there are a few studies that evaluate multiple prediction algorithms, it is difficult to conclude whether one particular prediction algorithm performs better, in general; this

978-1-4244-9827-7/11/$26.00 ©2011 IEEE

Page 2: [IEEE 2011 IEEE Conference on Prognostics and Health Management (PHM) - Denver, CO, USA (2011.06.20-2011.06.23)] 2011 IEEE Conference on Prognostics and Health Management - Methodology

Inputs (Vibration, Loads, AE) Feature Extraction

Feature Selection Health Assessment

Detection and Prediction Triggering

Remaining Life Prediction

Model Refinement

Figure 1. Rolling element bearing prognostic development flow chart

is for a variety of reasons. One reason is that many of the algorithms require the parameters to be proper tuned or adjusted. An example would be the material property parameters in a Paris’ law equation or even choosing the appropriate regression model for a regression-based technique [7]. Also, given the limited data sets, it is difficult to generalize whether the superior results for one prediction method would hold when additional data sets are included. Fusion of multiple techniques is one option; however, prior to considering this, one must be comfortable with the accuracy and potential disadvantages of each individual technique [8].

Having more validated prediction algorithms is only one technical aspect that is lacking for acceptance of this technology: a more general approach is needed that can highlight how the technology can be applied to other systems or components. The paper proposes an approach for remaining life prediction of rolling element bearings; however, the proposed method is also suitable, with some adjustments, to other mechanical components such as gears or shafts. Section 2 discusses each individual aspect of the methodology, while section 3 highlights results using this approach in a case study for an oil-cooler helicopter bearing. Lastly, conclusions from the case study and also the potential future work are presented in section 4.

II. ROLLING ELEMENT BEARING PROGNOSTIC METHODOLOGY AND FRAMEWORK

The framework for developing a rolling element bearing remaining life prediction method is highlighted in Fig. 1. Notice the actual prediction algorithm is only one aspect of the overall framework. This cannot be further emphasized; the prediction step requires substantial prior processing and quality inputs in order to produce an accurate result. The shaded regions represent the three steps in the development process: processing the raw data into features, developing the health model using selected features, and using the prediction module, which includes when to trigger the prediction and a remaining life prediction model. A description and discussion of each aspect are presented in order to highlight what techniques are available and the advantages and tradeoffs of each particular method.

A. Feature Extraction For monitoring dynamic components and, in particular,

rolling element bearings, the measured input signals for a condition monitoring system typically consist of vibration signals. The inputs could also consist of acoustic emission signals as well as information on the loading conditions or operating regime. A listing of the potential signal processing and feature extraction methods for rolling element bearing is provided in Table I, along with the potential advantages and disadvantages. This list is not exhaustive; however, it provides a guide to what techniques are available and what type of indication each processing method could provide. For development purposes, it is perhaps best to extract any potential features that could be relevant. The feature selection technique could then be used to automate which features are important and should be included in the model.

TABLE I. OVERVIEW OF SIGNAL PROCESSING TECHNIQUES FOR ROLLING ELEMENT BEARINGS

Signal Processing Technique Advantages Disadvantages

1. Time Domain Statistical Features

Provides an overall indication of mechanical

system health.

Limited root cause information [9].

2. FFT Bearing Fault Features

Good for detecting bearing damage at late

stages.

Not suitable for detecting incipient

damage [10]. 3. Bearing Envelope

Features Good for detecting incipient damage [11].

Requires a high sampling rate.

4. Time Synchronous Averaging

Can enhance vibration synchronous with the

shaft speed and reduce noise [12].

More suited for shaft and gear components.

5. Spectral Kurtosis Provides an indication of

transient impact in different frequency

bands [13].

Requires selection of the appropriate

block size.

6. Wavelet Decomposition

Suitable for non-stationary signals and

provides energy in frequency bands [14].

Requires selection of the mother wavelet and

decomposition level.

7. Empirical Mode Decomposition Adaptively decomposes

signals. Long processing

time.

An evaluation of the various techniques for rolling element bearings was conducted in a previous study [16]. The results from a prior study showed the time domain statistical features and the envelope bearing fault frequency features were the most promising. An example plot of the envelope 2X BPFI feature is provided in Fig. 2; this particularly promising feature trend is from a run-to-failure test of an oil-cooler bearing test-rig. One particular aspect that is desirable is a monotonic trend in a particular vibration feature; this particular feature that is plotted does not strictly exhibit this characteristic but does show this aspect towards the end of the test. Section 3 discusses this case study and results, in which two oil-cooler bearings were tested from a baseline condition until a failure condition.

Page 3: [IEEE 2011 IEEE Conference on Prognostics and Health Management (PHM) - Denver, CO, USA (2011.06.20-2011.06.23)] 2011 IEEE Conference on Prognostics and Health Management - Methodology

B. Feature Selection The feature extraction step applies all potentially relevant

processing methods to the measured vibration or other signals; however, this list of potential condition indicators can be quite large. It is not feasible to include all the extracted features into a health assessment model, but instead a subset of the features should be included. A listing of some of the common methods regarding feature selection is provided in Table II; this is not a complete listing of all potential methods but does provides some overview of what options are available. How to select the appropriate features is typically based on an automated selection routine that has some metric for ranking the potential features. Expert feature selection assumes that one has a very good knowledge base of which potential features are most suitable based on years of experience [17]. Expert selection is particularly suitable if the available data set is sparse, in which a feature selection routine might not be suitable, since many of the metrics are based on having both the baseline and failure signature data sets.

For an automated selection routine, the metric ideally should be suited for the task at hand. This could imply that a different feature set and different feature selection method is used for fault classification then one that is used for failure prediction. In the prognostic application, the features should provide a trend over time that is correlated with bearing damage. In this case, many of the filter-based selection routines, such as fisher criterion, might not select the best features. The reason being is that the metric is just comparing the means and variances between a good and degraded state. This only implies that the feature can discriminate between a good and degraded bearing and does not necessarily mean that the feature magnitude increases in a monotonic manner with bearing damage. If an output variable such as bearing spall size is recorded, using a correlation metric with this output variable and an individual feature would provide an effective way of ranking the potential features. As an alternative, the correlation metric can use operating time after a spall was first observed as an output variable if the spall size was not measured.

TABLE II. SAMPLE OF AVAIABLE FEATURE SELECTION METHODS

Feature Selection Method

Advantages and Disadvantages

1. Expert Selection Relies on experience but can be accurate if one has sufficient knowledge of the system.

2. Filter Methods Easy to implement and allows one to rank each feature based on a pre-selected metric [18].

3. Wrapper Methods

More ideal for fault classification, but again would require a label or output variable and pre-selected classification or regression method [19].

C. Health Assessment In many instances, a single condition indicator is not

enough to describe the failure progression of a dynamic component such as a rolling element bearing. In the bearing example, one particular condition indicator could be more suitable when the bearing spall damage is more localized; however, the shape and size of the bearing spall changes during the different stages of bearing failure. In this case, it would be advantageous to have a bearing health value that is a function of multiple input features. Various techniques exist for fusing multiple features into a single health value and a few of the methods are highlighted in Table III.

Perhaps the most straightforward approach is to have a weighted sum of individual features [20]. However, determining the weights requires some prior experience and can limit the effectiveness of combining multiple features if one particular feature is controlling the final output. Various distance metrics are used in the literature that estimate health based on a distance from a normal or baseline region. The use of Mahalanobis distance, Euclidean distance, and PCA monitoring statistics, such as SPE and T-square, are some of the more commonly used distance-based health metrics [21]. The use of self-organizing maps and calculating the minimum quantization error has shown to be an effective distance-based health method and has successfully applied to a bearing monitoring application [22–23]. Selecting the best distance-based health method is mostly done based on experience, considering that most prior work in the literature only evaluates one technique.

TABLE III. OVERVIEW OF COMMONLY USED HEALTH ASSESSMENT TECHNIQUES

Health Assessment Method

Advantages and Disadvantages

1. Weighted Sum of Features Does not require a baseline but requires

knowledge on how to weight the features.

2. Distance From Normal

Only requires defining a baseline region and indicates how far from normal the current

condition is; difficult to define a threshold for triggering maintenance action.

3. Regression Requires an output variable that is related to spall size or damage; however, directly allows one to estimate the damage based on the feature inputs.

4. Neural Network Similar to regression, with the exception that it is more suited if there is a nonlinear relationship

between bearing spall size and the feature values. A variety of other techniques exist, including regression or

neural network based methods; however, these methods

0 500 1000 15000

0.5

1

1.5

2

2.5

3

Sample # in 75Hz and 500lb Regime

Axi

al E

nvel

ope

2X B

PFI

(g)

Bearing #1 Axial Envelope 2X BPFI Feature Trend

Figure 2. Example of a suitable feature trend for prognostics

Page 4: [IEEE 2011 IEEE Conference on Prognostics and Health Management (PHM) - Denver, CO, USA (2011.06.20-2011.06.23)] 2011 IEEE Conference on Prognostics and Health Management - Methodology

require training a model with a known output variable. In the bearing application, the ideal output variable would be the spall size or damage level [24]. Selecting the appropriate health assessment technique is usually driven by what information is available. Distance-based methods are less restrictive since they only require baseline data for training and do not require any ground-truth information, such as bearing spall size.

D. Detection and Prediction Triggering A prediction algorithm should only be initiated when there

are signs of bearing degradation or spalling. Health-based prediction algorithms are based on the trend in the health value or a physical relationship that describes spall progression, such as Paris’ law. If there is no bearing spall, these prediction techniques would not be applicable. At this initial stage in the bearings’ life, usage- or reliability-based techniques are most applicable. Prior to using a health-based prediction algorithm, a key step would be to determine when the bearing has started to degrade and when the prediction technique should be enabled. This can be referred to as anomaly detection, and Table IV highlights some of the techniques suitable for this task.

TABLE IV. SAMPLE OF AVAIABLE ANOMALY DETECTION METHODS

Anomaly Detection Methods

Advantages and Disadvantages

1. Health Based Threshold Uses a health value to set a threshold for anomaly

detection; setting the threshold is not trivial.

2. One-Class Classifiers

Support vector data description can detect when the condition is not in the normal region; however,

which kernel to use requires experience. 3. Bayesian

Filtering Approach

Requires a nonlinear mapping between feature value and spall length and a physical model;

however, threshold can be based on a physical quantity such as spall size.

The health indicator value can be used directly for anomaly detection. This would require setting an appropriate threshold for when the health value would represent an anomalous behavior or a bearing with spall initiation. When setting the appropriate threshold, it is important to consider the tradeoff between a late detection of a bearing spall and also a false alarm. Various other approaches exist for anomaly detection. The use of a support vector data description can be considered a one-class classifier that defines a health region and then determines whether the current condition is in the health region or not. This particular technique is similar to support vector machines; however, it only requires baseline data for training. A threshold for detection does not need to be set for this particular method; nevertheless, choosing the appropriate kernel function is not straightforward and requires previous experience with using this algorithm [25]. The use of Bayesian filtering, such as a particle filter, can also be configured for anomaly detection; this method was used for our rolling element bearings and also for other health monitoring applications. The particle filter uses aspects from both data-driven and model-based approaches; however, this method would require a nonlinear mapping between feature

value and spall length and a spall progression model [26]. One particular advantage for this method is that the threshold can be based on an actual spall length that is unacceptable, as opposed to setting a threshold for a feature or a calculated health value.

E. Remaining Life Prediction The use of the prediction algorithm is the last aspect of the

remaining life prediction methodology for rolling element bearings. This implies the prediction results are a function of the health assessment and anomaly detection modules and not solely a function of the prediction algorithm. A variety of techniques exist for performing remaining useful life estimation and typically they are divided into two categories; data-driven methods and model-based approaches. Table V lists some of the more commonly used prediction techniques. Data-driven prediction methods consist of time series methods such as auto-regressive integrated moving average (ARIMA) and regression-based methods [27]. Regression-based techniques use the trend in the health or feature value to extrapolate and predict the remaining life of the component [6]. This technique assumes that one can define a failure threshold and also that an appropriate regression model can be used that describes the trend in the health value. Non-parametric regression-based methods are also available, such as Gaussian Process Regression (GPR); these techniques do not need a pre-defined regression model [27].

Perhaps unique to the available data-driven techniques is the use of a similarity-based prediction algorithm. The central concept is to have a library of degradation patterns from previous run-to-failure data sets and estimate the current remaining life by matching the current degradation pattern with the most similar ones in the degradation library [5]. This was previously applied to an aircraft engine application with the caveat that it would require multiple run-to-failure data sets to build up an appropriate degradation library.

TABLE V. OVERVIEW OF COMMONLY USED REMAINING LIFE PREDICTION TECHNIQUES

Remaining Life Prediction Methods

Advantages and Disadvantages

1. Regression-based Methods

Requires setting a failure threshold but the results can be accurate if regression model is

chosen appropriately. 2. Similarity-based

Prediction Accurate but requires multiple run-to-failure data sets.

3. Bayesian Filtering Approach

Requires a failure model to describe the fault progression but can provide accurate results and

a way to handle uncertainty.

The uses of Bayesian filtering techniques, such as the particle filter, have been applied for remaining life prediction of rolling element bearings. Again this particular approach would require a physical model that describes the spall progression. The use of Paris’ Law is the common approach for incorporating a physical crack propagation model into the particle filtering prediction method. The advantages include the ability of this method to handle the uncertainty in the

Page 5: [IEEE 2011 IEEE Conference on Prognostics and Health Management (PHM) - Denver, CO, USA (2011.06.20-2011.06.23)] 2011 IEEE Conference on Prognostics and Health Management - Methodology

remaining life estimation and provide a probability distribution function for the remaining life at different stages during the fault progression [2]. The selection of the appropriate prediction algorithm is a tradeoff between what information is available from the data sets, the ground-truth information from the testing, the level of understanding of the physical system, and the desired accuracy of the prediction algorithm. Without ground-truth information or an understanding of the physics of the fault progression, data-driven approaches are one of the feasible options.

III. OIL COOLER BEARING PROGNOSTIC CASE STUDY

A. Experimental Test-Rig In order to provide some initial evaluation of the rolling

element prediction approach, an experimental test-rig was constructed by Impact Technologies LLC for run-to-failure testing of an oil-cooler helicopter bearing. The overall configuration of the test-rig is provided in Fig. 3. The test-rig consisted of two test cells, each driven by a 10-hp industrial motor. A gearbox in each test-cell provided a speed multiplication of 3, in which the motor shaft was usually run at a nominal speed of 25 Hz, while the bearing shaft was driven at a nominal rotational speed of 75 Hz. The test-rig also consisted of a load cell to measure the axial load and pneumatic regulators to monitor the radial load. The bearings were instrumented with accelerometers in both the axial and radial direction on the bearing housing, and a tachometer pulse was also measured in order to measure the bearing shaft speed. The data were acquired using a National Instruments based PXI system and the accelerometer data were sampled at a rate of 102.4 KHz. Each data file was recorded for a sample length of 102,400; this corresponds to duration of 1 second per data acquisition file.

To accelerate the time for incipient spall damage, bearings with different levels of corrosion were tested. The corrosion was performed by IMR test labs and corrosion was placed on the inner and outer raceways of the bearing. Fig. 4 shows corrosion levels 0–3 from left to right, in which a level 0 represents a nominal new bearing without any corrosion and

level 3 would indicate the most severe amount of induced corrosion. The results presented for this particular study are from the run-to-failure testing of two bearings with a level 2 corrosion label. Both these bearings resulted in a spall in the inner race and an unacceptable level of vibration at the stoppage of each test.

B. Health Asssessment Results and Detection Threshold Setting In a previous study on diagnostics and health assessment

for oil-cooler bearings, a health assessment model based on a self-organizing map distance metric was used to characterize the bearing health value over time [16]. The health model used the top 20 selected features based on a correlation metric with operating hours after a spall was observed. Considering that spall size information was not available during the progression test, selecting a distance-based health assessment method was an appropriate choice. The results from this health assessment method are presented in Fig. 5 and Fig. 6 for each bearing. Bearing #1 displays a health trend that has a noticeable increase during spall initiation and a clear upward trend thereafter. The health results for Bearing #2 also exhibit an increasing trend after a spall is observed; the health assessment results are encouraging and appear suitable for this application.

Figure 3. Oil cooler bearing test-rig

Figure 4. Oil cooler bearings with corrosion levels

0 200 400 600 800 1000 1200 14000

2

4

6

8

10

12

14

Test Time (min)

Est

imat

ed H

ealth

Val

ue

Bearing # 1 Health Value Over Time

Figure 5. Bearing #1 health trend

Page 6: [IEEE 2011 IEEE Conference on Prognostics and Health Management (PHM) - Denver, CO, USA (2011.06.20-2011.06.23)] 2011 IEEE Conference on Prognostics and Health Management - Methodology

Defining an anomaly detection threshold involves a

compromise between late detection of a bearing spall or damage and the potential for false alarms. To establish an appropriate threshold for detection, a receiver operation characteristic curve (ROC) is used, which is a common way of displaying the true positive and false positive rate as a given threshold is varied. A plot of the ROC was generated using data from both bearing data sets, as shown on Fig. 7.

For determining the optimum location on the ROC, an approach used by Zweig et al. [28] is used. This consists of first calculating the slope, as shown in (1), in which P is number of instances for the true event and N is the number of instances for a false event. For this application, P is the number of samples after a bearing spall was observed; also the cost for a false positive and negative are assumed to be equal.

⎟⎠⎞

⎜⎝⎛

⎟⎟⎠

⎞⎜⎜⎝

⎛=

PN*

cost negative- falsecost positive-falsem (1)

A linear line with slope m is then place at the upper left corner of the ROC (0,1) and this curve is moved down and to

the right until it first intercepts a point on the ROC. The intersection point defines the optimum point in the ROC and specifies what threshold should be used and also the corresponding false positive and true positive rate. Applying this technique resulted in a calculated value slope value (m) of 1.79; this intersected the ROC with a true positive rate of 95.76%, a false positive rate of 0.015%, and a detection health threshold set at 3.06.

A plot of the health trend values with this detection

threshold are shown in Fig. 8 and Fig. 9 for bearing data sets 1 and 2. The health values in blue represent the bearing health prior to a visually observed spall, while the health values in black represent the bearing health after a spall was visually observed. The threshold based on the ROC is shown in red and it appears to be suitable for detecting whether a bearing has a spall or is healthy. For Bearing #1, the threshold does not result in any false alarms or late detection. However, for Bearing #2 the threshold did have one sample that was above the threshold prior to a spall being observed and also a handful

0 200 400 600 800 1000 1200 1400 1600 18000

2

4

6

8

10

12

Test Time (min)

Est

imat

ed H

ealth

Val

ue

Bearing # 2 Health Value Over Time

Figure 6. Bearing #2 health trend

0 0.05 0.1 0.150.93

0.94

0.95

0.96

0.97

0.98

0.99

1

1.01

1.02

False Positive Rate

True

Pos

itive

ROC Curve for Bearing Data Sets 1-2

Figure 7. ROC curve for anomaly detection 200 400 600 800 1000 1200 1400 1600

0

2

4

6

8

10

12

Test Time (min)

Est

imat

ed H

ealth

Val

ue

Health Trend for Bearing #2 and Detection Threshold

Health Prior to Observed Spall

Health After Observed SpallThreshold Based on ROC Curve

Figure 9. Bearing #2 health trend and detection threshold

200 400 600 800 1000 12000

2

4

6

8

10

12

14

Tes t Time (min)E

stim

ated

He

alth

Val

ue

Health Trend for Bearing #1 and Detect ion Threshold

Health Prior to Observed SpallHealth After Observed SpallThreshold Based on ROC Curve

Figure 8. Bearing #1 health trend and detection threshold

Page 7: [IEEE 2011 IEEE Conference on Prognostics and Health Management (PHM) - Denver, CO, USA (2011.06.20-2011.06.23)] 2011 IEEE Conference on Prognostics and Health Management - Methodology

of samples that were below the threshold after a spall was observed. There is always some tradeoff when setting a threshold and it appears that this particular threshold is suitable for detecting the bearing spall and triggering the prediction algorithm.

C. Regression Based Prediction Method Considering that the ground-truth information regarding

spall size was not available and the limited amount of run-to-failure data sets, a regression-based prediction method was a suitable choice. Within the realm of regression-based prediction algorithms, there are some options that can be adjusted or selected to better fit the data set and application in hand. Considering the noise in the feature and health values over time and the potential for outliers, there was a concern with using ordinary least-squares regression fitting due to the presence of outlier samples significantly skewing the regression fit from the true trend in the data. A robust regression fitting method that weights the residuals is known to be less susceptible to outliers and would be more likely to capture the true trend in the health value over time.

There are different options for weighting the residuals when doing a robust regression, and an Andrews’ weighting function is used for this particular application [30]. The weighting function is shown in (2), in which u is a residual normalized by its leverage value and mean absolute error term, and A is a constant that is typically specified at 1.339. Residuals that are large are given little if any weight; this reduces the effect of a few outliers significantly changing the regression curve fit.

⎪⎩

⎪⎨⎧

≤=

A u 0

A u )//()/sin()(

ππAuAu

uw

(2)

For this application, the advantages of a robust regression

compared to a least squares regression can be seen in Fig. 10. The least squares curve fit is indicating a much more rapid increase in the health value. However, the robust curve fit is

less influenced by the few high outlier health values and is fitting a regression curve that is more indicative of the degradation trend. In this case, the prediction method would be indicating a conservative estimate and a predicted life less than the actual life; however, the least squares method would have significantly more error in its remaining life prediction.

After the detection threshold is exceeded, the prediction routine is started and the curve fit is performed at each time instant. It is not feasible to show the regression fits for each time instance, and the curve fit is only shown for one particular instance in time. The regression fit shown in Fig. 11 and Fig. 12 is from both bearing data sets immediately prior to the stopping of each test.

A cubic regression curve was used for both data sets and appears very appropriate for Bearing #1. For Bearing #2, the health value has much more variation and noise. However, the regression curve is capturing the general increasing trend in the health value over time even though the regression fit was not quite as good as the one obtained for the first data set. A

0 500 1000 1500 2000 2500 3000 3500 40000

2

4

6

8

10

12

14

16

18

20

Sample #

Hea

lth V

alue

Curve Fit for Health Value for Progression Data Bearing #1

First 400 Points After Fault

Prior to Fault

Points After Prediction Curve Fit is MadeCurve Fit

Robust Curve Fit

Figure 10. Comparing least Squares and robust regression curve fit

0 500 1000 15000

2

4

6

8

10

12

Test Time (min)

Est

imat

ed H

ealth

Val

ue

Bearing # 1 Health Trend Value Plot

Health Value

Curve Fit

Upper BoundLower Bound

Threshold

Figure 11. Bearing #1 health trend and regression fit

0 200 400 600 800 1000 1200 1400 1600 18000

2

4

6

8

10

12

Test Time (min)

Est

imat

ed H

ealth

Val

ue

Bearing # 2 Health Trend Value Plot

Health Value

Curve Fit

Upper BoundLower Bound

Threshold

Figure 12. Bearing #2 health trend and regression fit

Page 8: [IEEE 2011 IEEE Conference on Prognostics and Health Management (PHM) - Denver, CO, USA (2011.06.20-2011.06.23)] 2011 IEEE Conference on Prognostics and Health Management - Methodology

failure threshold of 10 was used, since this represented the end health value obtained for both data sets at the stoppage of the each test. Note that in doing the regression fitting, an upper and lower confidence bound is calculated at each prediction point. The regression fit and the upper and lower bound are extrapolated until they reach the failure threshold; this defines the remaining life mean and upper and lower bound estimates.

D. Prediction Results The prediction results for Bearing #1 are shown in Fig. 13.

The actual and estimated life prediction was performed at each time step after an anomaly was detected and the prediction algorithm routine was initiated. This allows one to compare the actual remaining life and the estimated remaining life from the prediction method at many times instances up until the stopping of the bearing test cell. The results from the algorithm and the actual remaining life results appear to converge with 200 minutes remaining in the test. The actual remaining life appears to be within the upper and lower bound, with the exception of the end of the test when the prediction algorithm is providing a more conservative estimate. However, the conservative estimate is on the order of 30–50 minutes earlier and is still close to the actual remaining life. Although the predicted remaining life and actual remaining life are quite close, the results only converge with 200 minutes remaining in the bearings operational life. This is not a large window to take any corrective maintenance action. However, this was an accelerated run-to-failure test in which an oil-cooler bearing failure occurred with only 1200 minutes of operating time. There likely would be a greater maintenance time window for a helicopter in the fleet.

The prediction results for Bearing #2 are provided in Fig.

14, in which the actual and predicted remaining life converges with approximately 500 minutes remaining in the bearing life. There is more variation associated with the remaining life estimate, which is reflected by the regression curve fitting the noisy health value over time. Even still, the actual and predicted remaining life estimates are quite close, indicating that this particular method has potential for predicting the remaining life of helicopter bearings.

IV. CONCLUSIONS AND FUTURE WORK The paper highlighted the various modules in the

prediction framework, including feature extraction and selection, health assessment, anomaly detection, and remaining life prediction. Understanding the potential advantages and disadvantages of each available technique for the different modules provides a guide on which processing tool to select for a given rolling element bearing application. For this particular oil-cooler helicopter bearing prognostic application, the envelope and time-domain vibration features proved to be the most useful. A feature selection routine that ranked the features based on a correlation metric was an appropriate feature selection method for this application and the selected features were input into a distanced-based health assessment routine. These processing steps resulted in a clear increasing health trend and a similar end health value for both run-to-failure data sets considered in this study. A robust regression based curve fitting method was used for predicting the remaining useful life of the bearing at different time instances. The actual and predicted remaining life converged with 200 and 500 minutes remaining, respectively, for the two oiler-cooler bearing run-to-failure data sets. The remaining life prediction results were encouraging; however, the potential for benchmarking with other prediction algorithms is being considered for future work.

ACKNOWLEDGMENT This work was funded under contract number W911NF-

07-2-0075, the experimental testing and data collection were performed by Impact Technologies LLC. We would like to acknowledge Romano Patrick, Carl Byington, and other members of the Impact team for their collaboration in this research effort.

REFERENCES [1] E. Bechhoefer, “A method for generalized prognostics of a component

using Paris Law,” Proceedings of the American Helicopter Society 64th Annual Forum, Montreal, CA, May 2008.

950 1000 1050 1100 1150 1200 12500

50

100

150

200

250

Test Time (min)

RU

L (m

in)

Prediction Results for Beairng #1

Estimated RUL with Curve Fit

RUL Upper BoundRUL Lower BoundActual RUL

Figure 13. Remaining life prediction results for Bearing # 1

1200 1250 1300 1350 1400 1450 1500 1550 1600 1650 1700

0

100

200

300

400

500

600

700

Test Time (min)

RU

L (m

in)

Prediction Results for Bearing # 2

Estimated RUL with Curve Fit

RUL Upper BoundRUL Lower Bound

Actual RUL

Figure 14. Remaining life prediction results for Bearing # 2

Page 9: [IEEE 2011 IEEE Conference on Prognostics and Health Management (PHM) - Denver, CO, USA (2011.06.20-2011.06.23)] 2011 IEEE Conference on Prognostics and Health Management - Methodology

[2] M.E. Orchard, “A particle filtering-based framework for on-line fault diagnosis and failure prognosis,” PhD Dissertation, Georgia Institute of Technology, Atlanta, Georgia, 2007.

[3] B. Saha and K. Goebel, “Uncertainty management for diagnostics and prognostics of batteries using Bayesian Techniques,” Proceedings of the IEEE Aerospace Conference, Big Sky, MT, 2008.

[4] B. Zhang, C. Sconyers, R. Patrick, and G. Vachtsevanos, “A multi-fault modeling approach for fault diagnosis and failure prognosis of engineering systems,” Proceedings of the Annual Conference of the Prognostics and Health Management Society, San Diego, CA, 2009.

[5] T. Wang, J. Yu, D. Siegel, and J. Lee, “A similarity-based prognostics approach for remaining useful life estimation of engineered systems,” Proceedings of the 2008 Annual Conference of the Prognostics and Health Management Society, Denver, CO, 2008.

[6] R. Sekhon, H. Bassily, and J. Wagner, “A comparison of two trending strategies for gas turbine performance prediction,” Journal of Engineering for Gas Turbine and Power, vol. 130, pp. 1-10, July 2008.

[7] R. Li, J. Ma, A. Panyala, and D. He, “Hybrid ceramic bearing prognostics using particle filtering,” Proceedings of the Society for Machinery Failure Prevention Technology, Huntsville, AL, April 2010.

[8] K. Goebel and N. Eklund, “Prognostic fusion for uncertainty reduction,” Proceedings of the AIAA@Infotech Aerospace Conference, Reston, VA, 2007.

[9] P. Vecer, M. Kreidl, R. Smid, “Condition indicators for gearbox condition monitoring systems,” Acta Polytechnica, vol. 45, pp. 35-43, 2005.

[10] H. Qiu, H. Luo, N. Eklund, “On-board bearing prognostics in aircraft engine: enveloping analysis or FFT,” Proceedings of the ASME 2009 International Design Engineering Technical Conferences and Computers Information in Engineering, 2009.

[11] P.D. McFadden and J.D. Smith, “Vibration monitoring of rolling element bearings by high frequency resonance technique – a review,” Tribology International, vol. 77, pp. 3-10, 1984.

[12] H.J. Decker and J.J. Zakrajsek, “Comparison of interpolation methods as applied to time synchronous averaging,” Technical Memorandum, NASA/TM – 1999-209086, ARL-TR-1960, Army Research Lab, Cleveland, OH, 1999.

[13] J. Antoni, R.B. Randall, “The spectral kurtosis: application to the vibratory surveillance and diagnostics of rotating machines,” Mechanical Systems and Signal Processing, vol. 20, pp. 308-331, 2006.

[14] H. Ocak, K.A. Loparo, F.M. Discenzo, “Online tracking of bearing wear using wavelet packet decomposition and probabilistic modeling: a method for bearing prognostics,” Journal of Sound and Vibration, vol. 302, pp. 951-961, 2007.

[15] H. Khatri, K. Ranney, K. Tom, and R. Del Rosario, “New features for diagnosis and prognosis of systems based on empirical mode decomposition,” Proceedings of the 2008 Annual Conference of the Prognostics and Health Management Society, Denver, CO, 2008.

[16] D. Siegel, C. Ly, and J. Lee, “Evaluation of vibration based health assessment and diagnostic techniques for helicopter bearing components,” 2011 DSTO International Conference on Health and Usage Monitoring, Melbourne, Australia, in press.

[17] I. Guyon, A. Elisseeff, “An introduction to variable and feature selection,” The Journal of Machine Learning Research, vol. 3, pp. 1157-1182, 2003.

[18] S. Das, “Filters, wrappers and a boosting based hybrid for feature selection,” Proceedings of the Eighteenth International Conference on Machine Learning, pp. 74-81, 2001.

[19] R. Kohav and G.H. John, “Wrappers for feature subset selection,” Artificial Intelligence, vol. 97, pp. 273-324, 1997.

[20] G.J. Kacprzynski, M.J. Roemer, and R.F. Orasgh, “Assessment of data and knowledge fusion strategies for diagnostics and prognostics,” Society for Machinery Failure Prevention Technology, pp. 341-350, 2001.

[21] X. Wang, U. Kruger, and G.W. Irwin, “Process monitoring approach using fast moving window PCA,” Industrial and Engineering Chemistry Research, vol. 44, pp. 5691-5702, 2005.

[22] M. Cottrell, G. Gaubert, C. Eloy, D. Franois, G. Hallaux, J. Lacaile, and M. Verleysen, “Fault prediction in aircraft engines using self-organizing maps,” Advances in Self-Organizing Maps, vol. 5629, pp. 37-44, 2009.

[23] H. Qiu, J. Lee, J. Ling, G. Yu, “Robust performance degradation assessment method for enhanced rolling element bearings prognostics,” Journal of Advanced Engineering Informatics, vol. 17, pp. 127-140, 2003.

[24] D. He and E. Bechhoefer, “Development and validation of bearing diagnostic and prognostic tools using HUMS condition indicators,” Proceedings of the IEEE Aerospace Conference, Big Sky, MT, 2008.

[25] D. Tax, A. Ypma, and R. Duin, “Support vector data description applied to machine vibration analysis,” Proceedings of the Fifth Annual Conference of the ASCI, pp. 398-405, 1999.

[26] B. Zhang, C. Sconyers, C.S. Byington, R. Patrick, M.E. Orchard, and G.J. Vachtsevanos, “Anomaly detection: a robust approach for detection of unanticipated faults, Proceedings of the 2008 Annual Conference of the Prognostics and Health Management Society, Denver, CO, 2008.

[27] K. Goebel, B. Saha, and A. Saxena, “A comparison of three data-driven techniques for prognostics, ,” Proceedings of the Society for Machinery Failure Prevention Technology, Virginia Beach, VA, April 2010.

[28] M. Zweig and G. Campbell, “Receiver-Operating Characteristics (ROC) plots: a fundamental evaluation tool in clinical medicine,” Clinical Chemistry, vol. 39, pp. 561-577, 1993.

[29] D. Coleman, P. Holland, N. Kaden, V. Klema, and S.C Peters, “A system of subroutines for iteratively reweighted least squares computations,” ACM Transactions on Mathematical Software, vol. 6, pp. 327-336, 1980.