[IEEE 2011 IEEE AUTOTESTCON - Baltimore, MD, USA (2011.09.12-2011.09.15)] 2011 IEEE AUTOTESTCON - V-22 aircraft flight data mining

NAVAIR Public Release 11-0044 Distribution Statement A – “Approved for public release; distribution is unlimited”

V-22 Aircraft Flight Data Mining

Michael Burger Naval Air Systems

Command Lakehurst, NJ

Christopher Jaworowski Naval Air Systems


Robert Meseroll Naval Air Systems


Abstract—The Naval Air Systems Command (NAVAIR) produces and supports highly complex aircraft weapons systems which provide advanced capabilities required to defend U.S. freedoms. Supporting said complex systems such as the MV-22/CV-22 aircraft requires being able to troubleshoot and mitigate complex failure modes in dynamic operational environments. Since an aircraft is comprised of multiple systems designed by specialty sub-vendors and subsequently brought together by an aircraft integrator, diagnostics at the aircraft level are usually “good enough” but not capable of 100% fault isolation to a single component. Today’s system components must be highly integrated and are required to communicate via high speed data-bus conduits which require precise synchronization between systems. Failure modes of aircraft are identified via design, analysis and test prior to fielding of the weapon system. However, not all failure modes are typically known at the time of system Initial Operational Capability, but rather are found in the field by maintainers/pilots and then subsequently mitigated with aircraft engineering changes or system replacements. Also, the requirement for increased capabilities can drive the need for new systems to be integrated into an aircraft system that may not have been considered in the initial design and support concept. There is a plethora of maintenance action detail collected by pilots, maintenance officers (MO) and engineers that can and should be used to identify failure mode trends that come to light during the operational phase of an aircraft. New troubleshooting techniques can be developed to address underlying failure modes to increase efficiency of future maintenance actions thus reducing the logistics trail required to support the aircraft. The elements available for analysis are maintenance results input by the MO/pilot, (including free form comments regarding problems and resulting actions), Built-In-Test (BIT) fault codes recorded during a flight, and off-aircraft test equipment (such as Consolidated Automated Support System CASS) historical test results. The Integrated Support Environment (ISE) is collecting the data required to perform analysis of underlying maintenance trends that can be identified using some specialized software data mining tools such as text mining of corrective action and maintainer comments data fields from maintenance results. The findings or knowledge extracted from text mining can be correlated back to fault codes recorded during flight and historical maintenance results to help mitigate issues with broken troubleshooting procedures causing headaches to the our Sailors and Marines in the field. By tagging key phrases from the maintainer’s/pilot’s remarks, knowledge can be gleaned into how the aircraft fails in vigorous environments. The premise of this research is to first choose an apparent high failure avionics

system on the V-22 aircraft that is experiencing a high removal rate from the aircraft but subsequently found to be fully operational when tested on CASS. The results of this analysis should present potential root causes for “Cannot Duplicate” situations by recommending an augmentation of diagnostics at the aircraft level to avoid removing and replacing a system that has not failed even though it has reported bad via the aircraft diagnostics. This research will utilize the Net-Centric Diagnostics Framework (NCDF) to retrieve past Smart Test Program Set (TPS) results/BIT sequence strings as a variable for identifying trends in V-22 aircraft maintenance actions. The results of the research will be socialized with the V-22 avionics Fleet Support Team and the Comprehensive Automated Maintenance Environment Optimized (CAMEO) for validation of findings before any troubleshooting changes are recommended. If required, the Integrated Diagnostics and Automated Test Systems group will perform an engineering analysis of problem and suggest an enhanced diagnostic technique to mitigate the issue.

Keywords-V-22, data-mining, text-mining, maintenance actions, diagnostics

I. INTRODUCTION The Navy V-22 aircraft platform continues to be burdened

by weapon replaceable assemblies (WRAs), which malfunction during operation, causing the organizational maintenance personnel to remove the WRA from the aircraft platform. A considerable number of removed WRAs reach the intermediate maintenance level CASS (Consolidated Automated Support System) station with the CASS system unable to reproduce the malfunction error codes reported at the Organizational level (referred to as the O-level.) The ISE team at NAVAIR Lakehurst is investigating the use of text mining algorithms to identify root cause failures of non-reproducible aircraft WRA malfunction errors. The root cause of these aircraft failures and the appropriate maintenance actions will be shared with the organizational level maintenance personnel to allow the work center maintainers to make the necessary repairs or to pass down the information to the next maintenance level via NCDF.

II. RESEARCH AND DEVELOMENT The initial step of the research effort began with collecting

the V-22 aircraft platform dataset. The two data sources of interest are the Decision Knowledge Programming for

U.S. Government work not protected by U.S. copyright


Logistics Analysis and Technical Evaluation (DECKPLATE) system and the V-22 Fleet Support Team (FST) website. DECKPLATE is a Naval Aviation Logistics Data Analysis (NALDA) data warehouse that stores aircraft maintenance data joined together from various data sources that include Optimized Organizational Maintenance Activity (OOMA) [1]. The V-22 FST website contains the flight files produced by the V-22 aircraft platform onboard computer. These flight files consist of the data that is stored on the data transfer module (DTM), which is then removed by the pilot post-flight and downloaded onto the Aircraft Maintenance Ground Station (AMEGS) [2].

The target date range for this research effort is fiscal years 2008 through 2011. This date range was selected to produce an ample amount of maintenance records for the selected WRA as well as to include the accompanying WRA maintenance actions that could not be duplicated at the I-level. A larger dataset range spanning earlier years was not included as it may have potentially contained maintenance issues that have already been resolved by engineering change proposals (ECPs) or addressed by the Updated Integrated Electronic Technical Manual (IETM). Performing this research effort on a WRA that has already been or is in the process of being overhauled could potentially render this investigation prematurely obsolete. Therefore in an effort to maximize the value added to the fleet, this smaller date range was chosen.

Deficiencies or fault codes that the O-level maintainer fails to diagnose or correct result in the removal of the suspected WRA or WRAs from the aircraft. These O-level WRA removal maintenance actions are retrieved from DECKPLATE. Once a WRA is removed from an aircraft and the O-level maintainer does not have the capability to resolve the issue, the component is passed to the next level of maintenance, the intermediate level (referred to as the I-level.) The I-level maintenance CASS station test program set (TPS) is employed by the maintainer to identify the WRA deficiency. A malfunction code “799” or “NO DEFECT – malfunction could not be duplicated, item checks good” is recorded by the I-level maintainer in the event the CASS station cannot duplicate the O-level deficiency. These I-level “799” maintenance records are retrieved from DECKPLATE [3]. A key point to note is that the maintainer chooses the O and I-level malfunction error codes – there are no completely impartial criteria or automated systems that assign them. As such, the codes are susceptible to human error. These O and I-level maintenance datasets are fused together by their common job control number (JCN), aircraft-id, also known as BUNO number, and WRA part serial number as seen in Fig. 1. The JCN is a unique alphanumeric number assigned to a maintenance action and remains the same between maintenance levels. The aircraft-id number is a unique identification number assigned to the V-22 aircraft [4].

The DECKPLATE dataset is filtered to contain only those WRAs that are among the top five with respect to removals that cannot be duplicated at the I-level. This dataset indicates that the Flight Control Computer (FCC) WRA is the most erroneously removed WRA, as seen in Fig. 2. However, given the extensive NAVAIR resources and planned ECPs already focused on this WRA, the ISE team decided to focus on the

Figure 1. DataSet Collection Flow Chart

next highest WRA, the Regulated Converters (RC). The RC WRA serves a critical function in the V-22 aircraft electrical system by converting the 115 VAC from the onboard AC generators to 28 VDC, powering components connected to the DC bus. Improvements to the maintenance profile of the RC WRA would significantly benefit the fleet. The RC WRA is depicted in Fig. 3 [5].

The V-22 flight files corresponding to the DECKPLATE dataset were extracted from the V-22 FST website. These files are listed by their aircraft-id and date time information. Unfortunately, there is no automated data feed between the V-22 FST website and the ISE database, making the process of retrieving the flight files a tedious and manually intensive effort. Given the time constraints on this research effort, care was taken to select only the flight files that were necessary to conduct our text mining analysis. This limitation may hamper our team’s ability to identify any additional outlying text mining patterns or trends of interest. To merge the flight files to the DECKPLATE dataset, an automated script was

Figure 2. FY11-FY08 WRA Removals

DECKPLATE Data Set

V-22 Flight File Data

Buno U DateTime

Data Set

DECKPLATE O-level Rmvd PartNo

DECKPLATE I-level 799 Mal Code

Buno U JCN U PartNo


Figure 3. RC WRA

developed to preprocess the files in order to correct clearly erroneous data, such as having a flight file identified as containing data for a given aircraft-id that contains records where the aircraft-id is zeroed out. Once the files were cleaned, the V-22 flight files were fused to the DECKPLATE dataset as seen in Fig. 1. The two datasets were joined together based on the common aircraft-ids and date/time stamps. A window of less than or equal to 7 days was established to define when a maintenance record from the DECKPLATE dataset was considered as matching up with records from the flight file data.

The next step was to employ the IBM SPSS Modeler text analytics tool to analyze the resultant dataset. IBM SPSS Modeler is an industry leading data and text mining/analytics toolset. With the aid of this tool, a text-mining algorithm was implemented on the O and I-level maintenance action descriptive narrative data elements. These fields are where the maintainer records his/her observations and work performed. The text-mining algorithm extracts recurring text patterns and groups them by context. It then proceeds to assign each context as its own mathematical entity, thus providing mathematical structure (and thus an ability to statistically analyze) the free-form text written by the maintainers. The algorithm proceeds to track how often each of these concepts appears in the dataset as well as the records in which these concepts occur, enabling the ISE team to identify statistical information not only regarding which concepts correspond to other variables in the dataset (such as fault-id), but also which concepts frequently co-occur with each other as well. The first major relationship examined was the strength of the relationship between the flight file fault-id and the concepts occurring in the descriptive narrative fields. These data points are best visualized as a web graph, as will be discussed in the next section. In addition to capturing the strength of the text concept to fault-id relationship, we create an association model tracking the co-occurrences of each major concept identified by the text mining. In addition, another association model will be employed on the fault-id to determine a pattern of fault-ids that co-occur during a flight. (In a single flight, multiple faults often occur. It is reasonable to assume that they could be related – a failure in one system triggering another failure in the same, or a related system.) If time permits, an association model will be applied to the fault-id before and after maintenance actions are completed as well.

III. TEXT AND DATA MINING RESULTS As per the prior section describing the text mining

approach, the text-mining algorithm is implemented first on the O-level descriptive narrative data elements resulting in 151 extracted text concepts. The relationship of the O-level descriptive narrative text concepts to the flight file fault-ids is depicted in the initial web graph in Fig. 4. The thickness of the blue line between the text concept and the fault-id is proportional to their joint occurrence within a given dataset record. In other words, the more often the text concept in question and a fault-id occur together in a given dataset record, the thicker the line. To clarify the O-level relationship results and to focus on the strongest observed relationships, the web graph is filtered to display only those strongest relationships as seen in Fig. 5. In Fig. 4 and 5, the fault-ids are the numbers on the top, left, and bottom sides of the graph and the extracted text concepts are on the right. As the actual fault-id hexadecimal numbers and their corresponding definitions could not be disclosed as public release information given proprietary nature of the data, the actual fault-id hexadecimal numbers were replaced with the numbers as seen in the web graphs in Fig. 4 and 5. An example of what the mapping of the fault-id numbers would look like with its corresponding definition display message is given in Table 1. For the same reason that the hexadecimal number data could not be released, a detailed analysis of the actual results cannot be shared in this paper. From a high level perspective it can be seen that the fault-id numbers 32 and 33 have the strongest relationship to the text phrase “flight.” This is a key relationship to analyze, in addition to the remaining fault-id text phrase relationships.

Analogous to the O-level descriptive narrative, the text- mining algorithm was also implemented on the I-level descriptive narrative resulting in 59 text concepts. As is the case with the O-level, the actual hexadecimal numbers could be released. The relationship between the maintenance action I-level descriptive narratives to the fault-id is depicted in Fig. 6. The filtered web graph depicting only the strongest relationships is displayed in Fig. 7. In comparison to the O-level text phrase fault-id web graph, the relationships between the I-level text concepts and the fault-ids are considerably

Figure 4. Initial O-level Text Mining Web Graph


Figure 5. Filtered O-level Text Mining Web Graph

weaker. The dataset contains far less maintainer-populated I-level descriptive narratives compared to the O- level (they consist mostly of a small set of stock responses), thus resulting in fewer opportunities for meaningful joint text phrase and fault-id occurrences. We observe that the text phrases “niu” and “conv” have strong relationships to fault-id 153. The NIU is the interface that communicates links between the mission computer and the non-data bus components [5]. Per the prior paragraph’s explanation, the real fault-ids and detailed analysis of the data cannot be shared with the public.

An association model was applied to the fault-id O and I-level descriptive narrative text concepts and web graphs discussed and depicted above. Table 2 and 3 depict the O and I-level fault-id and text concept association model results. Per the Table 2 and 3 results, the inverse relationship between the support and confidence parameters is very evident. This high support and low confidence inverse relationship implies many of the fault-ids occur very frequently across the various descriptive narrative text phrases. To improve the confidence of this data, future work is required to include the sub fault-id flight file parameter in the data mining algorithm.

Table 1: Sample Fault-ID Display Message

Fault-ID Display Message 2 Display Message 1 4 Display Message 2 30 Display Message 3 31 Display Message 4 32 Display Message 5 33 Display Message 6 153 Display Message 7 224 Display Message 8

Figure 6. Initial I-level Mining Web Graph

The next data mining technique applied to the data was a classification model. A classification model assigns one set of variables in a dataset as predictors, and another set of variables as the target variables. It then attempts to construct and define a relationship between the predictors and the targets – i.e. given a set of values for the predictors, to see if there is a way to determine the value of the target variables. A classification model was applied to the V-22 flight file maintenance raw data, which includes the following parameters: OAT, lateral stick position, nacelle angle, right flapperon position, condition set reset, true air speed, barometric altitude, left flapperon position, longitudinal stick position, rotor rpm, elevator position, rudder position, and magnetic heading. These parameters are utilized as input fields to the data mining model to predict the known target field of interest, fault-id. Due to the size of the large dataset, (over 17 thousand complex records), and limited computing resources, the data was partitioned into fiscal years 2008, 2009, 2010, and 2011. The results of this modeling process are displayed in Table 4. As per the Table 4 results, fiscal year 2009 has the overall highest predictive model accuracy of 42.6%. With additional computing resources, a

Figure 7. Filtered I-level Text Mining Web Graph


Table 2: O-level Fault-ID Text Concept Association Model Results

Antecedent Consequent Support %

Confidence %

33 Concept_flight 2.60 37.3 33 Concept_conv 2.60 23.0 32 Concept_flight 2.55 37.9 32 Concept_conv 2.55 23.0

153 Concept_conv 1.86 32.9

2 Concept_conv 1.29 43.8

31 Concept_conv .925 61.9 4 Concept_conv .872 65.6 30 Concept_conv .872 65.6

more inclusive predictive model could be applied to the complete dataset, potentially improving the target fault-id prediction capability. Employing this predictive model as part of the O-level diagnostic phase may potentially improve efficiency and lead to cost savings when handling the RC WRA.

The final part of this effort was an association model applied to the V-22 flight file fault-ids, aggregated by individual flight and partitioned by fiscal year. Unfortunately, the available computing resources could not run the algorithm on the flight file datasets, even when partitioned by fiscal year. In the future, if high performance computing resources become available, this model will be revisited.

Table 3: I-level Fault-ID Text Concept Association Model Results

Antecedent Consequent Support %

Confidence %

224 Concept_conv 1.291 23.2 153 Concept_conv .925 36.0 153 Concept_niu .925 31.7

2 Concept_conv .759 38.6

Table 4: Predictive Model Overall Accuracy

Fiscal Year Overall Accuracy % 2011 35.9 2010 38.4 2009 42.6 2008 37.6

IV. CONCLUSION The complete findings of the V-22 data mining research

will be shared with the avionics FST and CAMEO teams for further discussion and validation. These findings may lead to improved O-level RC WRA diagnostic techniques by correctly identifying the faulty component at the O-level rather than needlessly allocating time and resources running CASS tests at the I-level. This research approach could easily scale to the top 10 WRA degraders and to additional aircrafts, such as the F/A-18 and F-35 platforms. In addition to improving aircraft operational availability, cost avoidances of over 9.9 thousand man-hours and $367 thousand burdened cost dollars can be achieved with these improved diagnostic techniques. The ISE Team will continue to pursue access to high performance computing resources to run data mining algorithms on the datasets that proved too complex and large for our workstations. Work to establish automated data feeds from the DECKPLATE system and the V22-FST website is ongoing, and would facilitate additional text and data mining opportunities. The ISE Team would also like to expand the research to include smart TPS results in the data and text mining techniques.

REFERENCES [1] L. Clark and A. Gilbert, “DECKPLATE Data Architecture” in Forum

for Logistics Integration, 2011, slide 9. [2] K. Westervelt, “Transforming A Maintenance Ground Station,” in Proc.

Aerospace Conference, 2005, pp. 3723-3731. [3] COMNAVAIRFORINST 4790.2A, Retrieved 12/2010, from

http://www.navair.navy.mil/logistics/4790/index.cfm, pp. E-11-E-18. [4] Integrated Publishing, Retrieved 5/2011, from

http://www.tpub.com/content/aviation/. [5] J. Werkley, “V-22 Site Activation Familiarization Course”, slide 158. [6] R. Meseroll, C. Kirkos, R. Shannon, “Data Mining Navy Flight and

Maintenance Data to Affect Repair”, in Proc. IEEE AUTOTESTCON, 2007, pp. 476-481.

Documents

[IEEE 2011 IEEE AUTOTESTCON - Baltimore, MD, USA (2011.09.12-2011.09.15)] 2011 IEEE AUTOTESTCON - V-22 aircraft flight data mining