Application of Application of Maximum Maximum Entropy PrincipleEntropy Principle to software to software
failure predictionfailure prediction
Wu JiWu Ji
Software Engineering InstituteSoftware Engineering Institute
BeiHang UniversityBeiHang University
AgendaAgenda
• Introduction
• Problem and focus
• Method and models
• Results
• Conclusions
IntroductionIntroduction
• Failure prediction is one of the key problems for software quality (reliability) estimation.
• Generally, failure prediction can be defined as y = f(x).– y is failure related variable– x is the foundation on which prediction works
• As far as we know, x has been set as:– Software execution time reliability growth predic
tion– Software execution trace anomaly detection
Introduction (cont.)Introduction (cont.)
• Reliability has been a big concern for high reliability requirement (HRR) software.
• Reliability engineering has very high cost. Reliability testing is seldom done for the software without HRR.
• Anomaly detection is usually implemented as a built-in module of software.
Introduction (cont.)Introduction (cont.)
• Generally, all managers are striving for high quality.
• What does manager really care for failure prediction?– Given an usage scenario, if software can
survive?
• How to predict software failure from input is still a new problem.
Problem and focusProblem and focus
How to predict failure from software
input?
Problem and focus (cont.)Problem and focus (cont.)
…
failure observation = ? (0/1)
left context
execution time line
execution start s
t
Problem and focus (cont.)Problem and focus (cont.)
• If we can model the left context, we get the distribution {(lc, fo)}.
Failure Learning
Failure Prediction
{(lc,fo)}Software input
Failure observation
Failure law
Method and modelsMethod and models
• The whole left context is hard to model. – A probability model: po(y|x)
– x: partial left context, y: failure observation.
• Maximum Entropy Principle (MEP) is applied to model the po(y|x).
1
{0,1} 1
1( | ) exp{ ( , )}
( )
( ) exp{ ( , )}
ro r o
r N
rr o
y r N
p y x f x yZ x
Z x f x y
Method and models (cont.)Method and models (cont.)
• MEP is a well-known and widely used learning principle:– Great generalization ability– Dynamic and open– Good adaptive with data sparseness
Method and models (cont.)Method and models (cont.)
Failure cannot be well modeled without
modeling fault.
Failure can be well modeled only from
input, and its relations with failures.
Structure ViewerSurface Viewer
Structure Model Surface Model
Method and models (cont.)Method and models (cont.)
• Surface Model: learns the statistical co-occurrence of the surface information.
• Structure Model: learns the statistical cause-effect (fault-failure) relationship.
Method and models (cont.)Method and models (cont.)
SIU-Seg-Ftrs
SIU-Num-Ftrs
Failure-Ftrs
Flr
The features applied in the surface model
Method and models (cont.)Method and models (cont.)
Fault-Ftrs
(Flt -> Flr) Ftrs
Failure-Ftrs
Flr
The features applied in the structure model
Method and models (cont.)Method and models (cont.)
• Supervised training
• Training data
• Objective: maximize the likelihood function.
Method and models (cont.)Method and models (cont.)
• Models Evaluation:– For a given test case:
• Test engineer would run it and get the test_fo_sequence;
• The prediction model would return the predicted pred_fo_sequence.
– Evaluate by the match degree (precision) between test_fo_sequence and pred_fo_sequence.
ResultsResults
• Two groups of experiments, totally 5 software involved in, 17 testing.
• Open test method– Testing data keeps separate with training data
and keeps unknown for training.
• Surface Model: average precision: 0.876
• Structure Model: average precision: 0.858
Results (cont.)Results (cont.)
0
2
4
6
8
10
12
14
(0.5,0.65] (0.65,0.75] (0.75,0.85] (0.85,1.0]
Surf_precStruc_prec
Evaluation Score Distribution
Results (cont.)Results (cont.)
Model Performance wi th TDS
0. 400
0. 500
0. 600
0. 700
0. 800
0. 900
1. 000
61 64 67 70 72 73 77 83 111 257 300 320 324 344 403 462 462
Trai ni ng Data Si ze
Prec
isio
n
Sur f ace Model
St ruct ur e Model
Results (cont.)Results (cont.)
Model Perforamce wi th average TDS
0. 400
0. 500
0. 600
0. 700
0. 800
0. 900
1. 000
Aver age TDS f or SI U
Prec
isio
n
Sur f ace ModelSt ruct ur e Model
Results (cont.)Results (cont.)
• Potential applications of the prediction model– Test case prioritization– Reliability Estimation– Reliability Growth Modeling
ConclusionsConclusions• A new failure prediction problem
• Apply statistical learning method to learn failure law and then predict failure
• Two models, surface model and structure model
• Promising evaluation results:– Surface Model: 0.876– Structure Model: 0.858.
Conclusions (cont.)Conclusions (cont.)
• Lessons learnt:– To design and start experiments ASAP to
verify model.– Complex model does not always perform well.
model simplification.– DO NOT draw much assumption on the
generation of data.
Thank you for the attentionsThank you for the attentions
Ready for questions!