View
13
Download
0
Category
Preview:
Citation preview
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Extreme-Value Analysis of Corrosion Data
Marcin Glegola
July 16, 2007
Supervisors:Prof. Dr. Ir. Jan M. van NoortwijkMSc Ir. Sebastian KuniewskiDr. Marco Giannitrapani
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Outline
Motivation
Objective of the Thesis
Methods used
Data declustering
Examples of application
Framework for modelling extremes of corrosion
Conclusions and recommendations
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Motivation
◮ in the oil industry, hundreds of kilometres ofpipes and other equipment can be affected by
corrosion
◮ it is extreme defect depth/wall loss thatinfluences the system reliability ⇒ extreme-value
methods are sensible for application
◮ only part of the system can be subjected to
inspection ⇒ results extrapolation is needed
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Motivation
◮ defects/wall losses caused by corrosion are likelyto be locally dependent ⇒ independence
assumption questionable
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Objective of the Thesis
◮ present statistical methods to model extremes ofcorrosion data, taking into account local defectdependence
◮ spatial extrapolation of the results
◮ examples of application
◮ framework/guideline for modellingextreme-values of corrosion
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Methods usedThe Generalised Extreme-Value (GEV) distribution (blockmaxima data)
G (z) =
exp
(
−h
1 + ξ“ z − µ
σ
”i
−
1ξ
+
)
, ξ 6= 0
expn
− exph
−“ z − µ
σ
”io
, ξ = 0,
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Methods used
The Generalised-Pareto (GP) distribution (excess overthreshold data)
Yi = Xi − u, for X > u, i = 1, . . . , nu
0 2 4 6 8 10 12 14 16 18 200
0.5
1
1.5
2
2.5
u
3.5
4
x i
i
Threshold exceedances
H(y) =
1 −
»
1 +ξy
σ̄
–
−1/ξ
+
, ξ 6= 0
1 − exp“
−y
σ̄
”
, ξ = 0
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Methods used (extrapolation-GEV)◮ return-level method
G (zp) = 1 − p ⇔ Pr{M > zp} = p =1
N
◮ implied distribution of the maximum corresponding to the notinspected area
Pr{XN ≤ z} = GN(z) = G (z)N
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Methods used (extrapolation-GP)
based on Poisson frequency of threshold exceedances (Poisson-GPmodel)
◮ return-level method
H(yp) = 1 − p ⇔ Pr{Y > yp} = p =1
NE
NE = λGEV × (|S | − k × |A|) - expected number ofexceedances on the not inspected area
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Methods used (extrapolation-GP)
◮ implied distribution of the maximum corresponding to the notinspected area
Pr{XN ≤ x} = exp
{
−λGEV
(
1 + ξx − u
σ̄
)
−1/ξ
+
}N
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Methods used-summary
◮ two methods for statistical inference about extreme-valuesof corrosion
- the GEV distribution (block maxima)- the GP distribution (excess over threshold data)
◮ two methods for spatial results extrapolation- return-level- distribution of the maximum corresponding to the not
inspected area
◮ the GEV and GP distributions are closely related andtheoretically, should give the same results
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Modelling extremes of dependent data
◮ for the stationary data characterised by the limited extendof long-range dependence at extreme levels, theextreme-value methods can be still applied
◮ in corrosion application it is reasonable to assume that pitdepths are locally dependent ⇒ extreme-value methodsare applicable
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Modelling extremes of dependent data with the GEV
distribution
◮ assuming local data dependence, block maxima ofstationary data (for sufficiently large block sizes) can beconsidered as approximately independent
◮ the GEV distribution is used in its standard form
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Modelling extremes of dependent data with the GP
distribution
◮ neighbouring exceedances maybe dependent, therefore thechange of practise is needed
◮ one of the most widely adoptedmethod is data declustering -filtering out dependentobservations such that remainingexceedances can be consideredas approximately independent
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Modelling extremes of dependent data with the GP
distribution-approach
◮ define clusters of exceedances
◮ identify maximum excess withineach cluster
◮ assuming that cluster maximaare independent fit the GPdistribution
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Data declustering
Estimation of the number of data clusters
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Data declustering
Extremal index method:
◮ the extent of short-range dependence of extremeevents is captured by the parameter θ, calledextremal index
θ =1
limiting mean cluster size
◮ extremal index measures the degree of clustering ofthe process at extreme levels
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Data declustering
Extremal index method
θ =1
limiting mean cluster size⇒ Nc = θ × Ne
where Nc - number of clusters, Ne - number of exceedances abovethreshold uθ is estimated by the intervals estimator
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Data declustering
Finding number of clusters using clustering algorithm
◮ define a validity criteria for the number Nc of foundclusters
◮ run the clustering algorithm for a range of Nc
◮ as proper number of clusters choose the one for which thevalidity criteria are optimised
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Agglomerative hierarchical clustering algorithm
◮ starts with single points as clusters
◮ at each step the two closest clusters are merged
◮ stops when only one cluster remains
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Agglomerative hierarchical clustering algorithm
0 1 2 3 4 5 6 70
1
2
3
4
5
6
7
1
2
3
4
5
distance
0 1 2 3 4 5 6 70
1
2
3
4
5
6
7
1
2
3
4
5
distance
6
7
8
9
2 3 1 4 50
0.5
1
1.5
2
Objects
Dis
tanc
e be
twee
n ob
ject
s
Dendrogram
distance=heightof the link
link
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Data declustering
Validation criteria, that we used, aim at identifying clustersthat are compact and well isolated:
◮ silhouette plot - maximum value indicates optimum
◮ Davies-Bouldin index - minimum indicates optimum
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Data declustering-summary
◮ two methods to estimate the number of data clusters- the extremal index method- clustering algorithm + validation criteria (Davies-Bouldin
index, silhouette plot)
◮ prior to data declustering perform clustering tendency test
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Simulated corroded surface
◮ application of the gamma-process model◮ dependence in terms of the product moment correlation
ρ(Xk , Xl) = exp
−d
(
2∑
i=1
|disti |p
)q/p
Xk = (xk , yk ), Xl = (xl , yl ),dist1 = |xk − xl |, dist2 = |yk − yl |,d = 0.3, p = 2, q = 1
0 1 2 3 4 5 6 70
1
2
3
4
5
6
7
1
2
3
4
5
distance
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Simulated corroded surface
0 20 40 60 80 1000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
x
Cor
rela
tion
coef
ficie
nt
Matrix 1
Matrix 10Matrix 9Matrix 8Matrix 7
Matrix 5Matrix 4Matrix 3Matrix 2
Matrix 6
Matrix 20Matrix 19Matrix 18Matrix 17Matrix 16
Matrix 15Matrix 14Matrix 13Matrix 12Matrix 11
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Simulated corroded surface
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Simulated corroded surface - clustering algorithm
20 40 60 80 100 120 1400.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2Davies−Bouldin index vs number of clusters
Number of clusters
Dav
ies−
Bou
ldin
inde
x
20 40 60 80 100 120 1400.65
0.7
0.75
0.8
0.85
0.9
0.95
1 Silhouette plot
Number of clusters
Ave
rage
silh
ouet
te c
oeffi
cien
t
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Simulated corroded surface - results
θ̂ nc ncalg
0.544 107 121
Table: The estimate of extremal index and determined number of clusters
Number of clusters AD2up p − v . KS p − v .
107 0.376 0.204
121 0.549 0.493
Table: Goodness-of-fit test results for different number of clusters
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
ξ̂ ˆ̄σ AD2up p − v. KS p − v.
0.065 1.341 0.647 0.488
Table: GP fit to excess of dependent data
ξ̂ ˆ̄σ AD2up p − v. KS p − v.
-0.0137 1.6839 0.555 0.493
Table: GP fit to excess of declustered data
ξ̂ σ̂ µ̂ AD2up p − v. KS p − v.
-0.007 1.627 5.637 0.621 0.899
Table: GEV fit to block maxima data
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Real data example
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Real data example
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Real data example
20 40 60 80 100 120 140 1600.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8Davies−Bouldin index vs number of clusters
Number of clusters
Dav
ies−
Bou
ldin
inde
x
20 40 60 80 100 120 140 1600.7
0.75
0.8
0.85
0.9
0.95
1 Silhouette plot
Number of clusters
Ave
rage
silh
ouet
te c
oeffi
cien
t
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
ξ̂ σ̂ µ̂ AD2up p − v. KS p − v.
-0.082 0.182 0.757 0.713 0.986
Table: GEV fit to block maxima data
ξ̂ ˆ̄σ AD2up p − v. KS p − v.
-0.008 0.133 0.616 0.074
Table: GP fit to excess of dependent data
ξ̂ ˆ̄σ AD2up p − v. KS p − v.
-0.069 0.172 0.717 0.654
Table: GP fit to excess of declustered data
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
0.5 1 1.5 2 2.5 30
0.5
1
1.5
2
2.5
3
3.5
4
Extreme pit depth [mm]
Den
sity
Extrapolated probability density function
GP−basedGEV
0.5 1 1.5 2 2.5 30
0.5
1
1.5
2
2.5
3
3.5
4
Extreme pit depth [mm]
Den
sity
Extrapolated probability density function
GP−basedGEV
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Framework for modelling extremes of corrosion with the GEV distribution
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Framework for modelling extremes of corrosion with the GP distribution
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
Conclusions and recommendations
◮ the two applied distributions are closely related and lead tothe consistent inference about extreme-values of corrosion
◮ data declustering improves the results given by the GPdistribution
◮ the performance of other clustering algorithms could bechecked
◮ in order to take into account corrosion nonstationarity dueto space-varying environmental conditions,covariate-dependent extreme-value models with trendscould be considered
Marcin Glegola Extreme-Value Analysis of Corrosion Data
MotivationObjective of the Thesis
Methods usedData declustering
Examples of applicationFramework for modelling extremes of corrosion
Conclusions and recommendations
THANK YOU FOR ATTENTION
Questions???
Marcin Glegola Extreme-Value Analysis of Corrosion Data
Recommended