Upload
fiaz-khan
View
213
Download
0
Embed Size (px)
Citation preview
8/17/2019 bad_data
1/5
1
A Modified Chi-Squares Test for
Improved Bad Data Detection
Murat Göl, Member, IEEE Ali Abur, Fellow, IEEE EEE Department ECE Department
Middle East Technical University Northeastern University
Ankara, Turkey Boston, MA, U.S.A.
[email protected] [email protected]
Abstract —Current state estimators employ the Weighted Least
Squares (WLS) estimator to solve the state estimation problem.
Once the state estimates are obtained, Chi-Square test is
commonly used to detect the presence of bad data in the
measurement sets. Regretfully, this test is not entirely reliable,
that is, bad data existing in the measurement set could be missed
for certain cases. One reason for this is the approximations used
to compute the bad data suspicion threshold, which is set based
on an assumed chi-squares distribution for the objective
function. In this paper, a modified metric is proposed in order
to improve the bad data detection accuracy of the commonly
used chi-square test. The bad data detection performance of the
proposed test is compared with that of conventional chi-square
test.
Index Terms-- Bad-data detection, state estimation, Chi-squared
distribution, measurement residuals, weighted least squares. 1
I.
I NTRODUCTIONPower system state estimation is one of the key tools of an
Energy Management System (EMS) [1]. State estimators
provide the best estimates of the system voltage magnitudes
and phase angles using the system model and a redundant
enough measurement set. Those estimates are used in the
economic and control tools of the EMS.
The most common state estimation technique employed in
present systems is the weighted least squares (WLS) method
[1]. WLS is a well-developed and fast method. When applied
to the first order approximation of measurement equations, it
provides the best linear unbiased estimator (BLUE) given
normally distributed measurement errors [2]. In the presence
of Gaussian errors, WLS provides unbiased state estimates.
Unfortunately WLS estimator is not robust against bad data,
and even a single measurement with gross error may
significantly bias the estimation results. Therefore, almost all
WLS estimators carry out a post-estimation bad data
detection test, which is commonly accomplished by the so-
called Chi-Squares test [3] - [4]. Although the Chi-Squares
This work made use of Engineering Research Center Shared Facilities
supported by the Engineering Research Center Program of the National
Science Foundation and the Department of Energy under NSF Award
Number EEC-1041877 and the CURENT Industry Partnership Program.
test is the most common bad data detection method used in
several commercial state estimators, this test may not always
yield correct results. There are cases where Chi-Squares test
can be shown to fail to detect existing bad data in the
measurement set.
Missing a bad measurement which is present in the
measurement set has dire consequences, such as biased
estimates which will affect the decisions based on those
estimates. Therefore, this paper proposes a simple
modification that will improve bad data detection capability
in existing state estimators. The proposed modification
requires calculation of residual covariance matrix. The
computation of residual covariance matrix uses a subset of
the elements in the inverse of the sparse gain matrix. It is
known that matrix inversion is a computationally expensive
operation, and hence avoided in power system analysis.
However, thanks to the efficient sparse inverse methods, [5] -
[7], the computation can be performed with littlecomputational cost. In this paper the proposed method is
compared with the conventional Chi Squares method in terms
of computational performance and bad measurement
detection accuracy.
The rest of the paper is organized as follows, Section II
explains the conventional Chi-Squares Test, while the
proposed method is explained in detail in Section III. The
simulations and the numerical results are shown in Section IV
and Section V concludes the paper.
II.
CONVENTIONAL CHI-SQUARE TEST
Consider a random variable Y , which has a chi-squared( χ 2) distribution with N degrees of freedom given by the
following expression:
∑=
=
N
i
i X Y
1
2 (1)
where the random variables X 1, X 2, … , X N are independent
and distributed according the standard normal distribution.
8/17/2019 bad_data
2/5
2
In power system state estimation problem formulation,
measurement errors are commonly assumed to have a normal
distribution with zero mean and known variance. Using the
same assumption a function f(x) can be defined as given in
(2), where f(x) has a chi-squared distribution with at most (m-
n) degrees of freedom (m being the number of measurements
and n being the number of the states). Note that in a power
system with m measurements and n system states at most (m-n) errors can be linearly independent, since at least n
measurements are required to obtain a solution. Thus the
degrees of freedom will be at most (m-n).
( ) ( )∑∑∑===
−=
⎟⎟
⎠
⎞
⎜⎜
⎝
⎛ ==
m
i
N i
m
i ii
im
i
iii e R
ee R x f
1
2
1
2
1
21 (2)
In (2), ei is the measurement error with normal distribution
and Rii is the variance of the ith measurement error, where R is
the diagonal error covariance matrix. N ie is the normalized
error which has a standard normal distribution.
Consider the Chi-squared probability density function plot
given in Fig. 1 [1]. The area below the p.d.f. represents the
probability of finding X in the given region, as shown below.
{ } ( )∫∞
=≥
t xt duu x X P
2 χ (3)
Eq. (3) represents the probability of X being larger
than t x . This probability decreases as t x increases, since the
tail of the distribution decays. According to the Fig. 1, t x is
25 as shown by the dotted line for the chosen probability
0.05.
Fig. 1. Chi-Squared probabbility density function [1].
t x represents the largest value that will not be identified
as bad measurement. If the measured value exceeds the
threshold, the presence of the bad measurement will be
suspected.
In order to detect bad data, most of the commercial state
estimators that employ WLS estimation method, use the
following metric:
( ) ( )( ) ( )∑∑ ===
−=
m
i i
i
m
i i
ii r xh z x J 1
2
2
12
2ˆˆ
σ σ (4)
where m is the number of measurements, x̂ is the (nx1)
estimated state vector, ( ) xhi ˆ , i z and ir are the estimated andmeasured values and the residual for the ith measurement
respectively, and σ i
2 is the corresponding measurement
variance, which is the same as Rii. The conventional chi-
squares test will suspect existence of bad data if the computed
metric ( ) x J ˆ is larger than 2 ),( pnm− χ , the bad data suspicion
threshold value according to a chi-squared distribution for a
given probability p and degrees of freedom (m-n).
Note that, a random variable with standard normal
distribution can have a chi-squared distribution if that randomvariable is normalized with its variance as defined in (2).
Therefore, (4) is an approximation of ( ) x f , which is definedin (2), since the measurement residuals are normalized withrespect to the variances of the measurement errors.
III. PROPOSED APPROACH
The conventional chi-square test assumes that the metric
( ) x J m ˆ shown in (4) is distributed according to a chi-squared
distribution. However, the denominator is not the variance of
the corresponding residual appearing in the numerator. Thisintroduces an approximation, which may lead to incorrect
results, i.e. existing bad data may not be detected.
According to [2], the key to the analysis of bad data is the
residual sensitivity matrix, S , which is obtained by
linearization of the relation between the measurement vector z , and system state vector x and measurement error vector e,which is defined as follows.
( )
( ) ( ) ( )( )
( ) e R H H R H H I r
e R H H R H H er
e Hx R H H R H H e Hxr
x H z r
z R H H R H x
e Hx z
T T
T T
T T
T T
⎟ ⎠ ⎞
⎜⎝ ⎛
−=
−=
+−+=
−=
=
+=
−−
−
−−
−
−−−
−−
−
111
111
111
111
ˆ
ˆ
(5)
( ) 111 −−−−= R H H R H H I S T T (6)
8/17/2019 bad_data
3/5
3
S is the residual sensitivity matrix, R is the measurement errorcovariance matrix, H is the measurement Jacobian matrix and
I is the mxm identity, m being the number of measurements
[1]. Note that the derivation is based on the linear
measurement model. The details on derivation of S can be
found in [1]. The residual sensitivity matrix S has the
following properties [1].
SRSRS
S S S S S
T =
=⋅⋅ (7)
Once the linearized measurement model is assumed, the
residual sensitivity matrix S , represents the relation betweenthe measurement errors and measurement residuals [1] as
shown below.
Ser = (8)
where r is the measurement residual vector and e is the
measurement error vector.
Using (7) and (8), and the known covariance matrix forthe measurement errors R, one can easily derive the expectedvalue and the covariance matrix of the measurement residuals
as given below:
{ } { } { }
( )
[ ] [ ] SRSRS S ee E S rr E r Cov
e E S Se E r E
T T T ==⋅⋅==Ω
Ω=
=⋅== 0
(9)
where, ( ) xh z r ˆ−= , Ω is the residual covariance matrix. Notethat, due to the standard normal distributed measurement
error assumption, the expected value of the measurement
errors is 0.
As seen in (9), Ω differs significantly from R, themeasurement error covariance matrix. Therefore, in this paper
it is proposed to use a modified bad data detection
metric, ( ) xm ˆΨ , as defined below, where Ωii is the variance ofthe i
th measurement residual.
( ) ( )( )
∑=
Ω
−=Ψ
m
i ii
iim
xh z x
1
2ˆˆ (10)
Note that Ω is a rank-deficient matrix, such that it is not
invertible. Therefore, instead of using the inverse of Ω, thediagonal entries, which are the measurement residual
variances, are employed. In this formulation, off-diagonal
entries of Ω, which represent the correlations amongmeasurement residuals will be neglected and only the
diagonal elements will be considered. Thus, this metric willstill be an approximation, albeit a more reliable metric
compared to (4), since the residuals are normalized using the
square root diagonal entries of the residual covariance matrix,
which are the measurement residual standard deviations,
instead of those of measurement errors.
The main computational cost of this approach is the
computation of Ω, since a matrix inversion must be performed. However, thanks to the extremely sparse structure
of the measurement Jacobian H, efficient sparse inverse
methods [4] - [7] can be employed and the computational
burden will not be significant even for large-scale systems.
Note also that Ω does not strongly depend on the operating
point. Therefore, as long as the topology and measurementconfiguration remain the same, Ω will not have to be updated.
IV. SIMULATION R ESULTS
In this section a real utility system with 265-buses and
340-branches will be used to illustrate the benefits of the
proposed bad data detection test. The system is measured by362 measurements which ensure high enough measurementredundancy to detect presence of bad data. Simulations are
carried out in MATLAB R2014a environment using a PC
with 4GB RAM and Windows operating system.
The first study shows the additional computational burden
required for computation of residual covariance matrix. Thesecond study compares bad data detection performances ofthe proposed modified method and the conventional chi-
squares test.
Case 1: In this study solution time of WLS estimation is
compared with the CPU times required for the proposed bad
data detection approach and the conventional one. 1500
Monte-Carlo simulations are carried out and mean value ofthe results is obtained. In these simulations, random
Gaussian errors are added to the measurement set and one
randomly selected measurement is intentionally corrupted to
emulate bad data by changing its sign. Table I shows the
CPU times for the WLS state estimation solution as well asfor the modified and conventional Chi-squares tests. The
increase in computation time when using the proposed
modified test is expected and is primarily caused by the
computation of residual covariance matrix, Ω.
TABLE I. MEAN COMPUTATION TIME (MILLISECONDS)
WLS EstimationProposed Modified
Chi-Squares
Conventional
Chi-Squares
7 3.4 0.1
Case 2: Bad data detection performance of the proposedapproach is compared to that of the conventional method.
Four different single bad data scenarios are studied. Eachscenario is repeated 1500 times each time introducing a
randomly selected bad measurement. In these four cases, acertain amount of error, which is proportional to the standard
deviation of the considered measurement σ, is added to theoriginal measurements in order to emulate bad measurements.
The amount of error introduced for each case is given below.
In order to make the simulations realistic, Gaussian errors are
also added to all measurements.
• Case 2.a: No bad measurement.
8/17/2019 bad_data
4/5
4
• Case 2.b: 3σ.
• Case 2.c: 40σ.
• Case 2.d: 100σ.
Table II shows the bad data detection performance of the
proposed method and the conventional approach. The values
given in Table II are percentage values, which also indicate bad data detection probability of the proposed and the
conventional methods. As evident in Table II, both cases give
correct results for very large and very small error values.
However, for intermediate error values such as Case 2.c,
which can still significantly bias the estimation results, the
proposed approach can detect bad data which is missed by theconventional chi-squares test.
TABLE II. BAD DATA DETECTION PERFORMANCE
Case
Bad Data Detection Percentage
Proposed
Modified
Chi-Squares
Conventional
Chi-Squares
Bad Data
Present
2.a 0 0 No
2.b 0 0 No
2.c 100 68.9 Yes
2.d 100 100 Yes
According to Table II, the estimation results of Case 2.b
are unbiased, while estimation results of case 2.c are biased.Fig. 2.a presents the difference between the true states andestimation results of one randomly selected Monte Carlo run
for Case 2.b. Similarly, Fig. 2.b presents the difference
between the true states and estimation results of the same
randomly selected Monte Carlo run for Case 2.c, such that
both figures consider the same measurement but with
different errors. As seen in Fig. 2.b, although the estimationresults are biased, the conventional method was not capable
of identifying the presence of gross error. On the other hand,
the proposed metric successfully detected the presence of bad
measurement.
0 100 200 300 400 500-10
-8
-6
-4
-2
0
2
4
6
8x 10
-3
States
x t r u e -
x e s t
(a) Case 2.b
0 100 200 300 400 500-10
-8
-6
-4
-2
0
2
4
6
8x 10
-3
States
x t r u e
- x e s t
(b) Case 2.c
Fig. 2. Mismatch between estimated and true states.
Finally, it is quite informative to take a look at the
covariance values for the errors and residuals. Fig. 3 presents
the variation of Ωii and R ii values. As seen in Fig. 3,compared to the constant R ii values, Ωii values in general
appear to be much smaller. Therefore, the proposed bad data
suspicion threshold will always be smaller than that of the
conventional Chi-squares test.
0 200 400 600 800 1000 1200 14006.5
7
7.5
8
8.5
9
9.5
10
10.5x 10
-4
Measurement Residuals
Fig. 3. Variation of Ωii and R ii values.
V.
CONCLUSIONS
In this paper a modified Chi-squares test to improve the bad data detection accuracy when using WLS method in state
estimation is proposed. As seen in the simulations, the
proposed metric has a better performance compared to the
conventional test in detecting presence of bad data in a given
measurement set. Although the proposed test is successful in
detection of bad data, identification and removal of the badmeasurements will still have to be carried out by methods
such as normalized residuals test [8].
8/17/2019 bad_data
5/5
5
Most commercial programs use Chi-squares test as acomputationally cheap filter to decide whether or not toconduct an identification test. In that sense, this modification
may serve a useful purpose in increasing the reliability of this
initial filter so that bad data will not be missed.
R EFERENCES
[1]
A. Abur and A. Gomez-Exposito, “Power System State Estimation:Theory and Implementation”, book, Marcel Dekker, 2004.
[2]
A. C. Aitken, “On Least Squares and Linear Combinations ofObservations”, Proc. Royal Society of Edinburg, 1935, vol. 35, pp. 42-48.
[3]
E. Handschin, F. C. Schweppe, J. Kohlas, and A. Fiechter, “Bad dataanalysis for power systems state estimation,” IEEE Trans. Power App.Syst., vol. 94, pp. 329–337, Mar./Apr. 1975.
[4]
A. Monticelli, “Electric Power System State Estimation”, Proceedingsof the IEEE, vol. 88, no 2, February 2000.
[5] K. Takahashi, J. Fagan and M. Chen, “Formation of a Sparse BusImpedance Matrix and Its Application to Short Circuit Study”, PICAProceedings, May 1973, pp. 63-69.
[6]
Y. E. Campbell and T. A. Davis, “Computing the Sparse Inverse
Subset: An Inverse Multi-frontal Approach”, University of Florida,
Technical Report TR-95-021.
[7]
B. Bilir and A. Abur, “Bad Data Processing When Using the CoupledMeasurement Model and Takahashi’s Sparse Inverse Method”,
Innovative Smart Grid Technologies Conference - Europe, IEEE,Istanbul, Turkey, 12-15 Oct. 2014.
[8]
A. Monticelli and A. Garcia, “Reliable Bad Data Processing for Real-Time State Estimation”, IEEE Transactions on Power Apparatus andSystems, Vol. PAS-102, No. 5, May 1983, pp. 1126-1139.