Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Contribution of Prof. H. Akaiket St ti ti l M d li
Contribution of Prof. H. Akaiket St ti ti l M d lito Statistical Modeling of Complex Systems
to Statistical Modeling of Complex Systemsof Complex Systems of Complex Systems
Research Organization of Information and Systems
Genshiro Kitagawa
Research Organization ofInformation and Systems
ISI, Dublin, August 23, 2011
Hirotugu Akaike(1927-2009)
Research Organization ofInformation and Systems 2
Brief History
1952 Graduated from Math. Dept., Tokyo UniversityResearcher of the Institute of Statistical Mathematics
1962 Head of 2nd Section, 1st Division1973 Director of 5th Division1985 Director of Dept. of Prediction and Control1986 Director-General of ISM (-1994)1988 Member of Science Council of Japan (- 1991)
Head of Dept. of Statistical Science, Graduate Universityof Advanced Study
1994 P f E it Th I t St ti t M th1994 Prof. Emeritus, The Inst. Statist. Math.Prof. Emeritus, Graduate Univ. for Advance Study
Research Organization ofInformation and Systems 3
Prizes1972 Ishikawa Prize
(Establishment of statistical analysis and control method for dynamic systems)
1980 Okochi Prize(Research and realization of optimal steam temperature control of thermal electric plant)
1989 A hi P i1989 Asahi Prize(Research on statistics, in particular theory and applications of AIC)
The Purple Ribbon Medalp(Statistics, in particular time series analysis and its applications)
1996 The 1st Japan Statistical Society Prize(Contributions to statistical theory and its applications)(Contributions to statistical theory and its applications)
2000 The Order of the Sacred Treasure2006 Kyoto Prize2006 Kyoto Prize
( Major contribution to statistical science and modeling with the development of AIC)
Fellow of ASA, RSS, IMS, IEEE, JSS
Research Organization ofInformation and Systems
, , , ,
4
Laureate of 22nd Kyoto Prize
"Major contribution to statistical science and modeling with the development of the Akaike Information Criterion (AIC)”
Research Organization ofInformation and Systems 5
Ontline
1 Launching period: Mathematical analysis1. Launching period: Mathematical analysis and structural modeling (1950’s)
2. Frequency domain analysis (1960’s)3 Time domain time series modeling (1970’s)3. Time domain time series modeling (1970 s)4. AIC and statistical modeling (1970’s)g ( )5. Bayes modeling (1980’s)
Research Organization ofInformation and Systems 6
1. Launching Period (1952-1960)
Mathematical AnalysisD i i
Bird cage• Decision process• Evaluation of probability distribution• Monte Carlo method solving linear equation• Monte Carlo method solving linear equation• Computation of eigenvalues• Convergence of optimum gradient method
Convergence Analysis of Optimal Gradient Method AISM(1959)Analyzed limiting behavior of the optimal gradient method and
g p g
y g p gshowed the poor convergence property, known as bird cage.
f fBecame the foundation of nonlinear optimization methods such as Conjugate Gradient method and Quasi Newton method
… Kowarik and Osbone (1968)
Research Organization ofInformation and Systems 7
Structural Modeling
At this time, he was rather interested in real-world problems. f ’And from very early stage in the1950’, he realized the
conventional linear stationary models are unrealistic and the importance of developing a model that fully takes intoimportance of developing a model that fully takes into account of the structure of the process.
Control of filature production process (Joint work with Dr Shimazaki of the Sericultural Experiment Station)(Joint work with Dr. Shimazaki of the Sericultural Experiment Station)
• Developed a control method based on gap process modeling• This method provided a reference process to detect abnormalities in• This method provided a reference process to detect abnormalities in
actual reeling process.• Brought significant innovation for silk production in Japan
Research Organization ofInformation and Systems 8
g g p p
2. Frequency Domain Analysis: (1960-70)
• By 1960, he established contacts with engineersIt took almost 10 years for me to develop substantial contacts, and this happened largely by coincidence. Statistical Science (1995)
• He realized that linear stationary modeling is utilized and many unsolved important problems are left for statisticiansmany unsolved important problems are left for statisticians• Smoothing method for power spectrum estimation
(Suspension system of a car, Dr. Kaneshige, Isuzu Motor Co.) Akaike windows
•Estimation of frequency response function(Dr. Yamanouchi, Transportation Tech. Res. Inst.)
Developed a method of estimating frequency response function from observations under normal steering (without using sinusoidal inputs)
Research Organization ofInformation and Systems 9
2. Frequency Domain Analysis: (1960-70)
Developed practical method of analyzing complex systems
Organized workshop on practical use of time series method
• Ship’s yawing, rolling (Yamanouchi, Kawashima)
• Rolling of a car(Kawamura)g ( )
• Engine of a car(Kaneshige)
• Response of an air-plane to side wind(Takeda)
• Hydroelectric power plant(Nakamura)
• Estimation of underground structure through micro tremorT i• Tsunami (Kinosita)
• EEG analysis (Suhara)
Research Organization ofInformation and Systems 10AISM supplement (1964)
Difficulty in Feedback Systemsrotar kiln
Cement Rotary Kiln, Chichibu Cement Company
t kil i li t d f db k t i ti
rotary kilnrotary kiln
rotary kilnCement rotary kiln
• rotary kiln is a complicated feedback system, consisting of many variables such as raw material feed, fuel feed, gas damper angle, temperature at various locations, etc.
• Conventional spectral analysis methods were useless to identify the source of fluctuation
• In the frequency domain analysis, we cannot explicitly utilize the physical realizability of the system
1n
A
1y2yLimitation of frequency
B
1ny2ny
(1967)
Limitation of frequency domain approach
Madison symposium (1967)
Return to timedomain modeling
Research Organization ofInformation and Systems 11
2n( )y p ( )
3. Time Domain Time Series Modeling (1968-)
Vector time series Tnnn yyy ),,( 1 1nw
Structural modelm
B1
iikkii
ij
m
knijknkijni
w
yy
1
,
A1
1n
niiknk
kini w ,
1
1ny2ny
A22nVAR model representation
m
A
k
nknkn wyAy1
B2
Estimation of kij and ki through Ak
Research Organization ofInformation and Systems 12
2nw
Spectrum & Power ContributionPower spectrum of yni
22|)(|)( fbf
VAR model
)0( NwwyAym
2
1
2|)(|)( jj
ijii fbfp
),0(~1
NwwyAy nk
nknkn
Cross spectrum
Power contribution
|)(| 22fb
*)( )()( fBfBfp
)2e p()()( fiAbfBm
)(
|)(|)(
22
fpfb
frii
jijij
)2exp()()(1
fiAbfB ij
Order m is unknown! Proportion of the contribution
Proper selection of the orderis crucial
from noise input ej(n) in the power spectrum of yi(n) at frequency f.
Research Organization ofInformation and Systems 13
q y f
Predictive Point of View and FPE
“True”“True”
Conventional fitting view point Predictive view point
TrueDistributionModel
TrueDistribution
FutureData
PresentData Model
Estimation Prediction
PresentData
Estimation
DataEvaluation
Scalar AR model
Data
Scalar AR model
),0(~, 2
1m
m
knnknkn Nwwyay
Multivariate case: MFPE, FPEC
FPE is a forerunner of AIC
2ˆ1FPE mmmn
FPE (Final Prediction Error)
)1(ˆlog11logˆlogFPE log
2
2
mnmnmnnnn mm
Research Organization ofInformation and Systems 14
1 mm mn
Akaike (1969)
)1(log mn m
Realization of Statistical ControlCement rotary kiln
VARX model
m
n
m
jnjjnjn rByAy1 1
j j1 1
State –Space RepresentationState –Space Representation
Electric power plant Nonlinear control
State Space RepresentationState Space Representation
nn
nnnn
HxyGrFxx
1
Criterion Function
controlJapan BailayOzaki-Peng
L
TT
Autopilot
Roll-reducing control
Mitsui ship buildingOda
Nakamura-Akaike(1981)
j
nT
nnTn RrrQxxEJ
1
*Statistical Optimal controller
Noise-adaptivecontroller
Yokokawa Kitagawa-Akaike-
1*
nnn HrKxr
Research Organization ofInformation and Systems
TIMSAC Akaike-Nakagawa(1972)
Ohtsu-Horigome-Kitagawa(1978)
1970 1980 2000
Kitagawa AkaikeOhtsu (2006)
15
Analysis and Control of Feedback Systems
MetabolismMulivariateAR model
Economics
Wada
Ship’s Tanaka-Satodynamics
Ohtsu-Horigome
Powercontribution
Finance, CDSTsudaNuclear
power plantHitachicontribution
Man-machinesystem
I hi Oh
( )Hitachi
(Fukunihsi)
Ishiguro-OhyaAkaike’s linear time series modeling starts from multivariate case.
Research Organization ofInformation and Systems
1968 1977 1993 2000
16
Main Contributions in Time Domain Modeling
f1. Developed practical method of VAR modeling• Analysis of feedback systems• Statistical controller
2. Use of state-space model• For optimal controller design
ML E i i d id ifi i f VARMA d l• ML Estimation and identification of VARMA model• BAYSEA: Bayesian method for seasonal adjustment
3 Oder determination3. Oder determination• FPE, MFPE, FPEC• AIC
4. Developed TIMSAC (Time Series Analysis and Control program package)
Research Organization ofInformation and Systems 17
4. AIC and Statistical Modeling (1971-)
“True”
Predictive Point of View
TrueDistribution
PredictivePont of viewPredictive
Pont of view FutureData
PresentData Model
Estimation Prediction
DataDataEvaluation
MFPE (1) Predictive point of view
(2) Prediction by distribution
(1) Predictive point of view
(2) Prediction by distribution AICAIC
Factor analysis
(3) Evaluation by K-L information(3) Evaluation by K-L information
Research Organization ofInformation and Systems 18
Predictive Point of View and AIC
Model g(y): distribution of future data, f(y): modelKullback Leibler InformationKullback-Leibler Information
fEgEfgEfgI YYY logloglog);(
Expected Log-Likelihoodf YYY
dfYfE )()(l)(lUnknown constant
Log-Likelihood dyygyfYfEY )()(log)(log
1 n)(~,,1 xgxx n
MLE)(log YfEY)(log1
1
n
iiXf
nMLE
Multi model Situation: Bias correction AIC
)|(log)( Xf )(ˆˆ)(max X
Research Organization ofInformation and Systems
Multi-model Situation: Bias correction AIC
Akaike (1973,1974) 19
Bias Correction
][))(ˆ|(log))(ˆ|(log )( DEXYfnEXXfEGb XYX Log-Likelihoodlog ( | ( ))f X X
ˆGeneric Information Criterion
D1)())(|(log GbXXf D
1
D2
D3 mGJGIGb 1)()(tr)(
I(G): Fisher Information E d
))(|(log XYfEY
I(G): Fisher InformationJ(G): -(Expected Hessian)
ExpectedLog-Likelihood
*
)of(dim2))ˆ((log2
2))(ˆ|(log2AIC
L
mXXf
Research Organization ofInformation and Systems 20
)of(dim2))((log2 L
Number of papers citing Akaike’s paper by Thomson (ISI)
5142 IEEEac(1974)
2907 Akad. Kiado (1973)
Year by year cites
1000
1200
AK(1973)IEEE(1974)
2550 Math. Stat.
Cited Research Area
1115 Psychometrika (1987)
818 AISM(1969)
431 AISM(1970)400
600
800IEEE(1974)
1987 Medical, Health
1868 Bi l359 Math. Science (1978)
281 Appl. Stat. (1977)
265 Bayesian Stat. (1980) 0
200
400
1973 1978 1983 1988 1993 1998 2003 2008
1868 Biology
1833 Engineering
204 System Ident. (1980)
135 Celebration (1986)
127 AISM(1969)
123 AISM(1974)
Cumulative cites
10000
12000
AK(1973)
1973 1978 1983 1988 1993 1998 2003 20081755 Computer, Info. Sci.
1376 Psychology
1070 E i123 AISM(1974)
13104 total
4000
6000
8000IEEE(1974)
By Google Scholar
1070 Economics 785 Social Science773 Environmental Sci.673 Physics Chemistry
0
2000
400014265 IEEE-ac (1974)
8588 Acad. Kiado (1973)
1987 Psychometrika(1987)
By Google Scholar
1428 Others
673 Physics, Chemistry610 Geophysics561 Pharmacology
Research Organization ofInformation and Systems
1973 1978 1983 1988 1993 1998 2003 20081969 AISM (1969)
21
Main Contributions in This Stage
f C1. Developed of AIC2. Importance of statistical modeling3. Motivated nonstationary, non-Gaussian
nonlinear modeling4 D l d TIMSAC 784. Developed TIMSAC-78
Research Organization ofInformation and Systems 22
5. From AIC to Bayes Modeling
Models: AIC,,AIC,, 11 MM
AICAIC
Likelihood of the model:
Prior probability of order: j
2/AICexp j
p y
Posterior probability of order: 2/AICexp
)|( jj
j
MAICEMAICE
12/AICexp
p)|(
j jj
jjj YMp
Prof. Akaike shifted to Bayesian modeling by 1976
Bayes estimate of the model
)|()ˆ|()|( YMpxpYxp jjj at Harvard (1976)
1976.
)|()|()|(1
ppp jjj
j
Model averaging mitigated the instability
Research Organization ofInformation and Systems 23Akaike (1977a,1978abc,1979)
Model averaging mitigated the instability inherent in model selection
Smoothness Prior : Bayesian Seasonal Adjustment
Seasonal Adjustment Problem
NnSTy nnnn ,,1,
ObservationTrend
nTy nS
Seasonal componentIrregular componentTrendnT
Penalized Least Squares Infidelity
n Irregular component
2minN
STy
Penalized Least Squares yto the data
Infidelity to smoothness
22222221
min
n
nnnf
SSzSSrTd
STy
Research Organization ofInformation and Systems
1112 nnnnn SSzSSrTd
Whittaker (1923), Shiller (1973), Good-Gaskins (1980), Akaike(1980), 24
Determination of Trade-off Parameter via Bayesian Interpretation
Crucial parameters
NN
SSzSSrTdSTy 22222222
),,( 222 zrd
Multiply by and exponentiate)2/(1 2
n
nnnnnn
nnn SSzSSrTdSTy1
11121
N
nn
N
nnnn TdSTy
1
222
2
1
22 2
exp2
1exp
Bayesian Interpretation
nn 11 22
),,,( 2222 zrd Smoothness Prior
)|,(),,|(),|,( STSTypyST Prior
Determination of by ABIC (Akaike 1980)
) of (dim2)log(max2ABIC L
Research Organization ofInformation and Systems
ete at o o by C ( a e 980)
25
Various Practice of Bayes Modeling
Seasonal Adjustment• BAYSEA
Software• BAYSEA• DECOMP
E th Tid A l i
TIMSAC84Time series analysis programs
Earth-Tide Analysis• BAYTAP-G
based on Bayesian modeling
Cohort Analysis
State Space ModelingState-Space Modeling• Time-varying AR model• Extraction of effect of earthquake
from groundwater level data• Nonlinear smoothing• Data assimilation
Research Organization ofInformation and Systems 26Akaike and Kitagawa (1998), Higuchi (2007).
Impacts of AIC and ABIC
Shift of Statistical Paradigmg
Estimation + Test
Statistical ModelingStatistical Modeling
Research Organization ofInformation and Systems 27
Data in Information & Knowledge Society
Information Society Information TechnologiesMeas rement de icesy
Information Technology• Measurement devices• Internet communication• Database
Massive Data• Life Science: DNA data, Micro-array data
Accumulation of Data
• Marketing: POS data• Finance: High frequency data• Environmental Science:
S i l M t l• Seismology, Meteorology• Astronomy(Whole-sky CCD camera)• High energy physics
Statistical Modeling
Research Organization ofInformation and Systems
DataData--centric Sciencecentric Science((Fourth ScienceFourth Science))DataData--centric Sciencecentric Science((Fourth ScienceFourth Science))
Cyber-enabledHuman
Inspiration
原理主導型
Cypdependent
Deductive(Principle driven)
Computing Computing Science Science
((SimulationSimulation))
Theoretical Theoretical sciencescience
データ主導型
DataData--centric centric ScienceScience
Inductive(Data driven)
Experimental Experimental ScienceScience Science Science
(Fourth science(Fourth science))(Data driven) ScienceScience
Research Organization ofInformation and Systems 29
Main Contributions of Prof. Akaike
1. Time domain multivariate modelingIdentification method and various practices
2. Information criteriaChange of statistical paradigm: modelingg p g g
1984
Research Organization ofInformation and Systems 30
Concluding Remarks: Akaike’s Research Style
• Challenge real problems for problem finding• Joint works with researchers in various domainsJoint works with researchers in various domains• Occasionally change his policy if necessary
D l d ft• Develop and open software
Real world Statistical Real world
Problem finding
Real world problems
Statistical science
Real worldproblems
ProblemKnowledge transfer
Research Organization ofInformation and Systems
Problem solving
31
Epilogue: Analysis of Golf Swing (1994‐2009)
Golf swingGolf swingGolf swing motion analysis: An experiment on the use of verbal analysis in statistical reasoning AISM Vol 53Golf swingGolf swing verbal analysis in statistical reasoning, AISM, Vol.53, No.1 1-10 (2001).
Blog http://ameblo.jp/linear/Blog http://ameblo.jp/linear/Renewal of his Blog
10
20
30
0
10
2006.3 2007.3 2008.3 2009.3
Evaluation and comparison of verballyPlausibilityPlausibility
Evaluation and comparison of verbally expressed psychological or physical image
Probability > Likelihood > PlausibilityStatistical Thinking
Research Organization ofInformation and Systems
Making statistical thinking more productive, to appear in AISM, Vol.62, No.1 (2010).
32