Click here to load reader
Upload
dg-green
View
216
Download
2
Embed Size (px)
Citation preview
Mathematics and Computers in Simulation 30 (1988) 33-38
North-Holland
33
SELF-ADAPTIVE MODELLING ALGORITHMS
D.G. GREEN
Research School of Pacific Studies, Australian National University. Canberra, ACT 2601, Australia
R.E. REICHELT and R.G. BUCK
Australian Institute of Marine Science, Townsville, Queensland 4810, Australia
“Self-adaptive” algorithms include a wide variety of procedures that derive models by “training” with data about the process concerned. Adaptive algorithms typically perform bril-
liantly under some conditions, but fail badly under others. Flexibility, reliability and speed are prime concerns in designing and using such algorithms.
1. INTRODUCTION
A common problem in developing simulations is lack of knowledge about the system being mod-
elled. Simulations frequently require guesses to be made about processes and variables that
have not been (and sometimes cannot be) determined or measured. Adaptive modelling algorithms
attempt to overcome this problem by substituting empirical knowledge for a lack of theoretical
knowledge. That is, they optimize the model’s performance using observations of the system it-
self.
Most adaptive algorithms mimic biological processes (e.g. learning, evolution). There are
many variations on the theme. Artificial intelligence and other areas of computer science are
producing new algorithms continually. However, scant attention has been given to their potcn-
tial in simulation. Here we survey a few of the main types and suggest some possible future
directions.
2. LEARNING ALGORITHMS
As the name suggests, learning algorithms mimic the learning process. Most of them improve
model performance via a systematic trial-and-error search and “remember” the results of past
trials.
Perhaps the simplest learning algorithm is hill-climbing 1181. For a function f(al,...,an;
xl,...,xn), this algorithm seeks values of parameters al,...,an that give the best fit of f to
a given set of training data I121. If al,...,a, are known, then the hill-climb seeks values of
x~,...,x,, that yield local values for Max f or Min f (if they exist). Many optimization algo-
rithms can be considered as hill-climbing, although most retain only the result of the last
successful trial. To avoid becoming trapped on a “ridge”, the algorithm must be recursive, so
that it checks other variables before abandoning values that initially fail for a particular
037%4754/88/$3.50 0 1988, IMACS/Elsevier Science Publishers B.V. (North-Holland)
34 D. G. Green et al. / Self-adaptive modelling algorithms
variable. Simulations can be calibrated by hill-climbing. For instance we fitted models of
fire spread to field data in this way f121. However the process is extremely time-consuming,
since it involves multiple runs of whole simulations!
More sophisticated learning algorithms require no prior knowledge of model structure. These
algorithms are self-organizing, that is they alter the form of a model to optimize its perfor-
mance. This flexibility is made possible by the underlying model structure. A large class of
learning algorithms is associated with neural nets [Il. A neural net r201 consists of inter-
connected nodes (“synapses”) that combine to simulate the outputs of some system. Nodes in the
array contain model elements (e.g. differential equations) and the “wiring” consists of path-
ways by which nodes exchange inputs and outputs. The nodes, the wiring pattern and the timing
of interactions between nodes may all be changed adaptively. This idea, first motivated by the
goal of creating an artificial brain, has links with work on both analogue and parallel comput-
ers. Early experiments with randomly constructed nets failed for the same reason that the
hill-climb fails with complex functions: when there are too many possibilities the model tends
to become trapped by false solutions. Recent work has sought to reduce the chaos by giving
nets hierarchical structure . 151
A good example of the neural net approach is the GMDH Algorithm (“Group Method of Data
ling”) [161. GMDH forms models consisting of a pyramid of connected polynomial sub-models
(Fig. 1). Known values of independent variables are input to submodels on the bottom tier of
the pyramid and the resulting outputs are passed on as inputs to submodels on tiers above until
a value emerges from the sub-model at the top of the pyramid. The pyramid is formed by a
learning procedure, starting at the lowermost tier f61,[131. GMDH has been applied to many
problems, especially in economic and environmental management ~11,~71,t111,~131,t221,
output
Inputs
Figure 1. Structure of a typical model derived using the GMDH algorithm. Each box in the pyramid receives inputs (Xl, X2 say) from boxes on the layer below and sends an output (Y say) to the layer above. Values of independent variables are input to the bottom layer and the resulting value emerges from the top layer. Y = aXt2 + bXz2
Each box contains a polynomial of the form + cX1 + dX2 + eXlX2 + f.
D. G. Green et al. / Self-adaptive modelling algorithms
Evaluation trials [I31 show that GMDH models usually interpolate well, but can be highly
unstable when extrapolating. The algorithm is good at eliminating irrelevant variables, but
tends to perform poorly when the process being modelled is noisy, highly non-linear or depen-
dent on many variables (Fig. 2). Many of GMDH’s failings can be traced to the way it combines
variables initially in pairs. Also, small data sets and collinearity between variables can
lead to drastic over-fitting of the training data. Attempts are underway to find new algo-
rithms that overcome these problems [221
yyJ?q _
y variabi;
I ’ ’ ’ ’ f
50 100
Noise/Signal (O/o) Obs. in source data
100
50
fJ input variables
No noise
1
35
Figure 2. Decline in performance of GMDH derived models for source data sets with increasing numbers of independent variables and increasing noise. The models are formed by training with
a set 100 observations (“source” data), then used to predict values of the dependent variable
in a second set (“target” data). (Based on [131).
3. EVOLVING SYSTEMS
Evolving systems are simulations that apply evolution-like procedures to model refinement.
They randomly mutate a whole population of models (or model elements), removing those that per-
form poorly. For example, in an intriguing series of experiments, Fogel [81 used this approach
to deal with whole populations of models, each of which was represented as a formally defined
Turing machine [15]. By mutation and selection of these models, he tried to evolve models that
would predict outputs from a “black box”. In principle, an evolutionary approach can generate
models of any complexity, but the process is inefficient and slow and suffers from the same
problems afflicting random neural nets.
Models evolved in the above way may bear no resemblance to the system of interest. They
are of little use if one is trying to understand the system. An important type of adaptation
36 D.G. Green et al. / Self-adaptive modelling algorithms
concerns simulation models in which behavioural properties of the system emerge in the course
of running the model. For ins t ante, Hogeweg and Hesper 1141 have used models of this sort to
show that the social structure of bumble-bee colonies is a simple consequence of the interac-
tion between the environment and the behaviour of individual bees. Green and Ball [in prep.1
tested hypotheses about mangrove ecology by allowing populations to evolve.
4. SYNTACTIC ALGORITHMS
Adaptive algorithms need not be restricted to numerical data f21. Formal language models,
which are defined by sets of grammatical rules, are incredibly flexible and can capture quali-
tative features of systems simply and naturally 121~131~1101. Syntactic pattern recognition
191 , in which observations are expressed in some formal language, points to the need for models
that manipulate syntactic, rather than numerical data and raises the possibility of adaptive
algorithms that manipulate grammatical rules, instead of algebraic formulae, as elements in
model construction 121~1101.
Many syntactic algorithms exist in the form of parsers, which interpret inputs as expres-
sions in some formal language f151. Unfortunately, the very flexibility of syntactic models
makes the problems of semantics and ambiguity very severe. For example, the trivial sequence
“ABABABAB...” may be parsed in at least two ways: the rule “AB->C” yields the result “CCCC...”
and the rules “A->D”, “B->D” yield “DDDDDDDD...“. Ambiguities such as this, even in the sim-
plest cases, emphasizes the need for syntactic algorithms to incorporate knowledge about the
system being modelled, so that they can make the most appropriate choices of model structure.
Three ways to incorporate knowledge into modelling algorithms are via expert systems (see
next section), embodiments, and frames. Embodiments [lOI are simulations that embody formal
languages by interpreting them as descriptions of systems in the real world. Thus the syntax
of the modelling language can be made very simple and can relate directly to the training data
used in model building. The frame concept 1181 largely overcomes the ambiguity problem.
Loosely speaking, a frame is a structure containing both information and procedures. It also
contains information defining when it becomes active and what other frames may be relevant.
The frame concept is a computing analogue of Piaget’s idea of “mental schemas”, which humans
employ in perception and behaviour. For modelling applications, a frame would contain simple
algorithms defining how to construct or combine different model elements using the training
data.
5. EXPERT SYSTEMS
As we have seen in earlier sections, there can be practical and aesthetic objections to
building models in ignorance. Many aspects of a system may be unknown, but many others are
usually well-known. Indeed, for many purposes simulations are a means of embodying one’s
D. G. Green et al. / Self-adaptive mode&g algorithms 37
knowledge of a system 1101. Viewed in this light, many similarities can be seen between simu-
lation and expert systems as means of representing knowledge 1191~1211. However, there are
also marked differences 1211: for instance, simulations are usually numeric, contain explicit
solution steps and have rigid structure, whereas expert systems usually contain many symbolic
processes, find solutions by pattern searching, and have a flexible structure that can be
changed by “learning”. These differences suggest combining the strengths of the two ap-
proaches. Such links can take many forms 1191, for example: (1) an expert system may provide
an intelligent “front end” to set up a simulation 1101; (2) an expert system may be embedded in
a simulation in order to make the model self-adaptive or to avoid redundant computations; (3) a
simulation may be embedded within an expert system as a means of checking cases not covered by
known rules; and (4) an expert system may work in parallel to extract “interesting” observa-
tions from the performance of a simulation 114lJl”l.
6. CONCLUSION
An important criterion when selecting an adaptive algorithm to use is to ask the reason for
seeking to generate a model. Adaptive algorithms can derive models that give highly accurate
predictions, but unlike “normal” simulations, which are built from knowledge of the system
being modelled, the structure of these derived models may bear no resemblance at all to the
system of interest. Thus if the goal of the exercise is to learn how a system works, rather
than prediction, then some of the algorithms discussed here may be unsuitable. Few advanced
methods are available “off-the-shelf” and require a substantial programming exercise to imple-
ment them. For example our implementation of GMDH (program KEOPS) occupies 1200 lines
code in the language Pascal.
Several observations can be made about adaptive algorithms in general. First, they tend to
perform brilliantly in some conditions, but fail badly under others. Second, as with all mod-
elling techniques, they make implicit assumptions that they embed in the models they derive.
Third, statistical inference and error estimation are often difficult because the variables may
have distributions that are either unknown or else too complex to determine analytically. How-
ever, the bootstrap 141, which models distributions empirically by sampling from the available
data, provides a way to assess the bias caused by the particular observations used. That is,
the algorithm is run several times, using randomly selected subsets of the data each time. Fi-
nally, adaptive techniques often make heavy demands on computing resources. GMDH, for
instance, can easily swamp both the memory and processing capacity of even the largest comput-
er. However, computing power today is so cheap and plentiful that even such demanding
techniques are feasible to use.
38
References
D. G. Green et al. / Self-adaptive modelling algorithms
t11
(7-l
Barron, R.L., Mucciardi,A.N., Cook,F.J., Craig,J.N., and Barron,A.R., Adaptive learning networks: development and application in the U.S. of GMDH, in: S. Farlow (ed.) Self-Orga- nizing Methods in Modeling (Marcel Dekker, New York, 1984) 25-65.
Bradbury, R.H., Green, D.G. and Reichelt, R.E., Qualitative patterns and processes in cor- al reef ecology - a conceptual programme, Marine Ecology - Progress Series 29 (1986) 299- 304.
t31
141
151
161 171 181
191 1101
1111
1121
I131
1141
1151
t161
[I71 1181
1191
1201
1211
1221
Bradbury, R.H. and Loya, Y., A heuristic analysis of spatial pattern of hermatypic corals
at Eilat Red Sea, American Naturalist 112 (1978) 493-507. Efron, B., Computers and the theory of statistics: thinking the unthinkable, SIAM Review
21 (1979) 460-480. Eilbert, J.L. and Salter, R.M., Modeling neural networks in Scheme, Simulation 46 (1986)
193-199. Farlow, S.J., The GMDH algorithm of Ivakhnenko, Amer. Stat. 3.5 (1981) 210-215. Farlow, S.J. (ed.), Self-Organizing Methods in Modeling (Marcel Dekker, N.Y., 1984). Fogel, L.J., Owens, A.J., Walsh, M.J., Artificial Intelligence through Simulated Evolution
(John Wiley and Sons, New York, 1966). Fu, K.S., Syntactic Methods in Pattern Recognition, (Academic Press, N-Y., 1974). Green,D.G., Bradbury,R.H. and Bainbridge, S., Embodiment of formal languages, This volume
(1987). Green,D.G, Bradbury,R.H. and Reichelt,R.E., Patterns of predictability in coral reefs,
Coral Reefs (1987) to appear. Green,D.G., Gill, A.M. and Noble, I.R., Fire shapes and the adequacy of fire-spread mod-
els, Ecol. Modelling 20 (1983) 33-45. Green, D.G., Reichelt,R.E., and Bradbury,R.H., Statistical behaviour of the GMDH algo-
rithm, Biometrics (1987) to appear. Hogeweg, P. and Hesper, B., The ontogeny of the interaction structure in bumble bee colo-
nies: a MIRROR model, Behav. Ecol. Sociobiol, 12 (1983) 271-283. Hopcroft, J.E. and Ullman, J.D., Formal Languages and their Relation to .4utomata, (Addi-
son-Wesley, London, 1969). Ivakhnenko, A.G., Group method of data handling - a rival of the method of stochastic ap-
proximation, Sov. Autom. Control 13 (1966) 43-71. Lenat, B., Software for intelligent systems, Sci. Amer. 251 (1984) 152-161. Minskv, M, Steps towards artificial intelligence, in: E.A. Feigenbaum & J.Feldman (eds.),
Computers and Thought (McGraw-Hill, New York, 1963). O’Keefe, R., Simulation and expert systems - a taxonomy and some examples, Simulation 46
(1986) 10-16. Rosenblatt, F., The perceptron, a probabilistic mode for information, organization and
storage in the brain, Psychol. Rev. 65 (1958) 368-408. Shannon, R.E., Mayer, R. and Adelsberger, H., Expert systems and simulation, Simulation 44
(1985) 275-284. Tamura, H. and Kondo, T., On revised algorithms of GMDH with applications, in: S.J.Farlow
(ed.), Self-Organizing Methods in Modeling (Marcel Dekker, New York, 1984) 225-242.
This paper is Australian Institute of Marine Science contribution number 354.