6

Click here to load reader

Self-adaptive modelling algorithms

Embed Size (px)

Citation preview

Page 1: Self-adaptive modelling algorithms

Mathematics and Computers in Simulation 30 (1988) 33-38

North-Holland

33

SELF-ADAPTIVE MODELLING ALGORITHMS

D.G. GREEN

Research School of Pacific Studies, Australian National University. Canberra, ACT 2601, Australia

R.E. REICHELT and R.G. BUCK

Australian Institute of Marine Science, Townsville, Queensland 4810, Australia

“Self-adaptive” algorithms include a wide variety of procedures that derive models by “training” with data about the process concerned. Adaptive algorithms typically perform bril-

liantly under some conditions, but fail badly under others. Flexibility, reliability and speed are prime concerns in designing and using such algorithms.

1. INTRODUCTION

A common problem in developing simulations is lack of knowledge about the system being mod-

elled. Simulations frequently require guesses to be made about processes and variables that

have not been (and sometimes cannot be) determined or measured. Adaptive modelling algorithms

attempt to overcome this problem by substituting empirical knowledge for a lack of theoretical

knowledge. That is, they optimize the model’s performance using observations of the system it-

self.

Most adaptive algorithms mimic biological processes (e.g. learning, evolution). There are

many variations on the theme. Artificial intelligence and other areas of computer science are

producing new algorithms continually. However, scant attention has been given to their potcn-

tial in simulation. Here we survey a few of the main types and suggest some possible future

directions.

2. LEARNING ALGORITHMS

As the name suggests, learning algorithms mimic the learning process. Most of them improve

model performance via a systematic trial-and-error search and “remember” the results of past

trials.

Perhaps the simplest learning algorithm is hill-climbing 1181. For a function f(al,...,an;

xl,...,xn), this algorithm seeks values of parameters al,...,an that give the best fit of f to

a given set of training data I121. If al,...,a, are known, then the hill-climb seeks values of

x~,...,x,, that yield local values for Max f or Min f (if they exist). Many optimization algo-

rithms can be considered as hill-climbing, although most retain only the result of the last

successful trial. To avoid becoming trapped on a “ridge”, the algorithm must be recursive, so

that it checks other variables before abandoning values that initially fail for a particular

037%4754/88/$3.50 0 1988, IMACS/Elsevier Science Publishers B.V. (North-Holland)

Page 2: Self-adaptive modelling algorithms

34 D. G. Green et al. / Self-adaptive modelling algorithms

variable. Simulations can be calibrated by hill-climbing. For instance we fitted models of

fire spread to field data in this way f121. However the process is extremely time-consuming,

since it involves multiple runs of whole simulations!

More sophisticated learning algorithms require no prior knowledge of model structure. These

algorithms are self-organizing, that is they alter the form of a model to optimize its perfor-

mance. This flexibility is made possible by the underlying model structure. A large class of

learning algorithms is associated with neural nets [Il. A neural net r201 consists of inter-

connected nodes (“synapses”) that combine to simulate the outputs of some system. Nodes in the

array contain model elements (e.g. differential equations) and the “wiring” consists of path-

ways by which nodes exchange inputs and outputs. The nodes, the wiring pattern and the timing

of interactions between nodes may all be changed adaptively. This idea, first motivated by the

goal of creating an artificial brain, has links with work on both analogue and parallel comput-

ers. Early experiments with randomly constructed nets failed for the same reason that the

hill-climb fails with complex functions: when there are too many possibilities the model tends

to become trapped by false solutions. Recent work has sought to reduce the chaos by giving

nets hierarchical structure . 151

A good example of the neural net approach is the GMDH Algorithm (“Group Method of Data

ling”) [161. GMDH forms models consisting of a pyramid of connected polynomial sub-models

(Fig. 1). Known values of independent variables are input to submodels on the bottom tier of

the pyramid and the resulting outputs are passed on as inputs to submodels on tiers above until

a value emerges from the sub-model at the top of the pyramid. The pyramid is formed by a

learning procedure, starting at the lowermost tier f61,[131. GMDH has been applied to many

problems, especially in economic and environmental management ~11,~71,t111,~131,t221,

output

Inputs

Figure 1. Structure of a typical model derived using the GMDH algorithm. Each box in the pyramid receives inputs (Xl, X2 say) from boxes on the layer below and sends an output (Y say) to the layer above. Values of independent variables are input to the bottom layer and the resulting value emerges from the top layer. Y = aXt2 + bXz2

Each box contains a polynomial of the form + cX1 + dX2 + eXlX2 + f.

Page 3: Self-adaptive modelling algorithms

D. G. Green et al. / Self-adaptive modelling algorithms

Evaluation trials [I31 show that GMDH models usually interpolate well, but can be highly

unstable when extrapolating. The algorithm is good at eliminating irrelevant variables, but

tends to perform poorly when the process being modelled is noisy, highly non-linear or depen-

dent on many variables (Fig. 2). Many of GMDH’s failings can be traced to the way it combines

variables initially in pairs. Also, small data sets and collinearity between variables can

lead to drastic over-fitting of the training data. Attempts are underway to find new algo-

rithms that overcome these problems [221

yyJ?q _

y variabi;

I ’ ’ ’ ’ f

50 100

Noise/Signal (O/o) Obs. in source data

100

50

fJ input variables

No noise

1

35

Figure 2. Decline in performance of GMDH derived models for source data sets with increasing numbers of independent variables and increasing noise. The models are formed by training with

a set 100 observations (“source” data), then used to predict values of the dependent variable

in a second set (“target” data). (Based on [131).

3. EVOLVING SYSTEMS

Evolving systems are simulations that apply evolution-like procedures to model refinement.

They randomly mutate a whole population of models (or model elements), removing those that per-

form poorly. For example, in an intriguing series of experiments, Fogel [81 used this approach

to deal with whole populations of models, each of which was represented as a formally defined

Turing machine [15]. By mutation and selection of these models, he tried to evolve models that

would predict outputs from a “black box”. In principle, an evolutionary approach can generate

models of any complexity, but the process is inefficient and slow and suffers from the same

problems afflicting random neural nets.

Models evolved in the above way may bear no resemblance to the system of interest. They

are of little use if one is trying to understand the system. An important type of adaptation

Page 4: Self-adaptive modelling algorithms

36 D.G. Green et al. / Self-adaptive modelling algorithms

concerns simulation models in which behavioural properties of the system emerge in the course

of running the model. For ins t ante, Hogeweg and Hesper 1141 have used models of this sort to

show that the social structure of bumble-bee colonies is a simple consequence of the interac-

tion between the environment and the behaviour of individual bees. Green and Ball [in prep.1

tested hypotheses about mangrove ecology by allowing populations to evolve.

4. SYNTACTIC ALGORITHMS

Adaptive algorithms need not be restricted to numerical data f21. Formal language models,

which are defined by sets of grammatical rules, are incredibly flexible and can capture quali-

tative features of systems simply and naturally 121~131~1101. Syntactic pattern recognition

191 , in which observations are expressed in some formal language, points to the need for models

that manipulate syntactic, rather than numerical data and raises the possibility of adaptive

algorithms that manipulate grammatical rules, instead of algebraic formulae, as elements in

model construction 121~1101.

Many syntactic algorithms exist in the form of parsers, which interpret inputs as expres-

sions in some formal language f151. Unfortunately, the very flexibility of syntactic models

makes the problems of semantics and ambiguity very severe. For example, the trivial sequence

“ABABABAB...” may be parsed in at least two ways: the rule “AB->C” yields the result “CCCC...”

and the rules “A->D”, “B->D” yield “DDDDDDDD...“. Ambiguities such as this, even in the sim-

plest cases, emphasizes the need for syntactic algorithms to incorporate knowledge about the

system being modelled, so that they can make the most appropriate choices of model structure.

Three ways to incorporate knowledge into modelling algorithms are via expert systems (see

next section), embodiments, and frames. Embodiments [lOI are simulations that embody formal

languages by interpreting them as descriptions of systems in the real world. Thus the syntax

of the modelling language can be made very simple and can relate directly to the training data

used in model building. The frame concept 1181 largely overcomes the ambiguity problem.

Loosely speaking, a frame is a structure containing both information and procedures. It also

contains information defining when it becomes active and what other frames may be relevant.

The frame concept is a computing analogue of Piaget’s idea of “mental schemas”, which humans

employ in perception and behaviour. For modelling applications, a frame would contain simple

algorithms defining how to construct or combine different model elements using the training

data.

5. EXPERT SYSTEMS

As we have seen in earlier sections, there can be practical and aesthetic objections to

building models in ignorance. Many aspects of a system may be unknown, but many others are

usually well-known. Indeed, for many purposes simulations are a means of embodying one’s

Page 5: Self-adaptive modelling algorithms

D. G. Green et al. / Self-adaptive mode&g algorithms 37

knowledge of a system 1101. Viewed in this light, many similarities can be seen between simu-

lation and expert systems as means of representing knowledge 1191~1211. However, there are

also marked differences 1211: for instance, simulations are usually numeric, contain explicit

solution steps and have rigid structure, whereas expert systems usually contain many symbolic

processes, find solutions by pattern searching, and have a flexible structure that can be

changed by “learning”. These differences suggest combining the strengths of the two ap-

proaches. Such links can take many forms 1191, for example: (1) an expert system may provide

an intelligent “front end” to set up a simulation 1101; (2) an expert system may be embedded in

a simulation in order to make the model self-adaptive or to avoid redundant computations; (3) a

simulation may be embedded within an expert system as a means of checking cases not covered by

known rules; and (4) an expert system may work in parallel to extract “interesting” observa-

tions from the performance of a simulation 114lJl”l.

6. CONCLUSION

An important criterion when selecting an adaptive algorithm to use is to ask the reason for

seeking to generate a model. Adaptive algorithms can derive models that give highly accurate

predictions, but unlike “normal” simulations, which are built from knowledge of the system

being modelled, the structure of these derived models may bear no resemblance at all to the

system of interest. Thus if the goal of the exercise is to learn how a system works, rather

than prediction, then some of the algorithms discussed here may be unsuitable. Few advanced

methods are available “off-the-shelf” and require a substantial programming exercise to imple-

ment them. For example our implementation of GMDH (program KEOPS) occupies 1200 lines

code in the language Pascal.

Several observations can be made about adaptive algorithms in general. First, they tend to

perform brilliantly in some conditions, but fail badly under others. Second, as with all mod-

elling techniques, they make implicit assumptions that they embed in the models they derive.

Third, statistical inference and error estimation are often difficult because the variables may

have distributions that are either unknown or else too complex to determine analytically. How-

ever, the bootstrap 141, which models distributions empirically by sampling from the available

data, provides a way to assess the bias caused by the particular observations used. That is,

the algorithm is run several times, using randomly selected subsets of the data each time. Fi-

nally, adaptive techniques often make heavy demands on computing resources. GMDH, for

instance, can easily swamp both the memory and processing capacity of even the largest comput-

er. However, computing power today is so cheap and plentiful that even such demanding

techniques are feasible to use.

Page 6: Self-adaptive modelling algorithms

38

References

D. G. Green et al. / Self-adaptive modelling algorithms

t11

(7-l

Barron, R.L., Mucciardi,A.N., Cook,F.J., Craig,J.N., and Barron,A.R., Adaptive learning networks: development and application in the U.S. of GMDH, in: S. Farlow (ed.) Self-Orga- nizing Methods in Modeling (Marcel Dekker, New York, 1984) 25-65.

Bradbury, R.H., Green, D.G. and Reichelt, R.E., Qualitative patterns and processes in cor- al reef ecology - a conceptual programme, Marine Ecology - Progress Series 29 (1986) 299- 304.

t31

141

151

161 171 181

191 1101

1111

1121

I131

1141

1151

t161

[I71 1181

1191

1201

1211

1221

Bradbury, R.H. and Loya, Y., A heuristic analysis of spatial pattern of hermatypic corals

at Eilat Red Sea, American Naturalist 112 (1978) 493-507. Efron, B., Computers and the theory of statistics: thinking the unthinkable, SIAM Review

21 (1979) 460-480. Eilbert, J.L. and Salter, R.M., Modeling neural networks in Scheme, Simulation 46 (1986)

193-199. Farlow, S.J., The GMDH algorithm of Ivakhnenko, Amer. Stat. 3.5 (1981) 210-215. Farlow, S.J. (ed.), Self-Organizing Methods in Modeling (Marcel Dekker, N.Y., 1984). Fogel, L.J., Owens, A.J., Walsh, M.J., Artificial Intelligence through Simulated Evolution

(John Wiley and Sons, New York, 1966). Fu, K.S., Syntactic Methods in Pattern Recognition, (Academic Press, N-Y., 1974). Green,D.G., Bradbury,R.H. and Bainbridge, S., Embodiment of formal languages, This volume

(1987). Green,D.G, Bradbury,R.H. and Reichelt,R.E., Patterns of predictability in coral reefs,

Coral Reefs (1987) to appear. Green,D.G., Gill, A.M. and Noble, I.R., Fire shapes and the adequacy of fire-spread mod-

els, Ecol. Modelling 20 (1983) 33-45. Green, D.G., Reichelt,R.E., and Bradbury,R.H., Statistical behaviour of the GMDH algo-

rithm, Biometrics (1987) to appear. Hogeweg, P. and Hesper, B., The ontogeny of the interaction structure in bumble bee colo-

nies: a MIRROR model, Behav. Ecol. Sociobiol, 12 (1983) 271-283. Hopcroft, J.E. and Ullman, J.D., Formal Languages and their Relation to .4utomata, (Addi-

son-Wesley, London, 1969). Ivakhnenko, A.G., Group method of data handling - a rival of the method of stochastic ap-

proximation, Sov. Autom. Control 13 (1966) 43-71. Lenat, B., Software for intelligent systems, Sci. Amer. 251 (1984) 152-161. Minskv, M, Steps towards artificial intelligence, in: E.A. Feigenbaum & J.Feldman (eds.),

Computers and Thought (McGraw-Hill, New York, 1963). O’Keefe, R., Simulation and expert systems - a taxonomy and some examples, Simulation 46

(1986) 10-16. Rosenblatt, F., The perceptron, a probabilistic mode for information, organization and

storage in the brain, Psychol. Rev. 65 (1958) 368-408. Shannon, R.E., Mayer, R. and Adelsberger, H., Expert systems and simulation, Simulation 44

(1985) 275-284. Tamura, H. and Kondo, T., On revised algorithms of GMDH with applications, in: S.J.Farlow

(ed.), Self-Organizing Methods in Modeling (Marcel Dekker, New York, 1984) 225-242.

This paper is Australian Institute of Marine Science contribution number 354.