16
odeling is a common but important technique for signal characterization. With the advent of com- putational power, many problems that were con- sidered to be unsolvable in the past can now be tackled with ease. Successful applications in this area include the time-delay estimation modeled as a finite impulse re- sponse (FIR) filter [ 11 for sonar and radar systems; speech coding using linear predictive coding [2-41; wavelets for speech and image coding and recognition [5-81; fractals for image compression and recognition systems [9-111; and delayed-X filter [12-141 for active noise control [ 151, to name but a few. An efficient model for signal processing is not easy to come by and is often obtained with the aid of an optimization scheme. The accuracy of the model is generally governed by a set of variables or parameters that is optimized in the 22 IEEE SIGNAL PROCESSING MAGAZINE 1051.5888 /9h /$5 nflOl99hTFFF NOVEMBER 1996

Genetic algorithms and their applications

  • Upload
    q

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Genetic algorithms and their applications

odeling is a common but important technique for signal characterization. With the advent of com- putational power, many problems that were con-

sidered to be unsolvable in the past can now be tackled with ease. Successful applications in this area include the time-delay estimation modeled as a finite impulse re- sponse (FIR) filter [ 11 for sonar and radar systems; speech coding using linear predictive coding [2-41; wavelets for

speech and image coding and recognition [5-81; fractals for image compression and recognition systems [9-111; and delayed-X filter [12-141 for active noise control [ 151, to name but a few.

An efficient model for signal processing is not easy to come by and is often obtained with the aid of an optimization scheme. The accuracy of the model is generally governed by a set of variables or parameters that is optimized in the

22 IEEE SIGNAL PROCESSING MAGAZINE 1051.5888 /9h /$5 nflOl99hTFFF

NOVEMBER 1996

Page 2: Genetic algorithms and their applications

searching domain. Sometimes, it has to operate under a restrictive bounded area where the optimization algorithm is not adequate enough to achieve its own task. A typical example is the constraint on the coefficients of an infinite impulse response (IIR) filter for stability. As the objective function can be linear or nonlinear, equality or inequality, smooth or nonsmooth, the solution is not unique because the objective function is often problem oriented.

This article introduces the genetic algorithm (GA) as an emerging optimization algorithm for signal processing. After a brief discussion of traditional optimization tech- niques, this article reviews the fundamental operations of a simple GA and discusses procedures to improve its functionality. The properties of the GA that relate to signal processing are summarized, and a number of applications that are being successfully implemented are described. In the authors’ view, this paper should contain sufficient material to simulate an interest in GAS within the signal processing community.

1. GA cycle.

What are Genetic Algorithms?

Genetic Algorithm Cycle Optimization Algorithms

Traditionally, there are two major classes of optimization algorithms used, and these are classified as the calculus-based technique and the enumerative technique. Calculus-based optimization techniques employ the gradient-directed searching mechanism to solve the error surface or differenti- able surface of an objective function [16]. However, for an ill-defined or multimodal objective function, local optima are frequently obtained. In signal processing, objective functions in this category are common since the signal can be noisy, fuzzy, vague, and discontinuous. Although dynamic pro- gramming (DP) is capable of handling the local optima problem and is considered as one of the major enumerative techniques in operation search [ 171, its simplicity, robustness, and popularity must be discounted for high computational consumption. In addition, DP may break down on complex problems of moderate size-a situation that is widely known as the “curse of dimensionality.”

In 1975, Holland introduced another optimization pro- cedure that is much different from two described above. It is a mechanism that mimics the process observed in natural evolution and is known as the GA [18-201. This technique of optimization is similar to its associated algorithms: simulated annealing [2 I ] , evolutionary strategies [22], and evolutionary programming [23, 241, which are classified as guided random techniques. The CA operates as an entirely different optimization procedure and provides fur- ther flexibility and robustness that are unique for signal processing. Because of its simplistic implementation pro- cedure, the GA can be used as an optimization tool for designing AI-hybrid systems for real-world applications [25-301. Despite the usefulness of the GA and the volume of literature published on the subject, the uses of the GA in signal processing are few in number, but, as an area of application, the potential use of the GA in signal process- ing is immeasurably wide.

The GA is a searching process based on the laws of natural selection and genetics. Usually, a simple GA consists of three operations: Selection, Genetic Operation, and Replacement. A typical GA cycle is shown in Fig. 1.

The population comprises a group of chromosomes from which candidates can be selected for the solution of a problem. Initially, a population is generated randomly. The fitness values of the all chromosomes are evaluated by calculating the objective function in a decoded form (phenotype). A particular group of chromosomes (parents) is selected from the population to generate the offspring by the defined genetic operations. The fitness of the offspring is evaluated in a similar fashion to their parents. The chromosomes in the current population are then replaced by their offspring, based on a certain replacement strategy.

Such a GA cycle is repeated until a desired termination criterion is reached (for example, a predefined number of gen- erations is produced). If all goes well throughout this process of simulated evolution, the best chromosome in the final popula- tion can become a highly evolved solution to the problem. A top-level description of a simple GA is shown in Fig. 2. In the following paragraphs, we describe various techniques that are employed in the GA process for encoding, fitness evaluation, parent selection, genetic operation, and replacement.

2. Top-level description of a simple CA.

IEEE SIGNAL PROCESSING MAGAZINE 23 NOVEMBER 1996

Page 3: Genetic algorithms and their applications

Encoding Scheme The encoding scheme is a key issue in any GA because it can severely limit the window of information that is observed from the system [31]. To enhance the performance of the algorithm, a chromosome representation that stores problem- specific information is desired. In general, the GA evolves a multiset of chromosomes. It should be noted that each chro- mosome x, ( i = 1,2, ... A9 represents a trial solution to the problem setting. The chromosome is usually expressed in a string of variables, each element of which is called a gene. The variable can be represented by binary, real number, or other forms and its range is usually defined by the problem specified.

Bit-string encoding [ 191 is the most classic approach used by GA researchers due to its simplicity and traceability. However, a string-based representation may pose difficulties for, and sometimes unnatural obstacles to, some optimization problems, e.g., the graph coloring problem. The use of other encoding techniques, such as real number representation [32, 331, order-based representation [34] (for bin-patching, graph coloring), embedded lists [35] (for factory scheduling prob- lems), variable element lists [35] (for semiconductor layout), and even LISP S-expressions [36], have, therefore, been explored.

Fitness Techniques The objective function (or evaluation function) is a main source to providing the mechanism for evaluating the status of each chromosome. This is an important link between the GA and the system. It takes a chromosome (or phenotype) as input and produces a number or list of numbers (objective value) as a measure to the chromosome’s performance. How- ever, its range of values varies from problem to problem. To maintain uniformity over various problem domains, a fitness function is needed to map the objective value to a fitness value. There are a number of methods, known as fitness techniques, used to perform this mapping. Two commonly used techniques are given as follows:

1) Windowing: Assuming that the objective value of the worst chromosome in the population is V,, each chromosome can be assigned a fitness valuei proportional to the “cost difference” between chromosome i and the worst chromosome. In mathematics, it i s expressed as

where V, is the objective value of chromosome i and c is a constant.

If a maximization problem is encountered, a positive sign is adopted as in Eq. (1). On the other hand, a negative sign is adopted if minimization is required.

2) Linear normalization: The chromosomes are ranked in descending or ascending

order of objective value if the objective function is to be

3. Roulette wheel parent selection.

maximized or minimized. Giving the best chromosome a raw fitness, then the fitness of i-th chromosome in the ordered list is conducted by a linear function,

where d is the decrement rate. This technique ensures that the average objective value of the population is mapped into the average fitness.

Parent Selection Parent selection emulates the survival-of-the-fittest mecha- nism in nature. It is expected that a fitter chromosome re- ceives a higher number of offspring and thus has a higher chance of surviving in the subsequent generation. There are many ways to achieve effective selection, including ranking, tournament, and proportionate schemes [20, 371 but the key assumption is to give preference to fitter individuals.

For example, in the proportionate scheme, chromosome x with a fitness value f ( x , t ) has a growth rate tsr defined as:

( 3 )

where F{t) is the average fitness of the population. Figure 3 explains the procedure of Roulette Wheel Selec-

tion [20], which is commonly used to implement the propor- tionate scheme.

Genetic Operation Crossover is a recombination operator that combines subparts of two parent chromosomes to produce offspring that contain some parts of both parents’ genetic material. A probability term, pc, is set to determine the operation rate. Many GA practitioners consider the crossover operator to be the determining factor that distinguishes the GA from all other optimization algorithms.

A number of variations on crossover operations are proposed and the simplest form is a single-point crossover. An example is shown in Fig. 4. The parents are randomly selected based on the above mentioned selection scheme. A crossover point is randomly selected and the portions of the two chromosomes beyond this point are exchanged to form the offspring.

4. Example of one-point crossover.

24 IEEE SIGNAL PROCESSING MAGAZINE NOVEMBER 1996

Page 4: Genetic algorithms and their applications

5. Exunzple of multipoint crossover (m=3). , Original Chromosome 0

V i

1 New Chromosome 0 I_.

6. Bit mutution on thefourth bit ojthe old chromosome

Multipoint crossover is similar to single-point crossover, except that m crossover positions are chosen at random with no duplication. An example of this operation is depicted in Fig. 5.

Single- and multipoint crossover define cross points where the chromosome can be split. Uniform crossover gen- eralizes the scheme to make every locus a potential crossover point. A random binary string with the same length as the chromosome indicates which parent will supply the child with the associated bit. At each location, the corresponding bits of the parents are exchanged if the random string contains a “1” at that location. If the random bit is “0” no exchange takes place.

Mutation is an operator that introduces variations into the chromosome. This variation can be global or local. The operation occurs occasionally (usually with small prob- ability p,) but randomly alters the value of a string posi- tion. Each bit of a bitstring is replaced by a randomly generated bit if a probability test is passed. An example of mutation on the fourth bit is shown in Fig. 6. The bitstring [10011010] is to be changed to [10001010] if it passes the probability test and the randomly generated bit is “0.” No change will take place if the randomly generated bit is “1 .”

Some GA practitioners use standard mutation to flip bits. Using this variant, “1” is replaced by a “0,” or vice versa if the probability test is passed. This approach results in an effective rate of mutation that is twice as high as the previous one.

Replacement Strategies After generating the subpopulation (offspring), two repre- sentative strategies can be proposed for old generation re- placement:

Generational-Replacement: Each population of size n generates an equal number of new chromosomes to replace the entire old population. This strategy may make the best member of the population fail to reproduce offspring in the next generation. So the method is usually combined with an elitist strategy where one chromosome or a few of the best chromosomes are copied into the succeeding generation. The

elitist strategy may increase the speed of domination of a population by a super chromosome, but on balance it appears to improve the performance

Steady-State Reproduction: This strategy means that only a few chromosomes are replaced once in the population to produce the succeeding generation. Usually the worst chro- mosomes are replaced when new chromosomes are inserted into the population. The number of new chromosomes is to be determined by this strategy. In practice, only one to two new chromosomes are being used by steady-state reproduction.

Schema Theory

The design methodology of the GA relies heavily on Hol- land’s notion of schemata [20, 351. It simply states that schemata are sets of strings that have one or more features in common. A schema is built by introducing a “don’t care” symbol, “#,” into the alphabet of genes, i.e., #1101#0. A schema represents all strings (a hyperplane or subset of the search space), which match it on all positions other than “#.” It is clear that every schema matches exactly 2‘ strings, where “r” is the number of don’t care symbols, “#,” in the schema template. For example, the set of the schema #1101#0 is { 1110110,1110100,0110110,0110100).

Efiect of Selection Since a schema represents a set of strings, we can associate a fitness valueffS,t) with schema “S, ” and the average fitness of the schema. f(S,t) is then determined by all the matched strings in the population. If proportional selection is used in the reproduction phase, we can estimate the number of matched strings of a schema “S” in the next generation.

Let c(S,t) be the number of strings matched by schema “S” at current generation. The probability of its selection (in a single string selection) is equal tof(S,t)/F(t). where F(t) is the average fitness of the current population. The expected number of occurrences of S in the next generation is

Let

(4)

If E > 0 , it means that the schema has an above-average fitness and vice versa.

Substitute Eq. (5) into Eq. (4) and it shows that an “above average” schema receives an exponentially increasing number of strings in the next generations:

Effect on Crossover During the evolution of a GA, the genetic operations are disruptive to current schemata; therefore, their effects

NOVEMBER 1996 IEEE SIGNAL PROCESSING MAGAZINE 25

Page 5: Genetic algorithms and their applications

7. Global GA.

should be considered. Assuming that the length of the chro- mosomeisL and one-point crossover is applied, in general, a crossover point is selected uniformly among L - I pos- sible positions.

This implies that the probability of destruction of a schema S is

or the probability of a schema survival is

(7)

where (T is the defining length of the schema S defined as the distance between the outermost fixed positions. It defines the compactness of information contained in a schema. For ex- ample, the defining length of #OOO# is 2, while the defining length of 1#OO# is 3.

Assuming the operation rate of crossover is pc, the prob- ability of a schema survival is:

(9)

Effect of Mutation I€ the bit mutation probability is pm, then the probability of a single bit survival is I - p,. Defining the order of schema S (denoted by o(S)) as the number of fixed positions (i.e., positions with 0 or 1) present in the schema, the probability of a schema S surviving a mutation (i.e., sequence of one-bit mutations) is

Since p,, << 1, this probability can be approximated by:

Schema Growth Equation Combining the effect of selection, crossover, and mutation, we have a new form of the reproductive schema growth equation:

Based on Eq. (12), it can be concluded that a high average fitness value alone is not sufficient for a high growth rate. Indeed, short, low-order, above-average schemata receive exponentially increasing trials in subsequent generations of a GA.

Genetic Optimizer for Signal Processing

GAS are considered powerful optimizers in many areas. In order to explore the application of GAS in the signal process- ing field, it is important to introduce some of their major features.

Para I le1 ism

A GA-based signal processing system may be paralleled in a number of methods to increase the computation speed [38]. The methods of parallelization can be classified as global,

26 IEEE SIGNAL PROCESSING MAGAZINE NOVEMBER 1996

Page 6: Genetic algorithms and their applications

8. Ring migration topology.

9. Neighborhood migration topology.

migration, and dzflusion. These categories reflect different ways in which parallelism is exploited in the GA and the nature of the population structure and recombination mecha- nisms used.

A global GA (Fig. 7) treats the entire population as a single breeding. Based on the master-slave architecture, the internal parallelism of a GA [20] is explored.

Migration of the GA divides the population into a number of subpopulations, each of which is treated as a separate breeding unit under the control of a conventional GA. To encourage the proliferation of good genetic material through- out the whole population, individual migration between the subpopulations occurs from time to time [39-431. Figures 8-10 show three different topologies in migration. Figure 8 shows the ring migration topology where individuals are transferred between directionally adjacent subpopulations. A similar strategy known as neighborhood migration is shown in Fig. 9. Migration is made only between nearest neighbors, but migration may occur in either direction between subpopu- lations. Unrestricted migration topology is depicted in Fig. 10. Here, individuals may migrate from any subpopulation to another. The individual migrants are then determined accord- ing to the appropriate selection strategy.

The migration model of the GA is well suited to parallel implementation on multiple-instruction-multiple-data (MIMD) machines. Given the range of possible population topologies and migration paths between them, efficient com- munications networks should be possible on most parallel

WHILE not finshed

. . Evaluation

IO. Unrestricted migration topology.

WHILE notfinished

11. Diffusion GA.

architecture from small multiprocessor pathforms to clusters of networked workstations.

Diffusion GA, as indicated in Fig. 11, considers the popu- lation as a single continuous structure. Each individual is assigned a geographic location on the population surface and is allowed to breed with individuals contained in a small local neighborhood. This neighborhood is usually chosen from immediately adjacent individuals on the population surface and is motivated by the practical communication restrictions of parallel computers [44-481.

Robustness

There are many instances where it is necessary to make the characteristics of a system variable and adaptive to dynamic signal behavior and able to sustain the environmental distur- bance in signal processing. This requires an adaptive algo- rithm to optimize time-dependent optima that are difficult to obtain from a standard CA. When using a standard GA, the diversity of the population is quickly eliminated as it seeks out a global optimum. Should the environment change, it is often unable to redirect its search to a different part of the space due to the bias of the chromosomes. To improve the convergency of the standard CA for changing environments, two basic strategies have been developed.

The first strategy expands the memory of the GA in order to build up a repertoire of ready responses to environmental conditions. Two typical examples in this group are Triallelic representation [49] and Structure GA [50]. Triallelic repre-

NOVEMBER 1996 IEEE SIGNAL PROCESSING MAGAZINE 21

Page 7: Genetic algorithms and their applications

13. A multimodal problem. 12. Feasible region on magnitude vs. delay error plot.

sentation consists of a diploid chromosome and a third allelic structure for deciding dominance. In Structure GA, the chro- mosome is represented in a hierarchical structure. The higher- level nodes in the structure regulate the activation or de-activation of lower-level genes.

The random immigrants mechanism [5 11, the triggered hypermutation mechanism [52, 531, and statistical process control [54] are grouped as another type of strategy. This approach increases diversity in the population to compensate for the changes encountered in the environment. The random immigrants mechanism is used to replace a fraction of a standard GA’s population. It works well in environments where there are occasional, large changes in the location of the optimum. An adaptive mutation-based mechanism, known as the triggered hypermutation mechanism, has been developed to adapt to the environmental change. The mecha- nism temporarily increases the mutation rate to a high value whenever the best time-average performance of the popula- tion deteriorates. Statistical process control can be applied to monitor the best performance of the population so that the GA-based optimization system adapts to the continuous, time-dependent nonstationary environment.

Multiple 0 bjectives

It is very common to have more than one objective for optimization in signal processing applications. The design of the IIR filter is a typical example. Given the magnitude and group-delay, IIR filter design can be easily converted into a multiple objectives problem. Expressions for the magnitude error e,,, and delay errors ed with respect to a given specifica- tion may be formulated into the IIR filter coefficients and the gain. The feasible set of filters representing all stable filters of a given order is depicted in Fig. 12.

The simultaneous optimization of competing objective functions e,n and e, seldom admits a single and perfect solu- tion. Instead, a Pareto-optimal set of filters having optimal trade-offs of magnitude and delay performance are usually obtained and illustrated as curve ABC. The GA has been demonstrated as being a powerful method for this multiob-

14. Maximum finding using a CA on multimodal surface.

jectives problem [ S I , enabling one to obtain the Pareto-op- timal set instead of a single solution. Decision makers can progressively articulate their performance while learning about the problem’s trade-offs [56].

Other examples of multiobjectives problems are illus- trated in [57] where the GA is used to search the Pareto-op- timal set of the best compromises between magnitude response error (root-sum-weighted-squared error) and added cost (total number of additions/subtractions required for im- plementation). An overview of the GA in multiple objectives problems is discussed in [SS]. Different approaches are re- viewed and their similarities and differences are discussed.

Multimodality

The multimodal function is prevalent in signal processing. The multimodal error surface of the IIR filter is well known and is particularly difficult for use with conventional optimi- zation algorithms. Such algorithms that use gradient descent may become “stuck” in local minima on the error surface. Another example is the time-delay problem [59] with the mean-square error (MSE) surface expressed as

28 IEEE SIGNAL PROCESSING MAGAZINE NOVEMBER 1996

Page 8: Genetic algorithms and their applications

where D is the time delay;

b is the estimated time delay; 01 and on2 is the signal power and the noise power,

respectively. The error surface is multimodal with the global minimum

appearing at D = D . Therefore, in order to avoid lock-up at local minima, [591 suggests that the initial range of D is

15. System identification with IIR using the GA.

so that the surface is unimodal within such a range. To illustrate the robustness and ability of the GA to escape

from the local minimum point, simulation was conducted on the MATLAB GA Toolbox for locating the global maximum from a mulltimodal surface (Fig. 13) with searching space equal to 216 x 216. The chromosome is formed by a 32-bit binary string to represent the x and y coordinates, with 16-bit resolution each. The best chromosome during the searching process is depicted in Fig. 14, which clearly shows the ability of the CA to escape from local maxima and to find the global maximum point. However, improper design of a CA may cause genetic drift and reduce the probability of finding the global optimum. Fitness sharing [60], crowding [20], and other techniques have been developed to prevent genetic drift from occurring.

Number Representation 16. IIK model optimization using GA

It is common to have a real-value searching space in signal processing problems. Genes used for the GA optimization process can be handled directly through binary or even n-ar- ray encoding. During recent rescarch, the direct manipulation of real-value chromosomes has raised a lot of interest. An experiment currently being undertaken by [32] indicates that the floating-point representation is faster and more consistent from run to run. On the other hand, [61] has suggested that real-coded GAS can be blocked from further progress in some situations, although many problems have been solved using real-coded GAS. Nonetheless, there is insufficient consensus to draw any conclusion about which representation is better.

Constraints

In general, signal processing problems are constrained in several ways. A typical example can be given by a stable IIR filter design. A GA can manage the constraints in two differ- ent ways.

Consider the transfer function of the N-th order IIR filter as:

The searching space of c, and d, can be defined in the stability triangle [62] in order to ensure the stability of H(z), i.e.,

Such constraints can be embedded in the system by con- fining the searching space of the chromosome. This approach can guarantee that only stable filters are generated in each generation and ensure that the optimal solution is stable. Another approach is to set up a penalty scheme for invalid chromosomes such that they are indeed low performers. However, appropriate penalty functions for a particular prob- lem are not necessarily easy to design, since they may con- siderably affect the efficiency of the genetic search [63].

Signal Processing Applications of the GA

Being a powerful optimization tool, the GA has explored a large number of applications in signal processing. It is impos- sible to cover all of these since the list is always growing. In this section, we will pinpoint some of the most successful applications of GAS in the signal processing area.

NOVEMBER 1996 IEEE SIGNAL PROCESSING MAGAZINE 29

Page 9: Genetic algorithms and their applications

17. Chromosome coding.

IIR Adaptive Filtering

Many problems in signal processing, like noise cancellation, equalization, time-delay estimation, etc., can be charac- terized as system identification problems, where one must gather data from a system whose structure is initially un- known. A resulting model can then be used for prediction and control of the system. The application of IIR in system identification has been widely studied recently. This process is accomplished by successively adjusting the parameters of the adaptive filter until the difference between its output and that of the unknown system is minimized. Due to the multi- modality of IIR error surface and the multiple criteria for optimization, the GA is well suited for optimizing the filter coefficients to search for global optima. Figure 15 shows a general system in which the application of the GA is found [55, 57, 64, 651.

The IIR filter can be realized by lattice form or cascaded direct form. By constraining the range of the filter coefficient, the stability can be guaranteed. Filter coefficients are en- coded in binary strings to formulate the chromosome struc- ture. The fitness function is defined as the estimation MSE of the IIR filter that is to be minimized by the GA.

A simulation was conducted to illustrate the effectiveness of the GA for IIR filter optimization. The IIR filter was estimated in the form of zero and double poles. The input signal was randomly generated. Figure 16 shows the esti- mated poles of the IIR filter within the first 300 generations. All the filter coefficients were represented by 16-bit strings.

Nonlinear Model Selection

As the performance of a GA does not really rely on the error surface of a cost function, it was proposed to demonstrate its effectiveness in selecting the nonlinear model terms of the NARMAX model (Nonlinear Auto-Regressive Moving Av- erage model with exogenous inputs) by [66]. For example, a model with seven terms of degree up to three and maximum lags has CT = 6124520 possible cases.

Although terms such as u(t - l)y(t - 4) and u(t - 2)y(t - 4), for example, may appear conceptually similar, their con- tributions to the final model can be drastically different. Hence, the term was coded by the index in Fig. 17 instead of using binary encoding of term lags. The chromosome was formulated by an integer string while each integer repre- sented a particular term.

A GA with a population size of 40 was run for 90 genera- tions in order to search for models that had low residual variance. The original system is in the form as follows:

y ( t ) = f (u( t - l , . . . ) ,y( t - l ; . . ) = a,y(t - 1) + a,u(t - 1) + a,y(t - 2)* + a,u(t - 3)3

(17) + u,y(t - 2)u(t - 1)’ + e( t )

where u(t) and e(t) consist of two independent pseudo-ran- dom sequences that were uniformly and normally distributed. The signal-to-noise ratio was set as 20dB.

Within the optimization process, the first 500 points of data were used to search for the model with low residual variance. The remaining 500 points were divided into two sets and used for determination of the one-step-ahead predic- tion error (OSAPE) and the long-term prediction error (LTPE) of the best models in which OSAPE and LTPE are computed as

where N is the total number of data points in the set, L, is the maximum lag of the estimated model 3 , y ( t ) is the measured system output, and j ( t ) is the predicted output from the model.

Fo r the case of OSAPE, j ( t ) i s de f ined as j ( t )=f(u( t - l , . . . ) ,y( t - l ; . . ) ) , whereas for LTPE, ;(t) is de-

30 IEEE SIGNAL PROCESSING MAGAZINE NOVEMBER 1996

Page 10: Genetic algorithms and their applications

18. Block Diagram of genetic time-delay estimation system

fined as i ( t ) = ~ ( u ( t - l , . . . ) , ~ ( t - l , . . . ) ) . Results of the best four models are presented in Table 1.

A similar application is presented for the implementation of a CA on GMDH (group method of data handling) [67]. GMDH is a procedure that attempts to model an unknown system by iteratively connecting layers of nodes that compute polynomial functions while introducing a genetic-based self- organizing network (GBSON) as a procedure for construct- ing a polynomial network in GMDH. The GBSON proceeds as follows:

Generate GA structures that represent new network nodes. Each node is represented by eight fields of a bit string. The first two fields identify connection and the last six fields represent the coefficient of the following output function

a + bz, + cz, + dz,z, + ez; + fz:

where z, and z, are the outputs of the connected nodes in the previous layer.

Calculate the description length of the function represented by the new node.

I = 0.5nlogDf +0.5mlogn

where 0,' is the model's MSE, rrz is the number of parame- ters, and n is the number of nodes.

Search the space of possible nodes for the current layer by simple CA with tournament selection, single-point cross- over, and point mutation.

Select the peak nodes to form the new network layer.

The process repeats for subsequent layers until the CA converges to a layer with a single node. The resulting network is taken as a model of the input data. Testing is run on the chaotic data obtained by solving following Glass-Mackey delay equation [68]:

dX U X ( ~ - T ) -= - bx(t) dt l + ~ ' ' ( t - ~ )

where z = 30, a = 0.2, b = 0.1. The result in [67] demonstrates the success in short-term

prediction of chaotic system dynamics of the model gener- ated by the GBSON.

NOVEMBER 1996 IEEE SIGNAL PROCESSING MAGAZINE 31

Page 11: Genetic algorithms and their applications

Time-Delay Estimation without modeling of the delay is a significant advantage. This chromosome is defined in Eq. (20):

Time-delay estimation (TDE) is usually solved by changing the one-dimensional delay problem into a multidimensional problem with FIR filter modeling. Adaptive filtering tech- niques have been successfully applied in this area by relying on the unimodal property of FIR error surface. However, due to the large number of parameters required for estimation, biased estimates are often obtained. This is particularly ap- parent when it is operating under a noisy estimating environ- ment, although an improved version that can directly estimate the delay has been reported in [59]. A GA can be applied to tackle such on-line, time-delay, estimation problems [54]. Modeling of the gain factor and the delay are not required and the direct estimates of both parameters are optimally obtained through the genetic optimization procedure.

The fact that the associated delay and gain in this case can be directly represented by a binary string, i.e., chromosome,

/May 2 2

2 , LMSTDE =no0005 I

Cansfrained LMS Algorithm p.=o.oon3

- Genetic Algorithm

1 9 k b ' T i

19. Comparison ojdifferent algorithms (SNR = OdB).

where b, E B = {O,l}. The block diagram of the genetic time delay estimation

system is depicted in Fig. 18. Comparison is made with the traditional LMSTDE and the

constrained LMS algorithm [59]. The simulation results for time-invariant and time-variant cases in a noisy situation are demonstrated in Figs 19 and 20.

Active Noise Control

Active Noise Control (ANC) [15] is a technique that uses secondary acoustic sources to generate sound waves for canceling an undesired noise. Active systems designed for the global control of sound or vibration over a region of space generally employ multiple secondary actuators (loudspeaker) and multiple error sensors (microphone) (see Fig. 21). The system can adjust the secondary sources such that the resul- tant noise received by the error sensors is reduced.

A GA can be applied to find optimally controlled secon- dary signals such that the minimum noise levels are obtained in the error sensors. [69] presents a genetic active noise control system (GANCS) that is able to optimize the error microphones' outputs as the multiple objectives functions use the GA. The block diagram of GANCS is depicted in Fig. 22.

The basic structure of this system consists of four funda- mental units, namely an acoustic path estimation process (APEP), a genetic control design process (GCDP), a statistic monitoring process (SMP), and a decision-maker (DM).

Delay 50

4 5

40

35

30

2 5

2 0

1 5

- Genetic Algorithm LMSTDE p 0 0000s LMSTDE P = 0 0002

I O

05 O 0 5 1 0 1 5 2 0 2 5 30 35 40 x10'

Sample Gain

Sample

20. Time-delay tracking different algorithms (SNR = OdB).

32 IEEE SIGNAL PROCESSING MAGAZINE

These have been designed to govern the on-line real-time performance. The acoustic dy- namics of the acoustic paths are estimated in APEP while the controller is derived from the GCDP. The purpose of SMP is to monitor the system performance and detect the en- vironmental change so as to ensure the systems robustness. Since the multiobjectives problem induces multiple non- dominated solutions, a multi- ple objective GA (MOGA) is utilized to find the optimal FIR controller design C(q-') so as to minimize an objective vec- tor

where CJ is the defined search- ing domain for C(q-'); a mul-

NOVEMBER 1996

Page 12: Genetic algorithms and their applications

algorithms for finding the optimal positions of eight secon- dary sources from a possible 32 locations in a model cabin, forwhichtherearemorethan lo7 possible combinations, are successfully applied in [70]. A binary coding method has been used, with each possible location being indicated by a bit position in a binary string. A value of 0 or 1 indicates the absence or presence of a source. A comparison can also be given against the random search technique. It can be concluded that the GA provides an efficient search proce- dure for these problems.

Speech Processing

Speech Recognition is an active research area in speech processing and the technique of dynamic time warping (DTW) is commonly used to evaluate the similarity of two different speech utterances. This can normally be achieved by minimizing

2 1. Multiple channels active control system.

22. Genetic active noise control system.

tiobjective vector, f, is defined as below for optimization:

1

(22) wherei is defined as MSE error of i-th error sensor within a sample window and y1 is the number of error sensors. Real- number representation [32] is applied.

The objective goal is preset and defined as the desirable residue noise power level of each error microphone. DM can select one of the solutions for the ANC based on this preset goal. In order to embed the goal-attainable method into the MOGA, a preferable approach is applied for chromosome comparison [56]. An experimental result can be carried out and demonstrates an effective attenuation of noise in 3D problems.

In addition to the controller design, the system configu- ration (such as the positions of the secondary loudspeak- ers) is also important for ANC system performance. The variation of attenuation with the locations of the optimally adjusted secondary sources is definitely not a quadratic optimization problem that appears to be difficult to solve using conventional optimintion techniques. Evolutionary

where X = ( x , , x *,... xM) and Y =(y,,y, ,... y,) represent two speech patterns while x, and y, are the parameter vectors of the short-time acoustic feature, such as the LPC coeffi- cient or cepstrum parameter; $ = (qX,$,) is the time-warping function, relating the indices, i, and i,, of the two speech patterns to the “normal” time axis k ; ~ I ( $ ~ ( k ) , $ ~ ( k ) ) is a short-time spectral distortion defined for x $ , ( ~ ) and ypy(k) ;

m(k) is a non-negative weighting coefficient; and M$ is the normalizing factor, usually defined as M4 =

practical applications, there are some limitations on the use of DTW:

(1) the exact endpoint registration of utterances and (2) the use of constant normalization factor M,+, instead of

the actual one.

The former presents the problem of robustness in speech recognition and the later concerns the accuracy of the algo- rithm used. In view of these, a GA-based time-warping algorithm (GTW) [71] has been proposed for improving the global searching ability of DTW to resolve the said problems.

Let us consider a time-warping path P, starting at (i, , io,) and ending at (i, , icl,), which are the representations of a moving sequence. It is specified by a pair of coordinate increments:

NOVEMBER 1996 IEEE SIGNAL PROCESSING MAGAZINE 33

Page 13: Genetic algorithms and their applications

where (io, , ioy) and (i, , icy) can be fixed or varied within a range determined by the constraints on endpoint uncer- tainty, on which several assumptions of endpoint pairs can be made; and

which are determined by the local con- tinuity constraints of the problem.

The time-warping path is now treated as a chromosome that is defined in Eq. (24). A population of chromosomes (30 chromosomes) is first generated. The genes of each individual chromosome represent a randomly selected local step from L,. The probability of the selected rate is inversely proportional to the

( p , , q , ) E L, = ~ ~ ~ , ~ ) ~ ~ ~ , ~ ~ , ( ~ ~ ~ ~ , ( ~ . ~ ) ~ ~ ~ ~ ~ ~ ,

23. Comparison of DTW and GTW.

slope weight m(k) of the local step. The initial slope weights are determined in a heuristic manner similar to that suggested in [72]. It should be noted that the global continuity constraint of the problem should also be satisfied by this approach of gene selection.

For each permitted pair of starting points (ioX,ioy) and ending points (i&), it is obvious that there are many time- warping paths between them, and the required number of steps are not necessarily the same. In other words, every chromosome has its own length, which is fundamentally different from the conventional fixed-length chromosome. Because the warping paths are directly stored as chromo- somes, this naturally leads to the possibility of n-best warping path solutions being obtained without extra computation time, although the solutions may not necessarily be optimum.

In order to demonstrate the matching ability of a GA-based DTW algorithm, experimental investigations similar to that of [73] were carried out. The test bed consisted of 10 acous- tically distinct Chinese words, each having 101 utterances and 100 other randomly chosen words. It should be noted that all these words were recorded from one particular person and were speaker dependent only. The results obtained cannot be used quantitatively for the purpose of generalization.

The performance of the algorithm was evaluated by a normal distribution model of distortions between reference and testing patterns [73]. As in the decision making scheme suggested by L.R. Rainer [73], each word wk in the vocabu- lary of the recognizer was associated with a distortion thresh- old y, which is defined as follows:

Let x be the mean distortion between the test word and the templates of the reference word wk, such that if x > y, the decision is made that the reference and test words are differ- ent, while if x 5 y, the decision is made that the reference and test words are the same. With such a deciding rule, the probability of a matching miss P, is

1 e - ( x - M , ) 2 / 2 0 : dx = e r j k ( y )

where erfc is the standard complementary error function, and (1 - P,) is the recognition rate for the reference word wk.

The recognition rate for 10 words is plotted in Fig. 23. Based on the results, GTW performed better than DTW for most of the tested words. Furthermore, GTW was found capable of finding n-best paths (in this case, a minimum of three possible paths were found to be very compatible to the performance of matching capability) although it requires a little more (10%) computational time than DTW.

To further improve the (;A approach, the use of parallel genetic algorithms (PGAs) may be possible because of the intrinsic parallelism of the GA as indicated earlier in the "Parallelism" subsection. However, the use of PGA is no different than other parallel systems, in which the computa- tion performance is largely dependent upon the design of the system topology and communication overheads. These two factors must be seriously taken into consideration in order to maximize the throughput of PGA architecture so that the inherent nature of GA paralllelism can be fully exploited.

The other advantage of using GTW is its processing accuracy, which is governed by the dynamically adjustable normalization factor

(27) pm = ",iz-,

where n/r, 1s the mean distortion and OL is the standard deviation of the reference patterns and the test pattern ob- tained from the same words, whereas the corresponding Me and oe are the mean distortion and standard deviation of the reference patterns and test pattern from different words, respectively.

where m(k) is determined, based on the time ratio of the reference patterns and testeal

In DTW, the&$ is usually defined as a constant value only, rather than a huge computationally intensive algorithm for the determination of its actual values. Therefore, the use of GTW provides a means to locate different paths with differ-

34 IEEE SIGNAL PROCESSING MAGAZINE NOVEMBER 1996

Page 14: Genetic algorithms and their applications

ent normalization factors, which at the end reflects the iden- tification accuracy of the actual path and makes the calcula- tion in Eq. (25) in a more reasonable, robust manner.

Discussion and Conclusion

Although the GA is apowerful optimization tool, it does have certain weaknesses in comparison to other optimization tech- niques. A number of barriers have yet to be overcome before it can be applied to some real-world implementations. Due to the randomness of the GA operation, it is difficult to predict its performance, a factor that is crucial for hard-deadline, real-time applications. The source of the problem lies in the diversity of the chromosomes that cause the on-line system performance to be unpredictable.

However, there are large classes of problems that appear to be more amenable to solution by GAS than by any other available optimization techniques. These tasks often involve multiple objectives such as ANC problems or optimization with constraints like the stability assurance of IIR filter design. Moreover, since the GA can jump out of local optima, it is more desirable for multimodal problems like direct time-delay estimation. Perhaps the most encouraging areas of application are the impending AI-hybrid systems. The use of GAS with neural networks (NN) and fuzzy logic is ex- pected to receive more attention in the future. For example, GAS may be used to optimize the membership functions of the fuzzy system 1251 or to assist NN operation through the determination of suitable NN structures [27-301. It is foreseen that more hybrid systems will be launched for signal process- ing applications and GAS are an important component of these developments.

Acknowledgment

Part of this research was conducted in the Department of Auto- matic Control and Systems Engineering at the University of Sheffield, UK. The authors would like to thank Professor Peter Fleming and his research group for their useful advice and for providing the opportunity to work with them in this area.

K.S. Tang, K.F. Man and S. Kwong are with the City Uni- versity of Hong Kong, Hong Kong. Q. He is now with South China University of Technology, Guanzhou, China.

References

1. Y.T. Chan, J.M. Riley and J.B. Plant, “Modeling of Time Delay and its Application to Estimation of Nonstationary Delays, ” ZEEE Trans Acoust. Speech, Signal Processing, vol. ASSP-29, pp. 577-581, June 1981

2. N.S. Jayant and P. Noll, Digital Coding of Waveforms, Prentice Hall, Englewood Cliffs, NJ, 1984

3 . J.D. Markel and A.H. Gray Jr., Linear Prediction of Speech, Springer-Ver- lag, NY, 1976

4. L.R. Rabiner and R.W. Schafer, Digital Processing of Speech Signals, Prentic-Hall, Englewood Cliffs, NJ 1978.

5. S. Kadambe and G.F. Bondreaux, Bartels, “Applications of Wavelet Transform for Pitch Detection of Speech Signals, ” IEEE Trans. Zform. Theory, Special Issue on Wavelet Transforms and Multiresolution Signal

NOVEMBER 1996 IEEE SIGNAL PROCESSING MAGAZINE 35

Page 15: Genetic algorithms and their applications

Analysis, 1. Daubechies, S. Mallat and A.S. Willsky, editors, 38(2), pp. 917-924. 1992.

6. M. Antonini, M. Barland, D. Mathieu and I. Daubechies, “Image Coding Using Wavelet Transform,” IEEE Trans. on Image Proc. 1 (2) , pp. 205-220, Apr 1992.

7. J.S. Lienard and D. d’ Alessandro, “Wavelet and Granular Analysis of Speech,’‘ in Wavelets:Time-Frequenc~l Methods and Phase Space, J.M. Combes, A. Grossman and Ph.Tehamitchian (editors), second edition, Sprin- ger-Verlag, New York, 1989.

8. R. Wilson, A.D. Calway and E.R.S. Pearson, “A Generalized Wavelet Transform for Fourier Analysis: The Multiresolution Fourier Transform and its Application to Image and Audio Signal Analysis,” IEEE Trans. Inform. Theory, Special Issue on Wavelet Transforms and Multiresolution Signal Analysis, I. Daubechies, S. Mallat and A.S. Willsky, editors, 38(2), pp. 674-690, 1992.

9. T. Bedford, F.M. Dekking, M. Breeuwer, M.S. Keane, D. van Schooneveld, “Fractal Coding of Monochrome Images, ” Signal Processing: Image Communication 6, pp. 405-419, 1994.

IO. A.E. Jacquin, “Image Coding Based on a Fractal Theory of Iterated Contractive Image Transformations,” IEEE Trans Image Processing. Vol. I , pp 18-30, 1992.

1 I . J.M. Beaumont, “Advances in Block Based Fractal Coding of Still Picturcs,” in Proc. IEE Colloquim: The Application of Fractal Techniques in Image Processing, pp. 3/1-3/6, 1990.

12. D.R. Morgan, “An Anaysis of Mulitple Correlation Cancellation Loops with a Filter in the Auxiliary Path, ” IEEE Trans. Acoustic, Speech andSignal Processing, ASSP-228, pp. 454-467, 1980

13. Youngjin Park, Hyounsuk Kim, “Delayed-X Algorithm for a Long Duct system,” Inter-Noise 93, Leuvem-Belgium, August 1993

14. L.J. Eriksson, “Development of the Filter-U Algorithm for Active Noise Control,” J. Acoust. Soc. Am. 89 ( I ) Jan 199 I .

15. S.J. Elliott and P.A. Nelson, “Active Noise Control,” IEEE Signal Processing Magazine, Oct 1993.

16. P.E. Gill, W. Murray and M.H. Wright, Practical Optimization, .4ca- demic Press, I98 I .

17. R.E. Bellman, Dynamic Programming, Princeton University Press, Princeton, New Jersey, USA, 1957.

18. D.E. Goldberg, “Genetic and Evolutionary Algorithms Come of Age,” pp. 113-119, Communications ofthe ACM, Vol37, No.3, Mar 1994.

19. H. Holland, Adaptation in Natural und ArriJiciul Systems, Ann Arbor: The University of Michigan Press, 1975.

20. D.E. Goldberg, “Genetic Algorithm in Search, Optimization, and Ma- chine Learning” Addison Wesley Publishing Company, 1989.

21. S. Kirkpatrick, C.D. Gellat Jr., and M.P. Vecchi, “Optimization by Simulated Annealing,” Science, Vol. 220, No. 4598, pp. 671-680, 1983.

22. H.P. Schwefel, “Numerical Optimization of Computer Models,” John Wiley, Chichester, 1981.

23. D.B. Fogel, System Identification through Simulated Evolution: A Ma- chine Learning Approach to Modeling, Ginn Press, Needham Heights, MA 02194, 1991.

24. L.J. Fogel, A.J. Owens, and M.J. Walsh, Avtifirial Intelligence througlz Simulated Evolution, John Wiley & Sons, New York, 1966.

25. K.S. Tang, K.F. Man and C Y . Chan, “Fuzzy Control of Water Pi-essure using Genetic Algorithm,” IFAC Workshop on SafeQ, Reliability and Appli- cations of Emerging Intelligent Control Technologies, pp, 15-20, Hong Kong, Dec. 1994.

26. C.L. Karr, “Genetic Algorithms for F L I Z Z ~ Controllers,” AI Expert, Vol. 6, NO. 2, pp. 26-33, 1991.

27. Vittorio Maniezzo, “Genetic Evolution of the Topology and Weight Distribution of Neural Networks,” IEEE Trans. Neural Networks, Vol. 5 , No. 1, pp. 39-53 Jan 1994

28. P.J. Angeline, G.M. Saunders, and J.B. Pollack, “An Evolutionary Algorithm that Constructs Recurrent Neural Networks, ” IEEE Truas. Neural Networks, Vo1.5, No.1, pp. 54-65, Jan 1994

29. S.L. Hung and H. Adeli, “A Parallel GenetidNeural Network Learning Algorithm for MIMD Shared Memory Machines ” IEEE Trans. Neural Networks, Vol. 5 , No. 6, pp. 900-909, Nov 1994

30. K.S. Tang, K.F. Man and C.Y. Chan, “Genetic Structure for NN Topol- ogy and Weight Optimization,’’ 1st IEELEEE Int. Conf on GAS in Engineer- ing Systems: Innovations and Application, pp. 250-255, 12- 14 Sept, 1995.

3 1. J.R. Koza, “Genetic Programming: A Paradigm for Genetically Breeding Populations of Computer Programs to Solve Problems,” Report No. STAN- CS-90-13 14, Stmdford University, 1990.

32. C.Z. Janikow and Z. Michalewicz, “An Experimental Comparison of Binary and Floating Point Representations in Genetic Algorithms,” Proc. 4th Int. Conf. Genetic Algorithms, pp. 31-36, 1991

33. A.H. Wright, “Genetic Algorithms for Real Parameter Optimization,” Foundations of Genetic Algorithms, J.E. Rawlins (Ed.), Morgan Kaufmann, pp. 205-218, 1991.

34. L. Davis, Handbook of Genetic Algorithms, Van Nostrand Reinhold, 1991.

35. Zbigniew Michalewicz Genetic Algorithms + Data Structures = Evolu- lion Programs, Springer-Verlag, NY, 1992.

36. J . Kosa, “Evolution and Co-evolution of Computer Programs to Control Independently-Acting Agents,” Animals to Animals, J.A. Ineger and S.W. Willson (Eds.) Cambridge, MA: MIT PresdBradford Books, 1991.

37. D. Whitley, “Using Reproductive Evaluation to Improve Genetic Search and Heuristic Discovery,” Proc. 2nd Int. Con$ Genetic Algorifhms, pp. 108-115., IY87.

38. A.J. Chipperfield and P.J. Fleming, “Parallel Genetic Algorithms: A Survey,” ACSE Research Report No. 518, University of Sheffield, May 1994.

39. C.B. Petty, M.R. Leuze and J.J. Grefenstette, “A Parallel Genetic Algorithm,” Proc. 2nd Int. Conf: Genetic Algorithms, pp. 155-161, 1987.

40. R. Tanse, “Parallel Genetic Algorithm for a Hypercube, ” Proc. 2nd Irit. Conj: Genetic Algorithms, pp. 177-1 83, 1987.

41. R. Tanse, “Distributed Genetic Algorithms,” Proc. 3rd Con$ Genetic Algorithms, pp. 434-439, 1989.

42. T. Starkweather, D. Whitley and K. Mathias, “Optimization Using Distributed Genetic Algorithms,” Proc. Parallel Problem Solving From Nature I , pp. 176-185. Springer-Verlag, 1990.

43. J.P. Cohoon, W.N. Martin and D.S. Richards,“ A Multi-population Genetic Algorithm for Solving the K-Partition Problem on Hyper-cubes,’’ Proc. 4th Int. Conf. Genetic Algorithms, pp. 244-248, 1991.

44. G. Robertson, “Parallel Implementation of Genetic Algorithms in a Classifier System,” Genetic Algorithms and Simulated Annealing, L. Davis (Ed.), pp.129-140, Pitman, London, 1987.

45. B. Manderick and P. Spiessens, “Fine-Grained Parallel Genetic Algo- rithms,” Proc. 3rd fnt. Confi Genetic Algorilhms, pp.428-433, 1989.

46. H. Muhlenbein, “Parallel Genetic Algorithms, Population Genetics and Combinatorial Optimization,” Parallelism. Learning, Evolution, pp. 398- 406, Springer-Verlag, 1989.

47. M. Gorges-Schleuter, “ASPARAGOS An Asynchronous Parallel Ge- netic Optimization Strategy,” Proc. 3rd Int. Conf. Genetic Algorithms, pp.422-427, 1989.

48. H.M. Voight, I. Santibanez-Koref and J. Born, “Hierarchically Structured Distributed Genetic Algorithms,” Parallel Problem Solvingform Nature 2, pp. 145154, Amsterdam: North-Holland, 1992.

49. D.E. Goldberg and R.E. Smith, “Nonstationary Function Optimization Using Genetic Dominance and Diploidy, ” Proc. 2nd Inl. Conj: Genetic Algorithms, pp.59-68, 1987.

50. D. Dasgupta andD.R. McGregor, “Nonstationary Function Optimization using the Structured Genetic Algorithm,” Parallel Problem Solving from Nature, 2, pp. 145-154, Amsterdam: North Holland, 1992.

5 1 . J.J. Grefenstette, “Genetic Algorithms for Changing Environments, ” Parallel Problem Solving from Nature, 2, pp. 137-144. Amsterdam: North Holland, 1992.

52. H.G. Cobb, “An Investigation into the Use of Hypermutation as an Adaptive Operator in Genetic Algorithms Having Continuous, Time-de-

36 IEEE SIGNAL PROCESSING MAGAZINE NOVEMBER 1996

Page 16: Genetic algorithms and their applications

pendent Nonstationary Environments,” NRL Memorandum Report 6760, 1990.

53. H.G. Cobb and J.J. Grefenstette, “Genetic Algorithms for Tracking Changing Environments,” Proc. 5th h t . Con$ Genetic Algorithms, pp, 523-530, 1993.

54. K.S. Tang, K.F. Man and S. Kwong, “CA Approach to Time-variant Delay Estimation,” International Conference on Control and Information, pp. 173-175, 5-9 June 1995, Hong Kong.

55. L.J. Nicolson and B.M.G. Cheetham, “Simulated Annealing Applied to the Design of IIR Digital Filters by Multiple Criterion Optimisation,” pp.6/1-6/7, Workshop on Natural Algorithms in Signal Processing, Chelmsfbrd, Essex, 14-16 Nov. 1993.

56. C.M. Fonseca and P.J. Fleming, “Genetic Algorithms for Multiobjective optimization: Formulation, Discussion and Generalization, ” ACSE Research Report No. 466, 1993.

57. P.B. Wilson and M.D. Macleod, “Low Implementation Cost 1IR Digital Filter Design Using Genetic Algorithms, ” Workshop on Natural Algorithms in Signal Processing, pp. 4/1-4/8, Chelmsford, Essex, 14-16 Nov. 1993.

58. C.M. Fonseca and P.J Fleming, “An Overview of Evolutionary Algo- rithms in Multiobjective Optimization,” ACSEResearch ReportNo.527, July 1994.

59. H.C. So, P.C. Ching, and Y.T. Chan, “A New Algorithm for Explicit Adaptation of Time Delay,” IEEE Trans Signal Processing, Vol42, No. 7, pp.1816-1820, July 1994.

60. Deb, K. and D.E. Goldberg, “An Investigation of Niche and Species Formation in Genetic Function Optimization,” Proc. 3rd Int. Con$ Genetic Algorithms, pp. 42-50, 1989.

61. D.E. Goldberg, “Real-Coded Genetic Algorithms, Virtual Alphabets, and Block,” University of Illinois, Technical Report No. 90001, Sept 1990.

62. John J . Shynk, “Adaptive IIR filtering,” IEEEASSP Magazine, pp.4-21, Apr 1989.

63. J.T. Richardson M.R. Palmer, G. Liepins and M. Hilliard, “Some Guidelines for Genetic Algorithms with Penalty Functions,” Proc. 3rd Int. Con$ Genetic Algorithms, pp.191-197, 1989.

64. R. Nambiar and P. Mars, “Adaptive IIR Filtering Using Natural Algo- rithms,” pp. 20/1-20/10, Workshop on Natural Algorithms in Signal Proc- essing, Chelmsford, Essex, 14-16 Nov. 1993.

65. M.S. White and S.J. Flockton, “A Comparative Study of Natural Algo- rithms for Adaptive IIR Filtering, ” Workshop on Natural Algorithms in Signal Processing, pp. 22/1-22/8, Chelmsford, Essex , 14-16 Nov. 1993.

66. C.M Fonseca, E.M. Mendes, P.J. Fleming and S.A. Billings, “Non-linear Model Term Selection with Genetic Algorithms, ” Workshop on Natural Algorithms in Signal Processing, pp. 27/1-27/8, Chelmsford, Essex , 14-16 Nov. 1993.

67. Hillol Kargupta and R.E. Smith, “System Identification with Evolving Polynomial Network,” Proc. 4th Int. Con$ Genetic Algorithms, pp. 370-376, 1991.

68. M. Mackey and L. Glass, “Oscillation and Chaos in Physiological Control System,” Science, pp.197-287, 1977.

69. K.S. Tang, K.F. Man, S. Kwong and P.J. Fleming, “GA Approach to Multiple Objective Optimization for Active Noise Control, ”Algorithms and Architecturesfor Real-Time Control 95, pp. 13-19, Belgium, 31 May-2 Jun 1995.

70. K.H. Baek and S.J. Elliott, “Natural Algorithms for Choosing Source Locations in Active Control System,” Workshop on Natural Algorithms in Signal Processing, pp. 25/1-25/10, Chelmsford, Essex , 14-16 Nov. 1993.

71. S. Kwong, Q. He and K.F. Man, “Genetic Time Warping for Isolated Word Recognition, ” Intemational Journal of Pattern Recognition and Arti- ficial Intelligence (to be published).

72. H. Sakoe and S. Chiba, “Dynamic Programming Optimization for Spoken Word Recognition,” IEEE Trans. Acoustics, Speech, Signal P roc. ASSP-26(1), 43-49, Feb 1978.

73. R. Rabiner, A.E. Rosenberg and S.E. Levinson. “Considerations in Dynamic Time Warping Algorithms for Discrete Word Recognition. ” IEEE Trans. on ASSP, vol. 26, No.6, Dec. 1978.

74. Heitkoetter, Joerg and Beasley, David, eds. (1994) “ The HitchHiker’s Guide to Evolutionary Computation: A list of Frequently Asked Questions (FAQ),” USENET:comp.ai.genetic.

75. A.J. Chipperfield, P. J. Fleming and H. Pohlheim, “A Genetic Algorithm Toolbox for MATLAB,” Proc. International Conference on Systems Engi- neering, Coventq, UK, 6-8 Sept 1994.

76. MATHWORKS, “MATLAB User’s Guide,” The Mathworks, Inc, I 991.

77. J.J. Grefenstette, “A User’s Guide to GENESIS v5.0,” Naval Research Laboratory, Washington, D.C., 1990

78. J.A. Smith, “Designing Biomorphs with an Interactive Genetic Algo- rithm,” Proc. 4th lnt. Conf. Genetic Algorithms, 1991.

79. B. Thomas, “Users Guide for GENEsYs,” System Analysis Research Group, Dept. of Computer Science, University of Dortmund, 1992.

80. Y.C. Tang, “Tolkien Reference Manual,” Dept. of Computer Science, The Chinese University of Hong Kong, 1994.

NOVEMBER 1996 IEEE SIGNAL PROCESSING MAGAZINE 37