34
Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor Dr. Hsu Graduate Min-Hong Lin IDSL seminar 2001/12/11

Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Embed Size (px)

DESCRIPTION

Motivation Prediction in financial domains is difficult to model. The dimensionality is high. Relationships among variables are weak and nonlinear. Variable interactions can be significant.

Citation preview

Page 1: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner

Overlaid With Entropy Reduction

Advisor : Dr. HsuGraduate : Min-Hong Lin

IDSL seminar 2001/12/11

Page 2: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Outline Motivation Objective Genetic Search for Rules Genetic Rule Discovery Comparison of GLOWER to Tree and Rule

Induction Conclusions Comments

Page 3: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Motivation Prediction in financial domains is difficult

to model. The dimensionality is high. Relationships among variables are weak and

nonlinear. Variable interactions can be significant.

Page 4: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Objective Describe and evaluate several variations of a new

genetic learning algorithm (GLOWER) on a variety of data sets.

Comparison of GLOWER variants. Comparison of GLOWER to Tree and Rule

Induction.

Page 5: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Genetic Search for Rules Genetic algorithms (Packard, 1989; Goldberg,

1989, Holland, 1992) have been shown to be well suited to learning rule-like patterns.

They have the ability to search large spaces for pattern without resorting to heuristics that are biased against term interactions.

Page 6: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Limitations of Genetic Search Lack of speed Randomness in creating the initial

population They can be myopic after find one good

solution

Page 7: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Benefits of Genetic Search Scour a search space thoroughly Allow arbitrary fitness functions in the

search

Page 8: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

The failings of greedy search Tree Induction algorithms are currently

among the most widely used techniques in data mining.

But greedy search conducted by TI algorithms can overlook good patterns. See next example

Page 9: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

An example

Page 10: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

An example(Cont’d)

Page 11: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Entropy Reduction Tree induction algorithms typically determi

ne split points based on a heuristic such as entropy reduction

Page 12: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Entropy Reduction(cont’d) The entropy of a cluster i can be computed

using the standard formula:

The gain from a split is computed based on the difference between the entropy of the parent cluster and the entropy of the child clusters resulting from the split.

Page 13: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

An example(Cont’d)

Page 14: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Evaluation of partial models Two commonly used metrics used to

measure the goodness of individual rules are confidence and support.

If N is the total number of examples in a data set, then, for a rule form A à B:

Page 15: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Genetic Rule Discovery Representation: Gene and Chromosome

Page 16: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Genetic Rule Discovery Process An initial population of patterns (chromosomes) is

created randomly. Each chromosome is evaluated and ranked. The higher-ranked chromosomes are selected to

participate in “mating” and “mutation” to produce new offspring. Mating involves exchanging parts of chromosomes

(genes) between pairs. (Crossover) Mutating involves changing the value of a gene

Page 17: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Genetic Rule Discovery Process(Cont’d) By repeating these steps over and over, the search

converges on populations of better patterns, as measured using some fitness function.

Page 18: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Genetic Rule Discovery (Cont’d)

Selection(or reproduction)

Crossover

MutationCrossover

Selection

Selection

Probability ofsurvival

Page 19: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Genetic Rule Discovery (Cont’d)

Page 20: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Genetic Rule Discovery (Cont’d) Two problems

A1 is the dominant pattern in the data set, chromosomes typically gather in the neighborhood of A1.

If such a pattern dominates the early populations, the algorithm is likely to overlook the other patterns.

Approach(inductive strengthening ) Sequential Niching:SN Data Reduction:DR

Page 21: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Sequential Niching:SN Niching schemes have been described for dealing with mul

timodal problems (Goldberg and Richardson (1987), Oei et al. (1991).

inductive strengthening (Provost & Buchanan, 1992): placing stronger restrictions on the search based on the rules that have already been induced.

Genetic search can perform inductive strengthening by niching sequentially.

After the search converges on a high-fitness schema, the evaluation function can be modified to penalize patterns corresponding to it.

Page 22: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Sequential Niching:SN(Cont’d)

Page 23: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Data Reduction:DR The most commonly used heuristic for inductive

strengthening is the covering heuristic. we can apply the covering heuristic to genetic

search. Instead of penalizing an area as classified when

calculating the fitness function, the data corresponding to the area are removed from further consideration.

Page 24: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Compare DR to SN

Page 25: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Compare DR to SN(Cont’d) The SN is more likely to produce non-overlapping rules

whereas the DR produces a hierarchy of rules

Page 26: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Compare DR to SN(Cont’d)

Page 27: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Comparison of GLOWER variants Sampling Design

GLOWER Variants sequential niching sequential niching plus entropy reduction data reduction plus entropy reduction

Page 28: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Results

Page 29: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Comparison of GLOWER to Tree and Rule Induction

To predict “earnings surprises” Date set : a history of earnings forecasts and actual

reported earnings for S&P500 stocks between 1990/1~1998/9

60 independent variables were chosen based on fundamental and technical analysis.

Page 30: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Results

Page 31: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Results(Cont’d) GLOWER is more thorough in its search than RL which is

more thorough than TI GLOWER outperforms in both confidence and support for

predicting positive surprises. For negative surprises, the confidence levels are similar,

but GLOWER is better in support. GA is much more suited to capturing symmetry in the

problem space TI and RL find it harder to predict a negative earnings

surprise than a positive one.

Page 32: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Discussion We are able to combine the speed of the traditional

methods with the more comprehensive search of the GA.

GA employ useful heuristics for achieving higher levels of support while maintaining high accuracy.

The GA is about two or three orders of magnitude slower than a TI algorithm.

Explainability is an important consideration in using knowledge discovery methods.

Page 33: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Conclusions Genetic learning algorithms are more thorough in

their search than other rule learning algorithms for hard problems.

Genetic algorithms have the flexibility to accommodate variable fitness functions.

It provides comparison and of the heuristics used by different rule-mining techniques.

Page 34: Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate

Comments GLOWER has the ability to uncover interesting

patterns for difficult problems. Maybe the time complexity should be considered

in the experiment.