26
Pat Langley Pat Langley Institute for the Study of Learning and Expertise Institute for the Study of Learning and Expertise Javier Sanchez Javier Sanchez CSLI / Stanford University CSLI / Stanford University Ljupco Todorovski Ljupco Todorovski Saso Dzeroski Saso Dzeroski Jozef Stefan Institute Jozef Stefan Institute Inducing Process Models Inducing Process Models from Continuous Data from Continuous Data Supported by NTT Communication Science Laboratories, by Supported by NTT Communication Science Laboratories, by Grant NCC 2-1220 from NASA Ames Research Center, and by Grant NCC 2-1220 from NASA Ames Research Center, and by EU Grant IST-2000-26469. EU Grant IST-2000-26469.

Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez

Embed Size (px)

DESCRIPTION

Inducing Process Models from Continuous Data. Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez CSLI / Stanford University Ljupco Todorovski Saso Dzeroski Jozef Stefan Institute. - PowerPoint PPT Presentation

Citation preview

Page 1: Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez

Pat LangleyPat LangleyInstitute for the Study of Learning and ExpertiseInstitute for the Study of Learning and Expertise

Javier SanchezJavier SanchezCSLI / Stanford UniversityCSLI / Stanford University

Ljupco TodorovskiLjupco TodorovskiSaso DzeroskiSaso Dzeroski

Jozef Stefan InstituteJozef Stefan Institute

Inducing Process ModelsInducing Process Modelsfrom Continuous Data from Continuous Data

Supported by NTT Communication Science Laboratories, by Grant NCC 2-1220 Supported by NTT Communication Science Laboratories, by Grant NCC 2-1220 from NASA Ames Research Center, and by EU Grant IST-2000-26469.from NASA Ames Research Center, and by EU Grant IST-2000-26469.

Page 2: Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez

Exploratory Research in Machine LearningExploratory Research in Machine Learning

define a challenging new problem for machine learning; define a challenging new problem for machine learning;

show that established methods cannot solve the problem;show that established methods cannot solve the problem;

present an initial approach that addresses the new task; andpresent an initial approach that addresses the new task; and

outline an agenda for future research efforts in the area. outline an agenda for future research efforts in the area.

Dietterich (1990) claims an exploratory research report should:Dietterich (1990) claims an exploratory research report should:

In this talk, we explore the problem of inducing In this talk, we explore the problem of inducing process modelsprocess models from continuous data.from continuous data.

Page 3: Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez

process exponential_growth process exponential_growth variables: P {population} variables: P {population} equations: d[P,t] = [0, 1,equations: d[P,t] = [0, 1,] ] P P

process logistic_growthprocess logistic_growth variables: P {population}variables: P {population} equations: d[P,t] = [0, 1, equations: d[P,t] = [0, 1, ] ] P P (1 (1 P / [0, 1, P / [0, 1, ])])

process constant_inflowprocess constant_inflow variables: I {inorganic_nutrient}variables: I {inorganic_nutrient} equations: d[I,t] = [0, 1, equations: d[I,t] = [0, 1, ]]

process consumptionprocess consumption variables: P1 {population}, P2 {population}, variables: P1 {population}, P2 {population}, nutrient_P2 nutrient_P2 equations: d[P1,t] = [0, 1, equations: d[P1,t] = [0, 1, ] ] P1 P1 nutrient_P2, nutrient_P2, d[P2,t] = d[P2,t] = [0, 1, [0, 1, ] ] P1 P1 nutrient_P2 nutrient_P2

process no_saturationprocess no_saturation variables: P {number}, nutrient_P {number}variables: P {number}, nutrient_P {number} equations: nutrient_P = Pequations: nutrient_P = P

process saturationprocess saturation variables: P {number}, nutrient_P {number}variables: P {number}, nutrient_P {number} equations: nutrient_P = P / (P + [0, 1, equations: nutrient_P = P / (P + [0, 1, ])])

Inductive Process ModelingInductive Process Modeling

model AquaticEcosystemmodel AquaticEcosystem

variables: nitro, phyto, zoo, nutrient_nitro, variables: nitro, phyto, zoo, nutrient_nitro, nutrient_phytonutrient_phytoobservables: nitro, phyto, zooobservables: nitro, phyto, zoo

process phyto_exponential_growthprocess phyto_exponential_growth equations: d[phyto,t] = 0.1 equations: d[phyto,t] = 0.1 phyto phyto

process zoo_logistic_growthprocess zoo_logistic_growth equations: d[zoo,t] = 0.1 equations: d[zoo,t] = 0.1 zoo / (1 zoo / (1 zoo / 1.5) zoo / 1.5)

process phyto_nitro_consumptionprocess phyto_nitro_consumption equations: d[nitro,t] = equations: d[nitro,t] = 1 1 phyto phyto nutrient_nitro, nutrient_nitro, d[phyto,t] = 1 d[phyto,t] = 1 phyto phyto nutrient_nitro nutrient_nitro

process phyto_nitro_no_saturationprocess phyto_nitro_no_saturation equations: nutrient_nitro = nitroequations: nutrient_nitro = nitro

process zoo_phyto_consumptionprocess zoo_phyto_consumption equations: d[phyto,t] = equations: d[phyto,t] = 1 1 zoo zoo nutrient_phyto, nutrient_phyto, d[zoo,t] = 1 d[zoo,t] = 1 zoo zoo nutrient_phyto nutrient_phyto

process zoo_phyto_saturationprocess zoo_phyto_saturation equations: nutrient_phyto = phyto / (phyto + 0.5)equations: nutrient_phyto = phyto / (phyto + 0.5)

InductionInduction

training datatraining data

background knowledgebackground knowledge

learned knowledgelearned knowledge

Page 4: Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez

Observed values for a set of continuous variables as they varyover time or situations

Generic processes thatcharacterize causal relationships amongvariables in terms ofconditional equations

Inductive Process ModelingInductive Process Modeling

A specific process model that explains the observed values and predicts future data accurately

Induction

training datatraining data

background knowledgebackground knowledge

learned modellearned model

Page 5: Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez

A Process Model of an Ice-Water SystemA Process Model of an Ice-Water System

model WaterPhaseChangemodel WaterPhaseChange

variables: temp, heat, ice_mass, water_massvariables: temp, heat, ice_mass, water_massobservables: temp, heat, ice_mass, water_massobservables: temp, heat, ice_mass, water_mass

process ice-warmingprocess ice-warming conditions: ice_mass > 0, temp < 0conditions: ice_mass > 0, temp < 0 equations: d[temp,t] = heat / (0.00206 equations: d[temp,t] = heat / (0.00206 ice_mass) ice_mass)

process ice-meltingprocess ice-melting conditions: ice_mass > 0, temp == 0conditions: ice_mass > 0, temp == 0 equations: d[ice_mass,t] = equations: d[ice_mass,t] = (18 (18 heat) / 6.02, heat) / 6.02, d[water_mass,t] = (18 d[water_mass,t] = (18 heat) / 6.02 heat) / 6.02

process water-warmingprocess water-warming conditions: ice_mass == 0, water_mass > 0,conditions: ice_mass == 0, water_mass > 0, temp >= 0, temp < 100temp >= 0, temp < 100 equations: d[temp,t] = heat / (0.004184 equations: d[temp,t] = heat / (0.004184 water_mass) water_mass)

00

TimeTime

tempice_masswater_mass

Page 6: Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez

Why Are Process Models Interesting?Why Are Process Models Interesting?

they incorporate they incorporate scientific formalismsscientific formalisms rather than AI notations; rather than AI notations;

that are easily that are easily communicable communicable to scientists and engineers;to scientists and engineers;

they move beyond descriptive generalization to they move beyond descriptive generalization to explanationexplanation;;

while retaining the while retaining the modularitymodularity needed to support induction. needed to support induction.

Process models are a crucial target for machine learning because: Process models are a crucial target for machine learning because:

These reasons point to process models as an ideal representation These reasons point to process models as an ideal representation for scientific and engineering knowledge.for scientific and engineering knowledge.

Process models are an important alternative to formalisms used Process models are an important alternative to formalisms used currently in machine learning. currently in machine learning.

Page 7: Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez

Challenges of Inductive Process ModelingChallenges of Inductive Process Modeling

process models characterize behavior of dynamical systems; process models characterize behavior of dynamical systems;

variables are mainly continuous and data are unsupervised; variables are mainly continuous and data are unsupervised;

observations are not independently and identically distributed;observations are not independently and identically distributed;

process models contain unobservable processes and variables; process models contain unobservable processes and variables;

multiple processes can interact to produce complex behavior.multiple processes can interact to produce complex behavior.

Process model induction differs from typical learning tasks in that:Process model induction differs from typical learning tasks in that:

Compensating factors include a focus on deterministic systems and Compensating factors include a focus on deterministic systems and the availability of background knowledge. the availability of background knowledge.

Page 8: Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez

Can Existing Methods Induce Process Models?Can Existing Methods Induce Process Models?

d[ice_mass,t] = d[ice_mass,t] = (18 (18 heat) / 6.02 heat) / 6.02d[water_mass,t] = (18 d[water_mass,t] = (18 heat) / 6.02 heat) / 6.02

equation discoveryequation discovery

B>6B>6

C>0C>0 C>4C>4

14.314.3 18.718.7 11.511.5 16.916.9

regression treesregression trees

xx=12,=12,xx=1=1

yy=18,=18,xx=2=2

xx=12,=12,xx=1=1

yy=10,=10,xx=2=2

xx=16,=16,xx=2=2

yy=13,=13,xx=1=1

xx=19,=19,xx=1=1

yy=11,=11,xx=2=2

0.30.3

0.70.7

1.01.0

1.01.0

hidden Markov modelshidden Markov models

explanation-based learningexplanation-based learning

gcd(X,X,X).gcd(X,X,X).gcd(X,Y,D) :- X<Y,Z is Y–X,gcd(X,Z,D).gcd(X,Y,D) :- X<Y,Z is Y–X,gcd(X,Z,D).gcd(X,Y,D) :- Y<X,gcd(Y,X,D).gcd(X,Y,D) :- Y<X,gcd(Y,X,D).

inductive logic programminginductive logic programming

Page 9: Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez

Facets of Inductive Process ModelingFacets of Inductive Process Modeling

characteristics of the data (observations to be explained); characteristics of the data (observations to be explained);

a representation for background knowledge (generic processes); a representation for background knowledge (generic processes);

a representation for learned knowledge (process models); a representation for learned knowledge (process models);

a performance element that makes predictions (a simulator);a performance element that makes predictions (a simulator);

a learning method that induces process models.a learning method that induces process models.

To describe a system that learns process models, we must specify: To describe a system that learns process models, we must specify:

We will use an example from population dynamics to illustrate an We will use an example from population dynamics to illustrate an initial approach to inductive process modeling.initial approach to inductive process modeling.

Page 10: Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez

Data for an Aquatic EcosystemData for an Aquatic Ecosystem

Page 11: Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez

Generic Processes for Population DynamicsGeneric Processes for Population Dynamics

process exponential_growth process exponential_growth process exponential_decayprocess exponential_decay variables: P {population} variables: P {population} variables: P {population} variables: P {population} equations: d[P,t] = [0, 1,equations: d[P,t] = [0, 1,] ] P P equations: d[P,t] = equations: d[P,t] = [0, 1, [0, 1, ] ] P P

process logistic_growthprocess logistic_growth variables: P {population}variables: P {population} equations: d[P,t] = [0, 1, equations: d[P,t] = [0, 1, ] ] P P (1 (1 P / [0, 1, P / [0, 1, ])])

process constant_inflowprocess constant_inflow variables: I {inorganic_nutrient}variables: I {inorganic_nutrient} equations: d[I,t] = [0, 1, equations: d[I,t] = [0, 1, ]]

process consumptionprocess consumption variables: P1 {population}, P2 {population}, nutrient_P2 {number}variables: P1 {population}, P2 {population}, nutrient_P2 {number} equations: d[P1,t] = [0, 1, equations: d[P1,t] = [0, 1, ] ] P1 P1 nutrient_P2, nutrient_P2, d[P2,t] = d[P2,t] = [0, 1, [0, 1, ] ] P1 P1 nutrient_P2 nutrient_P2

process no_saturationprocess no_saturation variables: P {number}, nutrient_P {number}variables: P {number}, nutrient_P {number} equations: nutrient_P = Pequations: nutrient_P = P

process saturationprocess saturation variables: P {number}, nutrient_P {number}variables: P {number}, nutrient_P {number} equations: nutrient_P = P / (P + [0, 1, equations: nutrient_P = P / (P + [0, 1, ])])

Page 12: Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez

Process Model for an Aquatic EcosystemProcess Model for an Aquatic Ecosystem

model AquaticEcosystemmodel AquaticEcosystem

variables: nitro, phyto, zoo, nutrient_nitro, variables: nitro, phyto, zoo, nutrient_nitro, nutrient_phytonutrient_phytoobservables: nitro, phyto, zooobservables: nitro, phyto, zoo

process phyto_exponential_growthprocess phyto_exponential_growth equations: d[phyto,t] = 0.1 equations: d[phyto,t] = 0.1 phyto phyto

process zoo_logistic_growthprocess zoo_logistic_growth equations: d[zoo,t] = 0.1 equations: d[zoo,t] = 0.1 zoo / (1 zoo / (1 zoo / 1.5) zoo / 1.5)

process phyto_nitro_consumptionprocess phyto_nitro_consumption equations: d[nitro,t] = equations: d[nitro,t] = 1 1 phyto phyto nutrient_nitro, nutrient_nitro, d[phyto,t] = 1 d[phyto,t] = 1 phyto phyto nutrient_nitro nutrient_nitro

process phyto_nitro_no_saturationprocess phyto_nitro_no_saturation equations: nutrient_nitro = nitroequations: nutrient_nitro = nitro

process zoo_phyto_consumptionprocess zoo_phyto_consumption equations: d[phyto,t] = equations: d[phyto,t] = 1 1 zoo zoo nutrient_phyto, nutrient_phyto, d[zoo,t] = 1 d[zoo,t] = 1 zoo zoo nutrient_phyto nutrient_phyto

process zoo_phyto_saturationprocess zoo_phyto_saturation equations: nutrient_phyto = phyto / (phyto + 0.5)equations: nutrient_phyto = phyto / (phyto + 0.5)

Page 13: Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez

Making Predictions with Process ModelsMaking Predictions with Process Models

Specify initial values for input variablesand the size for time steps

On each time step, check conditions todecide which processes are active

Solve algebraic and differentialequations with known values

Propagate values and recurseto solve other equations

Add the effects of differentprocesses on each variable

Page 14: Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez

The IPM Method for Process Model InductionThe IPM Method for Process Model Induction

Find all ways to instantiate known generic processes with specific variables

Combine subsets of instantiated processes into generic models

Remove candidates that are too complex or not connected graphs

For each generic model, search for good parameter values

Return parameterized modelwith the smallest error

Page 15: Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez

Initial Evaluation of IPM AlgorithmInitial Evaluation of IPM Algorithm

1. We used the aquatic ecosystem model to generate data for 100 1. We used the aquatic ecosystem model to generate data for 100 time steps, setting nitrogen = 1.0, phyto = 0.01, zoo = 0.01; time steps, setting nitrogen = 1.0, phyto = 0.01, zoo = 0.01;

2. We replaced each ‘true’ value 2. We replaced each ‘true’ value xx with with xx (1 + r (1 + r 0.05) 0.05), , where where rr came from a Gaussian distribution ( came from a Gaussian distribution ( = 0 and = 0 and = 1); = 1);

3. We ran IPM on these noisy data, giving it type constraints and 3. We ran IPM on these noisy data, giving it type constraints and generic processes as background knowledge.generic processes as background knowledge.

To demonstrate IPM's functionality at inducing process models, To demonstrate IPM's functionality at inducing process models, we ran it on synthetic data for a known system.we ran it on synthetic data for a known system.

The IPM algorithm examined a space of 2196 generic models, The IPM algorithm examined a space of 2196 generic models, each with an embedded parameter optimization.each with an embedded parameter optimization.

Page 16: Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez

Predictions from IPM’s Induced ModelPredictions from IPM’s Induced Model

Page 17: Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez

Process Model Generated by IPMProcess Model Generated by IPM

model AquaticEcosystemmodel AquaticEcosystem

variables: nitro, phyto, zoo, nutrient_nitro_1, variables: nitro, phyto, zoo, nutrient_nitro_1, nutrient_nitro_2, nutrient_phytonutrient_nitro_2, nutrient_phyto

observables: nitro, phyto, zooobservables: nitro, phyto, zoo

process phyto_exponential_growthprocess phyto_exponential_growth equations: d[phyto,t] = 0.089 equations: d[phyto,t] = 0.089 phyto phyto

process zoo_logistic_growthprocess zoo_logistic_growth equations: d[zoo,t] = 0.013 equations: d[zoo,t] = 0.013 zoo / (1 zoo / (1 zoo / 0.469) zoo / 0.469)

process phyto_nitro_consumptionprocess phyto_nitro_consumption equations: d[nitro,t] = equations: d[nitro,t] = 1.174 1.174 phyto phyto

nutrient_nitro_1,nutrient_nitro_1, d[phyto,t] = 1.058 d[phyto,t] = 1.058 phyto phyto

nutrient_nitro_1nutrient_nitro_1

process phyto_nitro_no_saturationprocess phyto_nitro_no_saturation equations: nutrient_nitro_1 = nitroequations: nutrient_nitro_1 = nitro

process zoo_phyto_consumptionprocess zoo_phyto_consumption equations: d[phyto,t] = equations: d[phyto,t] = 0.986 0.986 zoo zoo

nutrient_phyto,nutrient_phyto, d[zoo,t] = 1.089 d[zoo,t] = 1.089 zoo zoo nutrient_phyto nutrient_phyto

process zoo_phyto_saturationprocess zoo_phyto_saturation equations: nutrient_phyto = phyto / (phyto + 0.487)equations: nutrient_phyto = phyto / (phyto + 0.487)

Page 18: Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez

Process Model Generated by IPMProcess Model Generated by IPM(continued)(continued)

process nitro_constant_inflowprocess nitro_constant_inflow equations: d[nitro,t] = 0.067equations: d[nitro,t] = 0.067

process zoo_nitro_consumptionprocess zoo_nitro_consumption equations: d[nitro,t] = equations: d[nitro,t] = 0.470 0.470 zoo zoo nutrient_nitro_2, nutrient_nitro_2, d[zoo,t] = 1.089 d[zoo,t] = 1.089 zoo zoo nutrient_nitro_2 nutrient_nitro_2

process zoo_nitro_saturationprocess zoo_nitro_saturation equations: nutrient_nitro_2 = nitro / (nitro + 0.020)equations: nutrient_nitro_2 = nitro / (nitro + 0.020)

These extra processes complicate the model but have little effect These extra processes complicate the model but have little effect on its behavior or its predictive accuracy.on its behavior or its predictive accuracy.

Page 19: Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez

A Proposed Research AgendaA Proposed Research Agenda

reduce variance and overfitting (e.g., through pruning); reduce variance and overfitting (e.g., through pruning);

determine the conditions on processes from training data; determine the conditions on processes from training data;

associate variables with phyiscal entities to constrain search; associate variables with phyiscal entities to constrain search;

use a taxonomy of process types to organize and limit search;use a taxonomy of process types to organize and limit search;

use knowledge of dimensions and conservation to limit search;use knowledge of dimensions and conservation to limit search;

support the induction of qualitative process models; andsupport the induction of qualitative process models; and

revise existing process models rather than construct them.revise existing process models rather than construct them.

Future research on process modeling should explore methods that:Future research on process modeling should explore methods that:

This work should draw on traditional induction methods, which This work should draw on traditional induction methods, which have many relevant ideas. have many relevant ideas.

Page 20: Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez

Evaluation of Process ModelsEvaluation of Process Models

make explicit claims about an induction method's abilities;make explicit claims about an induction method's abilities; support these claims with experimental or theoretical evidence; support these claims with experimental or theoretical evidence;

study behavior on natural data sets to ensure relevance; study behavior on natural data sets to ensure relevance;

utilize synthetic data sets to vary dimensions of interest; andutilize synthetic data sets to vary dimensions of interest; and

incorporate ideas from other tasks and utilize existing methods incorporate ideas from other tasks and utilize existing methods whenever sensible.whenever sensible.

Research on this new class of problems should follow the accepted Research on this new class of problems should follow the accepted standards; thus, papers should:standards; thus, papers should:

In addition, the focus on communicability and use of background In addition, the focus on communicability and use of background knowledge suggests collaborations with domain experts. knowledge suggests collaborations with domain experts.

Page 21: Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez

Concluding RemarksConcluding Remarks

proposed a new problem that involves induction of process proposed a new problem that involves induction of process models from components to explain observations;models from components to explain observations;

argued that this task does not lend itself to established methods; argued that this task does not lend itself to established methods;

proposed a formalism for models and background knowledge; proposed a formalism for models and background knowledge;

presented an initial system that induces such process models; presented an initial system that induces such process models;

demonstrated its functionality in a population dynamics domain;demonstrated its functionality in a population dynamics domain;

outlined an agenda for future research in this new area.outlined an agenda for future research in this new area.

In this exploratory research contribution, we have:In this exploratory research contribution, we have:

Process model induction has great potential to aid development of Process model induction has great potential to aid development of models in science and engineering.models in science and engineering.

Page 22: Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez

In MemoriamIn Memoriam

Herbert A. Simon (1916 – 2001)Herbert A. Simon (1916 – 2001)

Jan M. Zytkow (1945 – 2001)Jan M. Zytkow (1945 – 2001)

Early last year, computational scientific discovery lost two of its Early last year, computational scientific discovery lost two of its founding fathers:founding fathers:

Both contributed to the field in many ways: posing new problems, Both contributed to the field in many ways: posing new problems, inventing methods, training students, and organizing meetings.inventing methods, training students, and organizing meetings.

Moreover, both were interdisciplinary researchers who contributed Moreover, both were interdisciplinary researchers who contributed to computer science, psychology, philosophy, and statistics.to computer science, psychology, philosophy, and statistics.

Herb Simon and Jan Zytkow were excellent role models that we Herb Simon and Jan Zytkow were excellent role models that we should all aim to emulate. should all aim to emulate.

Page 23: Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez
Page 24: Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez

Our approach to inductive process modeling builds on LaGramge Our approach to inductive process modeling builds on LaGramge (Todorovski & Dzeroski, 1997), a discovery system that: (Todorovski & Dzeroski, 1997), a discovery system that:

The LaGramge Discovery SystemThe LaGramge Discovery System

LaGramge has rediscovered an impressive class of differential and LaGramge has rediscovered an impressive class of differential and algebraic equations from noisy data. algebraic equations from noisy data.

specifies a space of abstract numeric equations in terms of a specifies a space of abstract numeric equations in terms of a context-free grammar;context-free grammar;

searches exhaustively through this space, to a given depth, to searches exhaustively through this space, to a given depth, to generate candidate abstract equations; generate candidate abstract equations;

calls on established optimization techniques to determine the calls on established optimization techniques to determine the parameters for each equation; andparameters for each equation; and

uses either squared error or minimum description length to uses either squared error or minimum description length to select its final equations. select its final equations.

Page 25: Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez

Making Predictions with Process ModelsMaking Predictions with Process Models

specify initial values for input variables and time step size;specify initial values for input variables and time step size;

on each time step, determine which processes are active;on each time step, determine which processes are active;

solve active algebraic/differential equations with known values;solve active algebraic/differential equations with known values;

propagate values and recursively solve other active equations; propagate values and recursively solve other active equations;

when multiple processes influence the same variable, assume when multiple processes influence the same variable, assume their effects are additive. their effects are additive.

To simulate a given process model’s behavior over time, we can: To simulate a given process model’s behavior over time, we can:

This performance element makes specific predictions that we can This performance element makes specific predictions that we can compare to observations. compare to observations.

Page 26: Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez

A Method for Process Model InductionA Method for Process Model Induction

1. Find all ways to instantiate known generic processes with 1. Find all ways to instantiate known generic processes with specific variables;specific variables;

2. Combine subsets of instantiated processes into generic 2. Combine subsets of instantiated processes into generic models, each specifying an explanatory structure;models, each specifying an explanatory structure;

2a. Ensure that each candidate consists of a connected graph; 2a. Ensure that each candidate consists of a connected graph;

2b. Limit the maximum number of processes that can connect 2b. Limit the maximum number of processes that can connect any two variables and the total number of processes;any two variables and the total number of processes;

3. Translate the candidate into a context-free grammar and 3. Translate the candidate into a context-free grammar and invoke LaGramge to search for good parameter values;invoke LaGramge to search for good parameter values;

4. Return the model with the least error produced by LaGramge.4. Return the model with the least error produced by LaGramge.

We have implemented IPM, an algorithm that constructs process We have implemented IPM, an algorithm that constructs process models from generic components in four stages:models from generic components in four stages: