Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Luca M. Ghiringhelli
On-line course on Big Data and Artificial Intelligence in Materials Sciences
Compressed sensingmeets
symbolic regression:SISSO
- Part 1 -
Reminder: Few bits of taxonomy
Machine learning
Representation learning
Learning algorithms that learn their representation and the predictive model.- symbolic regression- deep learning
Artificial intelligence
Compressed sensing meets symbolic regression
SymbolicregressionSymbolic
regressionCompressed
sensingCompressed
sensing
EvolutionaryprogrammingEvolutionary
programming
Sure-independence screeningcombined with
sparsifying operatorSISSO
Sure-independence screeningcombined with
sparsifying operatorSISSO
Featureselection/identification
Featureselection/identification
Linear regression
Symbolic regression
Linear regression
Symbolic regression
11111111
Linear regression
Symbolic regression
Linear regression
Symbolic regression
Linear regression
Symbolic regression
Linear regression Kernel regression
One-hidden-layer perceptron
Symbolic regression
Linear regression Kernel regression
One-hidden-layer perceptron
Symbolic regression
exp(x)
xn
ln(x)
Systematic construction of candidates
Length1 Length2
x + y
x·y
arctan(x)
Energy2 Energy1 Length1 Length2
| x - y | x + y
x / y
Length3
x3
Energy2 Energy1
| x - y |
exp(x)
xn
Energy2 Energy1
| x - y |
x / y
Length1 Length2
x / y
exp(-x)
ln(x)
Systematic construction of candidates
Length1 Length2
x + y
x·y
arctan(x)
Length1 Length2
x / y
exp(-x)
Energy2 Energy1
| x - y |
exp(x)
xn
Energy2 Energy1
| x - y |
x / y
Length1 Length2
x / y
exp(-x)
ln(x)
Length1 Length2
x + y
x·y
arctan(x)
Length1 Length2
x / y
exp(-x)
Energy2 Energy1
| x - y |
Symbolic regression
Evolutionary/genetic algorithmInitialize population
Representing individuals as binary genes
0 0 1 1 0 1 1 0 1 0 0
1 0 0 1 0 0 0 1 0 1 1
1 0 0 1 0 0 0 0 0 1 1
1 1 0 1 0 1 0 1 1 1 1
1 0 1 1 0 0 0 1 0 0 0
1 0 0 1 1 0 1 1 0 1 1
Representing individuals as binary genes
Evolutionary/genetic algorithm
0 0 1 1 0 1 1 0 1 0 0
1 0 0 1 0 0 0 1 0 1 1
1 1 0 1 0 1 0 1 1 1 1
1 0 0 1 0 0 0 0 0 1 1
1 0 0 1 1 0 1 1 0 1 1
1 0 1 1 0 0 0 1 0 0 0
0.89
0.55
0.34
0.21
0.13
0.08
Rank wrt fit function
Initialize population
Evolutionary/genetic algorithm
Rank wrt fit function
Randomly select “fittest first”
Initialize population
0 0 1 1 0 1 1 0 1 0 0
1 0 0 1 0 0 0 1 0 1 1
1 1 0 1 0 1 0 1 1 1 1
1 0 0 1 0 0 0 0 0 1 1
1 0 0 1 1 0 1 1 0 1 1
1 0 1 1 0 0 0 1 0 0 0
0.89
0.55
0.34
0.21
0.13
0.08
0 0 1 1 0 1 1 0 1 0 0 1 0 0 1 0 0 0 1 0 1 1
Evolutionary/genetic algorithm
Rank wrt fit function
Randomly select “fittest first”
Crossover
Initialize population
0 0 1 1 0 1 1 0 1 0 0 1 0 0 1 0 0 0 1 0 1 1
1 0 0 1 0 0 0 0 1 0 00 0 1 1 0 1 1 1 0 1 1
Crossover
Evolutionary/genetic algorithm
Rank wrt fit function
Randomly select “fittest first”
Crossover
Initialize population
0 0 1 1 0 1 1 0 1 0 0 1 0 0 1 0 0 0 1 0 1 1
1 0 0 1 0 0 0 0 1 0 00 0 1 1 0 1 1 1 0 1 1
0 0 1 1 0 1 1 1 0 1 0 1 0 0 1 0 1 0 0 1 1 0
Crossover
Mutation
Evolutionary/genetic algorithm
Rank wrt fit function
Randomly select “fittest first”
Crossover
Mutation
Initialize population
0 0 1 1 0 1 1 0 1 0 0 1 0 0 1 0 0 0 1 0 1 1
1 0 0 1 0 0 0 0 1 0 00 0 1 1 0 1 1 1 0 1 1
0 0 1 1 0 1 1 1 0 1 0 1 0 0 1 0 1 0 0 1 1 0
Crossover
Mutation
Evolutionary/genetic algorithm
Rank wrt fit function
Randomly select “fittest first”
Crossover
Mutation
Initialize population
Happy?No
Rank wrt fit function
0 0 1 1 0 1 1 0 1 0 0 1 0 0 1 0 0 0 1 0 1 1
1 0 0 1 0 0 0 0 1 0 00 0 1 1 0 1 1 1 0 1 1
0 0 1 1 0 1 1 1 0 1 0 1 0 0 1 0 1 0 0 1 1 0
Crossover
Mutation
Evolutionary/genetic algorithm
Rank wrt fit function
Randomly select “fittest first”
Crossover
Mutation
Initialize population
Happy?EndYes No
Rank wrt fit function
E.g., crossover for molecules/clusters
Rank wrt fit function
Randomly select “fittest first”
Crossover
Mutation
Initialize population
Happy?EndYes No
Evolutionary/genetic algorithm
Rank wrt fit function
Evolutionary programming
Energy2 Energy1
| x - y |
x / y
Length1 Length2
x / y
exp(-x)
Energy2 Energy1
x + y
x / y
Length1 Length2
x / y
ln(x)
Example of crossover between symbolic trees
Evolutionary programming
Energy2 Energy1
| x - y |
x / y
Length1 Length2
x / y
exp(-x)
Energy2 Energy1
x + y
x / y
Length1 Length2
x / y
ln(x)
Example of crossover between symbolic trees
Evolutionary programming
Energy2 Energy1
| x - y |
x / y
Length1 Length2
x / y
exp(-x)
Energy2 Energy1
x + y
x / y
Length1 Length2
x / y
ln(x)
Energy2 Energy1
x + y
x / y
Length1 Length2
x / y
exp(-x)ln(x)
Example of crossover between symbolic trees
Model selection: Pareto front
Objective 2
Obj
ectiv
e 1++++
++
+++++++
++
+++++++
++
+ + +
+
Multi-objective optimization:Points on the Pareto front are such that no point is found that simultaneously improve all the objective functions.
Model selection: Pareto front
Complexity (depth of the tree)
Accu
racy
(RM
SE) ++++
++
+++++++
++
+++++++
++
+ + +
+
Multi-objective optimization:Points on the Pareto front are such that no point is found that simultaneously improve all the objective functions.
A famous example: EUREQA
Distilling Free-Form Natural Laws from Experimental DataSchmidt M., Lipson H., Science, Vol. 324, No. 5923, (2009)EUREQA: genetic programming software.
EUREQA: Pareto front
Eureqa
In general, with symbolic regression:● If the exact equation is within reach of the searching/optimizing algorithm,
it is found. For other powerful ML methods (e.g., kernel regression, regression treesand forests, deep learning), this is not the case.
● The few fitting parameters yield stability with respect to noise (low complexity no overfitting)→
Compressing signals
Compressed sensing
Compressed sensing
Feature selection/identification vs extraction
Feature selection: selection of a subset among given features● Filters univariate ranking, i.e., each feature with the property● Wrappers search strategies, e.g., GA● Embedded (non-stochastic) optimization of objective function,
e.g., regularized regression, decision tree
Feature extraction: new (fewer) features are functions (e.g., linear combinations) of potentially all given features.● Dimension reduction● Autoencoders
Compressed sensing
D.L. Donoho, IEEE Trans. Inf. Theory 2006 DOI: 10.1109/TIT.2006.871582EJ Candès, J Romberg, T Tao, Trans. Inf. Theory 2006 DOI:10.1109/TIT.2005.862083R. Tibshirani J. Royal Stat. Soc. 1997 DOI: 10.1111/j.2517-6161.1996.tb02080.x
LASSO Least Absolute Shrinkage and Selection Operator
Compressed sensing
D.L. Donoho, IEEE Trans. Inf. Theory 2006DOI: 10.1109/TIT.2006.871582
EJ Candès, J Romberg, T Tao, Trans. Inf. Theory 2006 DOI:10.1109/TIT.2005.862083
Recovery possible when:
N: #features, M: #observations, Ω: sparsity
Compressed sensing
D.L. Donoho, IEEE Trans. Inf. Theory 2006 DOI: 10.1109/TIT.2006.871582EJ Candès, J Romberg, T Tao, Trans. Inf. Theory 2006 DOI:10.1109/TIT.2005.862083R. Tibshirani J. Royal Stat. Soc. 1997 DOI: 10.1111/j.2517-6161.1996.tb02080.x
LASSO Least Absolute Shrinkage and Selection Operator
Compressed sensing, or “sparse recovery”enables the recovering of a sparse signal from very few, non-adaptive measurements.
Compressed sensing, or “sparse recovery”enables the recovering of a sparse signal from very few, non-adaptive measurements.
Bonus slide: Suggested literature
Notable examples of other-than-SISSO compressed sensing applied to materials science.Actually, in this example LASSO is the applied method. Often LASSO and compressed sensing are thought to be equivalent, whereas compressed sensing includes LASSO as solution protocol.
V Ozoliņš, R Lai, R Caflisch, S Osher - PNAS, 2013 DOI: 10.1073/pnas.1318679110
LJ Nelson, GLW Hart, F Zhou, V Ozoliņš - PRB, 2013 DOI: 10.1103/PhysRevB.87.035125
LJ Nelson, V Ozoliņš, CS Reese, F Zhou, GLW Hart - PRB, 2013 DOI: 10.1103/PhysRevB.88.155105
F Zhou, W Nielson, Y Xia, V Ozoliņš - PRL, 2014 DOI: 10.1103/PhysRevLett.113.185501