Upload
others
View
12
Download
0
Embed Size (px)
Citation preview
We circumvent the expensive evaluation with a
deterministic RBF surrogate defined as:
We tackle the high-dimensionality by reducing the
probability φ of searching along dimension:
We escape local minima by computing a compound score
for each candidate point. The score is a dynamic weighted
average of a metric based on the distance from the best
found solution and of a metric based on the candidate
surrogate value.
Efficient Hyperparameter Optimization of Deep Learning
Algorithms using Deterministic RBF Surrogates
Ilija Ilievski a Taimoor Akhtar b Jiashi Feng c Christine Annette Shoemaker b
[email protected] [email protected] [email protected] [email protected]
➢In
sh
ort
➢M
ore
: bit.ly/hord-aaai Supplement and more at: ilija139.github.ioa) Graduate School for Integrative Sciences and Engineering b) Industrial and Systems Engineering c) Electrical and Computer Engineering
➢D
etai
ls
➢R
esu
lts
Optimization of 19 CNN hyperparameters Optimization of 8 CNN hyperparameters
Optimization of 6 MLP hyperparametersOptimization of 15 CNN hyperparameters
➢A
lgo
rith
m
HORD: Hyperparameter Optimization using deterministic RBF surrogate and DYCORS Input: n0 init evals, Nmaxmax evals, m #candidates
Sample n0 points Xn0≔{xi} Populate An0≔{Xn0,f(Xn0)} while n < Nmaxupdate surrogate Sn with An≔{Xn,f(Xn)}set xbest≔argmin{f(Xn)}sample m candidate points ti around xbest according to probabilities φncompute Vev, Vdm, and Wn for all tiset x*≔argmin{Wn(ti)}set An+1≔{ An ⋃ (x*,f(x*)}
end while return xbest ➢
Co
mp
aris
on
Evaluations required by HORD to match the best found error by state-of-the-art hyperparameter optimization algorithms
Deep learning algorithms are powerful but are very
sensitive to the many hyperparameters they have: number
of layers and nodes, learning rate, weights initialization...
Optimizing the validation error with respect to the
hyperparameters involves the minimization of highly
multimodal and expensive function in high dimensions.
We propose an algorithm that matches the performance of
the state-of-the-art hyperparameter optimization
algorithms while using up to 6 times fewer evaluations.
Distance metric:
Surrogate value metric:
Final candidate score: