Transcript
Page 1: Efficient Hyperparameter Optimization of Deep Learning ... · Optimization of 19 CNN hyperparameters Optimization of 8 CNN hyperparameters Optimization of 15 CNN hyperparameters Optimization

We circumvent the expensive evaluation with a

deterministic RBF surrogate defined as:

We tackle the high-dimensionality by reducing the

probability φ of searching along dimension:

We escape local minima by computing a compound score

for each candidate point. The score is a dynamic weighted

average of a metric based on the distance from the best

found solution and of a metric based on the candidate

surrogate value.

Efficient Hyperparameter Optimization of Deep Learning

Algorithms using Deterministic RBF Surrogates

Ilija Ilievski a Taimoor Akhtar b Jiashi Feng c Christine Annette Shoemaker b

[email protected] [email protected] [email protected] [email protected]

➢In

sh

ort

➢M

ore

: bit.ly/hord-aaai Supplement and more at: ilija139.github.ioa) Graduate School for Integrative Sciences and Engineering b) Industrial and Systems Engineering c) Electrical and Computer Engineering

➢D

etai

ls

➢R

esu

lts

Optimization of 19 CNN hyperparameters Optimization of 8 CNN hyperparameters

Optimization of 6 MLP hyperparametersOptimization of 15 CNN hyperparameters

➢A

lgo

rith

m

HORD: Hyperparameter Optimization using deterministic RBF surrogate and DYCORS Input: n0 init evals, Nmaxmax evals, m #candidates

Sample n0 points Xn0≔{xi} Populate An0≔{Xn0,f(Xn0)} while n < Nmaxupdate surrogate Sn with An≔{Xn,f(Xn)}set xbest≔argmin{f(Xn)}sample m candidate points ti around xbest according to probabilities φncompute Vev, Vdm, and Wn for all tiset x*≔argmin{Wn(ti)}set An+1≔{ An ⋃ (x*,f(x*)}

end while return xbest ➢

Co

mp

aris

on

Evaluations required by HORD to match the best found error by state-of-the-art hyperparameter optimization algorithms

Deep learning algorithms are powerful but are very

sensitive to the many hyperparameters they have: number

of layers and nodes, learning rate, weights initialization...

Optimizing the validation error with respect to the

hyperparameters involves the minimization of highly

multimodal and expensive function in high dimensions.

We propose an algorithm that matches the performance of

the state-of-the-art hyperparameter optimization

algorithms while using up to 6 times fewer evaluations.

Distance metric:

Surrogate value metric:

Final candidate score:

Recommended