18
Reading Report on “Online Optimization with Gradual Variations”, the Best Student Paper from 12’COLT , by C. Chiang & T. Yang, et al. Mengfei Cao April 16

Reading Report on “Online Optimization with Gradual ...mcao01/2013/comp236/Presentation2.pdf · Reading Report on “Online Optimization with Gradual Variations”, the Best Student

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Reading Report on “Online Optimization with Gradual ...mcao01/2013/comp236/Presentation2.pdf · Reading Report on “Online Optimization with Gradual Variations”, the Best Student

Reading Report on “Online Optimization with Gradual Variations”, the Best Student Paper from

12’COLT, by C. Chiang & T. Yang, et al.

Mengfei Cao

April 16

Page 2: Reading Report on “Online Optimization with Gradual ...mcao01/2013/comp236/Presentation2.pdf · Reading Report on “Online Optimization with Gradual Variations”, the Best Student

Introduction

O An example of online convex optimization model

Page 3: Reading Report on “Online Optimization with Gradual ...mcao01/2013/comp236/Presentation2.pdf · Reading Report on “Online Optimization with Gradual Variations”, the Best Student

Introduction

O An example of online convex optimization model

Page 4: Reading Report on “Online Optimization with Gradual ...mcao01/2013/comp236/Presentation2.pdf · Reading Report on “Online Optimization with Gradual Variations”, the Best Student

Introduction

O The General Online Convex Optimization Model

Minimize: regret =

Regret Bound: using Gradient Descend Algorithm, achieve 𝑂 𝑇𝑁 ,

where N is the dimensionality of the convex set.

Online Convex Optimization (OCO)

Input: A convex set X

for t = 1,2,… predict a vector xt from X

receive a convex loss function ft: X -> R

suffer loss ft(xt)

Page 5: Reading Report on “Online Optimization with Gradual ...mcao01/2013/comp236/Presentation2.pdf · Reading Report on “Online Optimization with Gradual Variations”, the Best Student

Scenario No. 1:

O Online Linear Optimization

𝑓𝑡 𝑥𝑡 =< 𝑥𝑡, 𝑧𝑡 >, 𝑓𝑜𝑟 𝑠𝑜𝑚𝑒 𝑧𝑡

Minimize: regret =

Regret Bound: using Gradient Descend Algorithm, achieve 𝑂 𝑇𝑁 ,

where N is the dimension of 𝑥𝑡.

Online Linear Optimization

Input: A convex set X

for t = 1,2,… predict a vector xt from X

receive a convex loss function ft: X -> R

suffer loss ft(xt)

Page 6: Reading Report on “Online Optimization with Gradual ...mcao01/2013/comp236/Presentation2.pdf · Reading Report on “Online Optimization with Gradual Variations”, the Best Student

Scenario No. 2:

O Prediction with Expert Advice

𝑓𝑡 𝑥𝑡 =< 𝑥𝑡, 𝑧𝑡 >, 𝑓𝑜𝑟 𝑠𝑜𝑚𝑒 𝑧𝑡, and ∀𝑥 ∈ 𝑋: 𝑥 = 1

Minimize: regret =

Regret Bound: using Multiplicative Update Algorithm, achieve

𝑂 𝑇ln𝑁 , where N is the number of experts.

Prediction with Expert Advice

Input: A convex set X

for t = 1,2,… predict a vector xt from X

receive a convex loss function ft: X -> R

suffer loss ft(xt)

Page 7: Reading Report on “Online Optimization with Gradual ...mcao01/2013/comp236/Presentation2.pdf · Reading Report on “Online Optimization with Gradual Variations”, the Best Student

Scenario No. 3:

O Online Convex Optimization*: gradient of each loss function is λ-smooth.

Minimize: regret =

Regret Bound: using Gradient Descend Algorithm, achieve 𝑂 𝑇𝑁 ,

where N is the dimension of 𝑥𝑡.

Online Convex Optimization*

Input: A convex set X

for t = 1,2,… predict a vector xt from X

receive a convex loss function ft: X -> R

suffer loss ft(xt)

Page 8: Reading Report on “Online Optimization with Gradual ...mcao01/2013/comp236/Presentation2.pdf · Reading Report on “Online Optimization with Gradual Variations”, the Best Student

Scenario No. 4:

O Online Strictly Convex Optimization: 𝛽 − 𝑐𝑜𝑛𝑣𝑒𝑥 loss function

Minimize: regret =

Regret Bound: using online Newton Step Algorithm, achieve 𝑂 𝑁𝑙𝑜𝑔𝑇 , where N is the dimension of 𝑥𝑡.

Online Strictly Convex Optimization

Input: A convex set X

for t = 1,2,… predict a vector xt from X

receive a convex loss function ft: X -> R

suffer loss ft(xt)

Page 9: Reading Report on “Online Optimization with Gradual ...mcao01/2013/comp236/Presentation2.pdf · Reading Report on “Online Optimization with Gradual Variations”, the Best Student

Motivation: Variations in cost functions can be small

O What else could the regret bound depend on other than T?

O better be small so as to be tight and easy to bound

O Variation-bounded regret (Previous work):

Page 10: Reading Report on “Online Optimization with Gradual ...mcao01/2013/comp236/Presentation2.pdf · Reading Report on “Online Optimization with Gradual Variations”, the Best Student

Motivation: Variations May Not be Reliable

O Example:

In some cases, the consecutive changes between loss functions are smooth and small.

O Deviation-bounded regret (Proposed work):

Page 11: Reading Report on “Online Optimization with Gradual ...mcao01/2013/comp236/Presentation2.pdf · Reading Report on “Online Optimization with Gradual Variations”, the Best Student

Motivation: Deviations Can be Very Small

O Example (Linear Online Optimization):

Page 12: Reading Report on “Online Optimization with Gradual ...mcao01/2013/comp236/Presentation2.pdf · Reading Report on “Online Optimization with Gradual Variations”, the Best Student

A Unified Algorithm

(Bregman divergence)

Page 13: Reading Report on “Online Optimization with Gradual ...mcao01/2013/comp236/Presentation2.pdf · Reading Report on “Online Optimization with Gradual Variations”, the Best Student

Proof for Regret Bound in Scenario No.1 and No.2 (linear)

Page 14: Reading Report on “Online Optimization with Gradual ...mcao01/2013/comp236/Presentation2.pdf · Reading Report on “Online Optimization with Gradual Variations”, the Best Student

Proof for Regret Bound in Scenario No.3 (λ-smooth)

Page 15: Reading Report on “Online Optimization with Gradual ...mcao01/2013/comp236/Presentation2.pdf · Reading Report on “Online Optimization with Gradual Variations”, the Best Student

Proof for Regret Bound in Scenario No.4 (𝛽 − 𝑐𝑜𝑛𝑣𝑒𝑥, λ-smooth)

Page 16: Reading Report on “Online Optimization with Gradual ...mcao01/2013/comp236/Presentation2.pdf · Reading Report on “Online Optimization with Gradual Variations”, the Best Student

Conclusion

O By introducing the notion of Lp-deviations, the work derived a tighter bound as proved for four specific online learning scenarios using a unified algorithm META, when the environment follows some stable pattern or the adversary is kind, from the high-level understanding.

O Technically, the application of Bregman projections is vital.

Page 17: Reading Report on “Online Optimization with Gradual ...mcao01/2013/comp236/Presentation2.pdf · Reading Report on “Online Optimization with Gradual Variations”, the Best Student
Page 18: Reading Report on “Online Optimization with Gradual ...mcao01/2013/comp236/Presentation2.pdf · Reading Report on “Online Optimization with Gradual Variations”, the Best Student