ACM 113: Lecture 1 - Stanford University

ACM 113: Lecture 1

Agenda

What is a mathematical optimization problem?Linear and nonlinear programmingFirst examplesLeast squaresConvex optimizationNonlinear optimizationWhat is this course about?

What is a mathematical optimization problem?

A mathematical optimization problem or optimization problem is of the form

(P) minimize f0(x)subject to fi(x) ≤ bi, i = 1, . . . ,m

x ∈ Rn is the optimization variablef0 : Rn → R is the objective functionfunctions fi are the constraints

A vector x? is solution to (P) if for all x such that

fi(x) ≤ bi, ∀i

thenf0(x?) ≤ f0(x)

What is a mathematical optimization problem?

A mathematical optimization problem or optimization problem is of the form

(P) minimize f0(x)subject to fi(x) ≤ bi, i = 1, . . . ,m

x ∈ Rn is the optimization variablef0 : Rn → R is the objective functionfunctions fi are the constraints

A vector x? is solution to (P) if for all x such that

fi(x) ≤ bi, ∀i

thenf0(x?) ≤ f0(x)

Linear and nonlinear programming

Linear programming: The objective and the constraints are linearA function f is linear if

f (x + λy) = f (x) + λf (y)

for all x, y ∈ Rn and λ ∈ R

Nonlinear programming: The objective or one of the constraints is notlinear

Convex programming: The objective functional and the constraintfunctionals are convexA function f is convex if

f (λx + (1− λ)y) ≤ λf (x) + (1− λ)f (y)

for all x, y ∈ Rn and 0 ≤ λ ≤ 1

In this course, emphasis on convex programming/optimization



f (x + λy) = f (x) + λf (y)




f (λx + (1− λ)y) ≤ λf (x) + (1− λ)f (y)





f (x + λy) = f (x) + λf (y)




f (λx + (1− λ)y) ≤ λf (x) + (1− λ)f (y)





f (x + λy) = f (x) + λf (y)




f (λx + (1− λ)y) ≤ λf (x) + (1− λ)f (y)



Convexity is more general than linearity

f (λx+(1−λ)y)

λ f (x)+(1−λ) f (y)

f (x)

f (y)

Applications of mathematical optimization

Nearly all fields of science and engineering

ExamplesPortfolio managementData fittingRegularization in signal processing

Examples of optimization problems

Least Squares

No constraintsObjective functional

f0(x) = ‖b− Ax‖22 =

k∑i=1

(aTi x− bi)2

where A ∈ Rk×n (typically n ≤ k) and A = row(a1, . . . , ak)

Solution(ATA)x = ATb ⇒ x? = (ATA)−1ATb

assuming ATA is invertible (if not?)

Many methods for solving least squares problems: fast and stableMature technologyImportant and straightforward to recognize a least squares problem


Least Squares

No constraintsObjective functional

f0(x) = ‖b− Ax‖22 =

k∑i=1

(aTi x− bi)2






Least SquaresNo constraints

Objective functional

f0(x) = ‖b− Ax‖22 =

k∑i=1

(aTi x− bi)2






Least SquaresNo constraintsObjective functional

f0(x) = ‖b− Ax‖22 =

k∑i=1

(aTi x− bi)2







f0(x) = ‖b− Ax‖22 =

k∑i=1

(aTi x− bi)2







f0(x) = ‖b− Ax‖22 =

k∑i=1

(aTi x− bi)2





Another examplef0(x) = ‖b− Ax‖2

2 + λ‖x‖22

Connected with statistical regularization (ridge regression)Trade-off between fit and size of the xj’sE.g. prior distribution on parameters in Bayesian statistics

Linear programming

minimize cTx

subject to aTi x ≤ bi, i = 1, . . . ,m

No analytical solutionMature technology: fast and stable solvers

Simplex method (Dantzig, 47)Interior point methods (Karmakar, 84)

Using linear programmingMany problems can be recast as LP’sNot as easy to recognize as least squares problemsExample

minimize‖b− Ax‖∞ = minimize maxi|aT

i x− bi|

Linear programming

minimize cTx


No analytical solution

Mature technology: fast and stable solversSimplex method (Dantzig, 47)Interior point methods (Karmakar, 84)



i x− bi|

Linear programming

minimize cTx






i x− bi|

Linear programming

minimize cTx






i x− bi|

minimize ‖b− Ax‖∞ = minimize maxi|aT

i x− bi|

Objective is maximum absolute deviation, not mean square absolutedeviationObjective functional is not differentiable

Recast as

minimize t

subject to aTi x− bi ≤ t

aTi x− bi ≥ −t

Important to recognize LP’s

minimizen∑

j=1

|xj|


Is this is an LP?


i x− bi|


Recast as

minimize t




minimizen∑

j=1

|xj|


Is this is an LP?


i x− bi|


Recast as

minimize t




minimizen∑

j=1

|xj|


Is this is an LP?


i x− bi|


Recast as

minimize t




minimizen∑

j=1

|xj|


Is this is an LP?

Convex optimization

minimize f0(x)subject to x ∈ C or fi(x) ≤ bi,

with f0 and fi convex (or C convex set)

Solving convex problemsIn general, no analytical solutionMany reliable and efficient algorithms existAlmost mature technology

Using convex optimizationOften go unrecognizedMany problems can be formulated as convex problemsImportant to learn the skills to cast nonlinear programs into convex programsImportant to distinguish between convex and nonconvex problems

Convex optimization





Convex optimization





Solving optimization problems

General optimization problemVery difficult to solveTypical methods are compromising; e.g. extremely long (prohibitive)computation time or solution is not found

ExceptionsLeast squares problemsLP’sConvex optimization problems

“In fact, the great watershed in optimization isn’t between linearity andnonlinearity, but convexity and nonconvexity”

Rockafellar, SIAM Review, 1993











Nonlinear optimization

1. Nonconvex objective

f (x)

2. Nonconvex constraints

minimize cTx

subject to Ax ≤ b

EasyFeasible for n ∼ 100, 000

minimize cTx

subject to Ax ≤ b and xj ∈ {0, 1}

Extremely hardInfeasible for n ≥ 40?

Not much is known about general nonlinear problems‘Art’ more than a science and technology


minimize cTx

subject to Ax ≤ b


minimize cTx





minimize cTx

subject to Ax ≤ b


minimize cTx




Methods

Local optimizationFinds a local minimumOften gives no idea whether this is a global minimum or how far this is froma global minimumRequires initial guessUsually efficient methods

Global optimizationWorst case complexity is exponential with problem sizeSmall-size problems are not even practical

Methods

Local optimizationFinds a local minimumOften gives no idea whether this is a global minimum or how far this is froma global minimumRequires initial guessUsually efficient methods

Global optimizationWorst case complexity is exponential with problem sizeSmall-size problems are not even practical

Insights from convex optimization

Initialization for local opt. methodsConvex heuristics for nonconvex problems

Example:minimize #{j : xj 6= 0}subject to Ax = b

Combinatorial problem (practically impossible)

minimize∑n

j=1 |xj|subject to Ax = b

Convex problem. Sometimes, same solution (Candes et al., 2004)

Bounds for global optimization: dual of an optimization problem is alwaysconvex→ gives a lower bound on the optimal value

Relaxation: nonconvex constraints replaced with looser convexconstraints





minimize∑n









minimize∑n









minimize∑n









minimize∑n





This course

Recognize and formulate convex problemsDevelop algorithms and efficient codeLearn about the useful theory

TopicsConvexity: convex sets, functions, optimizationAlgorithmsExamples and applications

This course

Recognize and formulate convex problemsDevelop algorithms and efficient codeLearn about the useful theory

TopicsConvexity: convex sets, functions, optimizationAlgorithmsExamples and applications

Documents

ACM 113: Lecture 1 - Stanford University