Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
ACM 113: Lecture 1
Agenda
What is a mathematical optimization problem?Linear and nonlinear programmingFirst examplesLeast squaresConvex optimizationNonlinear optimizationWhat is this course about?
What is a mathematical optimization problem?
A mathematical optimization problem or optimization problem is of the form
(P) minimize f0(x)subject to fi(x) ≤ bi, i = 1, . . . ,m
x ∈ Rn is the optimization variablef0 : Rn → R is the objective functionfunctions fi are the constraints
A vector x? is solution to (P) if for all x such that
fi(x) ≤ bi, ∀i
thenf0(x?) ≤ f0(x)
What is a mathematical optimization problem?
A mathematical optimization problem or optimization problem is of the form
(P) minimize f0(x)subject to fi(x) ≤ bi, i = 1, . . . ,m
x ∈ Rn is the optimization variablef0 : Rn → R is the objective functionfunctions fi are the constraints
A vector x? is solution to (P) if for all x such that
fi(x) ≤ bi, ∀i
thenf0(x?) ≤ f0(x)
Linear and nonlinear programming
Linear programming: The objective and the constraints are linearA function f is linear if
f (x + λy) = f (x) + λf (y)
for all x, y ∈ Rn and λ ∈ R
Nonlinear programming: The objective or one of the constraints is notlinear
Convex programming: The objective functional and the constraintfunctionals are convexA function f is convex if
f (λx + (1− λ)y) ≤ λf (x) + (1− λ)f (y)
for all x, y ∈ Rn and 0 ≤ λ ≤ 1
In this course, emphasis on convex programming/optimization
Linear and nonlinear programming
Linear programming: The objective and the constraints are linearA function f is linear if
f (x + λy) = f (x) + λf (y)
for all x, y ∈ Rn and λ ∈ R
Nonlinear programming: The objective or one of the constraints is notlinear
Convex programming: The objective functional and the constraintfunctionals are convexA function f is convex if
f (λx + (1− λ)y) ≤ λf (x) + (1− λ)f (y)
for all x, y ∈ Rn and 0 ≤ λ ≤ 1
In this course, emphasis on convex programming/optimization
Linear and nonlinear programming
Linear programming: The objective and the constraints are linearA function f is linear if
f (x + λy) = f (x) + λf (y)
for all x, y ∈ Rn and λ ∈ R
Nonlinear programming: The objective or one of the constraints is notlinear
Convex programming: The objective functional and the constraintfunctionals are convexA function f is convex if
f (λx + (1− λ)y) ≤ λf (x) + (1− λ)f (y)
for all x, y ∈ Rn and 0 ≤ λ ≤ 1
In this course, emphasis on convex programming/optimization
Linear and nonlinear programming
Linear programming: The objective and the constraints are linearA function f is linear if
f (x + λy) = f (x) + λf (y)
for all x, y ∈ Rn and λ ∈ R
Nonlinear programming: The objective or one of the constraints is notlinear
Convex programming: The objective functional and the constraintfunctionals are convexA function f is convex if
f (λx + (1− λ)y) ≤ λf (x) + (1− λ)f (y)
for all x, y ∈ Rn and 0 ≤ λ ≤ 1
In this course, emphasis on convex programming/optimization
Convexity is more general than linearity
f (λx+(1−λ)y)
λ f (x)+(1−λ) f (y)
f (x)
f (y)
Applications of mathematical optimization
Nearly all fields of science and engineering
ExamplesPortfolio managementData fittingRegularization in signal processing
Examples of optimization problems
Least Squares
No constraintsObjective functional
f0(x) = ‖b− Ax‖22 =
k∑i=1
(aTi x− bi)2
where A ∈ Rk×n (typically n ≤ k) and A = row(a1, . . . , ak)
Solution(ATA)x = ATb ⇒ x? = (ATA)−1ATb
assuming ATA is invertible (if not?)
Many methods for solving least squares problems: fast and stableMature technologyImportant and straightforward to recognize a least squares problem
Examples of optimization problems
Least Squares
No constraintsObjective functional
f0(x) = ‖b− Ax‖22 =
k∑i=1
(aTi x− bi)2
where A ∈ Rk×n (typically n ≤ k) and A = row(a1, . . . , ak)
Solution(ATA)x = ATb ⇒ x? = (ATA)−1ATb
assuming ATA is invertible (if not?)
Many methods for solving least squares problems: fast and stableMature technologyImportant and straightforward to recognize a least squares problem
Examples of optimization problems
Least SquaresNo constraints
Objective functional
f0(x) = ‖b− Ax‖22 =
k∑i=1
(aTi x− bi)2
where A ∈ Rk×n (typically n ≤ k) and A = row(a1, . . . , ak)
Solution(ATA)x = ATb ⇒ x? = (ATA)−1ATb
assuming ATA is invertible (if not?)
Many methods for solving least squares problems: fast and stableMature technologyImportant and straightforward to recognize a least squares problem
Examples of optimization problems
Least SquaresNo constraintsObjective functional
f0(x) = ‖b− Ax‖22 =
k∑i=1
(aTi x− bi)2
where A ∈ Rk×n (typically n ≤ k) and A = row(a1, . . . , ak)
Solution(ATA)x = ATb ⇒ x? = (ATA)−1ATb
assuming ATA is invertible (if not?)
Many methods for solving least squares problems: fast and stableMature technologyImportant and straightforward to recognize a least squares problem
Examples of optimization problems
Least SquaresNo constraintsObjective functional
f0(x) = ‖b− Ax‖22 =
k∑i=1
(aTi x− bi)2
where A ∈ Rk×n (typically n ≤ k) and A = row(a1, . . . , ak)
Solution(ATA)x = ATb ⇒ x? = (ATA)−1ATb
assuming ATA is invertible (if not?)
Many methods for solving least squares problems: fast and stableMature technologyImportant and straightforward to recognize a least squares problem
Examples of optimization problems
Least SquaresNo constraintsObjective functional
f0(x) = ‖b− Ax‖22 =
k∑i=1
(aTi x− bi)2
where A ∈ Rk×n (typically n ≤ k) and A = row(a1, . . . , ak)
Solution(ATA)x = ATb ⇒ x? = (ATA)−1ATb
assuming ATA is invertible (if not?)
Many methods for solving least squares problems: fast and stableMature technologyImportant and straightforward to recognize a least squares problem
Another examplef0(x) = ‖b− Ax‖2
2 + λ‖x‖22
Connected with statistical regularization (ridge regression)Trade-off between fit and size of the xj’sE.g. prior distribution on parameters in Bayesian statistics
Linear programming
minimize cTx
subject to aTi x ≤ bi, i = 1, . . . ,m
No analytical solutionMature technology: fast and stable solvers
Simplex method (Dantzig, 47)Interior point methods (Karmakar, 84)
Using linear programmingMany problems can be recast as LP’sNot as easy to recognize as least squares problemsExample
minimize‖b− Ax‖∞ = minimize maxi|aT
i x− bi|
Linear programming
minimize cTx
subject to aTi x ≤ bi, i = 1, . . . ,m
No analytical solution
Mature technology: fast and stable solversSimplex method (Dantzig, 47)Interior point methods (Karmakar, 84)
Using linear programmingMany problems can be recast as LP’sNot as easy to recognize as least squares problemsExample
minimize‖b− Ax‖∞ = minimize maxi|aT
i x− bi|
Linear programming
minimize cTx
subject to aTi x ≤ bi, i = 1, . . . ,m
No analytical solutionMature technology: fast and stable solvers
Simplex method (Dantzig, 47)Interior point methods (Karmakar, 84)
Using linear programmingMany problems can be recast as LP’sNot as easy to recognize as least squares problemsExample
minimize‖b− Ax‖∞ = minimize maxi|aT
i x− bi|
Linear programming
minimize cTx
subject to aTi x ≤ bi, i = 1, . . . ,m
No analytical solutionMature technology: fast and stable solvers
Simplex method (Dantzig, 47)Interior point methods (Karmakar, 84)
Using linear programmingMany problems can be recast as LP’sNot as easy to recognize as least squares problemsExample
minimize‖b− Ax‖∞ = minimize maxi|aT
i x− bi|
minimize ‖b− Ax‖∞ = minimize maxi|aT
i x− bi|
Objective is maximum absolute deviation, not mean square absolutedeviationObjective functional is not differentiable
Recast as
minimize t
subject to aTi x− bi ≤ t
aTi x− bi ≥ −t
Important to recognize LP’s
minimizen∑
j=1
|xj|
subject to aTi x ≤ bi, i = 1, . . . ,m
Is this is an LP?
minimize ‖b− Ax‖∞ = minimize maxi|aT
i x− bi|
Objective is maximum absolute deviation, not mean square absolutedeviationObjective functional is not differentiable
Recast as
minimize t
subject to aTi x− bi ≤ t
aTi x− bi ≥ −t
Important to recognize LP’s
minimizen∑
j=1
|xj|
subject to aTi x ≤ bi, i = 1, . . . ,m
Is this is an LP?
minimize ‖b− Ax‖∞ = minimize maxi|aT
i x− bi|
Objective is maximum absolute deviation, not mean square absolutedeviationObjective functional is not differentiable
Recast as
minimize t
subject to aTi x− bi ≤ t
aTi x− bi ≥ −t
Important to recognize LP’s
minimizen∑
j=1
|xj|
subject to aTi x ≤ bi, i = 1, . . . ,m
Is this is an LP?
minimize ‖b− Ax‖∞ = minimize maxi|aT
i x− bi|
Objective is maximum absolute deviation, not mean square absolutedeviationObjective functional is not differentiable
Recast as
minimize t
subject to aTi x− bi ≤ t
aTi x− bi ≥ −t
Important to recognize LP’s
minimizen∑
j=1
|xj|
subject to aTi x ≤ bi, i = 1, . . . ,m
Is this is an LP?
Convex optimization
minimize f0(x)subject to x ∈ C or fi(x) ≤ bi,
with f0 and fi convex (or C convex set)
Solving convex problemsIn general, no analytical solutionMany reliable and efficient algorithms existAlmost mature technology
Using convex optimizationOften go unrecognizedMany problems can be formulated as convex problemsImportant to learn the skills to cast nonlinear programs into convex programsImportant to distinguish between convex and nonconvex problems
Convex optimization
minimize f0(x)subject to x ∈ C or fi(x) ≤ bi,
with f0 and fi convex (or C convex set)
Solving convex problemsIn general, no analytical solutionMany reliable and efficient algorithms existAlmost mature technology
Using convex optimizationOften go unrecognizedMany problems can be formulated as convex problemsImportant to learn the skills to cast nonlinear programs into convex programsImportant to distinguish between convex and nonconvex problems
Convex optimization
minimize f0(x)subject to x ∈ C or fi(x) ≤ bi,
with f0 and fi convex (or C convex set)
Solving convex problemsIn general, no analytical solutionMany reliable and efficient algorithms existAlmost mature technology
Using convex optimizationOften go unrecognizedMany problems can be formulated as convex problemsImportant to learn the skills to cast nonlinear programs into convex programsImportant to distinguish between convex and nonconvex problems
Solving optimization problems
General optimization problemVery difficult to solveTypical methods are compromising; e.g. extremely long (prohibitive)computation time or solution is not found
ExceptionsLeast squares problemsLP’sConvex optimization problems
“In fact, the great watershed in optimization isn’t between linearity andnonlinearity, but convexity and nonconvexity”
Rockafellar, SIAM Review, 1993
Solving optimization problems
General optimization problemVery difficult to solveTypical methods are compromising; e.g. extremely long (prohibitive)computation time or solution is not found
ExceptionsLeast squares problemsLP’sConvex optimization problems
“In fact, the great watershed in optimization isn’t between linearity andnonlinearity, but convexity and nonconvexity”
Rockafellar, SIAM Review, 1993
Solving optimization problems
General optimization problemVery difficult to solveTypical methods are compromising; e.g. extremely long (prohibitive)computation time or solution is not found
ExceptionsLeast squares problemsLP’sConvex optimization problems
“In fact, the great watershed in optimization isn’t between linearity andnonlinearity, but convexity and nonconvexity”
Rockafellar, SIAM Review, 1993
Nonlinear optimization
1. Nonconvex objective
f (x)
2. Nonconvex constraints
minimize cTx
subject to Ax ≤ b
EasyFeasible for n ∼ 100, 000
minimize cTx
subject to Ax ≤ b and xj ∈ {0, 1}
Extremely hardInfeasible for n ≥ 40?
Not much is known about general nonlinear problems‘Art’ more than a science and technology
2. Nonconvex constraints
minimize cTx
subject to Ax ≤ b
EasyFeasible for n ∼ 100, 000
minimize cTx
subject to Ax ≤ b and xj ∈ {0, 1}
Extremely hardInfeasible for n ≥ 40?
Not much is known about general nonlinear problems‘Art’ more than a science and technology
2. Nonconvex constraints
minimize cTx
subject to Ax ≤ b
EasyFeasible for n ∼ 100, 000
minimize cTx
subject to Ax ≤ b and xj ∈ {0, 1}
Extremely hardInfeasible for n ≥ 40?
Not much is known about general nonlinear problems‘Art’ more than a science and technology
Methods
Local optimizationFinds a local minimumOften gives no idea whether this is a global minimum or how far this is froma global minimumRequires initial guessUsually efficient methods
Global optimizationWorst case complexity is exponential with problem sizeSmall-size problems are not even practical
Methods
Local optimizationFinds a local minimumOften gives no idea whether this is a global minimum or how far this is froma global minimumRequires initial guessUsually efficient methods
Global optimizationWorst case complexity is exponential with problem sizeSmall-size problems are not even practical
Insights from convex optimization
Initialization for local opt. methodsConvex heuristics for nonconvex problems
Example:minimize #{j : xj 6= 0}subject to Ax = b
Combinatorial problem (practically impossible)
minimize∑n
j=1 |xj|subject to Ax = b
Convex problem. Sometimes, same solution (Candes et al., 2004)
Bounds for global optimization: dual of an optimization problem is alwaysconvex→ gives a lower bound on the optimal value
Relaxation: nonconvex constraints replaced with looser convexconstraints
Insights from convex optimization
Initialization for local opt. methodsConvex heuristics for nonconvex problems
Example:minimize #{j : xj 6= 0}subject to Ax = b
Combinatorial problem (practically impossible)
minimize∑n
j=1 |xj|subject to Ax = b
Convex problem. Sometimes, same solution (Candes et al., 2004)
Bounds for global optimization: dual of an optimization problem is alwaysconvex→ gives a lower bound on the optimal value
Relaxation: nonconvex constraints replaced with looser convexconstraints
Insights from convex optimization
Initialization for local opt. methodsConvex heuristics for nonconvex problems
Example:minimize #{j : xj 6= 0}subject to Ax = b
Combinatorial problem (practically impossible)
minimize∑n
j=1 |xj|subject to Ax = b
Convex problem. Sometimes, same solution (Candes et al., 2004)
Bounds for global optimization: dual of an optimization problem is alwaysconvex→ gives a lower bound on the optimal value
Relaxation: nonconvex constraints replaced with looser convexconstraints
Insights from convex optimization
Initialization for local opt. methodsConvex heuristics for nonconvex problems
Example:minimize #{j : xj 6= 0}subject to Ax = b
Combinatorial problem (practically impossible)
minimize∑n
j=1 |xj|subject to Ax = b
Convex problem. Sometimes, same solution (Candes et al., 2004)
Bounds for global optimization: dual of an optimization problem is alwaysconvex→ gives a lower bound on the optimal value
Relaxation: nonconvex constraints replaced with looser convexconstraints
Insights from convex optimization
Initialization for local opt. methodsConvex heuristics for nonconvex problems
Example:minimize #{j : xj 6= 0}subject to Ax = b
Combinatorial problem (practically impossible)
minimize∑n
j=1 |xj|subject to Ax = b
Convex problem. Sometimes, same solution (Candes et al., 2004)
Bounds for global optimization: dual of an optimization problem is alwaysconvex→ gives a lower bound on the optimal value
Relaxation: nonconvex constraints replaced with looser convexconstraints
This course
Recognize and formulate convex problemsDevelop algorithms and efficient codeLearn about the useful theory
TopicsConvexity: convex sets, functions, optimizationAlgorithmsExamples and applications
This course
Recognize and formulate convex problemsDevelop algorithms and efficient codeLearn about the useful theory
TopicsConvexity: convex sets, functions, optimizationAlgorithmsExamples and applications