187
Submodular Functions, Optimization, and Applications to Machine Learning — Spring Quarter, Lecture 3 — http://www.ee.washington.edu/people/faculty/bilmes/classes/ee596b_spring_2016/ Prof. JeBilmes University of Washington, Seattle Department of Electrical Engineering http://melodi.ee.washington.edu/ ~ bilmes Apr 4th, 2016 + f (A)+ f (B) f (A [ B) =f (Ar ) + f ( C ) + f ( B r ) =f (A\B) f (A \ B) =f (Ar ) +2 f ( C ) + f ( B r ) Prof. JeBilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F1/63 (pg.1/187)

Submodular Functions, Optimization, and Applications to ......Finals Week: June 6th-10th, 2016. Prof. Je Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F4/63(pg.4/187)

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

  • Submodular Functions, Optimization,and Applications to Machine Learning

    — Spring Quarter, Lecture 3 —http://www.ee.washington.edu/people/faculty/bilmes/classes/ee596b_spring_2016/

    Prof. Je↵ Bilmes

    University of Washington, SeattleDepartment of Electrical Engineering

    http://melodi.ee.washington.edu/

    ~

    bilmes

    Apr 4th, 2016

    +f (A) + f (B) f (A [ B)= f (Ar ) +f (C ) + f (Br )

    �= f (A \ B)

    f (A \ B)= f (Ar ) + 2f (C ) + f (Br )

    Clockwise from top left:vLásló Lovász

    Jack EdmondsSatoru Fujishige

    George NemhauserLaurence Wolsey

    András FrankLloyd ShapleyH. NarayananRobert Bixby

    William CunninghamWilliam TutteRichard Rado

    Alexander SchrijverGarrett BirkhoffHassler Whitney

    Richard Dedekind

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F1/63 (pg.1/187)

  • Logistics Review

    Cumulative Outstanding Reading

    Read chapter 1 from Fujishige’s book.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F2/63 (pg.2/187)

  • Logistics Review

    Announcements, Assignments, and Reminders

    Homework 1 is now available at our assignment dropbox(https://canvas.uw.edu/courses/1039754/assignments), due(electronically) Friday at 5:00pm.

    Weekly O�ce Hours: Mondays, 3:30-4:30, or by skype or googlehangout (set up meeting via our our discussion board (https://canvas.uw.edu/courses/1039754/discussion_topics)).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F3/63 (pg.3/187)

    -

  • Logistics Review

    Class Road Map - IT-I

    L1(3/28): Motivation, Applications, &

    Basic Definitions

    L2(3/30): Machine Learning Apps

    (diversity, complexity, parameter, learning

    target, surrogate).

    L3(4/4): Info theory exs, more apps,

    definitions, graph/combinatorial examples,

    matrix rank example, visualization

    L4(4/6):

    L5(4/11):

    L6(4/13):

    L7(4/18):

    L8(4/20):

    L9(4/25):

    L10(4/27):

    L11(5/2):

    L12(5/4):

    L13(5/9):

    L14(5/11):

    L15(5/16):

    L16(5/18):

    L17(5/23):

    L18(5/25):

    L19(6/1):

    L20(6/6): Final Presentations

    maximization.

    Finals Week: June 6th-10th, 2016.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F4/63 (pg.4/187)

  • Logistics Review

    Two Equivalent Submodular Definitions

    Definition 3.2.1 (submodular concave)

    A function f : 2V ! R is submodular if for any A,B ✓ V , we have that:

    f(A) + f(B) � f(A [B) + f(A \B) (3.8)

    An alternate and (as we will soon see) equivalent definition is:

    Definition 3.2.2 (diminishing returns)

    A function f : 2V ! R is submodular if for any A ✓ B ⇢ V , andv 2 V \B, we have that:

    f(A [ {v})� f(A) � f(B [ {v})� f(B) (3.9)

    The incremental “value”, “gain”, or “cost” of v decreases (diminishes) asthe context in which v is considered grows from A to B.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F5/63 (pg.5/187)

  • Logistics Review

    Two Equivalent Supermodular Definitions

    Definition 3.2.1 (supermodular)

    A function f : 2V ! R is supermodular if for any A,B ✓ V , we have that:

    f(A) + f(B) f(A [B) + f(A \B) (3.8)

    Definition 3.2.2 (supermodular (improving returns))

    A function f : 2V ! R is supermodular if for any A ✓ B ⇢ V , andv 2 V \B, we have that:

    f(A [ {v})� f(A) f(B [ {v})� f(B) (3.9)Incremental “value”, “gain”, or “cost” of v increases (improves) as thecontext in which v is considered grows from A to B.A function f is submodular i↵ �f is supermodular.If f both submodular and supermodular, then f is said to be modular,and f(A) = c+

    P

    a2A f(a) (often c = 0).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F6/63 (pg.6/187)

  • Logistics Review

    Submodularity’s utility in ML

    A model of a physical process :When maximizing, submodularity naturally models: diversity, coverage,span, and information.When minimizing, submodularity naturally models: cooperative costs,complexity, roughness, and irregularity.vice-versa for supermodularity.

    A submodular function can act as a parameter for a machine learningstrategy (active/semi-supervised learning, discrete divergence,structured sparse convex norms for use in regularization).Itself, as an object or function to learn , based on data.

    A surrogate or relaxation strategy for optimization or analysisAn alternate to factorization, decomposition, or sum-product basedsimplification (as one typically finds in a graphical model). I.e., a meanstowards tractable surrogates for graphical models.Also, we can “relax” a problem to a submodular one where it can bee�ciently solved and o↵er a bounded quality solution.Non-submodular problems can be analyzed via submodularity.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F7/63 (pg.7/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Ground set: E or V ?

    Submodular functions are functions defined on subsets of some finite set,called the ground set .

    It is common in the literature to use either E or V as the ground set— we will at di↵erent times use both (there should be no confusion).

    The terminology ground set comes from lattice theory, where V arethe ground elements of a lattice (just above 0).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F8/63 (pg.8/187)

    °

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Ground set: E or V ?

    Submodular functions are functions defined on subsets of some finite set,called the ground set .

    It is common in the literature to use either E or V as the ground set— we will at di↵erent times use both (there should be no confusion).

    The terminology ground set comes from lattice theory, where V arethe ground elements of a lattice (just above 0).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F8/63 (pg.9/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Notation RE

    What does x 2 RE mean?

    RE = {x = (xj

    2 R : j 2 E)} (3.1)

    RE+

    = {x = (xj

    : j 2 E) : x � 0} (3.2)

    Any vector x 2 RE can be treated as a normalized modular function, andvice verse. That is

    x(A) =X

    a2Axa

    (3.3)

    Note that x is said to be normalized since x(;) = 0.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F9/63 (pg.10/187)

    O 1122 R "R|E=HEl - - - - E=

    E and

    HE{ Cia ,tgA ): Xa ER

    abet ,HEIR }

    AGE

    0

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    characteristic vectors of sets & modular functions

    Given an A ✓ E, define the vector 1A

    2 RE+

    to be

    1A

    (j) =

    (

    1 if j 2 A;0 if j /2 A

    (3.4)

    Sometimes this will be written as �A

    ⌘ 1A

    .

    Thus, given modular function x 2 RE , we can write x(A) in a varietyof ways, i.e.,

    x(A) = x · 1A

    =

    X

    i2Ax(i) (3.5)

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F10/63 (pg.11/187)

    O-

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    characteristic vectors of sets & modular functions

    Given an A ✓ E, define the vector 1A

    2 RE+

    to be

    1A

    (j) =

    (

    1 if j 2 A;0 if j /2 A

    (3.4)

    Sometimes this will be written as �A

    ⌘ 1A

    .

    Thus, given modular function x 2 RE , we can write x(A) in a varietyof ways, i.e.,

    x(A) = x · 1A

    =

    X

    i2Ax(i) (3.5)

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F10/63 (pg.12/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    characteristic vectors of sets & modular functions

    Given an A ✓ E, define the vector 1A

    2 RE+

    to be

    1A

    (j) =

    (

    1 if j 2 A;0 if j /2 A

    (3.4)

    Sometimes this will be written as �A

    ⌘ 1A

    .

    Thus, given modular function x 2 RE , we can write x(A) in a varietyof ways, i.e.,

    x(A) = x · 1A

    =

    X

    i2Ax(i) (3.5)

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F10/63 (pg.13/187)

    .

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Other Notation: singletons and sets

    When A is a set and k is a singleton (i.e., a single item), the union isproperly written as A [ {k}, but sometimes we will write just A+ k.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F11/63 (pg.14/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    What does ST mean when S and T are arbitrary sets?

    Let S and T be two arbitrary sets (either of which could be countable,or uncountable).

    We define the notation ST to be the set of all functions that map fromT to S. That is, if f 2 ST , then f : T ! S.Hence, given a finite set E, RE is the set of all functions that mapfrom elements of E to the reals R, and such functions are identical toa vector in a vector space with axes labeled as elements of E (i.e., ifm 2 RE , then for all e 2 E, m(e) 2 R).Often “2” is shorthand for the set {0, 1}. I.e., R2 where 2 ⌘ {0, 1}.Similarly, 2E is the set of all functions from E to “two” — so 2E isshorthand for {0, 1}E

    — hence, 2E is the set of all functions that mapfrom elements of E to {0, 1}, equivalent to all binary vectors withelements indexed by elements of E, equivalent to subsets of E. Hence,if A 2 2E then A ✓ E.

    What might 3E mean?

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F12/63 (pg.15/187)

    fi 289112

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    What does ST mean when S and T are arbitrary sets?

    Let S and T be two arbitrary sets (either of which could be countable,or uncountable).

    We define the notation ST to be the set of all functions that map fromT to S. That is, if f 2 ST , then f : T ! S.

    Hence, given a finite set E, RE is the set of all functions that mapfrom elements of E to the reals R, and such functions are identical toa vector in a vector space with axes labeled as elements of E (i.e., ifm 2 RE , then for all e 2 E, m(e) 2 R).Often “2” is shorthand for the set {0, 1}. I.e., R2 where 2 ⌘ {0, 1}.Similarly, 2E is the set of all functions from E to “two” — so 2E isshorthand for {0, 1}E

    — hence, 2E is the set of all functions that mapfrom elements of E to {0, 1}, equivalent to all binary vectors withelements indexed by elements of E, equivalent to subsets of E. Hence,if A 2 2E then A ✓ E.

    What might 3E mean?

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F12/63 (pg.16/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    What does ST mean when S and T are arbitrary sets?

    Let S and T be two arbitrary sets (either of which could be countable,or uncountable).

    We define the notation ST to be the set of all functions that map fromT to S. That is, if f 2 ST , then f : T ! S.Hence, given a finite set E, RE is the set of all functions that mapfrom elements of E to the reals R, and such functions are identical toa vector in a vector space with axes labeled as elements of E (i.e., ifm 2 RE , then for all e 2 E, m(e) 2 R).

    Often “2” is shorthand for the set {0, 1}. I.e., R2 where 2 ⌘ {0, 1}.Similarly, 2E is the set of all functions from E to “two” — so 2E isshorthand for {0, 1}E

    — hence, 2E is the set of all functions that mapfrom elements of E to {0, 1}, equivalent to all binary vectors withelements indexed by elements of E, equivalent to subsets of E. Hence,if A 2 2E then A ✓ E.

    What might 3E mean?

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F12/63 (pg.17/187)

    :HE m :E 1RMEIR

    " nE{ 44 ... ,n }*

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    What does ST mean when S and T are arbitrary sets?

    Let S and T be two arbitrary sets (either of which could be countable,or uncountable).

    We define the notation ST to be the set of all functions that map fromT to S. That is, if f 2 ST , then f : T ! S.Hence, given a finite set E, RE is the set of all functions that mapfrom elements of E to the reals R, and such functions are identical toa vector in a vector space with axes labeled as elements of E (i.e., ifm 2 RE , then for all e 2 E, m(e) 2 R).Often “2” is shorthand for the set {0, 1}. I.e., R2 where 2 ⌘ {0, 1}.

    Similarly, 2E is the set of all functions from E to “two” — so 2E isshorthand for {0, 1}E

    — hence, 2E is the set of all functions that mapfrom elements of E to {0, 1}, equivalent to all binary vectors withelements indexed by elements of E, equivalent to subsets of E. Hence,if A 2 2E then A ✓ E.

    What might 3E mean?

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F12/63 (pg.18/187)

    FER

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    What does ST mean when S and T are arbitrary sets?

    Let S and T be two arbitrary sets (either of which could be countable,or uncountable).

    We define the notation ST to be the set of all functions that map fromT to S. That is, if f 2 ST , then f : T ! S.Hence, given a finite set E, RE is the set of all functions that mapfrom elements of E to the reals R, and such functions are identical toa vector in a vector space with axes labeled as elements of E (i.e., ifm 2 RE , then for all e 2 E, m(e) 2 R).Often “2” is shorthand for the set {0, 1}. I.e., R2 where 2 ⌘ {0, 1}.Similarly, 2E is the set of all functions from E to “two” — so 2E isshorthand for {0, 1}E

    — hence, 2E is the set of all functions that mapfrom elements of E to {0, 1}, equivalent to all binary vectors withelements indexed by elements of E, equivalent to subsets of E. Hence,if A 2 2E then A ✓ E.What might 3E mean?

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F12/63 (pg.19/187)

    = E 99913

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    What does ST mean when S and T are arbitrary sets?

    Let S and T be two arbitrary sets (either of which could be countable,or uncountable).

    We define the notation ST to be the set of all functions that map fromT to S. That is, if f 2 ST , then f : T ! S.Hence, given a finite set E, RE is the set of all functions that mapfrom elements of E to the reals R, and such functions are identical toa vector in a vector space with axes labeled as elements of E (i.e., ifm 2 RE , then for all e 2 E, m(e) 2 R).Often “2” is shorthand for the set {0, 1}. I.e., R2 where 2 ⌘ {0, 1}.Similarly, 2E is the set of all functions from E to “two” — so 2E isshorthand for {0, 1}E — hence, 2E is the set of all functions that mapfrom elements of E to {0, 1}, equivalent to all binary vectors withelements indexed by elements of E, equivalent to subsets of E. Hence,if A 2 2E then A ✓ E.

    What might 3E mean?

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F12/63 (pg.20/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    What does ST mean when S and T are arbitrary sets?

    Let S and T be two arbitrary sets (either of which could be countable,or uncountable).

    We define the notation ST to be the set of all functions that map fromT to S. That is, if f 2 ST , then f : T ! S.Hence, given a finite set E, RE is the set of all functions that mapfrom elements of E to the reals R, and such functions are identical toa vector in a vector space with axes labeled as elements of E (i.e., ifm 2 RE , then for all e 2 E, m(e) 2 R).Often “2” is shorthand for the set {0, 1}. I.e., R2 where 2 ⌘ {0, 1}.Similarly, 2E is the set of all functions from E to “two” — so 2E isshorthand for {0, 1}E — hence, 2E is the set of all functions that mapfrom elements of E to {0, 1}, equivalent to all binary vectors withelements indexed by elements of E, equivalent to subsets of E. Hence,if A 2 2E then A ✓ E.What might 3E mean?

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F12/63 (pg.21/187)

    { 91,23£

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Example Submodular: Entropy from Information Theory

    Entropy is submodular. Let V be the index set of a set of randomvariables, then the function

    f(A) = H(XA

    ) = �X

    xA

    p(xA

    ) log p(xA

    ) (3.6)

    is submodular.

    Proof: (further) conditioning reduces entropy. With A ✓ B and v /2 B,

    H(Xv

    |XB

    ) = H(XB+v

    )�H(XB

    ) (3.7)

    H(XA+v

    )�H(XA

    ) = H(Xv

    |XA

    ) (3.8)

    We say “further” due to B \A not nec. empty.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F13/63 (pg.22/187)

    Harr ) 7 Haul At) ZHGR It )

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Example Submodular: Entropy from Information Theory

    Alternate Proof: Conditional mutual Information is always non-negative.

    Given A,B ✓ V , consider conditional mutual information quantity:

    I(XA\B;XB\A|XA\B) =

    X

    xA[B

    p(xA[B) log

    p(xA\B, xB\A|xA\B)

    p(xA\B|xA\B)p(xB\A|xA\B)

    =

    X

    xA[B

    p(xA[B) log

    p(xA[B)p(xA\B)

    p(xA

    )p(xB

    )

    � 0 (3.9)

    then

    I(XA\B;XB\A|XA\B)

    = H(XA

    ) +H(XB

    )�H(XA[B)�H(XA\B) � 0 (3.10)

    so entropy satisfies

    H(XA

    ) +H(XB

    ) � H(XA[B) +H(XA\B) (3.11)

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F14/63 (pg.23/187)

    A€€B

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Information Theory: Block Coding

    Given a set of random variables {Xi

    }i2V indexed by set V , how do we

    partition them so that we can best block-code them within each block.

    I.e., how do we form S ✓ V such that I(XS

    ;XV \S) is as small as

    possible, where I(XA

    ;XB

    ) is the mutual information between randomvariables X

    A

    and XB

    , i.e.,

    I(XA

    ;XB

    ) = H(XA

    ) +H(XB

    )�H(XA

    , XB

    ) (3.12)

    and H(XA

    ) = �P

    xAp(x

    A

    ) log p(xA

    ) is the joint entropy of the setX

    A

    of random variables.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F15/63 (pg.24/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Information Theory: Block Coding

    Given a set of random variables {Xi

    }i2V indexed by set V , how do we

    partition them so that we can best block-code them within each block.

    I.e., how do we form S ✓ V such that I(XS

    ;XV \S) is as small as

    possible, where I(XA

    ;XB

    ) is the mutual information between randomvariables X

    A

    and XB

    , i.e.,

    I(XA

    ;XB

    ) = H(XA

    ) +H(XB

    )�H(XA

    , XB

    ) (3.12)

    and H(XA

    ) = �P

    xAp(x

    A

    ) log p(xA

    ) is the joint entropy of the setX

    A

    of random variables.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F15/63 (pg.25/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Example Submodular: Mutual Information

    Also, symmetric mutual information is submodular,

    f(A) = I(XA

    ;XV \A) = H(XA) +H(XV \A)�H(XV ) (3.13)

    Note that f(A) = H(XA

    ) and ¯f(A) = H(XV \A), and adding

    submodular functions preserves submodularity (which we will see quitesoon).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F16/63 (pg.26/187)

    0 000

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Information Theory: Network CommunicationX

    1

    , Y1

    X2

    , Y2

    X3

    , Y3

    X4

    , Y4

    . . .

    Xm

    , Ym

    A network of senders/receivers

    Each sender Xi

    is trying tocommunicate simultaneouslywith each receiver Y

    i

    (i.e., for alli, X

    i

    is sending to {Yi

    }i

    The Xi

    are not necessarilyindependent.

    Communication rates from i to j are R(i!j) to send message

    W (i!j) 2n

    1, 2, . . . , 2nR(i!j)

    o

    .

    Goal: necessary and su�cient conditions for achievability.I.e., can we find functions f such that any rates must satisfy

    8S ✓ V,X

    i2S,j2V \S

    R(i!j) f(S) (3.14)

    Special cases MAC (Multi-Access Channel) for communication overp(y|x

    1

    , x2

    ) and Slepian-Wolf compression (independent compression ofX and Y but at joint rate H(X,Y )).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F17/63 (pg.27/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Monge Matrices

    m⇥ n matrices C = [cij

    ]

    ij

    are called Monge matrices if they satisfythe Monge property, namely:

    cij

    + crs

    cis

    + crj

    (3.15)

    for all 1 i < r m and 1 j < s n.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F18/63 (pg.28/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Monge Matrices

    m⇥ n matrices C = [cij

    ]

    ij

    are called Monge matrices if they satisfythe Monge property, namely:

    cij

    + crs

    cis

    + crj

    (3.15)

    for all 1 i < r m and 1 j < s n.Equivalently, for all 1 i, r m, 1 j, s n,

    cmin(i,r),min(j,s)

    + cmax(i,r),max(j,s)

    cis

    + crj

    (3.16)

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F18/63 (pg.29/187)

    ¥5

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Monge Matrices

    m⇥ n matrices C = [cij

    ]

    ij

    are called Monge matrices if they satisfythe Monge property, namely:

    cij

    + crs

    cis

    + crj

    (3.15)

    for all 1 i < r m and 1 j < s n.Equivalently, for all 1 i, r m, 1 j, s n,

    cmin(i,r),min(j,s)

    + cmax(i,r),max(j,s)

    cis

    + crj

    (3.16)

    Consider four elements of the m⇥ n matrix:

    cij

    crs

    cis

    crjm

    n

    i

    j

    r

    s

    cij

    crs

    cis

    crjm

    n

    i

    j

    r

    s

    A

    B

    C

    D

    cij

    = A+B, crj

    = B, crs

    = B +D, cis

    = A+B + C +D.Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F18/63 (pg.30/187)

    z

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Monge Matrices, where useful

    Useful for speeding up many transportation, dynamic programming,flow, search, lot-sizing and many other problems.

    Example, Hitchcock transportation problem: Given m⇥ n cost matrixC = [c

    ij

    ]

    ij

    , a non-negative supply vector a 2 Rm+

    , a non-negativedemand vector b 2 Rn

    +

    withP

    m

    i=1

    a(i) =P

    n

    j=1

    bj

    , we wish tooptimally solve the following linear program:

    minimizeX2Rm⇥n

    m

    X

    i=1

    n

    X

    j=1

    cij

    xij

    (3.17)

    subject tom

    X

    i=1

    xij

    = bj

    8j = 1, . . . , n (3.18)

    n

    X

    j=1

    xij

    = ai

    8i = 1, . . . ,m (3.19)

    xi,j

    � 0 8i, j (3.20)

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F19/63 (pg.31/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Monge Matrices, where useful

    Useful for speeding up many transportation, dynamic programming,flow, search, lot-sizing and many other problems.Example, Hitchcock transportation problem: Given m⇥ n cost matrixC = [c

    ij

    ]

    ij

    , a non-negative supply vector a 2 Rm+

    , a non-negativedemand vector b 2 Rn

    +

    withP

    m

    i=1

    a(i) =P

    n

    j=1

    bj

    , we wish tooptimally solve the following linear program:

    minimizeX2Rm⇥n

    m

    X

    i=1

    n

    X

    j=1

    cij

    xij

    (3.17)

    subject tom

    X

    i=1

    xij

    = bj

    8j = 1, . . . , n (3.18)

    n

    X

    j=1

    xij

    = ai

    8i = 1, . . . ,m (3.19)

    xi,j

    � 0 8i, j (3.20)

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F19/63 (pg.32/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Monge Matrices, Hitchcock transportation

    a1

    a2

    a3

    b1 b2 b3 b4

    C

    0 1 3 310

    14940

    1 4 7215

    3 2 1 2

    Producers,Sources,

    or Supply

    Consumers, Sinks, orDemand

    Solving the linear program can be done easily and optimally using the“North West Corner Rule” in only O(m+ n) if the matrix C is Monge!

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F20/63 (pg.33/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Monge Matrices and Convex Polygons

    Can generate a Monge matrix from a convex polygon - delete twosegments, then separately number vertices on each chain. Distancescij

    satisfy Monge property (or quadrangle inequality).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F21/63 (pg.34/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Monge Matrices and Convex Polygons

    Can generate a Monge matrix from a convex polygon - delete twosegments, then separately number vertices on each chain. Distancescij

    satisfy Monge property (or quadrangle inequality).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F21/63 (pg.35/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Monge Matrices and Convex Polygons

    Can generate a Monge matrix from a convex polygon - delete twosegments, then separately number vertices on each chain. Distancescij

    satisfy Monge property (or quadrangle inequality).

    p1

    p2

    p3

    p4p5p6

    q1q2

    q3

    q4

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F21/63 (pg.36/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Monge Matrices and Convex Polygons

    Can generate a Monge matrix from a convex polygon - delete twosegments, then separately number vertices on each chain. Distancescij

    satisfy Monge property (or quadrangle inequality).

    p1

    p2

    p3

    p4p5p6

    q1q2

    q3

    q4

    d(p2

    , q3

    ) + d(p3

    , q4

    ) d(p2

    , q4

    ) + d(p3

    , q3

    ) (3.21)

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F21/63 (pg.37/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Monge Matrices and Submodularity

    A submodular function has the form: f : 2V ! R which can be seenas f : {0, 1}V ! R

    We can generalize this to f : {0,K}V ! R for some constant K 2 Z+

    .

    We may define submodularity as: for all x, y 2 {0,K}V , we have

    f(x) + f(y) � f(x _ y) + f(x ^ y) (3.22)

    x _ y is the (join) element-wise min of each element, that is(x _ y)(v) = min(x(v), y(v)) for v 2 V .x ^ y is the (meet) element-wise min of each element, that is,(x ^ y)(v) = max(x(v), y(v)) for v 2 V .With K = 1, then this is the standard definition of submodularity.

    With |V | = 2, and K + 1 the side-dimension of the matrix, we get aMonge property (on square matrices).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F22/63 (pg.38/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Monge Matrices and Submodularity

    A submodular function has the form: f : 2V ! R which can be seenas f : {0, 1}V ! RWe can generalize this to f : {0,K}V ! R for some constant K 2 Z

    +

    .

    We may define submodularity as: for all x, y 2 {0,K}V , we have

    f(x) + f(y) � f(x _ y) + f(x ^ y) (3.22)

    x _ y is the (join) element-wise min of each element, that is(x _ y)(v) = min(x(v), y(v)) for v 2 V .x ^ y is the (meet) element-wise min of each element, that is,(x ^ y)(v) = max(x(v), y(v)) for v 2 V .With K = 1, then this is the standard definition of submodularity.

    With |V | = 2, and K + 1 the side-dimension of the matrix, we get aMonge property (on square matrices).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F22/63 (pg.39/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Monge Matrices and Submodularity

    A submodular function has the form: f : 2V ! R which can be seenas f : {0, 1}V ! RWe can generalize this to f : {0,K}V ! R for some constant K 2 Z

    +

    .

    We may define submodularity as: for all x, y 2 {0,K}V , we have

    f(x) + f(y) � f(x _ y) + f(x ^ y) (3.22)

    x _ y is the (join) element-wise min of each element, that is(x _ y)(v) = min(x(v), y(v)) for v 2 V .x ^ y is the (meet) element-wise min of each element, that is,(x ^ y)(v) = max(x(v), y(v)) for v 2 V .With K = 1, then this is the standard definition of submodularity.

    With |V | = 2, and K + 1 the side-dimension of the matrix, we get aMonge property (on square matrices).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F22/63 (pg.40/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Monge Matrices and Submodularity

    A submodular function has the form: f : 2V ! R which can be seenas f : {0, 1}V ! RWe can generalize this to f : {0,K}V ! R for some constant K 2 Z

    +

    .

    We may define submodularity as: for all x, y 2 {0,K}V , we have

    f(x) + f(y) � f(x _ y) + f(x ^ y) (3.22)

    x _ y is the (join) element-wise min of each element, that is(x _ y)(v) = min(x(v), y(v)) for v 2 V .

    x ^ y is the (meet) element-wise min of each element, that is,(x ^ y)(v) = max(x(v), y(v)) for v 2 V .With K = 1, then this is the standard definition of submodularity.

    With |V | = 2, and K + 1 the side-dimension of the matrix, we get aMonge property (on square matrices).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F22/63 (pg.41/187)

    000

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Monge Matrices and Submodularity

    A submodular function has the form: f : 2V ! R which can be seenas f : {0, 1}V ! RWe can generalize this to f : {0,K}V ! R for some constant K 2 Z

    +

    .

    We may define submodularity as: for all x, y 2 {0,K}V , we have

    f(x) + f(y) � f(x _ y) + f(x ^ y) (3.22)

    x _ y is the (join) element-wise min of each element, that is(x _ y)(v) = min(x(v), y(v)) for v 2 V .x ^ y is the (meet) element-wise min of each element, that is,(x ^ y)(v) = max(x(v), y(v)) for v 2 V .

    With K = 1, then this is the standard definition of submodularity.

    With |V | = 2, and K + 1 the side-dimension of the matrix, we get aMonge property (on square matrices).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F22/63 (pg.42/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Monge Matrices and Submodularity

    A submodular function has the form: f : 2V ! R which can be seenas f : {0, 1}V ! RWe can generalize this to f : {0,K}V ! R for some constant K 2 Z

    +

    .

    We may define submodularity as: for all x, y 2 {0,K}V , we have

    f(x) + f(y) � f(x _ y) + f(x ^ y) (3.22)

    x _ y is the (join) element-wise min of each element, that is(x _ y)(v) = min(x(v), y(v)) for v 2 V .x ^ y is the (meet) element-wise min of each element, that is,(x ^ y)(v) = max(x(v), y(v)) for v 2 V .With K = 1, then this is the standard definition of submodularity.

    With |V | = 2, and K + 1 the side-dimension of the matrix, we get aMonge property (on square matrices).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F22/63 (pg.43/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Monge Matrices and Submodularity

    A submodular function has the form: f : 2V ! R which can be seenas f : {0, 1}V ! RWe can generalize this to f : {0,K}V ! R for some constant K 2 Z

    +

    .

    We may define submodularity as: for all x, y 2 {0,K}V , we have

    f(x) + f(y) � f(x _ y) + f(x ^ y) (3.22)

    x _ y is the (join) element-wise min of each element, that is(x _ y)(v) = min(x(v), y(v)) for v 2 V .x ^ y is the (meet) element-wise min of each element, that is,(x ^ y)(v) = max(x(v), y(v)) for v 2 V .With K = 1, then this is the standard definition of submodularity.

    With |V | = 2, and K + 1 the side-dimension of the matrix, we get aMonge property (on square matrices).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F22/63 (pg.44/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Two Equivalent Submodular Definitions

    Definition 3.6.1 (submodular concave)

    A function f : 2V ! R is submodular if for any A,B ✓ V , we have that:

    f(A) + f(B) � f(A [B) + f(A \B) (3.8)

    An alternate and (as we will soon see) equivalent definition is:

    Definition 3.6.2 (diminishing returns)

    A function f : 2V ! R is submodular if for any A ✓ B ⇢ V , andv 2 V \B, we have that:

    f(A [ {v})� f(A) � f(B [ {v})� f(B) (3.9)

    The incremental “value”, “gain”, or “cost” of v decreases (diminishes) asthe context in which v is considered grows from A to B.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F24/63 (pg.45/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Submodular on Hypercube Verticies

    Test submodularity via values on verticies of hypercube.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F25/63 (pg.46/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Submodular on Hypercube Verticies

    Test submodularity via values on verticies of hypercube.

    Example: with |V | = n = 2, this iseasy:

    00 01

    1110

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F25/63 (pg.47/187)

    v= { 4r}

    f ( b)-1¥-

    f( 0 ) tf ( { v.v } )

    far ) tfd ) zftv ) TFCOD

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Submodular on Hypercube Verticies

    Test submodularity via values on verticies of hypercube.

    Example: with |V | = n = 2, this iseasy:

    00 01

    1110

    With |V | = n = 3, a bit harder.

    000

    001100 010

    011101110

    111

    How many inequalities?Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F25/63 (pg.48/187)

    IOOVOH

    :001

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Subadditive Definitions

    Definition 3.6.1 (subadditive)

    A function f : 2V ! R is subadditive if for any A,B ✓ V , we have that:

    f(A) + f(B) � f(A [B) (3.23)

    This means that the “whole” is less than the sum of the parts.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F26/63 (pg.49/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Two Equivalent Supermodular Definitions

    Definition 3.6.1 (supermodular)

    A function f : 2V ! R is supermodular if for any A,B ✓ V , we have that:

    f(A) + f(B) f(A [B) + f(A \B) (3.8)

    Definition 3.6.2 (supermodular (improving returns))

    A function f : 2V ! R is supermodular if for any A ✓ B ⇢ V , andv 2 V \B, we have that:

    f(A [ {v})� f(A) f(B [ {v})� f(B) (3.9)Incremental “value”, “gain”, or “cost” of v increases (improves) as thecontext in which v is considered grows from A to B.A function f is submodular i↵ �f is supermodular.If f both submodular and supermodular, then f is said to be modular,and f(A) = c+

    P

    a2A f(a) (often c = 0).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F27/63 (pg.50/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Superadditive Definitions

    Definition 3.6.2 (superadditive)

    A function f : 2V ! R is superadditive if for any A,B ✓ V , we have that:

    f(A) + f(B) f(A [B) (3.24)

    This means that the “whole” is greater than the sum of the parts.

    In general, submodular and subadditive (and supermodular andsuperadditive) are di↵erent properties.

    Ex: Let 0 < k < |V |, and consider f : 2V ! R+

    where:

    f(A) =

    (

    1 if |A| k0 else

    (3.25)

    This function is subadditive but not submodular.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F28/63 (pg.51/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Superadditive Definitions

    Definition 3.6.2 (superadditive)

    A function f : 2V ! R is superadditive if for any A,B ✓ V , we have that:

    f(A) + f(B) f(A [B) (3.24)

    This means that the “whole” is greater than the sum of the parts.

    In general, submodular and subadditive (and supermodular andsuperadditive) are di↵erent properties.

    Ex: Let 0 < k < |V |, and consider f : 2V ! R+

    where:

    f(A) =

    (

    1 if |A| k0 else

    (3.25)

    This function is subadditive but not submodular.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F28/63 (pg.52/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Superadditive Definitions

    Definition 3.6.2 (superadditive)

    A function f : 2V ! R is superadditive if for any A,B ✓ V , we have that:

    f(A) + f(B) f(A [B) (3.24)

    This means that the “whole” is greater than the sum of the parts.

    In general, submodular and subadditive (and supermodular andsuperadditive) are di↵erent properties.

    Ex: Let 0 < k < |V |, and consider f : 2V ! R+

    where:

    f(A) =

    (

    1 if |A| k0 else

    (3.25)

    This function is subadditive but not submodular.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F28/63 (pg.53/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Superadditive Definitions

    Definition 3.6.2 (superadditive)

    A function f : 2V ! R is superadditive if for any A,B ✓ V , we have that:

    f(A) + f(B) f(A [B) (3.24)

    This means that the “whole” is greater than the sum of the parts.

    In general, submodular and subadditive (and supermodular andsuperadditive) are di↵erent properties.

    Ex: Let 0 < k < |V |, and consider f : 2V ! R+

    where:

    f(A) =

    (

    1 if |A| k0 else

    (3.25)

    This function is subadditive but not submodular.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F28/63 (pg.54/187)

    @

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Modular Definitions

    Definition 3.6.3 (modular)

    A function that is both submodular and supermodular is called modular

    If f is a modular function, than for any A,B ✓ V , we have

    f(A) + f(B) = f(A \B) + f(A [B) (3.26)

    In modular functions, elements do not interact (or cooperate, or compete,or influence each other), and have value based only on singleton values.

    Proposition 3.6.4

    If f is modular, it may be written as

    f(A) = f(;) +X

    a2A

    f({a})� f(;)⌘

    = c+X

    a2Af 0(a) (3.27)

    which has only |V |+ 1 parameters.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F29/63 (pg.55/187)

    a

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Modular Definitions

    Proof.

    We inductively construct the value for A = {a1

    , a2

    , . . . , ak

    }.For k = 2,

    f(a1

    ) + f(a2

    ) = f(a1

    , a2

    ) + f(;) (3.28)implies f(a

    1

    , a2

    ) = f(a1

    )� f(;) + f(a2

    )� f(;) + f(;) (3.29)

    then for k = 3,

    f(a1

    , a2

    ) + f(a3

    ) = f(a1

    , a2

    , a3

    ) + f(;) (3.30)implies f(a

    1

    , a2

    , a3

    ) = f(a1

    , a2

    )� f(;) + f(a3

    )� f(;) + f(;) (3.31)

    = f(;) +3

    X

    i=1

    f(ai

    )� f(;)�

    (3.32)

    and so on . . .

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F30/63 (pg.56/187)

    CITE

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Complement function

    Given a function f : 2V ! R, we can find a complement function¯f : 2V ! R as ¯f(A) = f(V \A) for any A.

    Proposition 3.6.5

    ¯f is submodular if f is submodular.

    Proof.

    ¯f(A) + ¯f(B) � ¯f(A [B) + ¯f(A \B) (3.33)

    follows from

    f(V \A) + f(V \B) � f(V \ (A [B)) + f(V \ (A \B)) (3.34)

    which is true because V \ (A [B) = (V \A) \ (V \B) andV \ (A \B) = (V \A) [ (V \B) (De Morgan’s laws for sets).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F31/63 (pg.57/187)

    ±

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Undirected Graphs

    Let G = (V,E) be a graph with vertices V = V (G) and edgesE = E(G) ✓ V ⇥ V .

    If G is undirected, define

    E(X,Y ) = {{x, y} 2 E(G) : x 2 X \ Y, y 2 Y \X} (3.35)as the edges strictly between X and Y .Nodes define cuts, define the cut function �(X) = E(X,V \X).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F32/63 (pg.58/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Undirected Graphs

    Let G = (V,E) be a graph with vertices V = V (G) and edgesE = E(G) ✓ V ⇥ V .If G is undirected, define

    E(X,Y ) = {{x, y} 2 E(G) : x 2 X \ Y, y 2 Y \X} (3.35)as the edges strictly between X and Y .

    Nodes define cuts, define the cut function �(X) = E(X,V \X).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F32/63 (pg.59/187)

    info

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Undirected Graphs

    Let G = (V,E) be a graph with vertices V = V (G) and edgesE = E(G) ✓ V ⇥ V .If G is undirected, define

    E(X,Y ) = {{x, y} 2 E(G) : x 2 X \ Y, y 2 Y \X} (3.35)as the edges strictly between X and Y .Nodes define cuts, define the cut function �(X) = E(X,V \X).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F32/63 (pg.60/187)

    Ot

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Undirected Graphs

    Let G = (V,E) be a graph with vertices V = V (G) and edgesE = E(G) ✓ V ⇥ V .If G is undirected, define

    E(X,Y ) = {{x, y} 2 E(G) : x 2 X \ Y, y 2 Y \X} (3.35)as the edges strictly between X and Y .Nodes define cuts, define the cut function �(X) = E(X,V \X).

    G = (V ,E )

    S={a,b,c} �G (S) = {{u, v}2 E : u 2 S , v 2 V \ S}.

    a

    b

    c

    ef

    h

    g

    d

    = {{a,d},{b,d},{b,e},{c,e},{c,f}}Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F32/63 (pg.61/187)

    0

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Directed graphs, and cuts and flowsIf G is directed, define

    E+(X,Y ) , {(x, y) 2 E(G) : x 2 X \ Y, y 2 Y \X} (3.36)as the edges directed strictly from X towards Y .

    Nodes define cuts and flows. Define edges leaving X (out-flow) as

    �+(X) , E+(X,V \X) (3.37)and edges entering X (in-flow) as

    ��(X) , E+(V \X,X) (3.38)

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F33/63 (pg.62/187)

    -

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Directed graphs, and cuts and flowsIf G is directed, define

    E+(X,Y ) , {(x, y) 2 E(G) : x 2 X \ Y, y 2 Y \X} (3.36)as the edges directed strictly from X towards Y .Nodes define cuts and flows. Define edges leaving X (out-flow) as

    �+(X) , E+(X,V \X) (3.37)and edges entering X (in-flow) as

    ��(X) , E+(V \X,X) (3.38)

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F33/63 (pg.63/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Directed graphs, and cuts and flowsIf G is directed, define

    E+(X,Y ) , {(x, y) 2 E(G) : x 2 X \ Y, y 2 Y \X} (3.36)as the edges directed strictly from X towards Y .Nodes define cuts and flows. Define edges leaving X (out-flow) as

    �+(X) , E+(X,V \X) (3.37)and edges entering X (in-flow) as

    ��(X) , E+(V \X,X) (3.38)

    S

    ={a

    ,

    b

    ,

    c}

    a

    b

    c

    ef

    h

    g

    d

    �G (S) = {(u, v ) 2 E : u 2 S , v 2 V \ S}. = {(b,e) ,(c,f)}

    +

    = {(d,a) ,(d,b) ,(e,c)}�G (S) = {(v , u) 2 E : u 2 S , v 2 V \ S}.-

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F33/63 (pg.64/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    The Neighbor function in undirected graphs

    Given a set X ✓ V , the neighbor function of X is defined as

    �(X) , {v 2 V (G) \X : E(X, {v}) 6= ;} (3.39)

    Example:

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F34/63 (pg.65/187)

    - -

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    The Neighbor function in undirected graphs

    Given a set X ✓ V , the neighbor function of X is defined as

    �(X) , {v 2 V (G) \X : E(X, {v}) 6= ;} (3.39)

    Example:

    a

    b

    c

    ef

    h

    g

    d

    G = (V,E)

    S = {a, b, c}

    �(S) = {d, e, f}

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F34/63 (pg.66/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Directed Cut function: property

    Lemma 3.7.1

    For a digraph G = (V,E) and any X,Y ✓ V : we have

    |�+(X)|+ |�+(Y )|= |�+(X \ Y )|+ |�+(X [ Y )|+ |E+(X,Y )|+ |E+(Y,X)| (3.40)

    and

    |��(X)|+ |��(Y )|= |��(X \ Y )|+ |��(X [ Y )|+ |E�(X,Y )|+ |E�(Y,X)| (3.41)

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F35/63 (pg.67/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Directed Cut function: proof of propertyProof.

    We can prove this using a simple geometric counting argument (��(X) issimilar)

    X

    V \ X

    Y

    V \ Y

    X

    V \ X

    Y

    V \ Y

    X

    V \ X

    Y

    V \ Y

    X

    V \ X

    Y

    V \ Y

    (e)

    (e)

    (b)(a)

    (a)

    (b)

    (b)(b)

    (c)

    (c)

    (f )

    (f )

    (g)

    (g)

    (d)

    (d)

    X

    V \ X

    Y

    V \ Y

    X

    V \ X

    Y

    V \ Y

    |�+(X )| |�+(Y )|

    |�+(X \ Y )| |�+(X [ Y )|

    |E+(X ,Y )| |E+(Y ,X )|

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F36/63 (pg.68/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Directed cut/flow functions: submodular

    Lemma 3.7.2

    For a digraph G = (V,E) and any X,Y ✓ V : both functions |�+(X)| and|��(X)| are submodular.

    Proof.

    |E+(X,Y )| � 0 and |E�(X,Y )| � 0.

    More generally, in the non-negative weighted case, both in-flow andout-flow are submodular on subsets of the vertices.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F37/63 (pg.69/187)

    IAK ajgwca ) wcakl

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Undirected Cut/Flow & the Neighbor function: submodularLemma 3.7.3

    For an undirected graph G = (V,E) and any X,Y ✓ V : we have that boththe undirected cut (or flow) function |�(X)| and the neighbor function|�(X)| are submodular. I.e.,

    |�(X)|+ |�(Y )| = |�(X \ Y )|+ |�(X [ Y )|+ 2|E(X,Y )| (3.42)

    and

    |�(X)|+ |�(Y )| � |�(X \ Y )|+ |�(X [ Y )| (3.43)

    Proof.

    Eq. (3.42) follows from Eq. (3.40): we replace each undirected edge{u, v} with two oppositely-directed directed edges (u, v) and (v, u).Then we use same counting argument.

    Eq. (3.43) follows as shown in the following page.

    . . .Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F38/63 (pg.70/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Undirected Cut/Flow & the Neighbor function: submodularLemma 3.7.3

    For an undirected graph G = (V,E) and any X,Y ✓ V : we have that boththe undirected cut (or flow) function |�(X)| and the neighbor function|�(X)| are submodular. I.e.,

    |�(X)|+ |�(Y )| = |�(X \ Y )|+ |�(X [ Y )|+ 2|E(X,Y )| (3.42)

    and

    |�(X)|+ |�(Y )| � |�(X \ Y )|+ |�(X [ Y )| (3.43)

    Proof.

    Eq. (3.42) follows from Eq. (3.40): we replace each undirected edge{u, v} with two oppositely-directed directed edges (u, v) and (v, u).Then we use same counting argument.

    Eq. (3.43) follows as shown in the following page.

    . . .Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F38/63 (pg.71/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Undirected Neighbor functioncont.

    X Y(a) (b)

    (c)

    (f )

    (g)

    (h)(e)

    (d)

    Graphically, we can count and see that

    �(X) = (a) + (c) + (f) + (g) + (d) (3.44)

    �(Y ) = (b) + (c) + (e) + (h) + (d) (3.45)

    �(X [ Y ) = (a) + (b) + (c) + (d) (3.46)�(X \ Y ) = (c) + (g) + (h) (3.47)

    so

    |�(X)|+ |�(Y )| = (a) + (b) + 2(c) + 2(d) + (e) + (f) + (g) + (h)� (a) + (b) + 2(c) + (d) + (g) + (h) = |�(X [ Y )|+ |�(X \ Y )| (3.48)

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F39/63 (pg.72/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Undirected Neighbor functions

    Therefore, the undirected cut function |�(A)| and the neighbor function|�(A)| of a graph G are both submodular.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F40/63 (pg.73/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Undirected cut/flow is submodular: alternate proofAnother simple proof shows that |�(X)| is submodular.

    Define a graph Guv

    = ({u, v}, {e}, w) with two nodes u, v and oneedge e = {u, v} with non-negative weight w(e) 2 R

    +

    .Cut weight function over those two nodes: w(�

    u,v

    (·)) has valuation:w(�

    u,v

    (;)) = w(�u,v

    ({u, v})) = 0 (3.49)and

    w(�u,v

    ({u})) = w(�u,v

    ({v})) = w � 0 (3.50)Thus, w(�

    u,v

    (·)) is submodular sincew(�

    u,v

    ({u})) + w(�u,v

    ({v})) � w(�u,v

    ({u, v})) + w(�u,v

    (;)) (3.51)General non-negative weighted graph G = (V,E,w), define w(�(·)):

    f(X) = w(�(X)) =X

    (u,v)2E(G)

    w(�u,v

    (X \ {u, v})) (3.52)

    This is easily shown to be submodular using properties we will soon see(namely, submodularity closed under summation and restriction).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F41/63 (pg.74/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Undirected cut/flow is submodular: alternate proofAnother simple proof shows that |�(X)| is submodular.Define a graph G

    uv

    = ({u, v}, {e}, w) with two nodes u, v and oneedge e = {u, v} with non-negative weight w(e) 2 R

    +

    .

    Cut weight function over those two nodes: w(�u,v

    (·)) has valuation:w(�

    u,v

    (;)) = w(�u,v

    ({u, v})) = 0 (3.49)and

    w(�u,v

    ({u})) = w(�u,v

    ({v})) = w � 0 (3.50)Thus, w(�

    u,v

    (·)) is submodular sincew(�

    u,v

    ({u})) + w(�u,v

    ({v})) � w(�u,v

    ({u, v})) + w(�u,v

    (;)) (3.51)General non-negative weighted graph G = (V,E,w), define w(�(·)):

    f(X) = w(�(X)) =X

    (u,v)2E(G)

    w(�u,v

    (X \ {u, v})) (3.52)

    This is easily shown to be submodular using properties we will soon see(namely, submodularity closed under summation and restriction).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F41/63 (pg.75/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Undirected cut/flow is submodular: alternate proofAnother simple proof shows that |�(X)| is submodular.Define a graph G

    uv

    = ({u, v}, {e}, w) with two nodes u, v and oneedge e = {u, v} with non-negative weight w(e) 2 R

    +

    .Cut weight function over those two nodes: w(�

    u,v

    (·)) has valuation:w(�

    u,v

    (;)) = w(�u,v

    ({u, v})) = 0 (3.49)and

    w(�u,v

    ({u})) = w(�u,v

    ({v})) = w � 0 (3.50)

    Thus, w(�u,v

    (·)) is submodular sincew(�

    u,v

    ({u})) + w(�u,v

    ({v})) � w(�u,v

    ({u, v})) + w(�u,v

    (;)) (3.51)General non-negative weighted graph G = (V,E,w), define w(�(·)):

    f(X) = w(�(X)) =X

    (u,v)2E(G)

    w(�u,v

    (X \ {u, v})) (3.52)

    This is easily shown to be submodular using properties we will soon see(namely, submodularity closed under summation and restriction).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F41/63 (pg.76/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Undirected cut/flow is submodular: alternate proofAnother simple proof shows that |�(X)| is submodular.Define a graph G

    uv

    = ({u, v}, {e}, w) with two nodes u, v and oneedge e = {u, v} with non-negative weight w(e) 2 R

    +

    .Cut weight function over those two nodes: w(�

    u,v

    (·)) has valuation:w(�

    u,v

    (;)) = w(�u,v

    ({u, v})) = 0 (3.49)and

    w(�u,v

    ({u})) = w(�u,v

    ({v})) = w � 0 (3.50)Thus, w(�

    u,v

    (·)) is submodular sincew(�

    u,v

    ({u})) + w(�u,v

    ({v})) � w(�u,v

    ({u, v})) + w(�u,v

    (;)) (3.51)

    General non-negative weighted graph G = (V,E,w), define w(�(·)):

    f(X) = w(�(X)) =X

    (u,v)2E(G)

    w(�u,v

    (X \ {u, v})) (3.52)

    This is easily shown to be submodular using properties we will soon see(namely, submodularity closed under summation and restriction).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F41/63 (pg.77/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Undirected cut/flow is submodular: alternate proofAnother simple proof shows that |�(X)| is submodular.Define a graph G

    uv

    = ({u, v}, {e}, w) with two nodes u, v and oneedge e = {u, v} with non-negative weight w(e) 2 R

    +

    .Cut weight function over those two nodes: w(�

    u,v

    (·)) has valuation:w(�

    u,v

    (;)) = w(�u,v

    ({u, v})) = 0 (3.49)and

    w(�u,v

    ({u})) = w(�u,v

    ({v})) = w � 0 (3.50)Thus, w(�

    u,v

    (·)) is submodular sincew(�

    u,v

    ({u})) + w(�u,v

    ({v})) � w(�u,v

    ({u, v})) + w(�u,v

    (;)) (3.51)General non-negative weighted graph G = (V,E,w), define w(�(·)):

    f(X) = w(�(X)) =X

    (u,v)2E(G)

    w(�u,v

    (X \ {u, v})) (3.52)

    This is easily shown to be submodular using properties we will soon see(namely, submodularity closed under summation and restriction).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F41/63 (pg.78/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Undirected cut/flow is submodular: alternate proofAnother simple proof shows that |�(X)| is submodular.Define a graph G

    uv

    = ({u, v}, {e}, w) with two nodes u, v and oneedge e = {u, v} with non-negative weight w(e) 2 R

    +

    .Cut weight function over those two nodes: w(�

    u,v

    (·)) has valuation:w(�

    u,v

    (;)) = w(�u,v

    ({u, v})) = 0 (3.49)and

    w(�u,v

    ({u})) = w(�u,v

    ({v})) = w � 0 (3.50)Thus, w(�

    u,v

    (·)) is submodular sincew(�

    u,v

    ({u})) + w(�u,v

    ({v})) � w(�u,v

    ({u, v})) + w(�u,v

    (;)) (3.51)General non-negative weighted graph G = (V,E,w), define w(�(·)):

    f(X) = w(�(X)) =X

    (u,v)2E(G)

    w(�u,v

    (X \ {u, v})) (3.52)

    This is easily shown to be submodular using properties we will soon see(namely, submodularity closed under summation and restriction).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F41/63 (pg.79/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Other graph functions that are submodular/supermodular

    These come from Narayanan’s book 1997. Let G be an undirected graph.

    Let V (X) be the vertices adjacent to some edge in X ✓ E(G), then|V (X)| (the vertex function) is submodular.

    Let E(S) be the edges with both vertices in S ✓ V (G). Then |E(S)|(the interior edge function) is supermodular.Let I(S) be the edges with at least one vertex in S ✓ V (G). Then|I(S)| (the incidence function) is submodular.Recall |�(S)|, is the set size of edges with exactly one vertex inS ✓ V (G) is submodular (cut size function). Thus, we haveI(S) = E(S) [ �(S) and E(S) \ �(S) = ;, and thus that|I(S)| = |E(S)|+ |�(S)|.

    So we can get a submodular function bysumming a submodular and a supermodular function. If you had toguess, is this always the case?

    Consider f(A) = |�+(A)|� |�+(V \A)|. Guess, submodular,supermodular, modular, or neither? Exercise: determine which one andprove it.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F42/63 (pg.80/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Other graph functions that are submodular/supermodular

    These come from Narayanan’s book 1997. Let G be an undirected graph.

    Let V (X) be the vertices adjacent to some edge in X ✓ E(G), then|V (X)| (the vertex function) is submodular.Let E(S) be the edges with both vertices in S ✓ V (G). Then |E(S)|(the interior edge function) is supermodular.

    Let I(S) be the edges with at least one vertex in S ✓ V (G). Then|I(S)| (the incidence function) is submodular.Recall |�(S)|, is the set size of edges with exactly one vertex inS ✓ V (G) is submodular (cut size function). Thus, we haveI(S) = E(S) [ �(S) and E(S) \ �(S) = ;, and thus that|I(S)| = |E(S)|+ |�(S)|.

    So we can get a submodular function bysumming a submodular and a supermodular function. If you had toguess, is this always the case?

    Consider f(A) = |�+(A)|� |�+(V \A)|. Guess, submodular,supermodular, modular, or neither? Exercise: determine which one andprove it.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F42/63 (pg.81/187)

    p.Ess

    ' ¥€#.

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Other graph functions that are submodular/supermodular

    These come from Narayanan’s book 1997. Let G be an undirected graph.

    Let V (X) be the vertices adjacent to some edge in X ✓ E(G), then|V (X)| (the vertex function) is submodular.Let E(S) be the edges with both vertices in S ✓ V (G). Then |E(S)|(the interior edge function) is supermodular.Let I(S) be the edges with at least one vertex in S ✓ V (G). Then|I(S)| (the incidence function) is submodular.

    Recall |�(S)|, is the set size of edges with exactly one vertex inS ✓ V (G) is submodular (cut size function). Thus, we haveI(S) = E(S) [ �(S) and E(S) \ �(S) = ;, and thus that|I(S)| = |E(S)|+ |�(S)|.

    So we can get a submodular function bysumming a submodular and a supermodular function. If you had toguess, is this always the case?

    Consider f(A) = |�+(A)|� |�+(V \A)|. Guess, submodular,supermodular, modular, or neither? Exercise: determine which one andprove it.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F42/63 (pg.82/187)

    QE

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Other graph functions that are submodular/supermodular

    These come from Narayanan’s book 1997. Let G be an undirected graph.

    Let V (X) be the vertices adjacent to some edge in X ✓ E(G), then|V (X)| (the vertex function) is submodular.Let E(S) be the edges with both vertices in S ✓ V (G). Then |E(S)|(the interior edge function) is supermodular.Let I(S) be the edges with at least one vertex in S ✓ V (G). Then|I(S)| (the incidence function) is submodular.Recall |�(S)|, is the set size of edges with exactly one vertex inS ✓ V (G) is submodular (cut size function). Thus, we haveI(S) = E(S) [ �(S) and E(S) \ �(S) = ;, and thus that|I(S)| = |E(S)|+ |�(S)|.

    So we can get a submodular function bysumming a submodular and a supermodular function. If you had toguess, is this always the case?Consider f(A) = |�+(A)|� |�+(V \A)|. Guess, submodular,supermodular, modular, or neither? Exercise: determine which one andprove it.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F42/63 (pg.83/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Other graph functions that are submodular/supermodular

    These come from Narayanan’s book 1997. Let G be an undirected graph.

    Let V (X) be the vertices adjacent to some edge in X ✓ E(G), then|V (X)| (the vertex function) is submodular.Let E(S) be the edges with both vertices in S ✓ V (G). Then |E(S)|(the interior edge function) is supermodular.Let I(S) be the edges with at least one vertex in S ✓ V (G). Then|I(S)| (the incidence function) is submodular.Recall |�(S)|, is the set size of edges with exactly one vertex inS ✓ V (G) is submodular (cut size function). Thus, we haveI(S) = E(S) [ �(S) and E(S) \ �(S) = ;, and thus that|I(S)| = |E(S)|+ |�(S)|. So we can get a submodular function bysumming a submodular and a supermodular function.

    If you had toguess, is this always the case?Consider f(A) = |�+(A)|� |�+(V \A)|. Guess, submodular,supermodular, modular, or neither? Exercise: determine which one andprove it.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F42/63 (pg.84/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Other graph functions that are submodular/supermodular

    These come from Narayanan’s book 1997. Let G be an undirected graph.

    Let V (X) be the vertices adjacent to some edge in X ✓ E(G), then|V (X)| (the vertex function) is submodular.Let E(S) be the edges with both vertices in S ✓ V (G). Then |E(S)|(the interior edge function) is supermodular.Let I(S) be the edges with at least one vertex in S ✓ V (G). Then|I(S)| (the incidence function) is submodular.Recall |�(S)|, is the set size of edges with exactly one vertex inS ✓ V (G) is submodular (cut size function). Thus, we haveI(S) = E(S) [ �(S) and E(S) \ �(S) = ;, and thus that|I(S)| = |E(S)|+ |�(S)|. So we can get a submodular function bysumming a submodular and a supermodular function. If you had toguess, is this always the case?

    Consider f(A) = |�+(A)|� |�+(V \A)|. Guess, submodular,supermodular, modular, or neither? Exercise: determine which one andprove it.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F42/63 (pg.85/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Other graph functions that are submodular/supermodular

    These come from Narayanan’s book 1997. Let G be an undirected graph.

    Let V (X) be the vertices adjacent to some edge in X ✓ E(G), then|V (X)| (the vertex function) is submodular.Let E(S) be the edges with both vertices in S ✓ V (G). Then |E(S)|(the interior edge function) is supermodular.Let I(S) be the edges with at least one vertex in S ✓ V (G). Then|I(S)| (the incidence function) is submodular.Recall |�(S)|, is the set size of edges with exactly one vertex inS ✓ V (G) is submodular (cut size function). Thus, we haveI(S) = E(S) [ �(S) and E(S) \ �(S) = ;, and thus that|I(S)| = |E(S)|+ |�(S)|. So we can get a submodular function bysumming a submodular and a supermodular function. If you had toguess, is this always the case?Consider f(A) = |�+(A)|� |�+(V \A)|. Guess, submodular,supermodular, modular, or neither? Exercise: determine which one andprove it.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F42/63 (pg.86/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Number of connected components in a graph via edges

    Recall, f : 2V ! R is submodular, then so is ¯f : 2V ! R defined as¯f(S) = f(V \ S).

    Hence, if f : 2V ! R is supermodular, then so is ¯f : 2V ! R definedas ¯f(S) = f(V \ S).Given a graph G = (V,E), for each A ✓ E(G), let c(A) denote thenumber of connected components of the (spanning) subgraph(V (G), A), with c : 2E ! R

    +

    .c(A) is monotone non-increasing, c(A+ a)� c(A) 0 .Then c(A) is supermodular, i.e.,

    c(A+ a)� c(A) c(B + a)� c(B) (3.53)with A ✓ B ✓ E \ {a}.Intuition: an edge is “more” (no less) able to bridge separatecomponents (and reduce the number of conected components) whenedge is added in a smaller context than when added in a larger context.c̄(A) = c(E \A) is the number of connected components in G whenwe remove A, so is also supermodular, but monotone non-decreasing.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F43/63 (pg.87/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Number of connected components in a graph via edges

    Recall, f : 2V ! R is submodular, then so is ¯f : 2V ! R defined as¯f(S) = f(V \ S).Hence, if f : 2V ! R is supermodular, then so is ¯f : 2V ! R definedas ¯f(S) = f(V \ S).

    Given a graph G = (V,E), for each A ✓ E(G), let c(A) denote thenumber of connected components of the (spanning) subgraph(V (G), A), with c : 2E ! R

    +

    .c(A) is monotone non-increasing, c(A+ a)� c(A) 0 .Then c(A) is supermodular, i.e.,

    c(A+ a)� c(A) c(B + a)� c(B) (3.53)with A ✓ B ✓ E \ {a}.Intuition: an edge is “more” (no less) able to bridge separatecomponents (and reduce the number of conected components) whenedge is added in a smaller context than when added in a larger context.c̄(A) = c(E \A) is the number of connected components in G whenwe remove A, so is also supermodular, but monotone non-decreasing.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F43/63 (pg.88/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Number of connected components in a graph via edges

    Recall, f : 2V ! R is submodular, then so is ¯f : 2V ! R defined as¯f(S) = f(V \ S).Hence, if f : 2V ! R is supermodular, then so is ¯f : 2V ! R definedas ¯f(S) = f(V \ S).Given a graph G = (V,E), for each A ✓ E(G), let c(A) denote thenumber of connected components of the (spanning) subgraph(V (G), A), with c : 2E ! R

    +

    .

    c(A) is monotone non-increasing, c(A+ a)� c(A) 0 .Then c(A) is supermodular, i.e.,

    c(A+ a)� c(A) c(B + a)� c(B) (3.53)with A ✓ B ✓ E \ {a}.Intuition: an edge is “more” (no less) able to bridge separatecomponents (and reduce the number of conected components) whenedge is added in a smaller context than when added in a larger context.c̄(A) = c(E \A) is the number of connected components in G whenwe remove A, so is also supermodular, but monotone non-decreasing.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F43/63 (pg.89/187)

    o# ⇐ ¥at

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Number of connected components in a graph via edges

    Recall, f : 2V ! R is submodular, then so is ¯f : 2V ! R defined as¯f(S) = f(V \ S).Hence, if f : 2V ! R is supermodular, then so is ¯f : 2V ! R definedas ¯f(S) = f(V \ S).Given a graph G = (V,E), for each A ✓ E(G), let c(A) denote thenumber of connected components of the (spanning) subgraph(V (G), A), with c : 2E ! R

    +

    .c(A) is monotone non-increasing, c(A+ a)� c(A) 0 .

    Then c(A) is supermodular, i.e.,

    c(A+ a)� c(A) c(B + a)� c(B) (3.53)with A ✓ B ✓ E \ {a}.Intuition: an edge is “more” (no less) able to bridge separatecomponents (and reduce the number of conected components) whenedge is added in a smaller context than when added in a larger context.c̄(A) = c(E \A) is the number of connected components in G whenwe remove A, so is also supermodular, but monotone non-decreasing.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F43/63 (pg.90/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Number of connected components in a graph via edges

    Recall, f : 2V ! R is submodular, then so is ¯f : 2V ! R defined as¯f(S) = f(V \ S).Hence, if f : 2V ! R is supermodular, then so is ¯f : 2V ! R definedas ¯f(S) = f(V \ S).Given a graph G = (V,E), for each A ✓ E(G), let c(A) denote thenumber of connected components of the (spanning) subgraph(V (G), A), with c : 2E ! R

    +

    .c(A) is monotone non-increasing, c(A+ a)� c(A) 0 .Then c(A) is supermodular, i.e.,

    c(A+ a)� c(A) c(B + a)� c(B) (3.53)with A ✓ B ✓ E \ {a}.

    Intuition: an edge is “more” (no less) able to bridge separatecomponents (and reduce the number of conected components) whenedge is added in a smaller context than when added in a larger context.c̄(A) = c(E \A) is the number of connected components in G whenwe remove A, so is also supermodular, but monotone non-decreasing.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F43/63 (pg.91/187)

    @-

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Number of connected components in a graph via edges

    Recall, f : 2V ! R is submodular, then so is ¯f : 2V ! R defined as¯f(S) = f(V \ S).Hence, if f : 2V ! R is supermodular, then so is ¯f : 2V ! R definedas ¯f(S) = f(V \ S).Given a graph G = (V,E), for each A ✓ E(G), let c(A) denote thenumber of connected components of the (spanning) subgraph(V (G), A), with c : 2E ! R

    +

    .c(A) is monotone non-increasing, c(A+ a)� c(A) 0 .Then c(A) is supermodular, i.e.,

    c(A+ a)� c(A) c(B + a)� c(B) (3.53)with A ✓ B ✓ E \ {a}.Intuition: an edge is “more” (no less) able to bridge separatecomponents (and reduce the number of conected components) whenedge is added in a smaller context than when added in a larger context.

    c̄(A) = c(E \A) is the number of connected components in G whenwe remove A, so is also supermodular, but monotone non-decreasing.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F43/63 (pg.92/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Number of connected components in a graph via edges

    Recall, f : 2V ! R is submodular, then so is ¯f : 2V ! R defined as¯f(S) = f(V \ S).Hence, if f : 2V ! R is supermodular, then so is ¯f : 2V ! R definedas ¯f(S) = f(V \ S).Given a graph G = (V,E), for each A ✓ E(G), let c(A) denote thenumber of connected components of the (spanning) subgraph(V (G), A), with c : 2E ! R

    +

    .c(A) is monotone non-increasing, c(A+ a)� c(A) 0 .Then c(A) is supermodular, i.e.,

    c(A+ a)� c(A) c(B + a)� c(B) (3.53)with A ✓ B ✓ E \ {a}.Intuition: an edge is “more” (no less) able to bridge separatecomponents (and reduce the number of conected components) whenedge is added in a smaller context than when added in a larger context.c̄(A) = c(E \A) is the number of connected components in G whenwe remove A, so is also supermodular, but monotone non-decreasing.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F43/63 (pg.93/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Graph Strength

    So c̄(A) = c(E \A) is the number of connected components in Gwhen we remove A, is supermodular.

    Maximizing c̄(A) might seem as a goal for a network attacker — manyconnected components means that many points in the network havelost connectivity to many other points (unprotected network).

    If we can remove a small set A and shatter the graph into manyconnected components, then the graph is weak.

    An attacker wishes to choose a small number of edges (since it ischeap) to shatter the graph into as many components as possible.

    Let G = (V,E,w) with w : E ! R+ be a weighted graph withnon-negative weights.

    For (u, v) = e 2 E, let w(e) be a measure of the strength of theconnection between vertices u and v (strength meaning the di�cultyof cutting the edge e).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F44/63 (pg.94/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Graph Strength

    So c̄(A) = c(E \A) is the number of connected components in Gwhen we remove A, is supermodular.

    Maximizing c̄(A) might seem as a goal for a network attacker — manyconnected components means that many points in the network havelost connectivity to many other points (unprotected network).

    If we can remove a small set A and shatter the graph into manyconnected components, then the graph is weak.

    An attacker wishes to choose a small number of edges (since it ischeap) to shatter the graph into as many components as possible.

    Let G = (V,E,w) with w : E ! R+ be a weighted graph withnon-negative weights.

    For (u, v) = e 2 E, let w(e) be a measure of the strength of theconnection between vertices u and v (strength meaning the di�cultyof cutting the edge e).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F44/63 (pg.95/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Graph Strength

    So c̄(A) = c(E \A) is the number of connected components in Gwhen we remove A, is supermodular.

    Maximizing c̄(A) might seem as a goal for a network attacker — manyconnected components means that many points in the network havelost connectivity to many other points (unprotected network).

    If we can remove a small set A and shatter the graph into manyconnected components, then the graph is weak.

    An attacker wishes to choose a small number of edges (since it ischeap) to shatter the graph into as many components as possible.

    Let G = (V,E,w) with w : E ! R+ be a weighted graph withnon-negative weights.

    For (u, v) = e 2 E, let w(e) be a measure of the strength of theconnection between vertices u and v (strength meaning the di�cultyof cutting the edge e).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F44/63 (pg.96/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Graph Strength

    So c̄(A) = c(E \A) is the number of connected components in Gwhen we remove A, is supermodular.

    Maximizing c̄(A) might seem as a goal for a network attacker — manyconnected components means that many points in the network havelost connectivity to many other points (unprotected network).

    If we can remove a small set A and shatter the graph into manyconnected components, then the graph is weak.

    An attacker wishes to choose a small number of edges (since it ischeap) to shatter the graph into as many components as possible.

    Let G = (V,E,w) with w : E ! R+ be a weighted graph withnon-negative weights.

    For (u, v) = e 2 E, let w(e) be a measure of the strength of theconnection between vertices u and v (strength meaning the di�cultyof cutting the edge e).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F44/63 (pg.97/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Graph Strength

    So c̄(A) = c(E \A) is the number of connected components in Gwhen we remove A, is supermodular.

    Maximizing c̄(A) might seem as a goal for a network attacker — manyconnected components means that many points in the network havelost connectivity to many other points (unprotected network).

    If we can remove a small set A and shatter the graph into manyconnected components, then the graph is weak.

    An attacker wishes to choose a small number of edges (since it ischeap) to shatter the graph into as many components as possible.

    Let G = (V,E,w) with w : E ! R+ be a weighted graph withnon-negative weights.

    For (u, v) = e 2 E, let w(e) be a measure of the strength of theconnection between vertices u and v (strength meaning the di�cultyof cutting the edge e).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F44/63 (pg.98/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Graph Strength

    So c̄(A) = c(E \A) is the number of connected components in Gwhen we remove A, is supermodular.

    Maximizing c̄(A) might seem as a goal for a network attacker — manyconnected components means that many points in the network havelost connectivity to many other points (unprotected network).

    If we can remove a small set A and shatter the graph into manyconnected components, then the graph is weak.

    An attacker wishes to choose a small number of edges (since it ischeap) to shatter the graph into as many components as possible.

    Let G = (V,E,w) with w : E ! R+ be a weighted graph withnon-negative weights.

    For (u, v) = e 2 E, let w(e) be a measure of the strength of theconnection between vertices u and v (strength meaning the di�cultyof cutting the edge e).

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F44/63 (pg.99/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Graph Strength

    Then w(A) for A ✓ E is a modular function

    w(A) =X

    e2Awe

    (3.54)

    so that w(E(G[S])) is the “internal strength” of the vertex set S.Notation: S is a set of nodes, G[S] is the vertex-induced subgraph of G induced byvertices S, E(G[S]) are the edges contained within this induced subgraph, andw(E(G[S])) is the weight of these edges.

    Suppose removing A shatters G into a graph with c̄(A) > 1components — then w(A)/(c̄(A)� 1) is like the “e↵ort per achievedcomponent” for a network attacker.A form of graph strength can then be defined as the following:

    strength(G,w) = minA✓E(G):c̄(A)>1

    w(A)

    c̄(A)� 1 (3.55)

    Graph strength is like the minimum e↵ort per component. An attackerwould use the argument of the min to choose which edges to attack. Anetwork designer would maximize, over G and/or w, the graphstrength, strength(G,w).Since submodularity, problems have strongly-poly-time solutions.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F45/63 (pg.100/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Graph Strength

    Then w(A) for A ✓ E is a modular function

    w(A) =X

    e2Awe

    (3.54)

    so that w(E(G[S])) is the “internal strength” of the vertex set S.Suppose removing A shatters G into a graph with c̄(A) > 1components — then w(A)/(c̄(A)� 1) is like the “e↵ort per achievedcomponent” for a network attacker.

    A form of graph strength can then be defined as the following:

    strength(G,w) = minA✓E(G):c̄(A)>1

    w(A)

    c̄(A)� 1 (3.55)

    Graph strength is like the minimum e↵ort per component. An attackerwould use the argument of the min to choose which edges to attack. Anetwork designer would maximize, over G and/or w, the graphstrength, strength(G,w).Since submodularity, problems have strongly-poly-time solutions.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F45/63 (pg.101/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Graph Strength

    Then w(A) for A ✓ E is a modular function

    w(A) =X

    e2Awe

    (3.54)

    so that w(E(G[S])) is the “internal strength” of the vertex set S.Suppose removing A shatters G into a graph with c̄(A) > 1components — then w(A)/(c̄(A)� 1) is like the “e↵ort per achievedcomponent” for a network attacker.A form of graph strength can then be defined as the following:

    strength(G,w) = minA✓E(G):c̄(A)>1

    w(A)

    c̄(A)� 1 (3.55)

    Graph strength is like the minimum e↵ort per component. An attackerwould use the argument of the min to choose which edges to attack. Anetwork designer would maximize, over G and/or w, the graphstrength, strength(G,w).Since submodularity, problems have strongly-poly-time solutions.

    Prof. Je↵ Bilmes EE596b/Spring 2016/Submodularity - Lecture 3 - Apr 4th, 2016 F45/63 (pg.102/187)

  • Bit More Notation Info Theory Examples Monge More Definitions Graph & Combinatorial Examples Other Examples

    Graph Strength

    Then w(A) for A ✓ E is a modular function

    w(A) =X

    e2Awe

    (3.54)

    so that w(E(G[S])) is the “internal stren