Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Michael Robinson
Measuring and Exploiting Data-Model Fit
with Sheaves
Michael Robinson
Acknowledgements● Collaborators:
– Cliff Joslyn, Katy Nowak, Brenda Praggastis, Emilie Purvine (PNNL)– Chris Capraro, Grant Clarke, Janelle Henrich (SRC)– Jason Summers, Charlie Gaumond (ARiA)– J Smart, Dave Bridgeland (Georgetown)
● Students:– Eyerusalem Abebe– Olivia Chen– Philip Coyle– Brian DiZio
● Recent funding: DARPA, ONR, AFRL
– Jen Dumiak– Samara Fantie– Sean Fennell– Robby Green
– Noah Kim– Fangfei Lan– Metin Toksoz-Exley– Jackson Williams– Greg Young
Michael Robinson
The middle way
Statistics
Topology
Geometry
Michael Robinson
Key points● Systems can be encoded as sheaves● Datasets are assignments to a sheaf model of a system● Consistency radius measures compatibility between
system and dataset● Statistical inference minimizes consistency radius
– Regression (parameter estimation, learning, model selection) varies the sheaf
– Data fusion (prediction) varies the assignment● Topological properties of the minimization problem
may aid model selection and/or data cleaning
Michael Robinson
Process flow diagramSystem model
Sheaf
Observations
Assignment
Consistency radiusPaired (Sheaf,Assignment)
Regression
Data fusion
“Syntax”
“Semantics”
Michael Robinson
CategoriallySystem model
Shv
Observations
Pseud
Consistency radiusShvFPA
Regression
Data fusion
“Syntax”
“Semantics”
(Faithful)
(Faithful)
(Forgetful)
(Forgetful)
Michael Robinson
Geometrically/topologicallySystem model
Shv
Observations
Pseud
ℝShvFPA
Regression
Data fusion
“Syntax”
“Semantics”
(Faithful)
(Faithful)
(Continuous)
(Continuous)
(Continuous)
(Note: all of this within an isomorphism class of sheaves)
Michael Robinson
What is a sheaf?
A sheaf of _____________ on a ______________(data type) (topological space)
Michael Robinson
What is a sheaf?
A sheaf of _____________ on a ______________(data type) (topological space)
pseudometric spaces partial order
Michael Robinson
Topologizing a partial order
Open sets are unionsof up-sets
Michael Robinson
Topologizing a partial order
Intersectionsof up-sets are alsoup-sets
Michael Robinson
( )( )
A sheaf on a poset is...
A set assigned to each element, calleda stalk, and …
ℝ
ℝ2 ℝ2
ℝ3
0 1 11 0 1
1 0 10 1 1
(1 0)(0 1)
ℝ ℝ2
ℝ2
(1 -1) ( )0 11 0
( )-3 3-4 4
This is a sheaf of vector spaces on a partial order
ℝ3
( )0 1 11 0 1
ℝ
(-2 1)
(The stalk on an element in the posetis better thought of beingassociated to the up-set)
231
2 -23 -31 -1
Michael Robinson
( )( )
A sheaf on a poset is...
… restriction functions between stalks, following the order relation…
ℝ
ℝ2 ℝ2
ℝ3
0 1 11 0 1
1 0 10 1 1
(1 0)(0 1)
ℝ ℝ2
ℝ2
(1 -1)
231
( )0 11 0
( )-3 3-4 4
2 -23 -31 -1
This is a sheaf of vector spaces on a partial order
ℝ3
( )0 1 11 0 1
ℝ
(-2 1)
(“Restriction” because it goes frombigger up-sets to smaller ones)
Michael Robinson
( )( )
A sheaf on a poset is...ℝ
ℝ2 ℝ2
ℝ3
0 1 11 0 1
1 0 10 1 1
(1 0)(0 1)
ℝ ℝ2
ℝ2
(1 -1) ( )0 11 0
( )-3 3-4 4
This is a sheaf of vector spaces on a partial order
ℝ3
( )0 1 11 0 1
ℝ
(-2 1)
(1 -1) =
( )0 1 11 0 1(1 0) = (0 1) ( )1 0 1
0 1 1
( )0 11 0( )-3 3
-4 4( )1 0 10 1 1
2 -23 -31 -1
=
… so that the diagramcommutes!
231
231
2 -23 -31 -1
2 -23 -31 -1
Michael Robinson
( )( )
An assignment is...
… the selection of avalue on some open sets
0 1 11 0 1
1 0 10 1 1
(1 0)(0 1)
(1 -1)
231
( )0 11 0
( )-3 3-4 4
2 -23 -31 -1 ( )0 1 1
1 0 1
(-2 1)
(-1)
( )23
( )-4-3
( )-3-4
( )32
(-4)
(-4)
230
-2-3-1
The term serration is more common, but perhaps more opaque.
Michael Robinson
( )( )
A global section is...
… an assignmentthat is consistent with the restrictions
0 1 11 0 1
1 0 10 1 1
(1 0)(0 1)
(1 -1)
231
( )0 11 0
( )-3 3-4 4
2 -23 -31 -1 ( )0 1 1
1 0 1
(-2 1)
(-1)
( )23
( )-4-3
( )-3-4
( )32
(-4)
(-4)
230
-2-3-1
Michael Robinson
( )( )
Some assignments aren’t consistent
… but they mightbe partially consistent
0 1 11 0 1
1 0 10 1 1
(1 0)(0 1)
(1 -1)
231
( )0 11 0
( )-3 3-4 4
2 -23 -31 -1 ( )0 1 1
1 0 1
(-2 1)
(+1)
( )23
( )-4-3
( )-3-4
( )32
(-4)
(-4)
231
-2-3-1
Michael Robinson
( )( )
Consistency radius is...… the maximum (or some other norm)distance between the value in a stalk and the values propagated along the restrictions
0 1 11 0 1
1 0 10 1 1
(1 0)(0 1)
(1 -1)
231
( )0 11 0
( )-3 3-4 4
2 -23 -31 -1 ( )0 1 1
1 0 1
(-2 1)
(+1)
( )23
( )-4-3
( )-3-4
( )32
(-4)
(-4)
231
-2-3-1
231( )0 1 1
1 0 1 ( )32- = 2
( )23(1 -1) - 1 = 2
(+1) - 231
-2-3 = 2 14-1 MAX ≥ 2 14
Note: lots more restrictions to check!
Michael Robinson
( )( )
Consistency radius is monotonic
Lemma: Consistency radius on an open set U is computed by only considering open sets V1 ⊆ V2 ⊆ U
0 1 11 0 1
1 0 10 1 1
(1 0)(0 1)
(1 -1)
231
( )0 11 0
( )-3 3-4 4
2 -23 -31 -1 ( )0 1 1
1 0 1
(-2 1)
(+1)
( )23
( )-4-3
( )-3-4
( )32
(-4)
(-4)
231
-2-3-1
Note: lots more restrictions to check!
ConsistencyRadius = 0
Michael Robinson
( )( )
Consistency radius is monotonicProposition: If U ⊆ V then c(U) ≤ c(V)
0 1 11 0 1
1 0 10 1 1
(1 0)(0 1)
(1 -1)
231
( )0 11 0
( )-3 3-4 4
2 -23 -31 -1 ( )0 1 1
1 0 1
(-2 1)
(+1)
( )23
( )-4-3
( )-3-4
( )32
(-4)
(-4)
231
-2-3-1
Note: lots more restrictions to check!
ConsistencyRadius = 2 14
Consistencyradius restricted toan open set U
Michael Robinson
Consistency radius optimization
ℝ ℝ
ℝ
ℝ(2)
(1)
(1)
(1)
(½)
This is a sheaf on a small poset
A B
C
Michael Robinson
Consistency radius optimization
0 1
?
?(2)
(1)
(1)
(1)
(½)
Here is an assignment supported on part of it
A B
C
Michael Robinson
Consistency radius optimization
0 1
?(2)
(1)
(1)
(1)
(½)½
Minimizing the consistency radius when extending globally
A B
C
Michael Robinson
Consistency radius optimization
⅓
?(2)
(1)
(1)
(1)
(½)⅓
Here is the closest global section (everything can be changed)
⅔A B
C
Michael Robinson
Consistency radius is not a measure
ℝ ℝ
ℝ
ℝ(2)
(1)
(1)
(1)
(½)
This is the full sheaf diagram including all open sets in theAlexandrov topology, not just the base
{A,C} {B,C}
{C}
{A,B,C}
Michael Robinson
Consistency radius is not a measure
0 1
?
?(2)
(1)
(1)
(1)
(½)
{A,C} {B,C}
{C}
{A,B,C}
Michael Robinson
Consistency radius is not a measure
0 1
½
⅓(2)
(1)
(1)
(1)
(½)
Minimizing the consistency radius when extendingThe value on the intersection is no longer unique!
This value can be anything between ⅓ and ⅔
{A,C} {B,C}
{C}
{A,B,C}
Michael Robinson
Consistency radius is not a measure
0 1
½
⅓(2)
(1)
(1)
(1)
(½)
Consistency radius of this open set = 0
Lemma: Consistency radius on an open set U is computed by only considering open sets V1 ⊆ V2 ⊆ U
{A,C} {B,C}
{C}
{A,B,C}
Michael Robinson
Consistency radius is not a measure
0 1
½
⅓(2)
(1)
(1)
(1)
(½)
c(U) = ½
Lemma: Consistency radius on an open set U is computed by only considering open sets V1 ⊆ V2 ⊆ U
Michael Robinson
Consistency radius is not a measure
0 1
½
⅓(2)
(1)
(1)
(1)
(½)
c(U) = ½ c(V) = ½
Lemma: Consistency radius on an open set U is computed by only considering open sets V1 ⊆ V2 ⊆ U
Michael Robinson
Consistency radius is not a measure
0 1
½
⅓(2)
(1)
(1)
(1)
(½)
c(U) = ½ c(V) = ½
c(U ∩ V) = 0
c(U ∪ V) = ⅔ ≠ c(U) + c(V) – c(U ∩ V)
Michael Robinson
The consistency filtration
1
⅓
(1)
(1)
Consistencythreshold 0 ½ ⅔
0 1
½
⅓(2)
(1)
(1)
(1)
(½)
0 1
½
⅓(2)
(1)
(1)
(1)
(½)
0 1
½
⅓(2)
(1)
(1)
(1)
(½)
Consistency radius = ⅔
● … assigns the set of open sets with consistency less than a given threshold
● Lemma: consistency filtration is a sheaf of collections of open sets on (ℝ,≤). Restrictions in this sheaf are coarsenings.
coarsen coarsen
Michael Robinson
The consistency filtration
1
⅓
(1)
(1)
0 ½ ⅔
0 1
½
⅓(2)
(1)
(1)
(1)
(½)
0 1
½
⅓(2)
(1)
(1)
(1)
(½)
0 1
½
⅓(2)
(1)
(1)
(1)
(½)
Consistency radius = ⅔
● Theorem: Consistency filtration is a functor
CF : ShvFPA→ Shv(ℝ,≤)● As a practical matter, CF transforms an assignment into
persistent Čech cohomology
Consistencythreshold
coarsen coarsen
Michael Robinson
The consistency filtration
1
⅓
(1)
(1)
0 ½ ⅔
0 1
½
⅓(2)
(1)
(1)
(1)
(½)
0 1
½
⅓(2)
(1)
(1)
(1)
(½)
0 1
½
⅓(2)
(1)
(1)
(1)
(½)
Consistency radius = ⅔
● Theorem: Consistency filtration is continuous under an appropriate interleaving distance, so– Generators of PH0(CF) with are highly consistent subsets– Generators of PHk(CF) for k > 0 are robust obstructions to
consistency
Consistencythreshold
coarsen coarsen
Michael Robinson
Geometrically/topologicallySystem model
Shv
Observations
Pseud
ℝShvFPA
Regression
Data fusion
“Syntax”
“Semantics”
(Faithful)
(Faithful)
(Continuous)
(Continuous)
(Continuous)
(Note: all of this within an isomorphism class of sheaves)
Michael Robinson
Inference and consistencyStatistical inference minimizes various risks:
● Intrinsic risk– Noise in the data
● Parameter risk– Errors in estimation
● Model risk– Using the wrong
model
Minimize consistency radius by:
Varying the assignment
Varying the sheaf isomorphism class
Varying the sheaf within an isomorphism class
Michael Robinson
Chris Capraro, Janelle Henrich
Separating sinusoids from noise● Consider a signal formed from N sinusoids
– Each sinusoid has a (real) frequency ω– Each sinusoid has a (complex) amplitude a
● Task: Recover these parameters from M samples f is an arbitrary
known functionPossibilities:● Magnitude● Phase● Identity function● Quantizer output● Signal dispersion
Gaussian noiseSample time
Sinusoid frequency
Sinusoid amplitude
Michael Robinson
Chris Capraro, Janelle Henrich
Separating sinusoids from noise● Consider a signal formed from N sinusoids
– Each sinusoid has a (real) frequency ω– Each sinusoid has a (complex) amplitude a
● Model the situation as a sheaf over a poset...
ℂ ℂ ℂ…
Parameter spaces become stalks
Signal models become restrictions
ℝN×ℂN
Michael Robinson
Chris Capraro, Janelle Henrich
Separating sinusoids from noise● Consider a signal formed from N sinusoids
– Each sinusoid has a (real) frequency ω– Each sinusoid has a (complex) amplitude a
● The samples become an assignment to part of the sheaf
ℝN×ℂN
x1 ∈ ℂ x2 ∈ ℂ xM ∈ ℂ Observed…
Michael Robinson
Chris Capraro, Janelle Henrich
Separating sinusoids from noise● Consider a signal formed from N sinusoids
– Each sinusoid has a (real) frequency ω– Each sinusoid has a (complex) amplitude a
● Find the unknown parameters by minimizing consistency radius
(ω1,…,ωN,a1,…,aN) ∈ ℝN×ℂN
x1 ∈ ℂ x2 ∈ ℂ xM ∈ ℂ Observed
Inferred
…
Michael Robinson
Chris Capraro, Janelle Henrich
Sheaves deliver excellent performance
8x improvement over state-of-the-art in heavy noise!
Sheaf result
Michael Robinson
The future● Preprint: https://arxiv.org/abs/1805.08927● Computational sheaf theory
– Small examples can be put together ad hoc– Larger ones require a software library
● PySheaf: a software library for sheaves– https://github.com/kb1dds/pysheaf– Includes several examples you can play with!
● Connections to statistical models need to be explored– Is consistency radius related to likelihood functions?
● Extensive testing on various datasets and scenarios