131
Non-Adaptive Adaptive Sampling on Turnstile Streams Sepideh Mahabadi TTIC Ilya Razenshteyn MSR Redmond Samson Zhou CMU David Woodruff CMU

Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Non-Adaptive Adaptive Sampling on Turnstile Streams

Sepideh MahabadiTTIC

Ilya RazenshteynMSR Redmond

Samson ZhouCMU

David WoodruffCMU

Page 2: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

An algorithmic paradigm for solving many data summarization tasks.

Adaptive Sampling

Page 3: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

An algorithmic paradigm for solving many data summarization tasks.

Adaptive Sampling

Given: 𝑛𝑛 vectors in ℝ𝒅𝒅

β€’ Sample a vector w.p. proportional to its normβ€’ Project all vectors away from the selected subspaceβ€’ Repeat on the residuals

Page 4: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Adaptive Sampling Example

Page 5: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Adaptive Sampling Example

Page 6: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Adaptive Sampling Example

Page 7: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Adaptive Sampling Example

Page 8: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Adaptive Sampling Example

Page 9: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Adaptive Sampling Example

Page 10: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Adaptive Sampling Example

Page 11: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Adaptive Sampling Example

Page 12: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Adaptive Sampling Example

Page 13: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Adaptive Sampling Example

Page 14: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Data Summarization TasksGiven: β€’ 𝑛𝑛 by 𝑑𝑑 matrix 𝑨𝑨 ∈ ℝ𝒏𝒏×𝒅𝒅

β€’ parameter π‘˜π‘˜Goal: β€’ Find a representation (of β€œsize π‘˜π‘˜β€) for the dataβ€’ Optimize a predefined function

Rows correspond to 𝑛𝑛 data points

e.g. feature vectors of objects in a dataset

Page 15: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Given: β€’ 𝑛𝑛 by 𝑑𝑑 matrix 𝑨𝑨 ∈ ℝ𝒏𝒏×𝒅𝒅

β€’ parameter π‘˜π‘˜Goal: β€’ Find a representation (of β€œsize π‘˜π‘˜β€) for the dataβ€’ Optimize a predefined functionInstances:β€’ Row/Column subset selectionβ€’ Subspace approximationβ€’ Projective clusteringβ€’ Volume sampling/maximization

Data Summarization Tasks

β€’ Find a subset 𝑆𝑆 of π‘˜π‘˜ rows minimizing the squared distance of all rows to the subspace of 𝑆𝑆

𝐴𝐴 βˆ’ 𝑃𝑃𝑃𝑃𝑃𝑃𝑗𝑗𝑆𝑆 𝐴𝐴 𝐹𝐹

Best set of representatives

Page 16: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Data Summarization TasksGiven: β€’ 𝑛𝑛 by 𝑑𝑑 matrix 𝑨𝑨 ∈ ℝ𝒏𝒏×𝒅𝒅

β€’ parameter π‘˜π‘˜Goal: β€’ Find a representation (of β€œsize π‘˜π‘˜β€) for the dataβ€’ Optimize a predefined functionInstances:β€’ Row/Column subset selectionβ€’ Subspace approximationβ€’ Projective clusteringβ€’ Volume sampling/maximization

β€’ Find a subspace 𝐻𝐻 of dimension π‘˜π‘˜ minimizing the squared distance of all rows to 𝐻𝐻

𝐴𝐴 βˆ’ 𝑃𝑃𝑃𝑃𝑃𝑃𝑗𝑗𝐻𝐻 𝐴𝐴 𝐹𝐹

Best approximation with a subspace

Page 17: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Data Summarization TasksGiven: β€’ 𝑛𝑛 by 𝑑𝑑 matrix 𝑨𝑨 ∈ ℝ𝒏𝒏×𝒅𝒅

β€’ parameter π‘˜π‘˜Goal: β€’ Find a representation (of β€œsize π‘˜π‘˜β€) for the dataβ€’ Optimize a predefined functionInstances:β€’ Row/Column subset selectionβ€’ Subspace approximationβ€’ Projective clusteringβ€’ Volume sampling/maximization

β€’ Find 𝑠𝑠 subspaces 𝐻𝐻1, … ,𝐻𝐻𝑠𝑠each of dimension π‘˜π‘˜minimizing

βˆ‘π‘–π‘–=1𝑛𝑛 𝑑𝑑 𝐴𝐴𝑖𝑖 ,𝐻𝐻 2

Best approximation with several subspaces

Page 18: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Data Summarization TasksGiven: β€’ 𝑛𝑛 by 𝑑𝑑 matrix 𝑨𝑨 ∈ ℝ𝒏𝒏×𝒅𝒅

β€’ parameter π‘˜π‘˜Goal: β€’ Find a representation (of β€œsize π‘˜π‘˜β€) for the dataβ€’ Optimize a predefined functionInstances:β€’ Row/Column subset selectionβ€’ Subspace approximationβ€’ Projective clusteringβ€’ Volume sampling/maximization

β€’ Find a subset 𝑆𝑆 of π‘˜π‘˜ rows that maximizes the volume of the parallelepiped spanned by 𝑆𝑆

Notion for capturing diversityMaximizing diversity

Page 19: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Given: β€’ 𝑛𝑛 by 𝑑𝑑 matrix 𝑨𝑨 ∈ ℝ𝒏𝒏×𝒅𝒅

β€’ parameter π‘˜π‘˜Goal: β€’ Find a representation (of β€œsize π‘˜π‘˜β€) for the dataβ€’ Optimize a predefined functionInstances:β€’ Row/Column subset selectionβ€’ Subspace approximationβ€’ Projective clusteringβ€’ Volume sampling/maximization

Data Summarization Tasks

Adaptive sampling is used to derive algorithms for all these tasks

Page 20: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Adaptive Sampling[DeshpandeVempala06, DeshpandeVaradarajan07, DeshpandeRademacherVempalaWang06]

β€’ Sample row 𝑖𝑖 w.p. proportional to distance squared 𝐴𝐴𝑖𝑖 22

β€’ Given: 𝑛𝑛 by 𝑑𝑑 matrix 𝑨𝑨 ∈ ℝ𝒏𝒏×𝒅𝒅, parameter π‘˜π‘˜

β€’ Sample a row 𝐴𝐴𝑖𝑖 with probability 𝐴𝐴𝑖𝑖 2

2

𝐴𝐴 𝐹𝐹2

Page 21: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Adaptive Sampling[DeshpandeVempala06, DeshpandeVaradarajan07, DeshpandeRademacherVempalaWang06]

β€’ Sample row 𝑖𝑖 w.p. proportional to distance squared 𝐴𝐴𝑖𝑖 22

β€’ Given: 𝑛𝑛 by 𝑑𝑑 matrix 𝑨𝑨 ∈ ℝ𝒏𝒏×𝒅𝒅, parameter π‘˜π‘˜

β€’ Sample a row 𝐴𝐴𝑖𝑖 with probability 𝐴𝐴𝑖𝑖 2

2

𝐴𝐴 𝐹𝐹2

Frobenius norm:

𝐴𝐴 𝐹𝐹 = βˆ‘π‘–π‘– βˆ‘π‘—π‘— 𝐴𝐴𝑖𝑖,𝑗𝑗2

Page 22: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Adaptive Sampling[DeshpandeVempala06, DeshpandeVaradarajan07, DeshpandeRademacherVempalaWang06]

β€’ Sample row 𝑖𝑖 w.p. proportional to 𝐴𝐴𝑖𝑖 𝐼𝐼 βˆ’ 𝑀𝑀+𝑀𝑀 22

β€’ Given: 𝑛𝑛 by 𝑑𝑑 matrix 𝑨𝑨 ∈ ℝ𝒏𝒏×𝒅𝒅, parameter π‘˜π‘˜β€’ 𝑀𝑀 ← βˆ…β€’ For π‘˜π‘˜ rounds,

β€’ Sample a row 𝐴𝐴𝑖𝑖 with probability 𝐴𝐴𝑖𝑖 πΌπΌβˆ’π‘€π‘€

+𝑀𝑀 22

𝐴𝐴 πΌπΌβˆ’π‘€π‘€+𝑀𝑀 𝐹𝐹2

β€’ Append 𝐴𝐴𝑖𝑖 to 𝑀𝑀

Project away from sampled subspace𝑴𝑴+ :Moore-Penrose Pseudoinverse

Page 23: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Adaptive Sampling[DeshpandeVempala06, DeshpandeVaradarajan07, DeshpandeRademacherVempalaWang06]

β€’ Sample row 𝑖𝑖 w.p. proportional to 𝐴𝐴𝑖𝑖 𝐼𝐼 βˆ’ 𝑀𝑀+𝑀𝑀 22

β€’ Given: 𝑛𝑛 by 𝑑𝑑 matrix 𝑨𝑨 ∈ ℝ𝒏𝒏×𝒅𝒅, parameter π‘˜π‘˜β€’ 𝑀𝑀 ← βˆ…β€’ For π‘˜π‘˜ rounds,

β€’ Sample a row 𝐴𝐴𝑖𝑖 with probability 𝐴𝐴𝑖𝑖 πΌπΌβˆ’π‘€π‘€

+𝑀𝑀 22

𝐴𝐴 πΌπΌβˆ’π‘€π‘€+𝑀𝑀 𝐹𝐹2

β€’ Append 𝐴𝐴𝑖𝑖 to 𝑀𝑀

Seems inherently sequential

Project away from sampled subspace𝑴𝑴+ :Moore-Penrose Pseudoinverse

Page 24: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Question:

Can we implement Adaptive Sampling in one pass (non-adaptively)?

Page 25: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Streaming AlgorithmsMotivation: Data is huge and cannot be stored in the main memory

Streaming algorithms: Given sequential access to the data, make one or several passes over input

β€’ Solve the problem on the fly

β€’ Use sub-linear storage

Parameters: Space, number of passes, approximation

Page 26: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Streaming AlgorithmsMotivation: Data is huge and cannot be stored in the main memory

Streaming algorithms: Given sequential access to the data, make one or several passes over input

β€’ Solve the problem on the fly

β€’ Use sub-linear storage

Parameters: Space, number of passes, approximation

Models:

β€’ Row Arrival: rows of 𝐴𝐴 arrive one by one

β€’ Turnstile: we receive updates to the entries of the matrix i.e., (𝑖𝑖, 𝑗𝑗,Ξ”) means 𝐴𝐴𝑖𝑖,𝑗𝑗 ← 𝐴𝐴𝑖𝑖,𝑗𝑗 + Ξ”

Page 27: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Streaming AlgorithmsMotivation: Data is huge and cannot be stored in the main memory

Streaming algorithms: Given sequential access to the data, make one or several passes over input

β€’ Solve the problem on the fly

β€’ Use sub-linear storage

Parameters: Space, number of passes, approximation

Models:

β€’ Row Arrival: rows of 𝐴𝐴 arrive one by one

β€’ Turnstile: we receive updates to the entries of the matrix i.e., (𝑖𝑖, 𝑗𝑗,Ξ”) means 𝐴𝐴𝑖𝑖,𝑗𝑗 ← 𝐴𝐴𝑖𝑖,𝑗𝑗 + Ξ”

Focus on the row arrival model for the talk

Page 28: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Streaming AlgorithmsMotivation: Data is huge and cannot be stored in the main memory

Streaming algorithms: Given sequential access to the data, make one or several passes over input

β€’ Solve the problem on the fly

β€’ Use sub-linear storage

Parameters: Space, number of passes, approximation

Models:

β€’ Row Arrival: rows of 𝐴𝐴 arrive one by one

β€’ Turnstile: we receive updates to the entries of the matrix i.e., (𝑖𝑖, 𝑗𝑗,Ξ”) means 𝐴𝐴𝑖𝑖,𝑗𝑗 ← 𝐴𝐴𝑖𝑖,𝑗𝑗 + Ξ”

Focus on the row arrival model for the talk

Our goal: Simulate π‘˜π‘˜ rounds of adaptive sampling in 1 pass of streaming Data Summarization tasks were considered in the streaming models in earlier works that used

adaptive sampling [e.g. DV’06, DR’10, DRVW’06]

Page 29: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Outline of Results

1. Simulate adaptive sampling in 1 pass turnstile streamβ€’ 𝐿𝐿𝑝𝑝,2 sampling with post processing matrix 𝑃𝑃

2. Applications in turnstile streamβ€’ Row/column subset selectionβ€’ Subspace approximationβ€’ Projective clusteringβ€’ Volume Maximization

3. Volume maximization lower bounds

4. Volume maximization in row arrival

Page 30: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Outline of Results

1. Simulate adaptive sampling in 1 pass turnstile streamβ€’ 𝑳𝑳𝒑𝒑,𝟐𝟐 sampling with post processing matrix 𝑷𝑷

2. Applications in turnstile streamβ€’ Row/column subset selectionβ€’ Subspace approximationβ€’ Projective clusteringβ€’ Volume Maximization

3. Volume maximization lower bounds

4. Volume maximization in row arrival

Page 31: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Results: π‘³π‘³πŸπŸ,𝟐𝟐 Sampling with Post-Processing

Input:β€’ 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑 as a (turnstile) streamβ€’ a post-processing 𝑷𝑷 ∈ ℝ𝑑𝑑×𝑑𝑑

Output: samples an index 𝑖𝑖 ∈ [𝑛𝑛] w.p. π‘¨π‘¨π’Šπ’Šπ‘·π‘· 22

𝑨𝑨𝑷𝑷 𝐹𝐹2

Page 32: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Results: π‘³π‘³πŸπŸ,𝟐𝟐 Sampling with Post-Processing

Input:β€’ 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑 as a (turnstile) streamβ€’ a post-processing 𝑷𝑷 ∈ ℝ𝑑𝑑×𝑑𝑑

Output: samples an index 𝑖𝑖 ∈ [𝑛𝑛] w.p. π‘¨π‘¨π’Šπ’Šπ‘·π‘· 22

𝑨𝑨𝑷𝑷 𝐹𝐹2

𝑷𝑷 corresponds to the projection matrix (𝐼𝐼 βˆ’ 𝑀𝑀+𝑀𝑀)

Page 33: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Results: π‘³π‘³πŸπŸ,𝟐𝟐 Sampling with Post-Processing

Input:β€’ 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑 as a (turnstile) streamβ€’ a post-processing 𝑷𝑷 ∈ ℝ𝑑𝑑×𝑑𝑑

Output: samples an index 𝑖𝑖 ∈ [𝑛𝑛] w.p. 1 Β± πœ–πœ– π‘¨π‘¨π’Šπ’Šπ‘·π‘· 22

𝑨𝑨𝑷𝑷 𝐹𝐹2 + 1

𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝(𝑛𝑛) In one pass 𝑝𝑝𝑃𝑃𝑝𝑝𝑝𝑝(𝑑𝑑, πœ–πœ–βˆ’1, log𝑛𝑛) space

Page 34: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Results: π‘³π‘³πŸπŸ,𝟐𝟐 Sampling with Post-Processing

Impossible to return entire row instead of index in sublinear space A long stream of small updates + an arbitrarily large update

Input:β€’ 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑 as a (turnstile) streamβ€’ a post-processing 𝑷𝑷 ∈ ℝ𝑑𝑑×𝑑𝑑

Output: samples an index 𝑖𝑖 ∈ [𝑛𝑛] w.p. 1 Β± πœ–πœ– π‘¨π‘¨π’Šπ’Šπ‘·π‘· 22

𝑨𝑨𝑷𝑷 𝐹𝐹2 + 1

𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝(𝑛𝑛) In one pass 𝑝𝑝𝑃𝑃𝑝𝑝𝑝𝑝(𝑑𝑑, πœ–πœ–βˆ’1, log𝑛𝑛) space

Page 35: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Results: 𝑳𝑳𝒑𝒑,𝟐𝟐 Sampling with Post-Processing

Impossible to return entire row instead of index in sublinear space A long stream of small updates + an arbitrarily large update

Input:β€’ 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑 as a (turnstile) stream, 𝒑𝒑 ∈ {𝟏𝟏,𝟐𝟐}β€’ a post-processing 𝑷𝑷 ∈ ℝ𝑑𝑑×𝑑𝑑

Output: samples an index 𝑖𝑖 ∈ [𝑛𝑛] w.p. 1 Β± πœ–πœ– π‘¨π‘¨π’Šπ’Šπ‘·π‘· 2𝒑𝒑

𝑨𝑨𝑷𝑷 𝒑𝒑,πŸπŸπ’‘π’‘ + 1

𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝(𝑛𝑛)

In one pass 𝑝𝑝𝑃𝑃𝑝𝑝𝑝𝑝(𝑑𝑑, πœ–πœ–βˆ’1, log𝑛𝑛) space

Page 36: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Outline of Results

1. Simulate adaptive sampling in 1 pass turnstile streamβ€’ 𝐿𝐿𝑝𝑝,2 sampling with post processing matrix 𝑃𝑃

2. Applications in turnstile streamβ€’ Row/column subset selectionβ€’ Subspace approximationβ€’ Projective clusteringβ€’ Volume Maximization

3. Volume maximization lower bounds

4. Volume maximization in row arrival

Page 37: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Results: Adaptive Sampling

Input: 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑 as a (turnstile) streamOutput: Return each set 𝑺𝑺 βŠ‚π’Œπ’Œ [𝒏𝒏] of π‘˜π‘˜ indices w.p. 𝒑𝒑𝑺𝑺 s.t.

βˆ‘π‘†π‘† 𝒑𝒑𝑺𝑺 βˆ’ 𝒒𝒒𝑺𝑺 ≀ πœ–πœ–β€’ 𝒒𝒒𝑺𝑺: prob. of selecting 𝑺𝑺 via adaptive samplingβ€’ w.r.t. either distance or squared distance (i.e., 𝑝𝑝 ∈ {1,2})

Page 38: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Results: Adaptive Sampling

Input: 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑 as a (turnstile) streamOutput: Return each set 𝑺𝑺 βŠ‚π’Œπ’Œ [𝒏𝒏] of π‘˜π‘˜ indices w.p. 𝒑𝒑𝑺𝑺 s.t.

βˆ‘π‘†π‘† 𝒑𝒑𝑺𝑺 βˆ’ 𝒒𝒒𝑺𝑺 ≀ πœ–πœ–β€’ 𝒒𝒒𝑺𝑺: prob. of selecting 𝑺𝑺 via adaptive samplingβ€’ w.r.t. either distance or squared distance (i.e., 𝑝𝑝 ∈ {1,2})

In one pass

𝑝𝑝𝑃𝑃𝑝𝑝𝑝𝑝(𝑑𝑑,π‘˜π‘˜, πœ–πœ–βˆ’1, log𝑛𝑛) space

Page 39: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Results: Adaptive Sampling

Input: 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑 as a (turnstile) streamOutput: Return each set 𝑺𝑺 βŠ‚π’Œπ’Œ [𝒏𝒏] of π‘˜π‘˜ indices w.p. 𝒑𝒑𝑺𝑺 s.t.

βˆ‘π‘†π‘† 𝒑𝒑𝑺𝑺 βˆ’ 𝒒𝒒𝑺𝑺 ≀ πœ–πœ–β€’ 𝒒𝒒𝑺𝑺: prob. of selecting 𝑺𝑺 via adaptive samplingβ€’ w.r.t. either distance or squared distance (i.e., 𝑝𝑝 ∈ {1,2})

In one pass

𝑝𝑝𝑃𝑃𝑝𝑝𝑝𝑝(𝑑𝑑,π‘˜π‘˜, πœ–πœ–βˆ’1, log𝑛𝑛) space

Besides indices S, a noisy set of rows 𝑃𝑃1, … , π‘ƒπ‘ƒπ‘˜π‘˜ are returned β€’ Each 𝑃𝑃𝑖𝑖 is close to the corresponding 𝐴𝐴𝑖𝑖 (w.r.t. residual)

Page 40: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Results: Adaptive Sampling

Impossible to return the row accurately in sublinear space A long stream of small updates + an arbitrarily large update

Input: 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑 as a (turnstile) streamOutput: Return each set 𝑺𝑺 βŠ‚π’Œπ’Œ [𝒏𝒏] of π‘˜π‘˜ indices w.p. 𝒑𝒑𝑺𝑺 s.t.

βˆ‘π‘†π‘† 𝒑𝒑𝑺𝑺 βˆ’ 𝒒𝒒𝑺𝑺 ≀ πœ–πœ–β€’ 𝒒𝒒𝑺𝑺: prob. of selecting 𝑺𝑺 via adaptive samplingβ€’ w.r.t. either distance or squared distance (i.e., 𝑝𝑝 ∈ {1,2})

In one pass

𝑝𝑝𝑃𝑃𝑝𝑝𝑝𝑝(𝑑𝑑,π‘˜π‘˜, πœ–πœ–βˆ’1, log𝑛𝑛) space

Besides indices S, a noisy set of rows 𝑃𝑃1, … , π‘ƒπ‘ƒπ‘˜π‘˜ are returned β€’ Each 𝑃𝑃𝑖𝑖 is close to the corresponding 𝐴𝐴𝑖𝑖 (w.r.t. residual)

Page 41: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Outline of Results

1. Simulate adaptive sampling in 1 pass turnstile streamβ€’ 𝐿𝐿𝑝𝑝,2 sampling with post processing matrix 𝑃𝑃

2. Applications in turnstile streamβ€’ Row/column subset selectionβ€’ Subspace approximationβ€’ Projective clusteringβ€’ Volume Maximization

3. Volume maximization lower bounds

4. Volume maximization in row arrival

Page 42: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Row Subset Selection

Input: 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑 and an integer π’Œπ’Œ > 0

Output: π’Œπ’Œ rows of 𝐴𝐴 to form 𝑴𝑴 to minimize 𝐴𝐴 βˆ’ 𝐴𝐴𝑀𝑀+𝑀𝑀 𝐹𝐹

Page 43: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Row Subset Selection

Our Result: finds M such that,Pr[ 𝐴𝐴 βˆ’ 𝐴𝐴𝑴𝑴+𝑴𝑴 𝐹𝐹

2 ≀ 𝟏𝟏𝟏𝟏 π’Œπ’Œ + 𝟏𝟏 ! 𝐴𝐴 βˆ’ π΄π΄π‘˜π‘˜ 𝐹𝐹2 ] β‰₯ 2/3

β€’ π΄π΄π‘˜π‘˜: best rank-k approximation of 𝐴𝐴‒ first one pass turnstile streaming algorithmβ€’ 𝑝𝑝𝑃𝑃𝑝𝑝𝑝𝑝(𝑑𝑑,π‘˜π‘˜, log𝑛𝑛) space

Input: 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑 and an integer π’Œπ’Œ > 0

Output: π’Œπ’Œ rows of 𝐴𝐴 to form 𝑴𝑴 to minimize 𝐴𝐴 βˆ’ 𝐴𝐴𝑀𝑀+𝑀𝑀 𝐹𝐹

Page 44: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Row Subset Selection

Our Result: finds M such that,Pr[ 𝐴𝐴 βˆ’ 𝐴𝐴𝑴𝑴+𝑴𝑴 𝐹𝐹

2 ≀ 𝟏𝟏𝟏𝟏 π’Œπ’Œ + 𝟏𝟏 ! 𝐴𝐴 βˆ’ π΄π΄π‘˜π‘˜ 𝐹𝐹2 ] β‰₯ 2/3

β€’ π΄π΄π‘˜π‘˜: best rank-k approximation of 𝐴𝐴‒ first one pass turnstile streaming algorithmβ€’ 𝑝𝑝𝑃𝑃𝑝𝑝𝑝𝑝(𝑑𝑑,π‘˜π‘˜, log𝑛𝑛) space

Previous works: centralized setting [e.g. DRVW06, BMD09, GS’12] and row arrival [e.g., CMM’17, GP’14 , BDMMUWZ’18]

Input: 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑 and an integer π’Œπ’Œ > 0

Output: π’Œπ’Œ rows of 𝐴𝐴 to form 𝑴𝑴 to minimize 𝐴𝐴 βˆ’ 𝐴𝐴𝑀𝑀+𝑀𝑀 𝐹𝐹

Page 45: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Subspace Approximation

Input: 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑 and an integer π’Œπ’Œ > 0Output: π’Œπ’Œ-dim subspace 𝑯𝑯 to minimize βˆ‘π‘–π‘–=1𝑛𝑛 𝑑𝑑 𝐴𝐴𝑖𝑖 ,𝐻𝐻 𝑝𝑝 1/𝑝𝑝

β€’ 𝑝𝑝 ∈ 1,2β€’ 𝑑𝑑 𝐴𝐴𝑖𝑖 ,𝐻𝐻 = 𝐴𝐴𝑖𝑖 𝕀𝕀 βˆ’ 𝐻𝐻+𝐻𝐻 2

Page 46: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Subspace Approximation

Input: 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑 and an integer π’Œπ’Œ > 0Output: π’Œπ’Œ-dim subspace 𝑯𝑯 to minimize βˆ‘π‘–π‘–=1𝑛𝑛 𝑑𝑑 𝐴𝐴𝑖𝑖 ,𝐻𝐻 𝑝𝑝 1/𝑝𝑝

β€’ 𝑝𝑝 ∈ 1,2β€’ 𝑑𝑑 𝐴𝐴𝑖𝑖 ,𝐻𝐻 = 𝐴𝐴𝑖𝑖 𝕀𝕀 βˆ’ 𝐻𝐻+𝐻𝐻 2

Our Result I: finds H (which is π’Œπ’Œ noisy rows of 𝑨𝑨) s.t.,

Pr[ βˆ‘π‘–π‘–=1𝑛𝑛 𝑑𝑑 𝐴𝐴𝑖𝑖 ,𝑯𝑯 𝑝𝑝 1/𝑝𝑝 ≀ πŸ’πŸ’ π’Œπ’Œ + 𝟏𝟏 ! βˆ‘π‘–π‘–=1𝑛𝑛 𝑑𝑑 𝐴𝐴𝑖𝑖 ,π΄π΄π‘˜π‘˜ 𝑝𝑝 1/𝑝𝑝] β‰₯ 23

β€’ π΄π΄π‘˜π‘˜: best rank-k approximation of Aβ€’ 𝑝𝑝𝑃𝑃𝑝𝑝𝑝𝑝(𝑑𝑑,π‘˜π‘˜, log𝑛𝑛) space

Page 47: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Subspace Approximation

Input: 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑 and an integer π’Œπ’Œ > 0Output: π’Œπ’Œ-dim subspace 𝑯𝑯 to minimize βˆ‘π‘–π‘–=1𝑛𝑛 𝑑𝑑 𝐴𝐴𝑖𝑖 ,𝐻𝐻 𝑝𝑝 1/𝑝𝑝

β€’ 𝑝𝑝 ∈ 1,2β€’ 𝑑𝑑 𝐴𝐴𝑖𝑖 ,𝐻𝐻 = 𝐴𝐴𝑖𝑖 𝕀𝕀 βˆ’ 𝐻𝐻+𝐻𝐻 2

Our Result I: finds H (which is π’Œπ’Œ noisy rows of 𝑨𝑨) s.t.,

Pr[ βˆ‘π‘–π‘–=1𝑛𝑛 𝑑𝑑 𝐴𝐴𝑖𝑖 ,𝑯𝑯 𝑝𝑝 1/𝑝𝑝 ≀ πŸ’πŸ’ π’Œπ’Œ + 𝟏𝟏 ! βˆ‘π‘–π‘–=1𝑛𝑛 𝑑𝑑 𝐴𝐴𝑖𝑖 ,π΄π΄π‘˜π‘˜ 𝑝𝑝 1/𝑝𝑝] β‰₯ 23

β€’ π΄π΄π‘˜π‘˜: best rank-k approximation of Aβ€’ 𝑝𝑝𝑃𝑃𝑝𝑝𝑝𝑝(𝑑𝑑,π‘˜π‘˜, log𝑛𝑛) spaceβ€’ First relative error on turnstile streams that returns noisy rows of A [Levin, Sevekari, Woodruff’18]

+(1 + πœ–πœ–)-approximation –larger number of rows –rows are not from A

Page 48: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Subspace Approximation

Input: 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑 and an integer π’Œπ’Œ > 0Output: π’Œπ’Œ-dim subspace 𝑯𝑯 to minimize βˆ‘π‘–π‘–=1𝑛𝑛 𝑑𝑑 𝐴𝐴𝑖𝑖 ,𝐻𝐻 𝑝𝑝 1/𝑝𝑝

β€’ 𝑝𝑝 ∈ 1,2β€’ 𝑑𝑑 𝐴𝐴𝑖𝑖 ,𝐻𝐻 = 𝐴𝐴𝑖𝑖 𝕀𝕀 βˆ’ 𝐻𝐻+𝐻𝐻 2

Our Result II: finds H (which is 𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑(π’Œπ’Œ,𝟏𝟏/𝝐𝝐) noisy rows of 𝑨𝑨) s.t.,

Pr[ βˆ‘π‘–π‘–=1𝑛𝑛 𝑑𝑑 𝐴𝐴𝑖𝑖 ,𝑯𝑯 𝑝𝑝 1/𝑝𝑝 ≀ 𝟏𝟏 + 𝝐𝝐 βˆ‘π‘–π‘–=1𝑛𝑛 𝑑𝑑 𝐴𝐴𝑖𝑖 ,π΄π΄π‘˜π‘˜ 𝑝𝑝 1/𝑝𝑝] β‰₯ 23

β€’ π΄π΄π‘˜π‘˜: best rank-k approximation of Aβ€’ 𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑(𝒅𝒅,π’Œπ’Œ,𝟏𝟏/𝝐𝝐, π₯π₯π₯π₯π₯π₯𝒏𝒏) space

[Levin, Sevekari, Woodruff’18] –𝑝𝑝𝑃𝑃𝑝𝑝𝑝𝑝(log 𝑛𝑛𝑑𝑑 ,π‘˜π‘˜, 1/πœ–πœ–) rows –rows are not from A

Page 49: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Projective Clustering

Input: 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑, target dim π’Œπ’Œ and target number of subspaces 𝒔𝒔Output: 𝒔𝒔 π’Œπ’Œ-dim subspaces π‘―π‘―πŸπŸ, … ,𝑯𝑯𝒔𝒔 to minimize βˆ‘π‘–π‘–=1𝑛𝑛 𝑑𝑑 𝐴𝐴𝑖𝑖 ,𝑯𝑯 𝑝𝑝 1/𝑝𝑝

β€’ 𝑯𝑯 = π‘―π‘―πŸπŸ βˆͺβ‹―βˆͺ𝑯𝑯𝒔𝒔 and 𝑝𝑝 ∈ 1,2β€’ 𝑑𝑑 𝐴𝐴𝑖𝑖 ,𝐻𝐻 = min

π‘—π‘—βˆˆ 𝑠𝑠𝐴𝐴𝑖𝑖 𝕀𝕀 βˆ’ 𝐻𝐻𝑗𝑗+𝐻𝐻𝑗𝑗 2

Page 50: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Projective Clustering

Our Result: finds S (which is 𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑(π’Œπ’Œ, 𝒔𝒔,𝟏𝟏/𝝐𝝐) noisy rows of 𝑨𝑨),which contains a union T of 𝑠𝑠 π’Œπ’Œ-dim subspaces s.t.,

Pr[ βˆ‘π‘–π‘–=1𝑛𝑛 𝑑𝑑 𝐴𝐴𝑖𝑖 ,𝑻𝑻 𝑝𝑝 1/𝑝𝑝 ≀ 𝟏𝟏 + 𝝐𝝐 βˆ‘π‘–π‘–=1𝑛𝑛 𝑑𝑑 𝐴𝐴𝑖𝑖 ,𝑯𝑯 𝑝𝑝 1/𝑝𝑝] β‰₯ 2/3β€’ 𝐻𝐻: optimal solution to projective clusteringβ€’ first one pass turnstile streaming algorithm with sublinear spaceβ€’ 𝑝𝑝𝑃𝑃𝑝𝑝𝑝𝑝(𝑑𝑑,π‘˜π‘˜, log𝑛𝑛 , 𝑠𝑠, 1/πœ–πœ–) space [BHI’02, HM’04, Che09, FMSW’10] based on coresets, works in row arrival [KR’15] turnstile but linear in number of points

Input: 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑, target dim π’Œπ’Œ and target number of subspaces 𝒔𝒔Output: 𝒔𝒔 π’Œπ’Œ-dim subspaces π‘―π‘―πŸπŸ, … ,𝑯𝑯𝒔𝒔 to minimize βˆ‘π‘–π‘–=1𝑛𝑛 𝑑𝑑 𝐴𝐴𝑖𝑖 ,𝑯𝑯 𝑝𝑝 1/𝑝𝑝

β€’ 𝑯𝑯 = π‘―π‘―πŸπŸ βˆͺβ‹―βˆͺ𝑯𝑯𝒔𝒔 and 𝑝𝑝 ∈ 1,2β€’ 𝑑𝑑 𝐴𝐴𝑖𝑖 ,𝐻𝐻 = min

π‘—π‘—βˆˆ 𝑠𝑠𝐴𝐴𝑖𝑖 𝕀𝕀 βˆ’ 𝐻𝐻𝑗𝑗+𝐻𝐻𝑗𝑗 2

Page 51: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Volume Maximization

Input: 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑 and an integer π’Œπ’ŒOutput: π’Œπ’Œ rows π’“π’“πŸπŸ, … , π’“π’“π’Œπ’Œ of 𝐴𝐴, 𝑴𝑴, with maximum volume

Page 52: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Volume Maximization

Input: 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑 and an integer π’Œπ’ŒOutput: π’Œπ’Œ rows π’“π’“πŸπŸ, … , π’“π’“π’Œπ’Œ of 𝐴𝐴, 𝑴𝑴, with maximum volume

Volume of the parallelepiped spanned by those vectors

π’Œπ’Œ = 𝟐𝟐

Page 53: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Volume Maximization

Input: 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑 and an integer π’Œπ’ŒOutput: π’Œπ’Œ rows π’“π’“πŸπŸ, … , π’“π’“π’Œπ’Œ of 𝐴𝐴, 𝑴𝑴, with maximum volume

Our Result (Upper Bound I): for an approximation factor 𝜢𝜢, finds S(set of π’Œπ’Œ noisy rows of 𝑨𝑨) s.t.,

Pr[π›Όπ›Όπ‘˜π‘˜ π‘˜π‘˜! Vol 𝐒𝐒 β‰₯ Vol(𝐌𝐌)] β‰₯ 2/3β€’ first one pass turnstile streaming algorithmβ€’ �𝑂𝑂( β„π‘›π‘›π‘‘π‘‘π‘˜π‘˜2 𝛼𝛼2) space

Page 54: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Volume Maximization

Input: 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑 and an integer π’Œπ’ŒOutput: π’Œπ’Œ rows π’“π’“πŸπŸ, … , π’“π’“π’Œπ’Œ of 𝐴𝐴, 𝑴𝑴, with maximum volume

Our Result (Upper Bound I): for an approximation factor 𝜢𝜢, finds S(set of π’Œπ’Œ noisy rows of 𝑨𝑨) s.t.,

Pr[π›Όπ›Όπ‘˜π‘˜ π‘˜π‘˜! Vol 𝐒𝐒 β‰₯ Vol(𝐌𝐌)] β‰₯ 2/3β€’ first one pass turnstile streaming algorithmβ€’ �𝑂𝑂( β„π‘›π‘›π‘‘π‘‘π‘˜π‘˜2 𝛼𝛼2) space [Indyk, M, Oveis Gharan, Rezaei, β€˜19 β€˜20] coreset based �𝑂𝑂 π‘˜π‘˜ π‘˜π‘˜/πœ–πœ– approx. and �𝑂𝑂(π‘›π‘›πœ–πœ–π‘˜π‘˜π‘‘π‘‘) space for row-arrival streams

Page 55: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Outline of Results

1. Simulate adaptive sampling in 1 pass turnstile streamβ€’ 𝐿𝐿𝑝𝑝,2 sampling with post processing matrix 𝑃𝑃

2. Applications in turnstile streamβ€’ Row/column subset selectionβ€’ Subspace approximationβ€’ Projective clusteringβ€’ Volume Maximization

3. Volume maximization lower bounds

4. Volume maximization in row arrival

Page 56: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Volume Maximization Lower Bounds

Input: 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑 and an integer π’Œπ’ŒOutput: π’Œπ’Œ rows π’“π’“πŸπŸ, … , π’“π’“π’Œπ’Œ of 𝐴𝐴, 𝑴𝑴, with maximum volume

Our Result (Lower Bound I): for 𝜢𝜢, any 𝒑𝒑-pass algorithm that finds πœΆπœΆπ‘˜π‘˜-approximation w.p. β‰₯ ⁄63 64 in turnstile-arrival requires Ξ©( ⁄𝑛𝑛 π‘˜π‘˜π’‘π’‘πœΆπœΆ2) space.

β€’ Our previous upper bound is matches the upper bound up to a factor of π’Œπ’ŒπŸ‘πŸ‘π’…π’… in space and π’Œπ’Œ! in the approximation factor.

Page 57: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Volume Maximization Lower Bounds

Input: 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑 and an integer π’Œπ’ŒOutput: π’Œπ’Œ rows π’“π’“πŸπŸ, … , π’“π’“π’Œπ’Œ of 𝐴𝐴, 𝑴𝑴, with maximum volume

Our Result (Lower Bound II): for a fixed constant π‘ͺπ‘ͺ, any one-pass algorithm that finds π‘ͺπ‘ͺπ’Œπ’Œ-approximation w.p. β‰₯ ⁄63 64 in random order row-arrival requires Ξ©(𝑛𝑛) space

Page 58: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Volume Maximization – Row Arrival

Input: 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑 and an integer π’Œπ’ŒOutput: π’Œπ’Œ rows π’“π’“πŸπŸ, … , π’“π’“π’Œπ’Œ of 𝐴𝐴, 𝑴𝑴, with maximum volume

Our Result (Upper Bound II): for an approximation factor π‘ͺπ‘ͺ < ⁄(log 𝑛𝑛) π‘˜π‘˜, finds S (set of π’Œπ’Œ rows of 𝑨𝑨) s.t.β€’ approximation factor �𝑂𝑂 π‘ͺπ‘ͺπ‘˜π‘˜ π‘˜π‘˜/2 with high probabilityβ€’ one pass row-arrival streaming algorithmβ€’ �𝑂𝑂(𝑛𝑛𝑂𝑂(1/π‘ͺπ‘ͺ)𝑑𝑑) space

Page 59: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Volume Maximization – Row Arrival

Input: 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑 and an integer π’Œπ’ŒOutput: π’Œπ’Œ rows π’“π’“πŸπŸ, … , π’“π’“π’Œπ’Œ of 𝐴𝐴, 𝑴𝑴, with maximum volume

Our Result (Upper Bound II): for an approximation factor π‘ͺπ‘ͺ < ⁄(log 𝑛𝑛) π‘˜π‘˜, finds S (set of π’Œπ’Œ rows of 𝑨𝑨) s.t.β€’ approximation factor �𝑂𝑂 π‘ͺπ‘ͺπ‘˜π‘˜ π‘˜π‘˜/2 with high probabilityβ€’ one pass row-arrival streaming algorithmβ€’ �𝑂𝑂(𝑛𝑛𝑂𝑂(1/π‘ͺπ‘ͺ)𝑑𝑑) space

[Indyk, M, Oveis Gharan, Rezaei, β€˜19 β€˜20] coreset based �𝑂𝑂 π‘˜π‘˜ π‘ͺπ‘ͺπ’Œπ’Œ/𝟐𝟐 approx. and �𝑂𝑂(π‘›π‘›πŸπŸ/π‘ͺπ‘ͺπ‘˜π‘˜π‘‘π‘‘)space for row-arrival streams

Page 60: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

𝑳𝑳𝒑𝒑,𝟐𝟐 Sampler1. Simulate adaptive sampling in 1 pass

β€’ 𝐿𝐿𝑝𝑝,2 sampling with post processing matrix 𝑃𝑃

2. Applications in turnstile streamβ€’ Row/column subset selectionβ€’ Subspace approximationβ€’ Projective clusteringβ€’ Volume Maximization

3. Volume maximization lower bounds

4. Volume maximization in row arrival

Page 61: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

π‘³π‘³πŸπŸ,𝟐𝟐 Sampler with Post-Processing Matrix

Input: matrix A as a data stream, a post-processing matrix P

Output: index 𝑖𝑖 of a row of AP sampled w.p. ~ 𝐴𝐴𝑖𝑖𝑃𝑃 22

𝐴𝐴𝑃𝑃 𝐹𝐹2

Page 62: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

π‘³π‘³πŸπŸ,𝟐𝟐 Sampler with Post-Processing Matrix

Extension of π‘³π‘³πŸπŸ Sampler [Andoni et al.’10][Monemizadeh, Woodruff’10][Jowhari et al.’11][Jayaram, Woodruff’18]

Input: matrix A as a data stream, a post-processing matrix P

Output: index 𝑖𝑖 of a row of AP sampled w.p. ~ 𝐴𝐴𝑖𝑖𝑃𝑃 22

𝐴𝐴𝑃𝑃 𝐹𝐹2

Input: vector f as a data stream

Output: index 𝑖𝑖 of a coordinate of f sampled w.p. ~ 𝑓𝑓𝑖𝑖2

𝐟𝐟 22

Page 63: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

π‘³π‘³πŸπŸ,𝟐𝟐 Sampler with Post-Processing Matrix

Extension of π‘³π‘³πŸπŸ Sampler [Andoni et al.’10][Monemizadeh, Woodruff’10][Jowhari et al.’11][Jayaram, Woodruff’18]

Input: matrix A as a data stream, a post-processing matrix P

Output: index 𝑖𝑖 of a row of AP sampled w.p. ~ 𝐴𝐴𝑖𝑖𝑃𝑃 22

𝐴𝐴𝑃𝑃 𝐹𝐹2

Input: vector f as a data stream

Output: index 𝑖𝑖 of a coordinate of f sampled w.p. ~ 𝑓𝑓𝑖𝑖2

𝐟𝐟 22

What is new:1. Generalizing vectors to matrices 2. Handling the post processing matrix 𝑃𝑃

Page 64: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

π‘³π‘³πŸπŸ,𝟐𝟐 SamplerInput: matrix A as a data stream

Output: index 𝑖𝑖 of a row of A sampled w.p. ~ 𝐴𝐴𝑖𝑖 22

𝐀𝐀 𝐹𝐹2

Ignore 𝑃𝑃 for now

Page 65: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Step 1. β€’ pick 𝑑𝑑𝑖𝑖 ∈ [0,1] uniformly at random

π‘³π‘³πŸπŸ,𝟐𝟐 Sampler

Page 66: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Step 1. β€’ pick 𝑑𝑑𝑖𝑖 ∈ [0,1] uniformly at randomβ€’ set 𝐡𝐡𝑖𝑖 ≔

πŸπŸπ’•π’•π’Šπ’Š

Γ— 𝐴𝐴𝑖𝑖

π‘³π‘³πŸπŸ,𝟐𝟐 Sampler

Page 67: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Step 1. β€’ pick 𝑑𝑑𝑖𝑖 ∈ [0,1] uniformly at randomβ€’ set 𝐡𝐡𝑖𝑖 ≔

πŸπŸπ’•π’•π’Šπ’Š

Γ— 𝐴𝐴𝑖𝑖Pr[ 𝐡𝐡𝑖𝑖 2

2 β‰₯ 𝐴𝐴 𝐹𝐹2 ] = Pr[ 𝐴𝐴𝑖𝑖 2

2

𝐴𝐴 𝐹𝐹2 β‰₯ 𝑑𝑑𝑖𝑖] = π‘¨π‘¨π’Šπ’Š 𝟐𝟐

𝟐𝟐

𝑨𝑨 π‘­π‘­πŸπŸ

π‘³π‘³πŸπŸ,𝟐𝟐 Sampler

Page 68: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Step 1. β€’ pick 𝑑𝑑𝑖𝑖 ∈ [0,1] uniformly at randomβ€’ set 𝐡𝐡𝑖𝑖 ≔

πŸπŸπ’•π’•π’Šπ’Š

Γ— 𝐴𝐴𝑖𝑖

Return 𝑖𝑖 that satisfies π‘©π‘©π’Šπ’Š 𝟐𝟐𝟐𝟐 β‰₯ 𝑨𝑨 𝑭𝑭

𝟐𝟐

Pr[ 𝐡𝐡𝑖𝑖 22 β‰₯ 𝐴𝐴 𝐹𝐹

2 ] = Pr[ 𝐴𝐴𝑖𝑖 22

𝐴𝐴 𝐹𝐹2 β‰₯ 𝑑𝑑𝑖𝑖] = π‘¨π‘¨π’Šπ’Š 𝟐𝟐

𝟐𝟐

𝑨𝑨 π‘­π‘­πŸπŸ

π‘³π‘³πŸπŸ,𝟐𝟐 Sampler

Page 69: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Step 1. β€’ pick 𝑑𝑑𝑖𝑖 ∈ [0,1] uniformly at randomβ€’ set 𝐡𝐡𝑖𝑖 ≔

πŸπŸπ’•π’•π’Šπ’Š

Γ— 𝐴𝐴𝑖𝑖

Return 𝑖𝑖 that satisfies π‘©π‘©π’Šπ’Š 𝟐𝟐𝟐𝟐 β‰₯ 𝑨𝑨 𝑭𝑭

𝟐𝟐

Pr[ 𝐡𝐡𝑖𝑖 22 β‰₯ 𝐴𝐴 𝐹𝐹

2 ] = Pr[ 𝐴𝐴𝑖𝑖 22

𝐴𝐴 𝐹𝐹2 β‰₯ 𝑑𝑑𝑖𝑖] = π‘¨π‘¨π’Šπ’Š 𝟐𝟐

𝟐𝟐

𝑨𝑨 π‘­π‘­πŸπŸ

π‘³π‘³πŸπŸ,𝟐𝟐 Sampler

Issues:1. Multiple rows passing the threshold

2. Don’t have access to exact values of π‘©π‘©π’Šπ’Š 𝟐𝟐𝟐𝟐 and 𝑨𝑨 𝑭𝑭

𝟐𝟐

Page 70: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

π‘³π‘³πŸπŸ,𝟐𝟐 Sampler

Step 1. β€’ pick 𝑑𝑑𝑖𝑖 ∈ [0,1] uniformly at randomβ€’ set 𝐡𝐡𝑖𝑖 ≔

πŸπŸπ’•π’•π’Šπ’Š

Γ— 𝐴𝐴𝑖𝑖 Ideally, return the only π’Šπ’Š that satisfies π‘©π‘©π’Šπ’Š 𝟐𝟐

𝟐𝟐 β‰₯ 𝑨𝑨 π‘­π‘­πŸπŸ

𝜸𝜸𝟐𝟐: = π‘ͺπ‘ͺ π₯π₯π₯π₯π₯π₯ 𝒏𝒏𝝐𝝐

Pr[ 𝐡𝐡𝑖𝑖 22 β‰₯ 𝜸𝜸𝟐𝟐 β‹… 𝐴𝐴 𝐹𝐹

2 ] = 𝟏𝟏𝜸𝜸𝟐𝟐

Γ— π‘¨π‘¨π’Šπ’Š 𝟐𝟐𝟐𝟐

𝑨𝑨 π‘­π‘­πŸπŸ

Issue 1: Multiple rows passing the threshold Set the threshold higher

Page 71: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

π‘³π‘³πŸπŸ,𝟐𝟐 Sampler

Step 1. β€’ pick 𝑑𝑑𝑖𝑖 ∈ [0,1] uniformly at randomβ€’ set 𝐡𝐡𝑖𝑖 ≔

πŸπŸπ’•π’•π’Šπ’Š

Γ— 𝐴𝐴𝑖𝑖 Ideally, return the only π’Šπ’Š that satisfies π‘©π‘©π’Šπ’Š 𝟐𝟐

𝟐𝟐 β‰₯ 𝑨𝑨 π‘­π‘­πŸπŸ

𝜸𝜸𝟐𝟐: = π‘ͺπ‘ͺ π₯π₯π₯π₯π₯π₯ 𝒏𝒏𝝐𝝐

Pr[squared norm of at least one row exceeds 𝜸𝜸𝟐𝟐 β‹… 𝐴𝐴 𝐹𝐹2 ] = Ξ©( 1

𝛾𝛾2)

Pr[squared norms of more than one row exceed 𝜸𝜸𝟐𝟐 β‹… 𝐴𝐴 𝐹𝐹2 ] = O( 1

𝛾𝛾4)

Pr[ 𝐡𝐡𝑖𝑖 22 β‰₯ 𝜸𝜸𝟐𝟐 β‹… 𝐴𝐴 𝐹𝐹

2 ] = 𝟏𝟏𝜸𝜸𝟐𝟐

Γ— π‘¨π‘¨π’Šπ’Š 𝟐𝟐𝟐𝟐

𝑨𝑨 π‘­π‘­πŸπŸ

Success prob: Ξ©( πœ–πœ–log 𝑛𝑛

)

Issue 1: Multiple rows passing the threshold Set the threshold higher

Page 72: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

π‘³π‘³πŸπŸ,𝟐𝟐 Sampler

Step 1. β€’ pick 𝑑𝑑𝑖𝑖 ∈ [0,1] uniformly at randomβ€’ set 𝐡𝐡𝑖𝑖 ≔

πŸπŸπ’•π’•π’Šπ’Š

Γ— 𝐴𝐴𝑖𝑖 Ideally, return the only π’Šπ’Š that satisfies π‘©π‘©π’Šπ’Š 𝟐𝟐

𝟐𝟐 β‰₯ 𝑨𝑨 π‘­π‘­πŸπŸ

𝜸𝜸𝟐𝟐: = π‘ͺπ‘ͺ π₯π₯π₯π₯π₯π₯ 𝒏𝒏𝝐𝝐

To succeed, repeat �𝑢𝑢(𝟏𝟏/𝝐𝝐)

Pr[ 𝐡𝐡𝑖𝑖 22 β‰₯ 𝜸𝜸𝟐𝟐 β‹… 𝐴𝐴 𝐹𝐹

2 ] = 𝟏𝟏𝜸𝜸𝟐𝟐

Γ— π‘¨π‘¨π’Šπ’Š 𝟐𝟐𝟐𝟐

𝑨𝑨 π‘­π‘­πŸπŸ

Success prob: Ξ©( πœ–πœ–log 𝑛𝑛

)

Issue 1: Multiple rows passing the threshold Set the threshold higher

Page 73: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

π‘³π‘³πŸπŸ,𝟐𝟐 Sampler

Step 1. β€’ pick 𝑑𝑑𝑖𝑖 ∈ [0,1] uniformly at randomβ€’ set 𝐡𝐡𝑖𝑖 ≔

πŸπŸπ’•π’•π’Šπ’Š

Γ— 𝐴𝐴𝑖𝑖 Return π’Šπ’Š that satisfies π‘©π‘©π’Šπ’Š 𝟐𝟐 β‰₯ 𝜸𝜸 β‹… 𝑨𝑨 𝑭𝑭

Issue 2: Don’t have access to exact values of π‘©π‘©π’Šπ’Š and 𝑨𝑨 F

estimate π‘©π‘©π’Šπ’Š 𝟐𝟐 and 𝑨𝑨 𝑭𝑭

Page 74: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

π‘³π‘³πŸπŸ,𝟐𝟐 Sampler

Step 1. β€’ pick 𝑑𝑑𝑖𝑖 ∈ [0,1] uniformly at randomβ€’ set 𝐡𝐡𝑖𝑖 ≔

πŸπŸπ’•π’•π’Šπ’Š

Γ— 𝐴𝐴𝑖𝑖 Return π’Šπ’Š that satisfies π‘©π‘©π’Šπ’Š 𝟐𝟐 β‰₯ 𝜸𝜸 β‹… 𝑨𝑨 𝑭𝑭

Issue 2: Don’t have access to exact values of π‘©π‘©π’Šπ’Š and 𝑨𝑨 F

estimate π‘©π‘©π’Šπ’Š 𝟐𝟐 and 𝑨𝑨 𝑭𝑭

Estimate norm of A using AMS

Find heaviest row using CountSketch

Page 75: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Estimate π‘©π‘©π’Šπ’Š 𝟐𝟐 for rows with large norms

Count Sketch

Page 76: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Given a stream of items, estimate frequency of each item (i.e., coordinates in a vector)

#rows r =𝑂𝑂(log𝑛𝑛)#buckets/row b = 𝑂𝑂(1/πœ–πœ–2)

Count Sketch

𝑓𝑓𝑖𝑖

+π‘“π‘“π‘–π‘–β„Ž1(𝑖𝑖)

β€’ Hash β„Žπ‘—π‘—: 𝑛𝑛 β†’ [𝑏𝑏]β€’ Sign πœŽπœŽπ‘—π‘—: 𝑛𝑛 β†’ {βˆ’1, +1}

β€’ Update: 𝐢𝐢[𝑗𝑗,β„Žπ‘—π‘—(𝑖𝑖)] += πœŽπœŽπ‘—π‘—(𝑖𝑖) β‹… 𝑓𝑓𝑖𝑖

β€’

Page 77: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Given a stream of items, estimate frequency of each item (i.e., coordinates in a vector)

#rows r =𝑂𝑂(log𝑛𝑛)#buckets/row b = 𝑂𝑂(1/πœ–πœ–2)

Count Sketch

𝑓𝑓𝑖𝑖

+𝑓𝑓𝑖𝑖

β€’ Hash β„Žπ‘—π‘—: 𝑛𝑛 β†’ [𝑏𝑏]β€’ Sign πœŽπœŽπ‘—π‘—: 𝑛𝑛 β†’ {βˆ’1, +1}

β€’ Update: 𝐢𝐢[𝑗𝑗,β„Žπ‘—π‘—(𝑖𝑖)] += πœŽπœŽπ‘—π‘—(𝑖𝑖) β‹… 𝑓𝑓𝑖𝑖

β€’βˆ’π‘“π‘“π‘–π‘–β„Ž2(𝑖𝑖)

Page 78: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Given a stream of items, estimate frequency of each item (i.e., coordinates in a vector)

#rows r =𝑂𝑂(log𝑛𝑛)#buckets/row b = 𝑂𝑂(1/πœ–πœ–2)

Count Sketch

𝑓𝑓𝑖𝑖

+𝑓𝑓𝑖𝑖

β€’ Hash β„Žπ‘—π‘—: 𝑛𝑛 β†’ [𝑏𝑏]β€’ Sign πœŽπœŽπ‘—π‘—: 𝑛𝑛 β†’ {βˆ’1, +1}

βˆ’π‘“π‘“π‘–π‘–+π‘“π‘“π‘–π‘–β„Ž3(𝑖𝑖)

β€’ Update: 𝐢𝐢[𝑗𝑗,β„Žπ‘—π‘—(𝑖𝑖)] += πœŽπœŽπ‘—π‘—(𝑖𝑖) β‹… 𝑓𝑓𝑖𝑖

β€’

Page 79: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Given a stream of items, estimate frequency of each item (i.e., coordinates in a vector)

#rows r =𝑂𝑂(log𝑛𝑛)#buckets/row b = 𝑂𝑂(1/πœ–πœ–2)

Count Sketch

𝑓𝑓𝑖𝑖

+𝑓𝑓𝑖𝑖

β€’ Hash β„Žπ‘—π‘—: 𝑛𝑛 β†’ [𝑏𝑏]β€’ Sign πœŽπœŽπ‘—π‘—: 𝑛𝑛 β†’ {βˆ’1, +1}

β€’ Update: 𝐢𝐢[𝑗𝑗,β„Žπ‘—π‘—(𝑖𝑖)] += πœŽπœŽπ‘—π‘—(𝑖𝑖) β‹… 𝑓𝑓𝑖𝑖

β€’ Estimate 𝑓𝑓𝑖𝑖 ≔ π‘šπ‘šπ‘šπ‘šπ‘‘π‘‘π‘–π‘–π‘šπ‘šπ‘›π‘›π‘—π‘— πœŽπœŽπ‘—π‘—πΆπΆ[𝑗𝑗,β„Žπ‘—π‘—(𝑖𝑖)]βˆ’π‘“π‘“π‘–π‘–

+𝑓𝑓𝑖𝑖

Page 80: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Given a stream of items, estimate frequency of each item (i.e., coordinates in a vector)

#rows r =𝑂𝑂(log𝑛𝑛)#buckets/row b = 𝑂𝑂(1/πœ–πœ–2)

Count Sketch

𝑓𝑓𝑖𝑖

+π‘“π‘“π‘–π‘–βˆ’π‘“π‘“π‘–π‘–

+𝑓𝑓𝑖𝑖

Estimation guarantee𝑓𝑓𝑖𝑖 βˆ’ 𝑓𝑓𝑖𝑖 ≀ πœ–πœ– β‹… 𝐟𝐟 2

β€’ Update: 𝐢𝐢[𝑗𝑗,β„Žπ‘—π‘—(𝑖𝑖)] += πœŽπœŽπ‘—π‘—(𝑖𝑖) β‹… 𝑓𝑓𝑖𝑖

β€’ Estimate 𝑓𝑓𝑖𝑖 ≔ π‘šπ‘šπ‘šπ‘šπ‘‘π‘‘π‘–π‘–π‘šπ‘šπ‘›π‘›π‘—π‘— πœŽπœŽπ‘—π‘—πΆπΆ[𝑗𝑗,β„Žπ‘—π‘—(𝑖𝑖)]

Page 81: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Estimate π‘©π‘©π’Šπ’Š 𝟐𝟐 for rows with large norms

#rows r =𝑂𝑂(log𝑛𝑛)#buckets/row b = 𝑂𝑂(1/πœ–πœ–2)

Count Sketch

𝐡𝐡𝑖𝑖

+π΅π΅π‘–π‘–βˆ’π΅π΅π‘–π‘–

+𝐡𝐡𝑖𝑖

Estimation guarantee

𝐡𝐡𝑖𝑖 2 βˆ’ �𝐡𝐡𝑖𝑖 2 ≀ πœ–πœ– β‹… 𝐡𝐡 𝐹𝐹

Space usage:

𝑂𝑂 log𝑛𝑛 Γ—1πœ–πœ–2

Γ— 𝑑𝑑

Page 82: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Step 1. β€’ pick 𝑑𝑑𝑖𝑖 ∈ [0,1] uniformly at randomβ€’ set 𝐡𝐡𝑖𝑖 ≔

πŸπŸπ’•π’•π’Šπ’Š

Γ— 𝐴𝐴𝑖𝑖

π‘³π‘³πŸπŸ,𝟐𝟐 Sampler

Goal: 𝐡𝐡𝑖𝑖 2 β‰₯ 𝜸𝜸 β‹… 𝐴𝐴 𝐹𝐹

Page 83: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Step 1. β€’ pick 𝑑𝑑𝑖𝑖 ∈ [0,1] uniformly at randomβ€’ set 𝐡𝐡𝑖𝑖 ≔

πŸπŸπ’•π’•π’Šπ’Š

Γ— 𝐴𝐴𝑖𝑖

Step 2. β€’ οΏ½π‘©π‘©π’Šπ’Š 𝟐𝟐is an estimate of π‘©π‘©π’Šπ’Š 𝟐𝟐 by modified Countsketchβ€’ �𝑭𝑭 is an estimate of 𝑨𝑨 𝑭𝑭 by modified AMS

π‘³π‘³πŸπŸ,𝟐𝟐 Sampler

Goal: 𝐡𝐡𝑖𝑖 2 β‰₯ 𝜸𝜸 β‹… 𝐴𝐴 𝐹𝐹

Test: �𝐡𝐡𝑖𝑖 2 β‰₯ 𝜸𝜸 β‹… �𝐹𝐹

Page 84: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Step 1. β€’ pick 𝑑𝑑𝑖𝑖 ∈ [0,1] uniformly at randomβ€’ set 𝐡𝐡𝑖𝑖 ≔

πŸπŸπ’•π’•π’Šπ’Š

Γ— 𝐴𝐴𝑖𝑖

Step 2. β€’ οΏ½π‘©π‘©π’Šπ’Š 𝟐𝟐is an estimate of π‘©π‘©π’Šπ’Š 𝟐𝟐 by modified Countsketchβ€’ �𝑭𝑭 is an estimate of 𝑨𝑨 𝑭𝑭 by modified AMS

The test succeeds w.p. πœ–πœ–, the estimate of largest row exceeds the threshold

π‘³π‘³πŸπŸ,𝟐𝟐 Sampler

Goal: 𝐡𝐡𝑖𝑖 2 β‰₯ 𝜸𝜸 β‹… 𝐴𝐴 𝐹𝐹

Test: �𝐡𝐡𝑖𝑖 2 β‰₯ 𝜸𝜸 β‹… �𝐹𝐹

Page 85: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Handling Post-Processing Matrix

Input: matrix A as a data stream, a post-processing matrix P

Output: index 𝑖𝑖 of a row of AP sampled w.p. ~ 𝐴𝐴𝑖𝑖𝑃𝑃 22

𝐴𝐴𝑃𝑃 𝐹𝐹2

Page 86: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Handling Post-Processing Matrix

Run proposed algorithm on A, then multiply by P:β€’ CountSketch and AMS both are linear transformations

A is mapped to SA S (AP) = (SA) P

Input: matrix A as a data stream, a post-processing matrix P

Output: index 𝑖𝑖 of a row of AP sampled w.p. ~ 𝐴𝐴𝑖𝑖𝑃𝑃 22

𝐴𝐴𝑃𝑃 𝐹𝐹2

Page 87: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Handling Post-Processing Matrix

Run proposed algorithm on A, then multiply by P:β€’ CountSketch and AMS both are linear transformations

A is mapped to SA S (AP) = (SA) P

Total space for sampler: 𝑂𝑂( π‘‘π‘‘πœ–πœ–2

log2 𝑛𝑛) bits

Input: matrix A as a data stream, a post-processing matrix P

Output: index 𝑖𝑖 of a row of AP sampled w.p. ~ 𝐴𝐴𝑖𝑖𝑃𝑃 22

𝐴𝐴𝑃𝑃 𝐹𝐹2

Page 88: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

π‘³π‘³πŸπŸ,𝟐𝟐 sampling with post processingInput:β€’ 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑 as a (turnstile) streamβ€’ a post-processing 𝑷𝑷 ∈ ℝ𝑑𝑑×𝑑𝑑

Output: samples an index 𝑖𝑖 ∈ [𝑛𝑛] w.p. 1 Β± πœ–πœ– π‘¨π‘¨π’Šπ’Šπ‘·π‘· 22

𝑨𝑨𝑷𝑷 𝐹𝐹2 + 1

𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝(𝑛𝑛) In one pass 𝑝𝑝𝑃𝑃𝑝𝑝𝑝𝑝(𝑑𝑑, πœ–πœ–βˆ’1, log𝑛𝑛) space

Page 89: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Adaptive Sampler1. Simulate adaptive sampling in 1 pass

β€’ 𝐿𝐿𝑝𝑝,2 sampling with post processing matrix 𝑃𝑃

2. Applications in turnstile streamβ€’ Row/column subset selectionβ€’ Subspace approximationβ€’ Projective clusteringβ€’ Volume Maximization

3. Volume maximization lower bounds

4. Volume maximization in row arrival

Page 90: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Algorithm Using π‘³π‘³πŸπŸ,𝟐𝟐 SamplerMaintain π‘˜π‘˜ instances of 𝐿𝐿2,2 sampler with post processing: π‘Ίπ‘ΊπŸπŸ, … ,π‘Ίπ‘Ίπ’Œπ’Œπ‘€π‘€ ← βˆ…For round 𝑖𝑖 = 1 to π‘˜π‘˜,

β€’ Set 𝑃𝑃 ← 𝐼𝐼 βˆ’π‘€π‘€+𝑀𝑀‒ Use π‘Ίπ‘Ίπ’Šπ’Š to sample a noisy row 𝑃𝑃𝑗𝑗 of 𝐴𝐴 with post processing matrix 𝑃𝑃‒ Append 𝑃𝑃𝑗𝑗 to 𝑀𝑀

Page 91: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Algorithm Using π‘³π‘³πŸπŸ,𝟐𝟐 SamplerMaintain π‘˜π‘˜ instances of 𝐿𝐿2,2 sampler with post processing: π‘Ίπ‘ΊπŸπŸ, … ,π‘Ίπ‘Ίπ’Œπ’Œπ‘€π‘€ ← βˆ…For round 𝑖𝑖 = 1 to π‘˜π‘˜,

β€’ Set 𝑃𝑃 ← 𝐼𝐼 βˆ’π‘€π‘€+𝑀𝑀‒ Use π‘Ίπ‘Ίπ’Šπ’Š to sample a noisy row 𝑃𝑃𝑗𝑗 of 𝐴𝐴 with post processing matrix 𝑃𝑃‒ Append 𝑃𝑃𝑗𝑗 to 𝑀𝑀

Issues:X Noisy perturbation of rows (unavoidable) Sample 𝑗𝑗, 𝑃𝑃𝑗𝑗 = A𝑗𝑗𝑃𝑃 + v where v has small norm v < πœ–πœ– 𝐴𝐴𝑗𝑗𝑃𝑃 thus 𝑃𝑃𝑗𝑗 β‰ˆ 𝐴𝐴𝑗𝑗𝑃𝑃

Page 92: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Algorithm Using π‘³π‘³πŸπŸ,𝟐𝟐 SamplerMaintain π‘˜π‘˜ instances of 𝐿𝐿2,2 sampler with post processing: π‘Ίπ‘ΊπŸπŸ, … ,π‘Ίπ‘Ίπ’Œπ’Œπ‘€π‘€ ← βˆ…For round 𝑖𝑖 = 1 to π‘˜π‘˜,

β€’ Set 𝑃𝑃 ← 𝐼𝐼 βˆ’π‘€π‘€+𝑀𝑀‒ Use π‘Ίπ‘Ίπ’Šπ’Š to sample a noisy row 𝑃𝑃𝑗𝑗 of 𝐴𝐴 with post processing matrix 𝑃𝑃‒ Append 𝑃𝑃𝑗𝑗 to 𝑀𝑀

Issues:X Noisy perturbation of rows (unavoidable) Sample 𝑗𝑗, 𝑃𝑃𝑗𝑗 = A𝑗𝑗𝑃𝑃 + v where v has small norm v < πœ–πœ– 𝐴𝐴𝑗𝑗𝑃𝑃 thus 𝑃𝑃𝑗𝑗 β‰ˆ 𝐴𝐴𝑗𝑗𝑃𝑃

X This can drastically change the probabilities: may zero out probabilities of some rows

Page 93: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Bad Example

π‘¨π‘¨πŸπŸ = (𝑴𝑴,𝟎𝟎)

π‘¨π‘¨πŸπŸ = (𝟎𝟎,𝟏𝟏)

Page 94: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Bad Example

π‘¨π‘¨πŸπŸ

π‘¨π‘¨πŸπŸ π’“π’“πŸπŸ

Page 95: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Bad Example

π’“π’“πŸπŸ

Page 96: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Bad Example

π‘¨π‘¨πŸπŸ(𝑰𝑰 βˆ’π‘΄π‘΄+𝑴𝑴)

π‘¨π‘¨πŸπŸ(𝑰𝑰 βˆ’π‘΄π‘΄+𝑴𝑴) π’“π’“πŸπŸ

Noisy row sampling: π‘¨π‘¨πŸπŸ(𝑰𝑰 βˆ’π‘΄π‘΄+𝑴𝑴) β‰₯ π‘¨π‘¨πŸπŸ(𝑰𝑰 βˆ’π‘΄π‘΄+𝑴𝑴)

Page 97: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Bad Example

π‘¨π‘¨πŸπŸ(𝑰𝑰 βˆ’π‘΄π‘΄+𝑴𝑴)

π‘¨π‘¨πŸπŸ(𝑰𝑰 βˆ’π‘΄π‘΄+𝑴𝑴) π’“π’“πŸπŸ

Sample one row again and again

Noisy row sampling: π‘¨π‘¨πŸπŸ(𝑰𝑰 βˆ’π‘΄π‘΄+𝑴𝑴) β‰₯ π‘¨π‘¨πŸπŸ(𝑰𝑰 βˆ’π‘΄π‘΄+𝑴𝑴)

Page 98: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Bad Example

Noisy row sampling: π‘¨π‘¨πŸπŸ(𝑰𝑰 βˆ’π‘΄π‘΄+𝑴𝑴) β‰₯ π‘¨π‘¨πŸπŸ(𝑰𝑰 βˆ’π‘΄π‘΄+𝑴𝑴)

True row sampling: π‘¨π‘¨πŸπŸ(𝑰𝑰 βˆ’π‘΄π‘΄+𝑴𝑴) = 0

Page 99: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Bad Example

x We cannot hope for a multiplicative bound on probabilities.

Page 100: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Bad Example

x We cannot hope for a multiplicative bound on probabilities.

Lemma: Not only the norm of 𝑣𝑣 is small in compare to 𝐴𝐴𝑗𝑗 but also its norm projected away from 𝐴𝐴𝑗𝑗 is small

Page 101: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Bad Example

x We cannot hope for a multiplicative bound on probabilities.

Lemma: Not only the norm of 𝑣𝑣 is small in compare to 𝐴𝐴𝑗𝑗 but also its norm projected away from 𝐴𝐴𝑗𝑗 is small:

β€’ 𝑃𝑃𝑗𝑗 = 𝐴𝐴𝑗𝑗𝑃𝑃 + 𝑣𝑣

β€’ where 𝒗𝒗𝑸𝑸 ≀ πœ–πœ– 𝐴𝐴𝑗𝑗𝑃𝑃 β‹… 𝑨𝑨𝑷𝑷𝑸𝑸 𝑭𝑭𝑨𝑨𝑷𝑷 𝑭𝑭

for any projection matrix 𝑸𝑸

Page 102: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Bad Example

x We cannot hope for a multiplicative bound on probabilities.

Lemma: Not only the norm of 𝑣𝑣 is small in compare to 𝐴𝐴𝑗𝑗 but also its norm projected away from 𝐴𝐴𝑗𝑗 is small:

β€’ 𝑃𝑃𝑗𝑗 = 𝐴𝐴𝑗𝑗𝑃𝑃 + 𝑣𝑣

β€’ where 𝒗𝒗𝑸𝑸 ≀ πœ–πœ– 𝐴𝐴𝑗𝑗𝑃𝑃 β‹… 𝑨𝑨𝑷𝑷𝑸𝑸 𝑭𝑭𝑨𝑨𝑷𝑷 𝑭𝑭

for any projection matrix 𝑸𝑸

Page 103: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Bad Example

x We cannot hope for a multiplicative bound on probabilities.

Lemma: Not only the norm of 𝑣𝑣 is small in compare to 𝐴𝐴𝑗𝑗 but also its norm projected away from 𝐴𝐴𝑗𝑗 is small:

β€’ 𝑃𝑃𝑗𝑗 = 𝐴𝐴𝑗𝑗𝑃𝑃 + 𝑣𝑣

β€’ where 𝒗𝒗𝑸𝑸 ≀ πœ–πœ– 𝐴𝐴𝑗𝑗𝑃𝑃 β‹… 𝑨𝑨𝑷𝑷𝑸𝑸 𝑭𝑭𝑨𝑨𝑷𝑷 𝑭𝑭

for any projection matrix 𝑸𝑸

Bound the additive error of sampling probabilities in subsequent rounds

Page 104: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Overview of How to Bound the ErrorSuppose indices reported by our algorithm are 𝑗𝑗1, … , π‘—π‘—π‘˜π‘˜Consider two bases 𝑼𝑼 and 𝑾𝑾‒ 𝑼𝑼 follows True rows: π‘ˆπ‘ˆ = 𝑒𝑒1, … ,𝑒𝑒𝑑𝑑 s.t. {𝑒𝑒1, … ,𝑒𝑒𝑖𝑖} spans {𝐴𝐴𝑗𝑗1 , … ,𝐴𝐴𝑗𝑗𝑖𝑖}β€’ 𝑾𝑾 follows Noisy rows: π‘Šπ‘Š = 𝑀𝑀1, … ,𝑀𝑀𝑑𝑑 s.t. {𝑀𝑀1, … ,𝑀𝑀𝑖𝑖} spans {𝑃𝑃𝑗𝑗1 , … , 𝑃𝑃𝑗𝑗𝑖𝑖}

Page 105: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Overview of How to Bound the Error

For row 𝐴𝐴π‘₯π‘₯:

β€’ 𝐴𝐴π‘₯π‘₯ = βˆ‘π‘–π‘–=1𝑑𝑑 πœ†πœ†π‘₯π‘₯,𝑖𝑖𝑒𝑒𝑖𝑖‒ 𝐴𝐴π‘₯π‘₯ = βˆ‘π‘–π‘–=1𝑑𝑑 πœ‰πœ‰π‘₯π‘₯,𝑖𝑖𝑀𝑀𝑖𝑖

Suppose indices reported by our algorithm are 𝑗𝑗1, … , π‘—π‘—π‘˜π‘˜Consider two bases 𝑼𝑼 and 𝑾𝑾‒ 𝑼𝑼 follows True rows: π‘ˆπ‘ˆ = 𝑒𝑒1, … ,𝑒𝑒𝑑𝑑 s.t. {𝑒𝑒1, … ,𝑒𝑒𝑖𝑖} spans {𝐴𝐴𝑗𝑗1 , … ,𝐴𝐴𝑗𝑗𝑖𝑖}β€’ 𝑾𝑾 follows Noisy rows: π‘Šπ‘Š = 𝑀𝑀1, … ,𝑀𝑀𝑑𝑑 s.t. {𝑀𝑀1, … ,𝑀𝑀𝑖𝑖} spans {𝑃𝑃𝑗𝑗1 , … , 𝑃𝑃𝑗𝑗𝑖𝑖}

Page 106: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Overview of How to Bound the Error

For row 𝐴𝐴π‘₯π‘₯:

β€’ 𝐴𝐴π‘₯π‘₯ = βˆ‘π‘–π‘–=1𝑑𝑑 πœ†πœ†π‘₯π‘₯,𝑖𝑖𝑒𝑒𝑖𝑖‒ 𝐴𝐴π‘₯π‘₯ = βˆ‘π‘–π‘–=1𝑑𝑑 πœ‰πœ‰π‘₯π‘₯,𝑖𝑖𝑀𝑀𝑖𝑖

Sampling probs in terms of π‘ˆπ‘ˆ and π‘Šπ‘Š in t-th round

β€’ The correct probability: βˆ‘π‘–π‘–=𝑑𝑑𝑑𝑑 πœ†πœ†π‘₯π‘₯,𝑖𝑖

2

βˆ‘π‘¦π‘¦=1𝑛𝑛 βˆ‘π‘–π‘–=𝑑𝑑𝑑𝑑 πœ†πœ†π‘¦π‘¦,𝑖𝑖

2

β€’ What we sample from: βˆ‘π‘–π‘–=𝑑𝑑𝑑𝑑 πœ‰πœ‰π‘₯π‘₯,𝑖𝑖

2

βˆ‘π‘¦π‘¦=1𝑛𝑛 βˆ‘π‘–π‘–=𝑑𝑑𝑑𝑑 πœ‰πœ‰π‘¦π‘¦,𝑖𝑖

2

Suppose indices reported by our algorithm are 𝑗𝑗1, … , π‘—π‘—π‘˜π‘˜Consider two bases 𝑼𝑼 and 𝑾𝑾‒ 𝑼𝑼 follows True rows: π‘ˆπ‘ˆ = 𝑒𝑒1, … ,𝑒𝑒𝑑𝑑 s.t. {𝑒𝑒1, … ,𝑒𝑒𝑖𝑖} spans {𝐴𝐴𝑗𝑗1 , … ,𝐴𝐴𝑗𝑗𝑖𝑖}β€’ 𝑾𝑾 follows Noisy rows: π‘Šπ‘Š = 𝑀𝑀1, … ,𝑀𝑀𝑑𝑑 s.t. {𝑀𝑀1, … ,𝑀𝑀𝑖𝑖} spans {𝑃𝑃𝑗𝑗1 , … , 𝑃𝑃𝑗𝑗𝑖𝑖}

Page 107: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Overview of How to Bound the Error

For row 𝐴𝐴π‘₯π‘₯:

β€’ 𝐴𝐴π‘₯π‘₯ = βˆ‘π‘–π‘–=1𝑑𝑑 πœ†πœ†π‘₯π‘₯,𝑖𝑖𝑒𝑒𝑖𝑖‒ 𝐴𝐴π‘₯π‘₯ = βˆ‘π‘–π‘–=1𝑑𝑑 πœ‰πœ‰π‘₯π‘₯,𝑖𝑖𝑀𝑀𝑖𝑖

Sampling probs in terms of π‘ˆπ‘ˆ and π‘Šπ‘Š in t-th round

β€’ The correct probability: βˆ‘π‘–π‘–=𝑑𝑑𝑑𝑑 πœ†πœ†π‘₯π‘₯,𝑖𝑖

2

βˆ‘π‘¦π‘¦=1𝑛𝑛 βˆ‘π‘–π‘–=𝑑𝑑𝑑𝑑 πœ†πœ†π‘¦π‘¦,𝑖𝑖

2

β€’ What we sample from: βˆ‘π‘–π‘–=𝑑𝑑𝑑𝑑 πœ‰πœ‰π‘₯π‘₯,𝑖𝑖

2

βˆ‘π‘¦π‘¦=1𝑛𝑛 βˆ‘π‘–π‘–=𝑑𝑑𝑑𝑑 πœ‰πœ‰π‘¦π‘¦,𝑖𝑖

2

Suppose indices reported by our algorithm are 𝑗𝑗1, … , π‘—π‘—π‘˜π‘˜Consider two bases 𝑼𝑼 and 𝑾𝑾‒ 𝑼𝑼 follows True rows: π‘ˆπ‘ˆ = 𝑒𝑒1, … ,𝑒𝑒𝑑𝑑 s.t. {𝑒𝑒1, … ,𝑒𝑒𝑖𝑖} spans {𝐴𝐴𝑗𝑗1 , … ,𝐴𝐴𝑗𝑗𝑖𝑖}β€’ 𝑾𝑾 follows Noisy rows: π‘Šπ‘Š = 𝑀𝑀1, … ,𝑀𝑀𝑑𝑑 s.t. {𝑀𝑀1, … ,𝑀𝑀𝑖𝑖} spans {𝑃𝑃𝑗𝑗1 , … , 𝑃𝑃𝑗𝑗𝑖𝑖}

Difference between the correct prob and our algorithm sampling prob over all rows is πœ–πœ– for one round

β€’ Change of basis matrix β‰ˆ Identity matrixβ€’ Bound total variation distance by πœ–πœ–

Error in each round gets propagated π‘˜π‘˜ times

Total error is O(π‘˜π‘˜2πœ–πœ–)

Page 108: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Theorem:Our algorithm reports a set of π‘˜π‘˜ indices such that with high probability β€’ the total variation distance between the probability distribution output by the

algorithm and the probability distribution of adaptive sampling is at most 𝑂𝑂(πœ–πœ–)

β€’ The algorithm uses space 𝑝𝑝𝑃𝑃𝑝𝑝𝑝𝑝(π‘˜π‘˜, 1πœ–πœ–

,𝑑𝑑, log𝑛𝑛)

Page 109: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications1. Simulate adaptive sampling in 1 pass

β€’ 𝐿𝐿𝑝𝑝,2 sampling with post processing matrix 𝑃𝑃

2. Applications in turnstile streamβ€’ Row/column subset selectionβ€’ Subspace approximationβ€’ Projective clusteringβ€’ Volume Maximization

3. Volume maximization lower bounds

4. Volume maximization in row arrival

Page 110: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

ApplicationsMain Challenge: it suffices to get a noisy perturbation of the rows

Page 111: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Row Subset Selection

Input: 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑 and an integer π’Œπ’Œ > 0

Output: π’Œπ’Œ rows of 𝐴𝐴 to form 𝑴𝑴 to minimize 𝐴𝐴 βˆ’ 𝐴𝐴𝑀𝑀+𝑀𝑀 𝐹𝐹

Page 112: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Row Subset SelectionAdaptive Sampling provides a π’Œπ’Œ + 𝟏𝟏 ! approximation for subset selection

[DRVW’06]: Volume Sampling provides a π‘˜π‘˜ + 1 factor approximation to row subset selection with constant probability.

[DV’06]: Sampling probabilities for any π‘˜π‘˜-set 𝑆𝑆 produced by Adaptive Sampling is at most π‘˜π‘˜! of its sampling probability with respect to volume sampling.

Page 113: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Row Subset SelectionAdaptive Sampling provides a π’Œπ’Œ + 𝟏𝟏 ! approximation for subset selection

[DRVW’06]: Volume Sampling provides a π‘˜π‘˜ + 1 factor approximation to row subset selection with constant probability.

[DV’06]: Sampling probabilities for any π‘˜π‘˜-set 𝑆𝑆 produced by Adaptive Sampling is at most π‘˜π‘˜! of its sampling probability with respect to volume sampling.

Non-adaptive Adaptive Sampling provides a good approximation to Adaptive Sampling

Page 114: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Row Subset SelectionAdaptive Sampling provides a π’Œπ’Œ + 𝟏𝟏 ! approximation for subset selection

[DRVW’06]: Volume Sampling provides a π‘˜π‘˜ + 1 factor approximation to row subset selection with constant probability.

[DV’06]: Sampling probabilities for any π‘˜π‘˜-set 𝑆𝑆 produced by Adaptive Sampling is at most π‘˜π‘˜! of its sampling probability with respect to volume sampling.

Non-adaptive Adaptive Sampling provides a good approximation to Adaptive Sampling

1. For a set of indices 𝑱𝑱 output by our algorithm, 𝐴𝐴 𝐼𝐼 βˆ’ 𝑅𝑅+𝑅𝑅 F ≀ (1 + πœ–πœ–) 𝐴𝐴 𝐼𝐼 βˆ’π‘€π‘€+𝑀𝑀 F, w.h.p.

β€’ 𝑅𝑅: the set of noisy rows corresponding to 𝑱𝑱‒ 𝑀𝑀: the set of true rows corresponding to 𝑱𝑱

Page 115: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Row Subset SelectionAdaptive Sampling provides a π’Œπ’Œ + 𝟏𝟏 ! approximation for subset selection

[DRVW’06]: Volume Sampling provides a π‘˜π‘˜ + 1 factor approximation to row subset selection with constant probability.

[DV’06]: Sampling probabilities for any π‘˜π‘˜-set 𝑆𝑆 produced by Adaptive Sampling is at most π‘˜π‘˜! of its sampling probability with respect to volume sampling.

Non-adaptive Adaptive Sampling provides a good approximation to Adaptive Sampling

1. For a set of indices 𝑱𝑱 output by our algorithm, 𝐴𝐴 𝐼𝐼 βˆ’ 𝑅𝑅+𝑅𝑅 F ≀ (1 + πœ–πœ–) 𝐴𝐴 𝐼𝐼 βˆ’π‘€π‘€+𝑀𝑀 F, w.h.p.

β€’ 𝑅𝑅: the set of noisy rows corresponding to 𝑱𝑱‒ 𝑀𝑀: the set of true rows corresponding to 𝑱𝑱

2. For most π‘˜π‘˜-sets 𝑱𝑱, its prob. by adaptive sampling is within 𝑂𝑂(1) factor of Non-adaptive Sampling.

Page 116: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Row Subset Selection

Our Result: finds M such that,Pr[ 𝐴𝐴 βˆ’ 𝐴𝐴𝑴𝑴+𝑴𝑴 𝐹𝐹

2 ≀ 16 π‘˜π‘˜ + 1 ! 𝐴𝐴 βˆ’ π΄π΄π‘˜π‘˜ 𝐹𝐹2 ] β‰₯ 2/3

β€’ π΄π΄π‘˜π‘˜: best rank-k approximation of 𝐴𝐴‒ first one pass turnstile streaming algorithmβ€’ 𝑝𝑝𝑃𝑃𝑝𝑝𝑝𝑝(𝑑𝑑,π‘˜π‘˜, log𝑛𝑛) space

Input: 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑 and an integer π’Œπ’Œ > 0

Output: π’Œπ’Œ rows of 𝐴𝐴 to form 𝑴𝑴 to minimize 𝐴𝐴 βˆ’ 𝐴𝐴𝑀𝑀+𝑀𝑀 𝐹𝐹

Page 117: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Volume Maximization

Input: 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑 and an integer π’Œπ’ŒOutput: π’Œπ’Œ rows π’“π’“πŸπŸ, … , π’“π’“π’Œπ’Œ of 𝐴𝐴, 𝑴𝑴, with maximum volume

Page 118: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Volume Maximization

Input: 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑 and an integer π’Œπ’ŒOutput: π’Œπ’Œ rows π’“π’“πŸπŸ, … , π’“π’“π’Œπ’Œ of 𝐴𝐴, 𝑴𝑴, with maximum volume

Volume of the parallelepiped spanned by those vectors

π’Œπ’Œ = 𝟐𝟐

Page 119: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Volume Maximization[Civril, Magdon’09] Greedy Algorithm Provides a π’Œπ’Œ! approximation to Volume Maximization

Greedy

β€’ For π‘˜π‘˜ rounds, pick the vector that is farthest away from the current subspace.

π’Œπ’Œ = 𝟐𝟐

Page 120: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Volume Maximization[Civril, Magdon’09] Greedy Algorithm Provides a π’Œπ’Œ! approximation to Volume Maximization

Greedy

β€’ For π‘˜π‘˜ rounds, pick the vector that is farthest away from the current subspace.

π’Œπ’Œ = 𝟐𝟐

Page 121: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Volume Maximization[Civril, Magdon’09] Greedy Algorithm Provides a π’Œπ’Œ! approximation to Volume Maximization

Greedy

β€’ For π‘˜π‘˜ rounds, pick the vector that is farthest away from the current subspace.

π’Œπ’Œ = 𝟐𝟐

Page 122: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Volume Maximization[Civril, Magdon’09] Greedy Algorithm Provides a π’Œπ’Œ! approximation to Volume Maximization

Greedy

β€’ For π‘˜π‘˜ rounds, pick the vector that is farthest away from the current subspace.

π’Œπ’Œ = 𝟐𝟐

Page 123: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Volume Maximization[Civril, Magdon’09] Greedy Algorithm Provides a π’Œπ’Œ! approximation to Volume Maximization

Greedy

β€’ For π‘˜π‘˜ rounds, pick the vector that is farthest away from the current subspace.

π’Œπ’Œ = 𝟐𝟐

Page 124: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Volume Maximization[Civril, Magdon’09] Greedy Algorithm Provides a π’Œπ’Œ! approximation to Volume Maximization

Simulate Greedy

β€’ Maintain π‘˜π‘˜ instances of CountSketch, AMS and π‘³π‘³πŸπŸ,𝟐𝟐 Sampler

Page 125: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Volume Maximization[Civril, Magdon’09] Greedy Algorithm Provides a π’Œπ’Œ! approximation to Volume Maximization

If the largest row exceeds the threshold, then it is correctly found by CountSketch w.h.p.

Simulate Greedy

β€’ Maintain π‘˜π‘˜ instances of CountSketch, AMS and π‘³π‘³πŸπŸ,𝟐𝟐 Sampler

β€’ For π‘˜π‘˜ rounds,β€’ Let 𝒓𝒓 be the row of 𝐴𝐴𝑃𝑃 with largest norm //by CountSketch

Page 126: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Volume Maximization[Civril, Magdon’09] Greedy Algorithm Provides a π’Œπ’Œ! approximation to Volume Maximization

If the largest row exceeds the threshold, then it is correctly found by CountSketch w.h.p.

Otherwise, there are enough large rows and sampler chooses one of them w.h.p.

Simulate Greedy

β€’ Maintain π‘˜π‘˜ instances of CountSketch, AMS and π‘³π‘³πŸπŸ,𝟐𝟐 Sampler

β€’ For π‘˜π‘˜ rounds,β€’ Let 𝒓𝒓 be the row of 𝐴𝐴𝑃𝑃 with largest norm //by CountSketch

β€’ If 𝒓𝒓 𝟐𝟐 < 𝛼𝛼2

4π‘›π‘›π‘˜π‘˜π‘¨π‘¨π‘·π‘· 𝑭𝑭

𝟐𝟐, instead sample row 𝒓𝒓 according to norms of rows

Page 127: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Volume Maximization[Civril, Magdon’09] Greedy Algorithm Provides a π’Œπ’Œ! approximation to Volume Maximization

If the largest row exceeds the threshold, then it is correctly found by CountSketch w.h.p.

Otherwise, there are enough large rows and sampler chooses one of them w.h.p.

Simulate Greedy

β€’ Maintain π‘˜π‘˜ instances of CountSketch, AMS and π‘³π‘³πŸπŸ,𝟐𝟐 Sampler

β€’ For π‘˜π‘˜ rounds,β€’ Let 𝒓𝒓 be the row of 𝐴𝐴𝑃𝑃 with largest norm //by CountSketch

β€’ If 𝒓𝒓 𝟐𝟐 < 𝛼𝛼2

4π‘›π‘›π‘˜π‘˜π‘¨π‘¨π‘·π‘· 𝑭𝑭

𝟐𝟐, instead sample row 𝒓𝒓 according to norms of rows

β€’ Add 𝑃𝑃 to the solution, and update the postprocessing matrix 𝑃𝑃

Page 128: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Applications: Volume Maximization

Input: 𝐴𝐴 ∈ ℝ𝑛𝑛×𝑑𝑑 and an integer π’Œπ’ŒOutput: π’Œπ’Œ rows π’“π’“πŸπŸ, … , π’“π’“π’Œπ’Œ of 𝐴𝐴, 𝑴𝑴, with maximum volume

Our Result: for an approximation factor 𝜢𝜢, finds S (set of π’Œπ’Œ noisy rows of 𝑨𝑨) s.t.,

Pr[π›Όπ›Όπ‘˜π‘˜ π‘˜π‘˜! Vol 𝐒𝐒 β‰₯ Vol(𝐌𝐌)] β‰₯ 2/3β€’ first one pass turnstile streaming algorithmβ€’ �𝑂𝑂( β„π‘›π‘›π‘‘π‘‘π‘˜π‘˜2 𝛼𝛼2) space

Page 129: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Problem Model Approximation/error space Comments

𝑳𝑳𝒑𝒑,𝟐𝟐 Sampler

turnstile

(𝟏𝟏 + 𝝐𝝐) relative + πŸπŸπ’‘π’‘π’‘π’‘π’‘π’‘π’‘π’‘ 𝒏𝒏 𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑(𝒅𝒅, ππβˆ’πŸπŸ, π₯π₯π₯π₯π₯π₯𝒏𝒏)

Adaptive Sampling 𝑢𝑢(𝝐𝝐) total variation distance 𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑(𝒅𝒅,π’Œπ’Œ, ππβˆ’πŸπŸ, π₯π₯π₯π₯π₯π₯𝒏𝒏)Row Subset Selection 𝑢𝑢( π’Œπ’Œ + 𝟏𝟏 !) 𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑(𝒅𝒅,π’Œπ’Œ, π₯π₯π₯π₯π₯π₯𝒏𝒏)

Subspace Approximation

𝑢𝑢( π’Œπ’Œ + 𝟏𝟏 !) 𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑(𝒅𝒅,π’Œπ’Œ, π₯π₯π₯π₯π₯π₯𝒏𝒏)(𝟏𝟏 + 𝝐𝝐) 𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑(𝒅𝒅,π’Œπ’Œ, π₯π₯π₯π₯π₯π₯𝒏𝒏 ,𝟏𝟏/𝝐𝝐) 𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑(π’Œπ’Œ,𝟏𝟏/𝝐𝝐) rows

Projective Clustering (𝟏𝟏 + 𝝐𝝐) 𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑(𝒅𝒅,π’Œπ’Œ, π₯π₯π₯π₯π₯π₯𝒏𝒏 , 𝒔𝒔,𝟏𝟏/𝝐𝝐) 𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑(π’Œπ’Œ, 𝒔𝒔,𝟏𝟏/𝝐𝝐) rows

Volume Maximization

πœΆπœΆπ’Œπ’Œ π’Œπ’Œ! �𝑢𝑢( β„π’π’π’…π’…π’Œπ’ŒπŸπŸ 𝜢𝜢𝟐𝟐)πœΆπœΆπ’Œπ’Œ 𝛀𝛀( ⁄𝒏𝒏 π’Œπ’Œπ’‘π’‘πœΆπœΆπŸπŸ) 𝒑𝒑 pass

Row Arrival

π‘ͺπ‘ͺπ’Œπ’Œ 𝛀𝛀(𝒏𝒏) Random Order�𝑢𝑢 π‘ͺπ‘ͺπ’Œπ’Œ π’Œπ’Œ/𝟐𝟐 �𝑢𝑢(𝒏𝒏𝑢𝑢(𝟏𝟏/π‘ͺπ‘ͺ)𝒅𝒅) π‘ͺπ‘ͺ < ⁄(π₯π₯π₯π₯π₯π₯ 𝒏𝒏) π’Œπ’Œ

Page 130: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Open problems

β€’ Get tight dependence on the parametersβ€’ Further applications of non-adaptive adaptive sampling

β€’ Result on Volume Maximization in row arrival model is not tight, i.e., can we get 𝑂𝑂(π‘˜π‘˜)π‘˜π‘˜ approximation without dependence on 𝑛𝑛?

Problem Model Approximation/error space Comments

𝑳𝑳𝒑𝒑,𝟐𝟐 Sampler

turnstile

(𝟏𝟏 + 𝝐𝝐) relative + πŸπŸπ’‘π’‘π’‘π’‘π’‘π’‘π’‘π’‘ 𝒏𝒏 𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑(𝒅𝒅, ππβˆ’πŸπŸ, π₯π₯π₯π₯π₯π₯𝒏𝒏)

Adaptive Sampling 𝑢𝑢(𝝐𝝐) total variation distance 𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑(𝒅𝒅,π’Œπ’Œ, ππβˆ’πŸπŸ, π₯π₯π₯π₯π₯π₯𝒏𝒏)Row Subset Selection 𝑢𝑢( π’Œπ’Œ + 𝟏𝟏 !) 𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑(𝒅𝒅,π’Œπ’Œ, π₯π₯π₯π₯π₯π₯𝒏𝒏)

Subspace Approximation

𝑢𝑢( π’Œπ’Œ + 𝟏𝟏 !) 𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑(𝒅𝒅,π’Œπ’Œ, π₯π₯π₯π₯π₯π₯𝒏𝒏)(𝟏𝟏 + 𝝐𝝐) 𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑(𝒅𝒅,π’Œπ’Œ, π₯π₯π₯π₯π₯π₯𝒏𝒏 ,𝟏𝟏/𝝐𝝐) 𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑(π’Œπ’Œ,𝟏𝟏/𝝐𝝐) rows

Projective Clustering (𝟏𝟏 + 𝝐𝝐) 𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑(𝒅𝒅,π’Œπ’Œ, π₯π₯π₯π₯π₯π₯𝒏𝒏 , 𝒔𝒔,𝟏𝟏/𝝐𝝐) 𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑(π’Œπ’Œ, 𝒔𝒔,𝟏𝟏/𝝐𝝐) rows

Volume Maximization

πœΆπœΆπ’Œπ’Œ π’Œπ’Œ! �𝑢𝑢( β„π’π’π’…π’…π’Œπ’ŒπŸπŸ 𝜢𝜢𝟐𝟐)πœΆπœΆπ’Œπ’Œ 𝛀𝛀( ⁄𝒏𝒏 π’Œπ’Œπ’‘π’‘πœΆπœΆπŸπŸ) 𝒑𝒑 pass

Row Arrival

π‘ͺπ‘ͺπ’Œπ’Œ 𝛀𝛀(𝒏𝒏) Random Order�𝑢𝑢 π‘ͺπ‘ͺπ’Œπ’Œ π’Œπ’Œ/𝟐𝟐 �𝑢𝑢(𝒏𝒏𝑢𝑢(𝟏𝟏/π‘ͺπ‘ͺ)𝒅𝒅) π‘ͺπ‘ͺ < ⁄(π₯π₯π₯π₯π₯π₯ 𝒏𝒏) π’Œπ’Œ

Page 131: Non-Adaptive Adaptive Sampling on Turnstile Streamsmahabadi/slides/Stream-Adaptive-Sampling.pdf1. Simulate adaptive sampling in 1 pass turnstile stream β€’ 𝐿𝐿𝑝𝑝,2sampling

Problem Model Approximation/error space Comments

𝑳𝑳𝒑𝒑,𝟐𝟐 Sampler

turnstile

(𝟏𝟏 + 𝝐𝝐) relative + πŸπŸπ’‘π’‘π’‘π’‘π’‘π’‘π’‘π’‘ 𝒏𝒏 𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑(𝒅𝒅, ππβˆ’πŸπŸ, π₯π₯π₯π₯π₯π₯𝒏𝒏)

Adaptive Sampling 𝑢𝑢(𝝐𝝐) total variation distance 𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑(𝒅𝒅,π’Œπ’Œ, ππβˆ’πŸπŸ, π₯π₯π₯π₯π₯π₯𝒏𝒏)Row Subset Selection 𝑢𝑢( π’Œπ’Œ + 𝟏𝟏 !) 𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑(𝒅𝒅,π’Œπ’Œ, π₯π₯π₯π₯π₯π₯𝒏𝒏)

Subspace Approximation

𝑢𝑢( π’Œπ’Œ + 𝟏𝟏 !) 𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑(𝒅𝒅,π’Œπ’Œ, π₯π₯π₯π₯π₯π₯𝒏𝒏)(𝟏𝟏 + 𝝐𝝐) 𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑(𝒅𝒅,π’Œπ’Œ, π₯π₯π₯π₯π₯π₯𝒏𝒏 ,𝟏𝟏/𝝐𝝐) 𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑(π’Œπ’Œ,𝟏𝟏/𝝐𝝐) rows

Projective Clustering (𝟏𝟏 + 𝝐𝝐) 𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑(𝒅𝒅,π’Œπ’Œ, π₯π₯π₯π₯π₯π₯𝒏𝒏 , 𝒔𝒔,𝟏𝟏/𝝐𝝐) 𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑(π’Œπ’Œ, 𝒔𝒔,𝟏𝟏/𝝐𝝐) rows

Volume Maximization

πœΆπœΆπ’Œπ’Œ π’Œπ’Œ! �𝑢𝑢( β„π’π’π’…π’…π’Œπ’ŒπŸπŸ 𝜢𝜢𝟐𝟐)πœΆπœΆπ’Œπ’Œ 𝛀𝛀( ⁄𝒏𝒏 π’Œπ’Œπ’‘π’‘πœΆπœΆπŸπŸ) 𝒑𝒑 pass

Row Arrival

π‘ͺπ‘ͺπ’Œπ’Œ 𝛀𝛀(𝒏𝒏) Random Order�𝑢𝑢 π‘ͺπ‘ͺπ’Œπ’Œ π’Œπ’Œ/𝟐𝟐 �𝑢𝑢(𝒏𝒏𝑢𝑢(𝟏𝟏/π‘ͺπ‘ͺ)𝒅𝒅) π‘ͺπ‘ͺ < ⁄(π₯π₯π₯π₯π₯π₯ 𝒏𝒏) π’Œπ’Œ

Open problems

β€’ Get tight dependence on the parametersβ€’ Further applications of non-adaptive adaptive sampling

β€’ Result on Volume Maximization in row arrival model is not tight, i.e., can we get 𝑂𝑂(π‘˜π‘˜)π‘˜π‘˜ approximation without dependence on 𝑛𝑛?

Thank You!