19
Uncertainty-aware Multidimensional Ensemble Data Visualization and Exploration -- Haidong Chen (Zhejiang University), Song Zhang (Mississippi State University), Wei Chen, Honghui Mei, Jiawei Zhang (Zhejiang University), Andrew Mercer (Mississippi State University), Ronghua Liang (Zhejiang University) and Huamin Qu (Hong Kong University of Science and Technology). ~Presented By: Subhashis Hazarika (The Ohio State University)

Uncertainty aware multidimensional ensemble data visualization and exploration

Embed Size (px)

Citation preview

Page 1: Uncertainty aware multidimensional ensemble data visualization and exploration

Uncertainty-aware Multidimensional Ensemble Data Visualization and Exploration

-- Haidong Chen (Zhejiang University), Song Zhang (Mississippi State

University), Wei Chen, Honghui Mei, Jiawei Zhang (Zhejiang University), Andrew Mercer (Mississippi State University), Ronghua Liang (Zhejiang

University) and Huamin Qu (Hong Kong University of Science and Technology).

~Presented By:

Subhashis Hazarika (The Ohio State University)

Page 2: Uncertainty aware multidimensional ensemble data visualization and exploration

Goal

• Come up with a projection scheme for multi(/high)-dimensional data which are uncertain( in this case ensemble of data).

• Naïve Approach involves getting the ensemble means and projecting it in low-dimensional space. But it involves loss of distributional information for ensemble objects.

• Applying MDS techniques with the dissimilarity matrix created from distributional distance field is computational intensive for large datasets.

• This work wants to strike a balance between the accuracy and efficiency of projection for large multidimensional ensemble data.

Page 3: Uncertainty aware multidimensional ensemble data visualization and exploration

Motivation

Page 4: Uncertainty aware multidimensional ensemble data visualization and exploration

Key Contributions

• A novel uncertainty-aware multidimensional projection approach, key factor: – A new dissimilarity measure for the ensemble data objects.

– An enhanced Laplacian-based projection scheme.

• Augment the users’ ability to visually study the ensemble dataset with a suite of visual exploration widgets.

Page 5: Uncertainty aware multidimensional ensemble data visualization and exploration

Problem Formulation

• n ensemble data objects

• Each object has m d-dimensional ensemble members

• Goal is to build an l-dimensional representation preserving the relationships among the data-objects in terms of both the ensemble mean and ensemble distribution

Page 6: Uncertainty aware multidimensional ensemble data visualization and exploration

Approach Overview

• 2 –step multidimensional projection: – A small set of control points are selected from U and projected to a 2D space using conventional

MDS method.

– Next all the other objects in U are projected to the 2D space with an enhanced Laplacian system that combines the influences from both the control points and the other points.

• To find the distance between two data objects they use both Euclidean distance between the ensemble means as well as the JSD between the ensemble distribution of the 2 objects.

• Overall Steps:

Create Prob. Distributions for Ensemble Data Objects

Dissimilarity Estimate

Enhanced Projection Scheme

Page 7: Uncertainty aware multidimensional ensemble data visualization and exploration

Ensemble Data Objects & Prob. Distribution

• To reconstruct the continuous ensemble distribution for each data object, a multidimensional Kernel Density Estimate (KDE) method that considers the dimensional correlations is employed.

• Used a normal kernel, moreover the selection of the kernel K(.) is less important than the bandwidth matrix H in terms of influences on the estimation.

• Choices for H: – Scaled Identity Matrix :

– Diagonal Matrix:

– Generic Symmetric Positive Definite Matrix.

• Silverman’s rule of thumb:

Page 8: Uncertainty aware multidimensional ensemble data visualization and exploration

Ensemble Data Objects & Prob. Distribution

• To take advantage of the simplicity offered by the diagonal matrix while preserving correlations among dimensions , KDE is performed not in the usual data space but in a space defined by the principal component transformation.

• Space transformation: – Mean centering approach:

– Then apply PCA to that yields a transformation matrix

– Lastly transform each ensemble member into a new set by:

– Because the bases of the new space are eigenvectors that are orthogonal to each other(i.e independent) we can go ahead and use the diagonal bandwidth matrix created using eq (2) for our KDE.

Page 9: Uncertainty aware multidimensional ensemble data visualization and exploration

Dissimilarity Estimation

• Jensen Shannon Divergence:

• Dissimilarity between two distributions:

Page 10: Uncertainty aware multidimensional ensemble data visualization and exploration

Enhanced Laplacian Based Projection

• Inspired by Least Square Projection. It is a 2-step local technique. Basic Idea: – First a subset of data objects are projected to the visual space.

– Then, rest of the data objects are interpolated according to the K-nearest neighborhood graph.

• Approach: – Select the initial control points using “K-center algorithm”. If we don’t

have any prior information about the data then we select points.

– We then calculate a set of K - nearest neighbors Ni for each ensemble object Ui. To avoid pairwise distributional difference calculation we select the control points and the nearest neighbors only based on the ensemble mean.

– Apparently Ni might not hold the true K-nearest neighbors of Ui because only the ensemble mean information is utilized. During the second step of projection we will identify those values and assign them to a random set Ri, which is an extension of Ni.

Page 11: Uncertainty aware multidimensional ensemble data visualization and exploration

Enhanced Laplacian Based Projection

• Now project the control points using an iterative majorization algorithm called Scaling by Majorizing a Convex Function (SMACOF) (a kind of MDS technique). Let the low-dimensional control points be

• Laplacian-based projection schemes relies on the theory of convex combination. It says that the low dimensional representation for each high-dimensional data object can be regarded as a linear combination of its neighborhoods in the visual space. – Let be the projection of ensemble data object Ui, according to the convex combination theory

Vi can be written as:

Page 12: Uncertainty aware multidimensional ensemble data visualization and exploration

Enhanced Laplacian Based Projection

Page 13: Uncertainty aware multidimensional ensemble data visualization and exploration

Uncertainty Quantification

• Overall uncertainty Oi of the ensemble data object Ui is sum of standard deviation in all dimensions:

• Deviation of the t-th ensemble member of Ui is defined as its Euclidean distance to the ensemble mean:

Page 14: Uncertainty aware multidimensional ensemble data visualization and exploration

From Quantification to Visualization

• Ensemble Bar: a color bar based representation to depict the uncertainty of each data object.

Page 15: Uncertainty aware multidimensional ensemble data visualization and exploration

Visual Exploration and Interaction

Page 16: Uncertainty aware multidimensional ensemble data visualization and exploration

Synthetic Data

Page 17: Uncertainty aware multidimensional ensemble data visualization and exploration

NBA Players’ Statistics

Page 18: Uncertainty aware multidimensional ensemble data visualization and exploration

Numerical Weather Simulation Dataset

Page 19: Uncertainty aware multidimensional ensemble data visualization and exploration

Thank You