18
Uncovering Clusters in Crowded Parallel Coordinates Visualizations Alimir Olivettr Artero, Maria Cristina Ferreiara de Oliveira, Haim levkowitz Information Visualization 2004

Uncovering Clusters in Crowded Parallel Coordinates Visualizations

  • Upload
    lang

  • View
    45

  • Download
    0

Embed Size (px)

DESCRIPTION

Uncovering Clusters in Crowded Parallel Coordinates Visualizations. Alimir Olivettr Artero , Maria Cristina Ferreiara de Oliveira, Haim levkowitz Information Visualization 2004. Abstract. The idea is inspired by traditional image processing techniques such as grayscale manipulation. - PowerPoint PPT Presentation

Citation preview

Page 1: Uncovering Clusters in Crowded Parallel Coordinates Visualizations

Uncovering Clusters in Crowded Parallel Coordinates Visualizations

Alimir Olivettr Artero, Maria Cristina Ferreiara de Oliveira, Haim levkowitz

Information Visualization 2004

Page 2: Uncovering Clusters in Crowded Parallel Coordinates Visualizations

Abstract

• The idea is inspired by traditional image processing techniques such as grayscale manipulation.

• Reducing visual clutter and allowing the analyst to observe relevant patterns in the parallel coordinates.

Page 3: Uncovering Clusters in Crowded Parallel Coordinates Visualizations

Introduction

• The strong overlapping of graphical markers hampers the user’s ability to identify patterns in the data when the number of records and the dimensionality of the data set are high.

• It is important to avoid displaying irrelevant information and enhancing the presentation of the useful one.

Page 4: Uncovering Clusters in Crowded Parallel Coordinates Visualizations

Introduction

• Tackling this problem with a strategy that computes frequency and density information, and uses them in parallel coordinates visualizations to filter out the information to be presented to the user.

Page 5: Uncovering Clusters in Crowded Parallel Coordinates Visualizations

Frequency Information

• The frequency function for a n-dimensional variable x is defined as :

where h is the size of bins, σ is the number of records in the same bin, m is the number of all records.

Page 6: Uncovering Clusters in Crowded Parallel Coordinates Visualizations

Frequency Information

• A two-dimensional matrix is generated to store the frequency of each pair of attribute values, which is then used to draw the polygonal lines for the records in the data set.

• For a data set with n attributes, n-1 frequency matrices are generated, one for each pair of attributes.

Page 7: Uncovering Clusters in Crowded Parallel Coordinates Visualizations

Frequency Information

• All the non-zero matrix elements generate a line segment in the visualization and the pixel intensity used to draw the line segment.

• Each line segment is drawn with the Bresenham algorithm:

Page 8: Uncovering Clusters in Crowded Parallel Coordinates Visualizations

Interactive Parallel Coordinates Frequency and Density plots

• The intensity of the pixel with coordinates (q,p) is given by:

• Square wave smoothing filter is used for each pixel:

Page 9: Uncovering Clusters in Crowded Parallel Coordinates Visualizations

Interactive Parallel Coordinates Frequency and Density plots

• S is a scaling factor.

Page 10: Uncovering Clusters in Crowded Parallel Coordinates Visualizations

Density Information

• The density function for a n-dimensional variable x is defined as :

where di is the i-th record of the data set and K is the kernel function, the parameter defines a smoothing factor or bandwidth.

Page 11: Uncovering Clusters in Crowded Parallel Coordinates Visualizations

visualizations of the Pollen data

a) Frequency Plot b) Density Plot

Page 12: Uncovering Clusters in Crowded Parallel Coordinates Visualizations

Interactive high-dimensional clustering with IPC plot

Page 13: Uncovering Clusters in Crowded Parallel Coordinates Visualizations

Interactive high-dimensional clustering with IPC plot

Page 14: Uncovering Clusters in Crowded Parallel Coordinates Visualizations

Interactive high-dimensional clustering with IPC plot

Page 15: Uncovering Clusters in Crowded Parallel Coordinates Visualizations

Interactive high-dimensional clustering with IPC plot

Page 16: Uncovering Clusters in Crowded Parallel Coordinates Visualizations

Interactive high-dimensional clustering with IPC plot

Page 17: Uncovering Clusters in Crowded Parallel Coordinates Visualizations

Performance

• Running times in seconds for the proposed algorithm with different values of m and n.

Page 18: Uncovering Clusters in Crowded Parallel Coordinates Visualizations

Conclusions

• The new plots support interactive data exploration of large and high-dimensional data sets, allowing users to remove noise and highlight areas with high concentration of data.

• The proposed algorithms use only integer arithmetic to compute the frequency matrices.