VisDB: Database Exploration Using Multidimensional Visualization Maithili Narasimha 4/24/2001

Preview:

Citation preview

VisDB: Database Exploration VisDB: Database Exploration Using Multidimensional Using Multidimensional

VisualizationVisualization

Maithili Narasimha

4/24/2001

VisDBVisDB

Scientific and Geographic databases tend to have large amounts of data.

Some of the challenges in dealing with these databases are:

Mining these databases for useful information is a difficult task due to the sheer volume of data

VisDBVisDB Users do not know what they are looking for

exactly.

With traditional query specification languages, it is not possible to specify vague queries and thus not possible to get approximate results.

There is no feedback. Result set may contain too few or too many points.

VisDB

Requirements for a good Visualization System to explore large databases:

Flexible Query Specification Good Query Feedback Interactive system

VisDBVisDB Also, the users should be able to

view as many data points as possible to see the patterns and clusters.

Display size and resolution are limiting factors

Also necessary to display the interdependencies between data attributes, Hotspots(anomalies).

VisDBVisDB The goal of the VisDB system is to

address the tasks of visualization of the results , and that of incrementally refining the query to provide an effective way to find interesting data properties.

VisDBVisDB The approach.

Use each pixel of the screen to visualize the results.

Provide data items not only fulfilling the result exactly , but also those that match approximately.

VisDBVisDB Approximate results are

determined by a relevance factor. The relevance factor of a data item

is obtained by calculating distances for each selection predicate and combining them.

The more the combined distance, the less the relevance of the data point.

Calculating the Relevance FactorCalculating the Relevance Factor Calculate the distance.

Simple for Quantitative data. Nominal and Ordinal?

Combining distances. User and Query dependent. Weighting Factor for each attribute. Normalizing. Arithmetic Mean for AND and Geometric mean

for OR for combining different condition parts. Relevance Factor is the inverse of the

Combined distance.

VisDBVisDB Basic Visualization Technique

Sort the data points according to their relevance, with respect to the query.

Assign colors depending on the relevance.

Plot the sorted, colored points starting from the center of the screen moving outwards in a rectangular spiral fashion.

Overall Result Plotting

VisDBVisDB To relate the visualization of the overall

result to the visualization of different selection predicates, separate windows for each selected predicate of the query are created and shown along with the result window.

The position of the data items in all the other windows is determined by their position in the overall result window.

Arrangement of Windows for 5D

Data

VisDBVisDB Mapping two dimensions to the axes

It is possible for the user to assign two attributes to the axes and the system will arrange the relevance factors according to the directions of the distance of the data point from the selection predicate.

With this method it is possible to provide better feedback to the user.

However, we may not be able to use the display efficiently in some cases (I.e. some quadrants may not be used fully, while others are saturated)

2D Representation

VisDBVisDB Grouping the dimensions

The pixels corresponding to the different dimensions of one data item are placed in one area instead of distributing them in different windows.

Will require more pixels per dimension per data item.

May provide more useful visualizations for data sets with larger dimensionality.

Grouping multi dimensional data

VisDBVisDB Interactive data exploration

Users initially specify their queries, using some query language.

Inside the VisDB interactive query and visualization interface, it is possible to view the visualizations and perform query modifications.

System provides sliders for modifying selection predicates, weight factors and other options.

VisDBVisDB

VisDBVisDB

Conclusion Useful for identifying and isolating

clusters, correlations and hotspots in large databases.

Good Query specification system. No Zoom or pan for the visualizations.

Recommended