1
Service-Oriented Local And Global Visualization with Sorting On-demand for Climate Data Xusheng Xiao [email protected] Huge amount of climate simulation data are collected from different areas (e.g., cities, countries). Climate scientists keep trying to predict the trends of the variation of climate both locally and globally. Exploring visualization of data mining (e.g., histogram) has been used more and more frequently to get a general view ahead of predicting. Climate experts would like to analyze data by navigating among levels of data ranging from the most summarized (drill-up) to the most detailed (drill- down) (e.g., drill-down shown in Figure 1). Table 1 [1] Solution 1: Service-Oriented Histogram [2] Solution 2: On-demand Sorting [3] http://csc.ncsu.edu/ NCSU Computer Science Cache data and parameters (min, max, count) locally Index data with break number (e.g., 0.5 is in the break [0, 1] ) Check whether the data in the requested breaks are sorted or not If sorted, transfer data directly If data is not sorted, sort only the data in the corresponding break and mark the break as sorted Transfer local histogram data (min, max, count) for global computation Merge data from different sources Table 2 Result Challenge References 1. http://www.esrl.noaa.gov/psd/psd3/cruises/ 2. Felix Halim, Panagiotis Karras, and Roland H.C. Yap. 2009. Fast and effective histogram construction. ACM, New York, NY, USA, 1167-1176. 3. C. A. R. Hoare. Quicksort. The Computer Journal, 5(1):10‚Äì16, January 1962. Zhe Zhang [email protected] Ye Jin [email protected] Globally transferring caused problems: Time-consuming (see Table 1) Package Lost during data transfer (see Table 1) Frequently drill-up and drill-down navigation of data consumes computation resources. (e.g., scanning same data set multiple times see Table 2) Motivation Locally And Global Visualization Locally compute min, max, and count Transmitting the local min, max and count to compute global min, max and count Each data sources compute the histogram based on the global min, max and count Only transferring the computed histogram data, which is much smaller compared to all the climate data Merge the transmitted histograms to show the global histograms Figure 1: Drill-down to interval [-1,1] Here are the raw data in multiple domains have already collected, we can see the latest data sets are all for year 2008. Data Domain Single data set size Number of data sets Total Size Collecting Time In Best Case VOCALS 2008 ~70000 KB 56 ~3920 MB ~10 Hrs ASCOS 2008 ~140000 KB 25 ~3500 MB ~10 Hrs AEROSE 2008 80000 KB 36 ~2880 MB ~7 Hrs STRATUS 2007 70000KB 21 ~1470 MB ~5 Hrs Data Size Run Once Histogram Discovery Histogram Run log(n) Times User specified 30 Times ~1500 MB 2 Mins ~17 * 2 = 34 Mins 60 Mins ~3000 MB 4 Mins ~18 * 2 = 36 Mins 120 Mins ~4500 MB 6 Mins ~19 * 2 =38 Mins 180 Mins Total time needed to discovery meaningful or user specified parameters visualization results, we need to speed up those visualization algorithms. x-value y-counts -40 -35 -30 -25 -20 -15 -10 0 10 20 30 40 x-value y-counts 10 12 14 16 18 20 0 5 10 15 20 25 x-value y-counts 30 35 40 45 50 0 1 2 3 4 5 6 x-value y-counts -10 -8 -6 -4 -2 0 0 5 10 15 20 25 30 x-value y-counts -30 -20 -10 0 10 20 30 40 0 20 40 60 80 x-value y-counts -20 -18 -16 -14 -12 -10 0 5 10 15 x-value y-counts -35 -30 -25 -20 -15 -10 0 2 4 6 8 10 Figure 2: System Framework

Service-Oriented Local And Global Visualization with Sorting On-demand for Climate Data

  • Upload
    erma

  • View
    33

  • Download
    0

Embed Size (px)

DESCRIPTION

200. 150. y-counts. 100. 80. 50. 0. 60. -3. -2. -1. 0. 1. 2. 3. y-counts. 40. x-value. 20. 0. -1.0. -0.5. 0.0. 0.5. 1.0. x-value. Service-Oriented Local And Global Visualization with Sorting On-demand for Climate Data. Zhe Zhang [email protected]. Ye Jin - PowerPoint PPT Presentation

Citation preview

Page 1: Service-Oriented Local And Global Visualization with Sorting On-demand for Climate Data

Service-Oriented Local And Global Visualization with Sorting On-demand for Climate Data

Xusheng [email protected]

Huge amount of climate simulation data are collected from different areas (e.g., cities, countries). Climate scientists keep trying to predict the trends of the variation of climate both locally and globally.Exploring visualization of data mining (e.g., histogram) has been used more and more frequently to get a general view ahead of predicting.Climate experts would like to analyze data by navigating among levels of data ranging from the most summarized (drill-up) to the most detailed (drill-down) (e.g., drill-down shown in Figure 1).

Table 1 [1]

Solution 1: Service-Oriented Histogram [2]

Solution 2: On-demand Sorting [3]

http://csc.ncsu.edu/ NCSU Computer Science

Cache data and parameters (min, max, count) locallyIndex data with break number (e.g., 0.5 is in the break [0, 1] )Check whether the data in the requested breaks are sorted or notIf sorted, transfer data directlyIf data is not sorted, sort only the data in the corresponding break and mark the break as sortedTransfer local histogram data (min, max, count) for global computationMerge data from different sources

Table 2

Result

Challenge

References1. http://www.esrl.noaa.gov/psd/psd3/cruises/2. Felix Halim, Panagiotis Karras, and Roland H.C. Yap. 2009. Fast and effective

histogram construction. ACM, New York, NY, USA, 1167-1176.3. C. A. R. Hoare. Quicksort. The Computer Journal, 5(1):10–16, January 1962.

Zhe [email protected]

Ye [email protected]

Globally transferring caused problems:Time-consuming (see Table 1)Package Lost during data transfer (see Table 1)

Frequently drill-up and drill-down navigation of data consumes computation resources. (e.g., scanning same data set multiple times see Table 2)

MotivationLocally And Global Visualization

Locally compute min, max, and countTransmitting the local min, max and count to compute global min, max and countEach data sources compute the histogram based on the global min, max and countOnly transferring the computed histogram data, which is much smaller compared to all the climate dataMerge the transmitted histograms to show the global histograms

Figure 1: Drill-down to interval [-1,1]

Here are the raw data in multiple domains have already collected, we can see the latest data sets are all for year 2008.

Data DomainSingle data

set sizeNumber of data sets

Total SizeCollecting Time

In Best Case

VOCALS 2008 ~70000 KB 56 ~3920 MB ~10 Hrs

ASCOS 2008 ~140000 KB 25 ~3500 MB ~10 Hrs

AEROSE 2008 ~ 80000 KB 36 ~2880 MB ~7 Hrs

STRATUS 2007 ~ 70000KB 21 ~1470 MB ~5 Hrs

Data SizeRun Once Histogram

Discovery Histogram Run log(n) Times

User specified 30 Times

~1500 MB 2 Mins ~17 * 2 = 34 Mins 60 Mins

~3000 MB 4 Mins ~18 * 2 = 36 Mins 120 Mins

~4500 MB 6 Mins ~19 * 2 =38 Mins 180 Mins

Total time needed to discovery meaningful or user specified parameters visualization results, we need to speed up those visualization algorithms.

x-value

y-co

un

ts

-40 -35 -30 -25 -20 -15 -10

01

02

03

04

0

x-value

y-co

un

ts

10 12 14 16 18 20

05

10

15

20

25

x-value

y-co

un

ts

30 35 40 45 50

01

23

45

6

x-value

y-co

un

ts

-10 -8 -6 -4 -2 0

05

10

15

20

25

30

x-value

y-co

unt

s

-30 -20 -10 0 10 20 30 40

02

04

06

08

0

x-value

y-co

un

ts

-20 -18 -16 -14 -12 -10

05

10

15

x-value

y-co

un

ts

-35 -30 -25 -20 -15 -10

02

46

81

0

Figure 2: System Framework