34
Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27–28, 2010 Statistics Austria, Guglgasse 13, 1110 Vienna Eurostat grant N 0 61001.2009.004-2009.858.

Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

Workshop onExploratory Data Analysis

and VisualizationStatistics Austria

Mai 27–28, 2010

Statistics Austria, Guglgasse 13, 1110 Vienna

Eurostat grant N0 61001.2009.004-2009.858.

Page 2: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

2

List of ParticipantsOuti Ahti-Miettinen Statistics Finland [email protected] Alfons Vienna University of Technology, Austria [email protected] E. Baaske Studienzentrum fur int. Analysen (STUDIA) [email protected] Boesch Swiss Federal Statistical Office FSO [email protected] Brandmuller Eurostat Unit E4 [email protected] Dobszay ECOSTAT Research Institute for Economy

and Society, [email protected]

Carlo De Gregorio ISTAT, Italy [email protected] Filzmoser Vienna University of Technology, Austria [email protected] Graf Swiss Federal Statistical Office FSO [email protected] Greenacre Universitat Pompeu Fabra, Catalunya, Spain [email protected] Hacking Statistics Netherlands [email protected] Haslinger Statistics Austria [email protected] Hulliger University of Applied Sciences N-W Switzer-

[email protected]

Mikael Jern Linkoping University, Sweden [email protected] de Jonge Statistics Netherlands [email protected] Kaminger Statistics Austria [email protected] Katzlberger Statistics Austria [email protected] Kehrer University of Bergen, Norway [email protected] Kowarik Statistics Austria [email protected] Malmdin Statistics Sweden [email protected] Di Meglio Eurostat Emilio.DI-

[email protected] Meindl Statistics Austria [email protected] Milano Statistics Belgium [email protected] Mueller Johannes Kepler University Linz [email protected] Muthmann Eurostat [email protected] Rainer Statistics Austria [email protected] Piringer VRVis (Zentrum fur Virtual Reality und Vi-

sualisierung), [email protected]

Adrian Redmond Central Statistics Office Ireland [email protected] Rejec Statistical Office of the Republic of Slovenia [email protected] Ribe Statistics Sweden [email protected] Rogers Office for National Statistics, UK [email protected] Seidou Sanda United Nations Economic Commission for Eu-

[email protected]

Andreja Smukavec Statistical Office of the Republic Slovenia [email protected] Steiwer STATEC Luxembourg [email protected]

Maria Sustakova Statistical Office of the Slovak Republic [email protected] Talers Central Statistical Bureau of Latvia [email protected] Templ Statistics Austria & Vienna Uni. of Technol-

[email protected]

Andrej Vallo Statistical Office of the Slovak Republic [email protected] Vesselinov National Statistical Institute Bulgaria [email protected] Weisz Hungarian Central Statistical Office [email protected] Zechner Vienna University of Technology, Austria [email protected]

Page 3: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

3

Main-Organizers

Matthias TemplMethods UnitStatistics Austria & Vienna University of TechnologyE-mail: [email protected]

Alois HaslingerMethods UnitStatistics AustriaE-mail: [email protected]

Local Organizers

Alois Haslinger, Elena Hoschek, Matthias TemplTel.: +43 71128 e-mail: [email protected]

Scientific Committee

Peter Filzmoser Monique Graf Beat HulligerVienna University of Technology Swiss Federal Statistical Office Uni. of Applied Sciences, N-W. [email protected] [email protected] [email protected]

Alois Haslinger Emilio Di Meglio Matthias TemplStatistics Austria Eurostat Statistics Austra & Vienna Univ. of [email protected] [email protected] [email protected]

Sponsors

This conference is supported by Eurostat grant N0 61001.2009.004-2009.858.

List of sponsors:

European Commission

Statistics Austria

Conference Website

http://www.statistik.tuwien.ac.at/edavis

Page 4: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

4

Aim and Scope

Aims of the workshop:

Exploratory data analysis and visualization (EDAVIS) become more and more important as a toolfor analysing microdata, and for presenting aggregated information for end-users and policy needs.

EDAVIS is a wide field, starting from basic summary measures and simple graphical displays toadvanced statistical tools. While the use of EDAVIS in statistics is quite popular, the applicationof EDAVIS is somehow limited in Official Statistics. Nevertheless, it has a big potential to improveprocesses, quality and communication and becomes more and more popular and important.

It has to be recognized that exploratory methods can be said to be suitable for generatinghypotheses rather than testing and confirming hypotheses.

One of the main features of exploratory data analysis is the key role of graphical displays.Human eye is a very powerful pattern recognition tool and Graphical excellence is that which givesto the viewer the greatest number of ideas in the shortest time with the least ink in the smallestspace ( Edward R. Tufte).

For this reason the focus is on an exploratory and graphical approach. This approach is eitheruseful to learn about the structure of the underlying microdata, to evaluate model estimations bygraphical diagnostic tools and to display key figures about data to the public.

A workshop on EDAVIS should extend these ideas and make these ideas applicable to a broaderaudience.

Scope:

The purpose of this workshop is to allow participants to share their experience in implementationof data analysis and visualisation techniques in official statistics, review these actions, discuss goodpractices and identify ways to progress in this field at ESS level. Furthermore, it can also be aframework to present an overview of the methodological and technological developments in dataanalysis and visualisation, to disseminate and demonstrate state-of-the-art results in this field.

Leading scientists, experienced researchers and practitioners will come together to exchangeknowledge and to build scientific contacts. The number of participants is restricted to 40.

Topics:

All aspects of Data Analysis and Visualisation including (but not limited to) developments in theareas listed below:

1. Data Analysis and Visualisation in the data production process Exploratory Data AnalysisVisualisation of Categorical Data Diagnostics for Models, Measurement Errors and InfluentialObservations, Missing Values and Imputed Values. Multivariate Methods

2. Visualisation for a Better Understanding by the End User Visualisation of Aggregated DataVisualisation of Indicators, Visualisation for Policy Needs Mapping, GIS

3. Software and Data-driven Computational Methods Software Development and (Interactive)Tools for Data Analysis and Visualisation Data-driven Computational Methods

Page 5: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

5

Scientific Programme

Thursday, May 27, 2010:

Opening

9.00-9.10:K. Pesendorfer (Director General, Statistics Austria): Welcome Address

9.10-9.20:R. Muthmann (Eurostat): Welcome Address or Short Introduction

9.20-9.30:A. Haslinger, M. Templ (Organizers): Organisatorial Issues

Chair: M. Templ

9.30-10.15: (KEYNOTE)M. Greenacre: Dynamic Graphics in Exploratory Data Analysis: the New Frontier

10.15-10.45:J. Kehrer: Selected Opportunities for Integrating Statistics and Visualization in Multi-dimensional Data Exploration

10.45-11.15:A. Alfons, M. Templ, P. Filzmoser: Exploring Microdata with Missing Information

11.15-11.45: Coffee break

Chair: B. Hulliger

11.45-12.30: (KEYNOTE)P. Filzmoser, M. Templ: Data Analysis with Robust Statistical Methods

12.30-13.00:W. Hacking: Applying Macro Editing in MacroView

13.00-14.15: Break

Page 6: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

6

Panel Session Robustness and Editing:

14.15-14.45:Discussants P. Filzmoser, B. Hulliger

Chair: M. Graf

14.45-15.15:I. Kamminger, G. Katzelberger: Statistical Grids and Cartographic Products in Statis-tics Austria

15.15-15.45:W. Baaske: Explorative data analysis in policy counseling

15.45-16.15:B. Meindl, A. Kowarik, S. Zechner: Creation of Graphical Tables for Websites andDocuments

16.15-16.40: Coffee break

Panel Session on Data Analysis and Official Statistics:

16.40-17.15:Discussants M. Graf, M. Greenacre, E. Jonge

Page 7: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

7

Friday, May 28, 2010:

Chair: A. Haslinger

9.00-9.45: (KEYNOTE)S. Rogers: Talk about Revolutions: The changing face of data dissemination in the 21st

Century9.45-10.15:

T. Brandmuller: Map-Based Data Visualization on the web

10.15-10.45:J. Malmdin and M. Ribe: Plans and tools at Statistics Sweden for standardized graphicsand interactive maps

10.45-11.15: Coffee break

Chair: E. Di Meglio

11.15-12.00: (KEYNOTE)E. De Jonge: Visualizing migration using network analysis

12.00-12.30:A. Boesch: Visual aggregation of the Swiss Sustainable Development Indicators System(MONET)

12.30-13.00:M. Jern, L. Thygesen: Explore, Collaborate and Publish Official Statistics

13.00-14.15: Break

Chair: P. Filzmoser

14.15-14.45:H. Piringer Visplore - a Versatile Approach to the Interactive Visual Analysis of LargeMultivariate Data

14.45-15.15:B. Hulliger, D. Lussmann, M. Templ: Visualisation in the AMELI Project

15.15-15.45:C. De Gregorio: The dynamics of HICP components and their comparability across coun-tries

Panel Session on Future Directions in Data Analysis and Visualization:

15.45-16.15:Discussants: E. Di Meglio, S. Rogers, M. Templ:

Page 8: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

8

adf

Page 9: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

Abstracts

A. Alfons, M. Templ, P. Filzmoser: Exploring Microdatawith Missing Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

W.E. Baaske: Explorative data analysis in policy counseling . . . 12

A. Boesch: Visual aggregation of the Swiss Sustainable Develop-ment Indicators System (MONET) . . . . . . . . . . . . . . . . . . . . . . . . . . 13

T. Brandmuller: Map-Based Data Visualization on the web . . 14

C. De Gregorio: The dynamics of HICP components and theircomparability across countries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

P. Filzmoser, M. Templ: Data Analysis with Robust StatisticalMethods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

M. Greenacre: Dynamic Graphics in Exploratory Data Analysis:the New Frontier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

W. Hacking: Applying Macro Editing in MacroView . . . . . . . . . . . 19

B. Hulliger, D. Lussmann, M. Templ: Visualisation in theAMELI Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

I. Kaminger, G. Katzelberger: Statistical Grids and Carto-graphic Products in Statistics Austria . . . . . . . . . . . . . . . . . . . . . . . 21

J. Kehrer: Selected Opportunities for Integrating Statistics andVisualization in Multi-dimensional Data Exploration . . . . . . . . 22

M. Jern, L. Thygesen: Explore, Collaborate and Publish Offi-cial Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

E. de Jonge: Visualizing migration using network analysis . . . . . 26

J. Malmdin and M. Ribe: Plans and tools at Statistics Swedenfor standardized graphics and interactive maps . . . . . . . . . . . . . . 27

B. Meindl and A. Kowarik and S. Zechner: Creation ofGraphical Tables for Websites and Documents . . . . . . . . . . . . . . . 28

H. Piringer: Visplore - a Versatile Approach to the InteractiveVisual Analysis of Large Multivariate Data . . . . . . . . . . . . . . . . . . 29

S. Rogers: Talk about Revolutions: The changing face of datadissemination in the 21st Century . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

Page 10: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

Exploring Microdata with Missing Information

A. Alfons1, M. Templ1,2, P. Filzmoser1

1 Department of Statistics and Probability Theory, Vienna University of Technology, Wiedner Hauptstrasse7, 1040 Wien

2 Methods Unit, Statistics Austria

Keywords: Missing Values, Visualization, R, Graphical User Interface

1 Abstract

Visualization of missing values allows to simultaneously explore the data and the structure ofmissing values. This may help to identify the mechanism generating the missing data, which isrequired knowledge for estimating the missing values in a reliable manner.

The main goal of this contribution is to stress the importance of visualizing missing valuesbefore imputation. A collection of visualization techniques for missing values is presented, whichare implemented in the R package VIM. A graphical user interface allows easy use of the plotmethods. Furthermore, VIM can be used for data from essentially any field. If spatial coordinatesare available, information about missing values can also be displayed in maps.

Page 11: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

Explorative data analysis in policy counseling

W.E. Baaske1

1 Studienzentrum fr internationale Analysen (STUDIA) Panoramaweg 1, 4553 Schlierbach, AustriaZVR: 742926122 http://www.studia-austria.com

Keywords: Policy counseling, regional development, explorative data analysis

1 Abstract

As opposed to the natural sciences, socio-economics and politics have a nearly unlimited numberof influential factors. On the other hand, clients often demand research results that are determinedbefore research has taken place. The researcher is in the dilemma to promise results and riskingnot achieving them - or losing the client’s confidence. Research of the client’s world should increasehis range of action and provide him with promising options. Explorative methods are a tool forachieving this.

I have experience in research on regional policies, and co-operate with a counselor for localdevelopment that has collected data of 20.000 questionnaires from 60 municipalities in Germanyand Austria. This data base is steadily increasing, due to new municipalities demanding support.The data have been used not only for creating awareness in each single municipality, but also forcross comparison of municipalities, creating benchmarks for success or deriving consequences forsocietal development and agricultural policies.

Descriptions of well-accepted ideas have opened up minds for research questions. The researcherand regional developer referred to the clients’, municipalities’ and their political representatives’values, such as development goals, commitment to frameworks (Common Agricultural Policy, Ky-oto, . . . ) or participation of the civil society. Own experiences and success stories create confidencein the counseling institution. This opens up minds for verifying hypotheses, seeking evidence anddeveloping new insights.

The research process links loosely to the client’s demands. Several valid indicators should rep-resent the hypotheses, but they constitute just a small part of a broad data base, in which officialstatistics are also included. Descriptive and explorative methods are used to learn, how the databehave and how they interfere. Interpreting results will demand diverse views from the researchteam and external practitioners. Also the client may be involved to ensure a proper understandingof interdependencies and causality.

Final presentations focus first on consequences, and to a lesser degree the results or explorativeprocess. The process will be described at a management level and will not focus on unimportantdetails. Descriptive statistics help to get an understanding of the facts, but the data presentedhave to be reduced. Models (like regression models) are presented only insofar as they depict themain action options for the client.

Explorative data analysis is definitely a crucial tool for policy counseling, but is often leftunmentioned and is therefore underestimated and underfinanced. It should be promoted that ex-plorative methods yield important insight into the clients’ world and provide them with reliabledecision tools. This may be best done when important consequences for the client’s action havebeen achieved by using data and appropriate analysis tools.

Page 12: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

Visual aggregation of the Swiss Sustainable DevelopmentIndicators System (MONET)

A. Boesch1

1 Federal Statistical Office FSO, Espace de l’Europe 10, CH-2010 Neuchatel, Switzerland

Keywords: Visual aggregation, Sustainable Development, Indicators, Visualisation, Dashboard

1 Abstract

What is the best way to synthesize the information provided by an indicator system of more than50 indicators? What is the best way to provide an overview that is easily understandable by thegeneral public and by policymakers without losing transparency? This is the challenge faced by anindicator system such as MONET (German acronym for “Monitoring Sustainable Development”).

There are several ways to aggregate data. The approach used for MONET is a visual aggregationmethod called Dashboard (or Cockpit), in analogy with a car dashboard or aircraft cockpit. Thissimple method allows the aggregation of indicators with various units (by means of the assessmentof their trend) and provides a synoptic view of the whole system as well as an overall assessmentof the situation. At the same time, the Dashboard gives access to each individual indicator, thusallowing detailed information to be displayed and achieving transparency.

The aim of the presentation is to provide information about the experiences made with theDashboard of Sustainable Development at the FSO. Among others, these include: taking dueaccount of the requirement for transparency in official statistics; the prerequisites for an indicatorto be included in the Dashboard; and the way the assessments of each individual indicator aremerged into a broader assessment.

Page 13: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

Map-Based Data Visualization on the web

T. Brandmuller1

1 Eurostat

1 Abstract

As stated in the 5-year Community Statistical Programme 2008 to 2012: “The rapid evolution inthe capacity and availability of the Internet will make it the prime tool for the dissemination ofstatistical data in the future. . . . The Internet will, however, also introduce significant new challengesfor user-friendly presentation of data that help users to find, display and understand statistics.. . . The growing awareness of the potential of the combination of geographic with statistical andthematic information increases the demand for mapping, analyses and applications.”

The presentation will examine to which extent the portals of Statistical Institutes met thechallenges mentioned above. In particular, the presentation will focus on the visualization toolscombining geographic and statistical information.

In the first part conceptual framework will be established combining user needs and function-alities. We classify users into tourists, farmers and miners following the typology established by[Inmon, et al., 2001] and identify their needs. These needs should be reflected in the functionalrequirements included in the conceptual framework.

In the second part selected portals will be shown and assessed following the established frame-work. The selected portals include the Eurostat portal and the OECD eXplorer. The objective isto compare and evaluate the mapping tools used.

References

[EUROPEAN PARLIAMENT, 2007] DECISION No 1578/2007/EC OF THE EUROPEAN PAR-LIAMENT AND OF THE COUNCIL of 11 December 2007 on the Community Statistical Pro-gramme 2008 to 2012, 2007. URL http://CRAN.R-project.org/package=simPopulation. Rpackage version 0.1.

[Inmon, et al., 2001] W.H. Inmon, C. Imhoff, and R. Sousa. Corporate Information Factory, Wiley,2001.

Page 14: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

The dynamics of HICP components and their comparabilityacross countries

C. De Gregorio1

1 ISTAT via Cesare Balbo 16, 00184, Rome, Italy.

Keywords: Consumer price indices, Harmonisation, Multivariate data analysis, Principal components,Cluster analysis, Variability, Sampling in HICP

1 Abstract

In this contribution it is proposed a method to classify the behaviour of HICP sub-indices and toevaluate their comparability across countries: as a consequence of this analysis, some suggestionsare proposed concerning the possible improvements to be adopted in order to make it viable the useof sub-indices as a tool for policy purposes. Notwithstanding its detailed aggregative structure, theconsumer price index (CPI) has been traditionally seen in the literature as a typical macroeconomicindicator to be used mainly as a tool to target monetary policies, while much less importance hasbeen given to its potential uses for less aggregated policies, concerning industrial, competition orconsumer protection subjects. These uses seem nevertheless to have been gaining ground in thelast years at least in the European Union, also due to the presence of a harmonised CPI (theHICP) produced by 30 countries in the area according to a common legal basis. They seem topave the way for further extensions of the scope of the potential uses of the CPI well beyond theneeds of monetary policy. Two main elements are stressed in this contribution. Firstly, the HICP(as any CPI) derives from the aggregation of sub-indices with possibly largely diversified pricedynamics (flatly or smoothly time-linear, erratic or seasonal fluctuations, sharp rises or declines,etc.): it appears useful to quantify this heterogeneity and to derive a classification of the behavioursof the whole set of sub-indices. Secondly, it may be reasonable to expect that the sub-indicesreferred to a same elementary aggregate behave according to similar rules across EU countries or atleast that they can be compared in order to give an economic explanation to different behavioursof a same market in different countries. If considered at a very detailed level, the HICP sub-indices may assume very heterogeneous behaviours across EU. In several cases, it is reasonable toassume that these differences are not generated by genuine underlying differences in the structureof national markets: they can be justified by the heterogeneity of the methodological approachesadopted nationally to provide the estimates. This mere fact might affect sector analysis, especiallyif conducted at EU level and for policy reasons. In the last years the harmonisation process hasmore thoroughly focussed on the methodological approach to estimating the HICP. EC Regulation1334/2007 is in fact a major step forward a unitary approach to the process of estimating price sub-indices based on a coherent definition of the target population and on explicit sampling design. Thispaper suggests a methodology to identify such possible areas where the harmonisation of samplingapproaches is more urgently needed, and provides some suggestions for possible intervention inspecific areas. For this purpose, a comparative study of the elementary indices by country is hereproposed in order to provide an overall evaluation of their dynamics and to propose a measurementof cross country heterogeneity. The paper, in particular, provides a classification of the dynamicbehaviour of more than 1.200 HICP series, examined in the period 2004-2008 and broken downby country and COICOP class. Such classification is derived on the basis of the values assumedby a set of indicators calculated on each series which are analysed by means of the sequentialuse of principal components and cluster analysis (Ward algorithm). The classification on the onehand provides a tool to describe the range of behaviours which characterise elementary HICPs;on the other hand it helps to identify the sub-indices which show lower degrees of homogeneityacross countries and which may thus be deemed as possibly needing harmonisation. A measurement

Page 15: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

16 The dynamics of HICP components and their comparability across countries

of such heterogeneities is proposed, with the main objective of highlighting the possible areas ofmethodological harmonisation: some case studies might be reported.

Page 16: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

Data Analysis with Robust Statistical Methods

P. Filzmoser1, M. Templ1, 2

1 Department of Statistics and Probability Theory, Vienna University of Technology, Wiedner Hauptstraße8-10, 1040 Wien

2 Methods Unit, Statistics Austria

Keywords: Outliers, Indicators, Robust Estimation, Visual Exploration

1 Abstract

Data outliers or other data inhomogeneities can be highly influential to traditional statistical meth-ods and estimators. Robust versions, in contrast, can cope with certain deviations from idealizedstatistical assumptions, and they still deliver reliable results in such situations.

Basic concepts

The most important concepts from robust statistics, the influence function and the breakdownpoint, are introduced and visually explained. Diagnostic tools like the sensitivity curve are pre-sented, which are helpful in understandiing the effect that a single observation can have on anestimator.

Reducing the influence of outliers

Robust estimators are characterized by limiting the effect of outlying observations. This is demon-strated with point and variance estimators. When estimating totals, robustness often leads to asmall bias of the estimates but their variances may decrease considerably. Using close-to-realitysimulations, we show that also for the estimation of indicators the robust versions lead to morereliable results in presence of data contamination.

Outlier detection

Outlier detection methods are helpful to detect measurement errors, but they can also identifyobservations that are highly influential on non-robust estimations. While traditional editing rulesare deterministic and not data-driven, outlier detection based on statistical estimations accountfor the structure of the underlying data. Reliable outlier identification is only possible if robustestimators, being insensitive to these outliers, are used. For survey data, however, it is not triv-ial how robust estimators should be applied to data including missing values, zeros and skeweddistributions. We will discuss these issues and provide possible approaches.

Page 17: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

Dynamic Graphics in Exploratory Data Analysis: the NewFrontier

M. Greenacre

1 Universitat Pompeu Fabra, Ramon Trias Fargas 25–27, Barcelona, 08005 Spain

Keywords: Dynamic graphics, Embedding, Video, Visualization.

1 Abstract

In his book Sight, Sound, Motion: Applied Media Aesthetics Zettl (2005) describes the five principalaesthetic fields in media production: light/color; the two-dimensional field of area and screenswithin screens; the three-dimensional field of space; and extensions to the four-dimensional field oftime/motion and the five-dimensional field of sound. Scientific graphics hardly venture beyond thetwo-dimensional field, and only occasionally into the third dimension using interactive computergraphics. In this talk we consider the “fourth-dimensional” use of motion in data visualization andexploration.

Dynamic graphics introduce an additional dimension in the form of a timeline. The simplesttimeline is, of course, time, when the data are observed in a natural time series. Hans Rosling’swell-known TED talk on the web gives striking examples of these. Other examples of timelinevariables are:

• a changing angle of a three-dimensional graphic;

• a conditioning variable;

• a user-defined parameter which links two alternative ways of visualizing a data set.

As scientific publishing goes increasingly online, the inclusion of dynamic graphics in publishedarticles is becoming a real possibility and will, in our opinion, become a commonplace feature.Already videos and three-dimensional objects can be embedded in PDF files – Greenacre andHastie (2010) is, to our knowledge, a first attempt at publishing a complete article with embeddedanimations. The challenge is to prepare general tools for users to enable the production of suchanimations – we describe some existing examples using R.

References

M. Greenacre, and T. Hastie. Dynamic visualization of statistical learning in the context of high-dimensional textual data. Journal of Web Semantics, to appear, 2010.

H. Zettl. Sight, Sound, Motion: Applied Media Aesthetics. 4th Edition. Wadsworth, California,2005.

Page 18: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

Applying Macro Editing in MacroView

W. Hacking1

1 Statistics Netherlands

Keywords: macro editing, output editing, statistical editing, graphical editing

1 Abstract

Macro-editing or selective editing is a processing step that should be performed after automatic cor-rections and derivations have been applied to micro-data [T. De Waal, D. Haziza 2008]. In macro-editing records are selected and compared at a macro-level and (only) a few of those records arecorrected manually. During macro-editing the idea is to select (groups of) records that:

i) have suspicious values

ii) and are influential, e.g. they contribute considerably to the value in the aggregated (output)tables

As a result of the last criterion a large reduction in the effort put into the manual correction ofmicro-data is to be expected while the quality of the output will not reduce. Some simulation stud-ies (e.g., [L. Granquist 1994]) have reported an efficiency gain of 35%80% over micro-editing byusing the macro selection approach. Although some specific applications have been implemented[L. Granquist 1994], to our best knowledge, no general tool has been specifically developed formacro-editing. Therefore, at Statistics Netherlands a dedicated tool called MacroView is currentlybeing developed. This macro-editing tool uses a script language allowing for a very flexible specifi-cation of the macro-selection process and its parts. In addition to an efficiency gain this approachcan also improve the quality of the data through comparison with other sources of data.

References

[L. Granquist 1994] Macro editing: A Review of some Methods for Rationalizing the Editing ofSurvey Data. Statistical Data Editing, Volume No. 1: Methods and Techniques, 1994.

[T. De Waal, D. Haziza 2008] Statistical editing and imputation. Handbook of Statistics, Volume29, Sample Surveys: Theory, Methods and Inference, Editors: C.R. Rao and D. Pfeffermann,2008.

Page 19: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

Visualisation in the AMELI Project

B. Hulliger1, D. Lussmann1, M. Templ2,3

1 University of Applied Sciences Northwestern Switzerland, School of Business, 4600 Olten, Switzerland2 Department of Statistics and Probability Theory, Vienna University of Technology, Wiedner Hauptstrasse

7, 1040 Wien3 Methods Unit, Statistics Austria

Keywords: Visualisation of Indicators, Evaluation-Plot, Sparkevals, Mapping, Grid-map

1 Abstract

The aim of the FP7 project AMELI (Advanced Methodology for European Laeken Indicators)is to develop new and improved methodologies for estimating indicators, i.e. to increase the pre-cision when estimating indicators and their variances which should allow to report high-qualityfigures to policy makers. This presentation highlights some parts of the developments of AMELIin visualisation where we concentrate on the visualisation of indicators.

Evaluation of Indicators:

The evaluation of indicators depends on the targets which policy has set for the indicator. Thevarious possible targets are summarised into the concept of a target course. When comparing thetarget course and the indicator the relevance of a deviation and the uncertainty in the estimationhas to be taken into account. These elements are condensed into an evaluation plot. The evaluationplot shows by a traffic light colouring and boundary limits when an indicator is likely to go offcourse.

Projection of Maps:

European Maps are presented using a specific coordinate system. However, when providing func-tionality to zoom into smaller regions, the presentation of the zoomed objects is distorted withoutprojecting the corresponding region into another suitable coordinate system. We present a solu-tion where the projection and the calculation of the necessary parameters for projection is doneautomatically and hidden from the user.

Presentation of Indicators in Maps:

Whenever time series of indicators are available for each region, it may be interesting to estimatecorrelations between them. Doing so, it is then possible to provide functionality that allows to clickon one region and to automatically display the correlations of each region to the selected region inmaps using a continuous colour scale. In addition to that, additional information can be providedby plotting sparklines or evaluation plots for each region in the map and to provide graphical tablesincluding evaluation plots in plots and LATEX format.

Grid map:

The map of Europe is a very intuitive tool to guide the user to results of his/her own country andto neighbouring countries that are interesting for comparison purposes. Unfortunately the areaof European countries is very variable and, in addition, the area is only loosely correlated to thepopulation size. Grid maps propose to display diagrams in a grid arrangement where each countryhas the same area for display. At the same time grid maps retain some of the desirable properties ofa map: The geographical location and neighbourhood relations. Thus a grid map displays a countryalways at the same location and close to its geographical position. In addition the neighbouringcountries can be found easily, too.

Page 20: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

Statistical Grids and Cartographic Products in StatisticsAustria

I. Kaminger, G. Katzelberger1

1 Statistics Austria, Guglgasse 13, 1110 Vienna.

Keywords: GIS, Raster Maps

1 Abstract

Statistical grids:

The spatial dimension of statistical data is no longer limited to administrative units, but throughthe link to the building register (xy-coordinates) can be aggregated to any customer-defined areas.Over the past few years Statistics Austria has developed a system of standardised grids from assmall as 100m × 100m and is offering a set of specific variables on the basis of grids for spatialanalysis.

Small area statistics is also becoming more and more important for transnational spatial anal-ysis. Therefore the ESSNet-Project GeoStat is aiming to harmonise the various European nationalgrid systems into one European grid system. Another aim is to provide the census data from thecoming census 2010/2011 for all of Europe on the basis of grids. Two methods for establishingthe grids will be combined. The aggregation or “bottom-up”-method, for countries where pointbased data is available and a disaggregation approach or “top-down”-method for the remainingcountries. The infrastructure for the dissemination of results as well as methods and possibly toolsfor analysis will also be provided through the project.

Cartographic Products in Statistics Austria

Besides the traditional cartographic outputs as prints, i.MAP an application for visualising mapswas developed in Statistics Austria. This tool enables interactive thematic and topographic mapson a variety of subject areas to be made available on the website. The main target group is notthe expert user, but the broad public, so it will be used for press conferences and press releases toillustrate the presented data in an efficient way.

Page 21: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

Selected Opportunities for Integrating Statistics andVisualization in Multi-dimensional Data Exploration

J. Kehrer1

1 Visualization Group, Dept. of Informatics, University of Bergen, Norway

1 Abstract

Visualization and statistics both facilitate the understanding of complex data characteristics, andthere is a long history of relations between the two fields. Traditional approaches for data anal-ysis often consider passive visualizations of statistical data properties. Interactive visual analysis,however, as addressed in this talk, allows the iterative exploration and analysis of data in a guidedhuman-computer dialog. Graphical representations of the data and well-proven interaction mech-anisms are used to concurrently show, explore, and analyze complex (i.e., time-dependent, multi-variate, and/or multi-dimensional) data. Interesting subsets of the data are interactively selected(brushed) directly on the screen, the relations are investigated in other linked views (including 2Dscatterplots, histograms, function graph views, parallel coordinates, but also 3D views of volumetricdata).

In recent work, we have studied the integration of large amounts of locally aggregated statis-tical data properties as well as measures of outlyingness in an interactive visual analysis process.The approach is demonstrated on the visual analysis of multi-dimensional climate data. A discus-sion of possibilities explains how a further combination of interactive statistical plots and proveninteraction schemes from visualization research shows great potential for future research.

Page 22: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

Explore, Collaborate and Publish Official Statistics

M. Jern1, L. Thygesen2

1 National Center for Visual Analytics NCVA, Linkoping University, Sweden2 Statistics Denmark

Keywords: GeoAnalytics, explorative statistics data visualization, collaborative visualization, web visu-alization

1 Abstract

Official statistics such as demographics, environment, health, social-economy and education fromnational and sub-national sources are a rich and important source of information for many impor-tant aspects of life and should be considered to be more used and acknowledged in education. Aseamless integrated statistics exploration, collaboration and publication process is introduced facili-tating storytelling aimed at producing statistical news content in support of an automatic authoringprocess. The author can simply press a button to publish gained knowledge that efficiently andclearly visualize spatio-temporal statistical data. The toolkit involves a novel storytelling technol-ogy that advances research critical to official statistical production and publishing. It delivers thisresearch into a web-enabled toolkit for the generation, management and publication of embeddeddynamic visualization with the analytics sensemaking metadata joined together and publishablein any HTML web pages such as blogs, wikis etc. Publishing official statistics through assistedcontent creation with emphasis on visualization and aesthetics represents another key advantageof our storytelling and could in many ways change the terms and structures for learning.

While the benefits of geovisual analytics (GeoAnalytics) tools are many[G. Andrienko, N. Andrienko 2005], it has been a challenge to adapt these tools to the Internet andreach a broader user community. Research has focused on tools that explore data while methodsthat publish gained knowledge have not achieved the same attention. Publication is the part ofthe analytical process that is visible to the consumers and the visual sparks it generates couldtake on new value in a social setting and become a catalyst for discussion. GeoAnalytics tools areintroduced that address challenges in support of both editorial and related authoring process withthe goal to advance research critical to publishing. Seamless integration of exploration, collabora-tion and publication is required according to [P. Keel 2006]. The storytelling mechanism enablesthe transition of tedious statistics data into heterogeneous, open and communicative sensemakingnews entities with integrated contextual metadata that will emphasize on content creation aspectssuch as aesthetics analysis or “infosthetics” and where dynamic embedded temporal visualizationcould engage the user.

We build upon previous research by [M. Jern, J. Rogstadius, T. Astrom and A. Ynnerman 2008]and our web-enabled applications, e.g. “OECD eXplorer” in collaboration with OECD[M. Jern, L. Thygesen, M. Brezzi 2009] and the Open eXplorer platform that is emerging as a defacto standard in the statistics community for exploring and communicating statistics data. A sto-rytelling mechanism is available for the author to first access and explore spatiotemporal statisticaldata from official databases, then orchestrate and describe metadata, collaborate with colleagues toreach consensus and finally publish essential gained insight and knowledge as a Vislet (figure 2(a))embedded into blogs or wikis. The conceptual approach to our authoring and publishing conceptis based around three complementary characteristics: eXplorer, storytelling and Vislet (widget):

Authoring (eXplorer) : data provider (spreadsheet and SDMX), data manager, motion visual rep-resentations including choropleth map, scatter plot, table lens, parallel axes chart and timegraph, data grid, coordinated views, map layers, analytic tools (dynamic query, filter, regionalcategorization, profiles, highlight), dynamic colour scale and legend, create HTML code forVislet.

Storytelling: snapshot mechanism, metadata with hyperlinks, story and chapters, edit, capture,

Page 23: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

24 Explore, Collaborate and Publish Official Statistics

FIGURE 1. The analyst uses eXplorer to 1) import regional statistical data, 2) explore and make discoveriesthrough trends and patterns and derive insight - gained knowledge is the foundation for 3) creating a storythat can be 4) shared with colleagues and reach consensus and trust. The visual discoveries are capturedinto snapshots together with descriptive metadata and hyperlinks in relation to the analytics reasoning.The author gets feedback from colleagues, adopts the story and 5) finally publishes “tell-a-story” to thecommunity using a “Vislet” that is embedded in blogs or wikis.

save, export story, publish story.

Vislet: embeddable interactive motion visual representations including choropleth map, scatterplot, parallel axes chart, table lens and metadata for publishing in blog, wikis etc.

The eXplorer platform is customized from our Web-enabled GAV Flash class library, pro-grammed in Adobe’s object-oriented language ActionScript and includes a collection of com-mon geo- and information visualization representations (figure 2(b)). Statistical data are analysedthrough the use of multiple-linked motion views. Complex patterns can be detected through anumber of different visual representations simultaneously, each of which is best suited to high-light different statistics pattern and can help stimulate the analytical visual thinking process socharacteristic for GeoAnalytics reasoning. All graphs are time-linked, important in the synthesisof animation within explorative statistical data analysis. Interactive features that support a spa-tial analytical reasoning process are exposed such as tooltips, brushing, highlight, visual inquiry,conditioned statistics filter mechanisms that can discover outliers and simultaneously update allviews. Of particular interest is the common information visualization methods table lens and par-allel coordinates, to a great extent unknown to the statistics community extended with specialfeatures that are important to statistics exploration, for example, compare the profiles of selectedregions, motion to see these profiles change over time, frequency histograms and filter operationsbased on percentile statistics. The Flash-based parallel axis chart and table lens (figure 2(b))have slowly demonstrated to be not only functional but also productive analysing patterns formulti-dimensional statistical (6-12) indicators.

Storytelling is achieved through a mechanism in GAV Flash that supports the storage of in-teractive events in an analytical reasoning process through “memorized interactive visualizationviews” or “snapshots” that can be captured at any time during an explorative data analysis processand becomes an important task of the storytelling authoring process. The author selects relevantindicators, regions-of-interest, colour schema, visual inquiries, filter conditions focusing on the data-of-interest and finally highlights the “discovery” from the most appropriate visual representationsand include reasoning text. The Vislet (figure 2(a)) is a standalone Flash application (widget)assembled from the GAV Flash class library (figure 2(b)) and Flex GUI tools. Integrates selected

Page 24: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

(a) Time-linked views in a Vislet showing age-ing population for Europe 1990-2008.

(b) eXplorer and Vislets are developed from GAVFlash components customized and optimized to sus-tain real-time coordinated time-linked views thatare simultaneously updated with changing regionalstatistics data for every new time step.

FIGURE 2. Vislet views (a) and eXplorer and Vislet (b)

statistical indicators supported by highly interactive visualization with descriptive metadata thatare embedded into blogs, wikis or any HTML document.

Open eXplorer is free available and could increase the interest among specialist as well as non-specialist users and at the same time encourage the practical use of more advanced GeoAnalyticstechnologies.

References

[G. Andrienko, N. Andrienko 2005] . Visual Exploration of Spatial Distribution of Temporal Be-haviors. In Proceedings of IEEE IV2005.

[D. Guo D, J. Chen, A.M MacEachren, K. Liao 2006] . A visualization system for space-time andmultivariate patterns. IEEE Visualization and Computer Graphics, Vol 12, No 6, 2006.

[M. Jern, J. Rogstadius, T. Astrom and A. Ynnerman 2008] . Visual Analytics presentation toolsapplied in HTML Documents. Reviewed proceedings, IV08, London, July 2008, published byIEEE Computer Society.

[M. Jern, L. Thygesen, M. Brezzi 2009] . A web-enabled Geovisual Analytics tool applied to OECDRegional Data, Reviewed Proceedings in Eurographics 2009, Munchen, March 2009.

[P. Keel 2006] . Collaborative Visual Analytics: Inferring from the Spatial Organisation and Col-laborative use of information, VAST 2006, pp.137-144, IEEE.

Page 25: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

Visualizing migration using network analysis

E. de Jonge1

1 Statistics Netherlands (CBS) Henri Faasdreef 312, 2492 JP Den Haag, The Netherlands

Keywords: Consumer price indices, Harmonisation, Multivariate data analysis, Principal components,Cluster analysis, Variability, Sampling in HICP

1 Abstract

Many offcial statistics like migration and trade are flow data. Flow data are observations of quan-tities from a source to a destination. A natural visualization for unidirectional and simple flowdata is a flow diagram. Such a diagram shows at a glance the main flows thereby giving the userinsight in the data set. Analyzing and visualizing bidirectional flow data with thousands of flowshowever is no simple task. An example of such a data set is internal migration between cities. Thepresentation discusses the problems and solutions in the visual analysis of internal migration inthe Netherlands. It shows examples of Google Earth flow maps generated with R scripts. The flowmaps and analysis improve if network analysis is used to find clusters of graphs. The presentationintroduces networks analysis and shows how modularity can be used to show important patternsfor internal migration in the Netherlands.

Page 26: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

Plans and tools at Statistics Sweden for standardizedgraphics and interactive maps

J. Malmdin1 and M. Ribe2

1 Statistics Sweden, Process Department, Method Unit, Box 24 300, SE-104 51 Stockholm, Sweden2 Statistics Sweden, Process Department, Management, Box 24 300, SE-104 51 Stockholm, Sweden

Keywords: Statistics presentation, graphic presentation, SCB eXplorer, exploratory data analysis, inter-active software tool.

Statistics Sweden (SCB) is developing standard formats for graphic presentation of statistics on itsweb site. The work covers both graphs in press releases and interactive graphs using a statisticaldata base. The standardization should optimize graphics in supplementing tables and text forconveying statistical information to users. Standard templates will also make the production ofgraphs efficient. A project group is drafting the standard, as is presented in the the paper alongwith a review of potential software tools considered.

SCB is further involved in work on enhanced use of maps and Geographic Information Sys-tems (GIS) technology for effective presentation of regional statistics. The software product SCBeXplorer is developed in cooperation between SCB and the National Centre for Visual Analytics(NCVA) at Linkoping University, Sweden. Visualization with SCB eXplorer gives the end usera tool to analyze data and find clusters, trends, and what regions are alike in various aspects.The OECD eXplorer, a sister application to the SCB eXplorer, is a visualization tool for regionalstatistics in OECD countries.

BJ Map is an existing map server which can display thematic maps based on data bases (indBase or PC-AXIS format). It uses stored background maps for different zoom levels and plotssymbols on them. The result is returned to the Internet user (client) as an HTML-file containingreference to a generated map image. Subject-matter areas covered include income, agriculture,elections, environment, enterprises, municipalities (taxation, finance), social services, education,urban areas, and international statistics.

The paper also briefly discusses how to resolve problems in standardized use of data fromdifferent sources.

Page 27: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

Creation of Graphical Tables for Websites and Documents

B. Meindl 1 and A. Kowarik1 and S. Zechner2

1 Statistics Austria, Guglgasse 13, A-1110 Vienna, Austria2 Vienna University of Technology, Wiedner Hauptstr. 8-10, A-1040 Vienna, Austria

Keywords: Sparklines, Sparkbar, EDAVIS Workshop 2010.

1 Abstract

With the R package sparklines tables presenting quantitative information can be enhanced byincluding sparklines and sparkbars, proposed by Tufte (2001). Sparklines and sparkbars are simple,intense and illustrative graphs, small enough to be fitted in a single line. Therefore they can easilyenrich tables and continuous texts with additional information in a comprehensive visual way.

The aim is to provide an easy and fast way to create output for websites, presentations anddocuments. The core functionality is provided in a flexible way so that it is possible to use it in acustom setting, such as including more sophisticated sparkline plots as suggested in Hulliger et al.(2008), for example.

The usage of the package is explained with real-world applications.

References

B. Hulliger, D. Lussmann, F. Kohler, A.-M. Mayerat, and A. de Montmollin. Evaluation of indi-cators on environment and sustainability. European Conference on Quality in Official Statis-tics, Rome,2008.

E. R. Tufte. The Visual Display of Quantitative Information. Graphics Press, 2001.

Page 28: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

Visplore - a Versatile Approach to the Interactive VisualAnalysis of Large Multivariate Data

H. Piringer1

1 VRVis Zentrum fuer Virtual Reality und Visualisierung, Forschungs-GmbH, Donau-City-Strae 1, A-1220Wien, Austria

Keywords: exploratory data analysis, interactive visualization, multiple coordinated views, linking+brushing,interactive optimization, black-box analysis

1 Abstract

Visplore is a powerful approach to VISually exPLORE large and complex data in many diverseapplication scenarios (e.g., engineering, simulation, science, finance, business, telecommunication,infrastructure, any many more). Rather than using visualization only for presentation of knownfacts, visplore helps users to gain new insight into their data by a highly innovative combination ofvisualization, computation, and interaction techniques. In particular, it is possible to relate dozensof data dimensions simultaneously and to select arbitrary subsets (e.g., patterns like clusters oroutliers) for highlighted visualization and statistical quantification. Visplore also greatly facilitatesoptimization tasks involving multiple criteria and constraints, as for many applications. In general,users benefit by getting valuable information out of their data within a short time which helps toidentify potentials for improvement and to find reasons explaining observed facts.

(a) Optimizing the design of a car engine by ana-lyzing several hundred simulation runs for differ-ent choices of design parameters. All simulationruns are highlighted which are Pareto-optimalwith respect to four different simulation resultsas shown in the Parallel coordinates plot

(b) Analyzing relationships between multiple finan-cial time series (i.e., prices of commodities)

FIGURE 1. Examples from Visplore.

Page 29: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

Talk about Revolutions: The changing face of datadissemination in the 21st Century

S. Rogers1

1 ONS

1 Abstract

The shift from traditional paper formats to a web-based medium is a new and challenging revolutionin a history of revolutions that have shaped the way National Statistical Institutes deliver statis-tical information. While inventions such as time-line charts [J. Priestley 1765], line and bar charts[W. Playfair 1768], and the steam press [F.G. Koenig 1814] opened the door to the presentationof data to a wide audience in a meaningful way, the World Wide Web [T. Berners-Lee 1989] andWeb 2.0, [T. O’Reilly 2004] marks the beginning of new and unrealised opportunities to unlock andpresent even more information latent in data. Guided by other revolutions such as our understand-ing about how people perceive information in both static and interactive images [E. Hering 1878];[J.J. Gibson 1966], here, we present some of the ways the Office for National Statistics is contribut-ing to the changing face of data dissemination in the 21st Century.

References

[E.G. Carmines and R.A. Zeller 1979] Reliability and Validity Assessment., London: Sage Publi-cations. 1979.

[M. Friendly 2000] Visualizing Categorical Data. SAS Institute. Inc., 2000.

[R.J. Light 1971] Measures of response agreement for qualitative data: generalizations and alter-natives. Psychological Bulletin, 76, pages 365-377. 1971.

[T. Berners-Lee 1989] A proposal for the World Wide Web. Accessed online, March 2010.http://info.cern.ch/Proposal.html

[J.J. Gibson 1966] The senses considered as perceptual systems. Allen & Unwin, London, 1966.

[E. Hering 1878] On the theory of sensibility to light. In Hochberg, J. E. Perception. Prentice-HallInc, New Jersey, 1964.

[F.G. Koenig 1814] The high-speed printing press. Accessed online, March 2010.http://www.victorianweb.org/technology/print/3.html

[T. O’Reilly 2004] What is Web 2.0?. Accessed online, March 2010.http://oreilly.com/web2/archive/what-is-web-20.html

[W. Playfair 1768] The commercial and political atlas. Accessed online, March 2010.http://www.absoluteastronomy.com/topics/William Playfair

[J. Priestley 1765] Essay on a course of liberal education for civil and active life. Printed for C.Henderson under the Royal Exchange; T. Becket and De Hondt in the Strand; and by J.Johnson and Davenport, in Pater-Noster-Row, London, 1765.

Page 30: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

31

Social Program

The conference dinner is funded by Statistics Austria. It takes place in a typical Viennese winerestaurant (Heuriger). Details will be postet on the conference website shortly before the workshopstarts.

Page 31: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

32

Venue of the Workshop

Arriving at the Vienna International Airport (VIE):

Take the CAT (City-Airport-Train, direct connection, more expensive) or the regular city train(Schnellbahn, S7, more stops, lower price) to reach the city centre. The train stop for the city centeris WIEN-MITTE/Landstrasse. From there it is possible to change to several public transport lines(e.g. underground line U3 or U4). To go to Statistics Austria take U3 in direction “Simmering”and get off at the stop “Gasometer” (see above).

Arriving by train:

The main railway station WIEN WEST has a direct connection to different public transport lines(e.g. underground line U3 and U6). When arriving at the southern terminal WIEN SUD, it is possi-ble to change to several public transport lines (e.g. S-Bahn (city train) to Wien Mitte/Landstrae).To go to Statistics Austria, again take U3 direction “Simmering” and get off at the stop “Gasome-ter”. Of course there is also the possibility to take a taxi.

Detailed maps of the conference venue can be found here at the conference website:

http://www.statistik.tuwien.ac.at/edavisor by clicking at those links:

• MAPS

• Statistics Austria in Google Maps

Page 32: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

33

asdf

Page 33: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

34

Refunding

For each NSI at least one person will get the travel costs refunded (allowances and costs foraccommodation are not paid by the organizers). Travel expenses incurred for the purposes of amission are reimbursed exclusively on the basis of the cost of the most appropriate and cost-effective means of transport (e.g., by using “low cost” carriers) between the place of employmentand the place of the mission. When travelling by air with less than four hours continuous flyingtime, economy class or equivalent have to be used at the lowest available rates. The use of a car isauthorized where, in view of the specific features of the mission, it improves the cost effectiveness oftravel and/or of the mission itself, particularly where the vehicle is shared by a number of colleagues.We strongly advise staff against using their own cars when going on mission. We encourage the useof public transport. Airport transfers are reimbursed on request at the price of the shuttle serviceor on presentation of supporting documents. Taxis may only be used for transfers to airports orstations at the place of employment or the place of mission where public transport is not a suitablealternative. Travel expenses by rail are reimbursed on presentation of supporting documents onthe basis of the first- or second class rail fare, including the cost of seat reservations and anysupplements.

To get the money refunded you have to provide us with an invoice of the travel agency statingthe cost of your travel at the conference together with a form about your bank account details,which can be found here:http://www.statistik.tuwien.ac.at/edavis/EDAVIS reimburstment.doc.

Please note, when you apply for refunding, that it is also necessary to show us the passportduring the conference from which we take a copy.

Page 34: Workshop on Exploratory Data Analysis and Visualization · 2010-05-26 · Workshop on Exploratory Data Analysis and Visualization Statistics Austria Mai 27{28, 2010 Statistics Austria,

Time Table

Thursday, May 27, 2010:

9.00-11.15: presentations11.15-11.45: coffee break11.45-13.15: presentations13.15-14.15: break14.15-14.45: panel session14.45-16.15: presentations16.15-16.45: coffee break16.45-17.15: panel session19.00-22.30: conference dinner

Friday, May 28, 2010:

9.00-10.45: presentations10.45-11.15: coffee break11.15-13.00: presentations13.00-14.15: break14.15-15.45: presentations15.45-16.15: panel session