30
Motivation and Contribution Preliminaries The Open Cube Vocabulary Related Work Conclusion and Future Work Enhancing OLAP Analysis with Web Cubes Lorena Etcheverry 1 Alejandro A. Vaisman 2 1 lorenae@fing.edu.uy Instituto de Computación Universidad de la República, Uruguay 2 [email protected] Université Libre de Bruxelles, Belgium 9th Extended Semantic Web Conference, Crete, 2012 Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 1/25

Enhancing OLAP Analysis with Web Cubes

Embed Size (px)

Citation preview

Motivation and ContributionPreliminaries

The Open Cube VocabularyRelated Work

Conclusion and Future Work

Enhancing OLAP Analysis with Web Cubes

Lorena Etcheverry1 Alejandro A. Vaisman2

[email protected] de Computación

Universidad de la República, Uruguay

[email protected]é Libre de Bruxelles, Belgium

9th Extended Semantic Web Conference, Crete, 2012

Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 1/25

Motivation and ContributionPreliminaries

The Open Cube VocabularyRelated Work

Conclusion and Future Work

Outline

1 Motivation and Contribution

2 PreliminariesMultidimensional ModelOLAP Operators

3 The Open Cube VocabularyRepresenting the Model in OCImplementing OLAP Operators in OC

4 Related Work

5 Conclusion and Future WorkConclusionFuture Work

Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 2/25

Motivation and ContributionPreliminaries

The Open Cube VocabularyRelated Work

Conclusion and Future Work

Outline

1 Motivation and Contribution

2 PreliminariesMultidimensional ModelOLAP Operators

3 The Open Cube VocabularyRepresenting the Model in OCImplementing OLAP Operators in OC

4 Related Work

5 Conclusion and Future WorkConclusionFuture Work

Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 3/25

Motivation and ContributionPreliminaries

The Open Cube VocabularyRelated Work

Conclusion and Future Work

Motivation

OLAP (On-line AnalyticalProcessing) allows analyzinghuge amounts of data fordecision-making.

Multidimensional data are seenas data cubes (DC).

ETL (Extract,Transform, Load)process initially loads the DW;then, data is refreshedperiodically

ETL is costly andresource-consuming.

DW Architecture, Malinowski &Zimányi, 2008

Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 4/25

Motivation and ContributionPreliminaries

The Open Cube VocabularyRelated Work

Conclusion and Future Work

Research Problem

Sometimes we cannot afford to wait until the next DWrefreshment. We need data now, as it is.This represents a paradigm shift from traditional ETL.Semantic Web and OLAP tools and theories can be puttogether to solve this problem.Idea: use temporary cubes of “Good Enough” web data toenhance the analysis of traditional DCs. After being used,“Good Enough” cubes can be either dropped or loaded into theDW in the usual way.The main research questions are:

How can we get data for decision-making from the Web, readyfor analysis, in the shortest possible time-frame?Where do we store these data?

Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 5/25

Motivation and ContributionPreliminaries

The Open Cube VocabularyRelated Work

Conclusion and Future Work

Motivation

Use CaseA business user analyzes data stored in a DW using OLAPtools.She wants to obtain data from the web (e.g., from currentonline offers of certain products).But she does not have time to incorporate these new data intothe ETL process (she maybe even won’t need these data afterthe analysis).

Solution: Build a “Web Cube” and analyze its content togetherwith local cubes. We can:

Get data from SW and export into a traditional DC.Build cubes using SW technology.

Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 6/25

Motivation and ContributionPreliminaries

The Open Cube VocabularyRelated Work

Conclusion and Future Work

A Possible Architecture

[Fusion cubes: Towards self-service business intelligence, Abelló etal. Dagstuhl seminar 2011 “Data Warehousing: from OccasionalOLAP to Real-time Business Intelligence”]

Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 7/25

Motivation and ContributionPreliminaries

The Open Cube VocabularyRelated Work

Conclusion and Future Work

A Possible Architecture

[Fusion cubes: Towards self-service business intelligence, Abelló etal. Dagstuhl seminar 2011 “Data Warehousing: from OccasionalOLAP to Real-time Business Intelligence”]

Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 7/25

Motivation and ContributionPreliminaries

The Open Cube VocabularyRelated Work

Conclusion and Future Work

Contribution

An RDF vocabulary that fully supports the classicalmultidimensional model.A set of OLAP operators implemented as SPARQL queries.Algorithms that automatically build the SPARQL queries thatimplement OLAP operators.

Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 8/25

Motivation and ContributionPreliminaries

The Open Cube VocabularyRelated Work

Conclusion and Future Work

Multidimensional ModelOLAP Operators

Outline

1 Motivation and Contribution

2 PreliminariesMultidimensional ModelOLAP Operators

3 The Open Cube VocabularyRepresenting the Model in OCImplementing OLAP Operators in OC

4 Related Work

5 Conclusion and Future WorkConclusionFuture Work

Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 9/25

Motivation and ContributionPreliminaries

The Open Cube VocabularyRelated Work

Conclusion and Future Work

Multidimensional ModelOLAP Operators

Multidimensional Model

Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 10/25

Motivation and ContributionPreliminaries

The Open Cube VocabularyRelated Work

Conclusion and Future Work

Multidimensional ModelOLAP Operators

Instance Example

TIMEyear1

month1date1 date2

PRODUCT GEOGRAPHY price qtySold price qtySoldcat1 model1 prod1 state1 city1 200 20 200 5

prod2 state2 city2 245 10 230 10city3 260 15 265 10

Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 11/25

Motivation and ContributionPreliminaries

The Open Cube VocabularyRelated Work

Conclusion and Future Work

Multidimensional ModelOLAP Operators

OLAP Operators

prod1 prod2month1 date1 1 3

date2 3 2month2 date3 4 3

(a) Cube C

prod1 prod2month1 4 5month2 4 3

(b)RollUp(C ,Time,month, sum)

month1 date1 4date2 5

month2 date3 7

(c) Slice(C ,Product, sum)

prod1 prod2month1 date2 3 2month2 date3 4 3

(d) Dice(C ,Time, date > date1)

Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 12/25

Motivation and ContributionPreliminaries

The Open Cube VocabularyRelated Work

Conclusion and Future Work

Representing the Model in OCImplementing OLAP Operators in OC

Outline

1 Motivation and Contribution

2 PreliminariesMultidimensional ModelOLAP Operators

3 The Open Cube VocabularyRepresenting the Model in OCImplementing OLAP Operators in OC

4 Related Work

5 Conclusion and Future WorkConclusionFuture Work

Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 13/25

Motivation and ContributionPreliminaries

The Open Cube VocabularyRelated Work

Conclusion and Future Work

Representing the Model in OCImplementing OLAP Operators in OC

The Open Cube Vocabulary

Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 14/25

Motivation and ContributionPreliminaries

The Open Cube VocabularyRelated Work

Conclusion and Future Work

Representing the Model in OCImplementing OLAP Operators in OC

The Open Cube Vocabulary

Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 14/25

Motivation and ContributionPreliminaries

The Open Cube VocabularyRelated Work

Conclusion and Future Work

Representing the Model in OCImplementing OLAP Operators in OC

The Open Cube Vocabulary

Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 14/25

Motivation and ContributionPreliminaries

The Open Cube VocabularyRelated Work

Conclusion and Future Work

Representing the Model in OCImplementing OLAP Operators in OC

The Open Cube Vocabulary

Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 14/25

Motivation and ContributionPreliminaries

The Open Cube VocabularyRelated Work

Conclusion and Future Work

Representing the Model in OCImplementing OLAP Operators in OC

The Open Cube Vocabulary

Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 14/25

Motivation and ContributionPreliminaries

The Open Cube VocabularyRelated Work

Conclusion and Future Work

Representing the Model in OCImplementing OLAP Operators in OC

Schema Examples

Dimension Schema

eg:products a oc:Dimension;oc:dimHasLevel eg:product;oc:dimHasLevel eg:model;oc:dimHasLevel eg:category.

eg:product a oc:Level.eg:model ra oc:Level.eg:category a oc:Level.eg:product oc:parentLevel eg:model.eg:model oc:parentLevel eg:category.

Fact Schema

eg:sales a oc:FactSchema;oc:hasLevel eg:product;oc:hasLevel eg:city;oc:hasLevel eg:date;oc:hasMeasure eg:price;oc:hasMeasure eg:qtySold;

eg:price a oc:Measure;oc:hasAggFunction eg:avg.

eg:qtySold a oc:Measure;oc:hasAggFunction eg:sum.

Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 15/25

Motivation and ContributionPreliminaries

The Open Cube VocabularyRelated Work

Conclusion and Future Work

Representing the Model in OCImplementing OLAP Operators in OC

Instance Examples

Dimension Instance

egI:prod1 oc:inLevel eg:product;oc:parentLevelMember egI:model1.

egI:model1 oc:inLevel eg:model;oc:parentLevelMember egI:cat1.

egI:cat1 oc:inLevel eg:category.

Fact Instance

egI:sales_i1 rdf:type oc:FactInstance;oc:hasSchema eg:sales;eg:product egI:prod1;eg:date egI:date1;eg:city egI:city1;eg:price 200^xsd:integer;eg:qtySold 20^xsd:integer.

Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 16/25

Motivation and ContributionPreliminaries

The Open Cube VocabularyRelated Work

Conclusion and Future Work

Representing the Model in OCImplementing OLAP Operators in OC

Roll-Up Example

RollUp(Sales,Time,month,sum, avg)CONSTRUCT { ?id oc:hasSchema eg:salesMonth . ?id eg:product ?prod .

?id eg:city ?city . ?id eg:month ?mon .?id eg:price ?priceMonth . ?id eg:qtySold ?qtyMonth .}

WHERE{ {SELECT ?prod ?city ?mon (AVG(?price) AS ?priceMonth)(SUM(?qtySold) AS ?qtyMonth)(iri(fn:concat("http://example.org/salesInstances#", "sales","_",fn:substring−after(?prod,"http://example.org/salesInstances#"),"_",fn:substring−after(?city,"http://example.org/salesInstances#"),"_",fn:substring−after(?mon,"http://example.org/salesInstances#"))) AS ?id)

WHERE {?i oc:hasSchema sl:sales . ?i eg:product ?prod .?i eg:city ?city . ?i eg:date ?date .?i eg:price ?price . ?i eg:qtySold ?qty .?date oc:parentLevelMember ?mon . ?mon oc:inLevel eg:month

}GROUP BY ?prod ?city ?mon}}

Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 17/25

Motivation and ContributionPreliminaries

The Open Cube VocabularyRelated Work

Conclusion and Future Work

Outline

1 Motivation and Contribution

2 PreliminariesMultidimensional ModelOLAP Operators

3 The Open Cube VocabularyRepresenting the Model in OCImplementing OLAP Operators in OC

4 Related Work

5 Conclusion and Future WorkConclusionFuture Work

Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 18/25

Motivation and ContributionPreliminaries

The Open Cube VocabularyRelated Work

Conclusion and Future Work

Related Work

The RDF Data Cube Vocabulary (QB) [Cyganiak et al. 2012](W3C Working Draft) does not directly support the classicalmultidimensional model for OLAP.

Oriented to statistical data analysis.Does not represent dimension structure.Does not bind measures to aggregate functions.Dimension hierarchies not accounted for directly.

Consequence: OLAP operators are difficult to define over QB(see Kämpgen et al., ILD, ESWC 2012).

Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 19/25

Motivation and ContributionPreliminaries

The Open Cube VocabularyRelated Work

Conclusion and Future Work

ConclusionFuture Work

Outline

1 Motivation and Contribution

2 PreliminariesMultidimensional ModelOLAP Operators

3 The Open Cube VocabularyRepresenting the Model in OCImplementing OLAP Operators in OC

4 Related Work

5 Conclusion and Future WorkConclusionFuture Work

Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 20/25

Motivation and ContributionPreliminaries

The Open Cube VocabularyRelated Work

Conclusion and Future Work

ConclusionFuture Work

Conclusion

An RDF vocabulary for representing the classicalmultidimensional model s.t.

ANSI architecture is supported (conceptual, logical andphysical levels clearly identified).OLAP applications and operators can be implementednaturally and easily maintained and extended.

A set of OLAP operators implemented as SPARQL queries.Algorithms that automatically build the SPARQL queries thatimplement such OLAP operators.Preliminary tests over proof-of-concept prototype.

Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 21/25

Motivation and ContributionPreliminaries

The Open Cube VocabularyRelated Work

Conclusion and Future Work

ConclusionFuture Work

Future Work

Extend the operator set (e.g., Drill-Across).Perform stress tests.Query processing and optimization.Incorporate all of these into the general framework.

Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 22/25

Motivation and ContributionPreliminaries

The Open Cube VocabularyRelated Work

Conclusion and Future Work

ConclusionFuture Work

Thanks for your attention.Questions?Contact:

Lorena Etcheverry [email protected] A. Vaisman [email protected]

Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 23/25

Appendix References

References I

A. Abelló, J. Darmont, L. Etcheverry, M. Golfarelli, J.-N.Mazón, F. Naumann, T. B. Pedersen, S. Rizzi, J. Trujillo,P. Vassiliadis, and G. Vossen.Fusion cubes: Towards self-service business intelligence.Submitted to IJDWM, 2012.

R. Cyganiak and D. Reynolds.The RDF Data Cube Vocabulary, March 2012.

S. Gómez, L. Gómez, and A. A. Vaisman.A Generic Data Model and Query Language for SpatiotemporalOLAP Cube Analysis.In EDBT 2012, 2012.

Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 24/25

Appendix References

References II

C. A. Hurtado, A. O. Mendelzon, and A. A. Vaisman.Maintaining Data Cubes under Dimension Updates.In ICDE ’99, pages 346–355, Washington, DC, USA, 1999.IEEE Computer Society.

B. Kämpgen and A. Harth.Transforming statistical linked data for use in OLAP systems.In I-Semantics ’11, pages 33–40, New York, NY, USA, 2011.ACM.

B. Kämpgen, S. O’Riain, and A. Harth.Interacting with Statistical Linked Data via OLAP Operations.In Proceedings of the International Workshop on Interactingwith Linked Data (ESWC). CEUR-WS.org, 2012.

Etcheverry, L., Vaisman, A. A. Enhancing OLAP Analysis with Web Cubes 25/25