Mondrian and OLAP Overview

Embed Size (px)

Citation preview

Mondrian and OLAP
Frontends

RTP Pentaho User GroupQ1 2012 Meetup

Pentaho Overview

BI Server (Frontend for tools)

Report Designer (canned report designer)

Mondrian (Schema workbench, aggregate designer)

Data Integrator

Other ad-hoc tools (reporting)

Weka (predictive analytics)

Enterprise Extras

Analyzer

Interactive Reporting

Dashboard Designer

Data Integration Scheduler

Support

OLAP?

On-line Analytical Processing

Designed for Analytics, not transactions

ROLAP (Mondrian)

Relational OLAP

MOLAP (Palo)

Multidimensional OLAP

ROLAP

Benefits

Data is stored in a Kimball-style star schemaUsable by all other tools (reporting, dashboards, etc.)

Cube is stored in memory

Cons

Performance while cube is being cached

Performance depending on backend database

MOLAP

Benefits

Data stored in multidimensional format

Usually highly compressed

Cons

Potentially long processing times to handle permutations

Higher cardinality (dimensions with millions of records) increases processing

Mondrian Development Life-cycle

ROLAP Optimizations

Columnar data stores

Built for huge datasets in a conformed dimension format

Highly compresses and scales

Examples: LucidDB, Infobright, InfiniDB

Mondrian Specific Optimizations

Aggregate Designer

Performs cost/benefit analysis on all permutations of data

Builds SQL queries that can be loaded into ETL or plugins (like with LucidDB) and run at set times

Cons

Have to refresh aggregate data as new data comes in will get stale otherwise!

Time to refresh is dependent on data set

MDX

MultiDimensional Expressions

SQL for OLAP

Open standard developed by Microsoft

MDX

Source: http://sqlblogcasts.com/blogs/drjohn/archive/2008/09/27/mdx-and-sql-combining-relational-and-multi-dimensional-data-into-one-query-result-set.aspx