Transcript
Page 1: Business Intelligence Open Source

Business Intelligence

Page 2: Business Intelligence Open Source

www.robertomarchetto.com

History

● Business Intelligence term first apparition on 1958 by Hans Peter Luhn, an IBM researcher

● Authomatic method to provide current awareness services to scientists and engineers

● Current definition of Business Intelligence as a combination of processes and technologies for gathering, storing, analyzing and providing access to informations to help enterprise users to make conscious decisions

Page 3: Business Intelligence Open Source

www.robertomarchetto.com

Main concept

● Collect data from different sources● Integrate and clean up data in a common, easy

to analyze repository● Provide business related analysis for managers

and decision makers● Focus on business, data integration, data

presentation

Page 4: Business Intelligence Open Source

www.robertomarchetto.com

Datawarehouse

● Bill Inmon: A collection of data in support of decisional process● End-user oriented● Collected from different sources● Time dependence● Data is not editable

● In theory means a group of processes● In the real world is often used for the database

Page 5: Business Intelligence Open Source

www.robertomarchetto.com

OLTP: On-Line Transaction Processing

● Commonly used in ERP, CRM systems and database applications

● Focuson transaction level (one invoice, one sales order, a search query, etc.)

● Updates and insertions are frequent● Relational model with many tables, using

normalization rules

Page 6: Business Intelligence Open Source

www.robertomarchetto.com

OLAP: On-Line Analytical Processing

● A system designed for analysis prouposes● Focused on the data exploration on the whole ● Data once added changes a lot less frequently● 13 (12+0) rules of Dr. Codd (1993)

● Multidimensional view● Intuitive data manipulation● Dimensions, Facts, Hierarchy levels, Cardinality

Page 7: Business Intelligence Open Source

www.robertomarchetto.com

On-Line Analytical Processing

Page 8: Business Intelligence Open Source

www.robertomarchetto.com

Relational OLAP

● Uses relational database schemas and SQL to store and access OLAP cubes

● Reuse of RDBMS technology● Many tools and vendors available● SQL can be used directly by many tools● Scalability

Page 9: Business Intelligence Open Source

www.robertomarchetto.com

Star schema

Page 10: Business Intelligence Open Source

www.robertomarchetto.com

Memory OLAP, Hybrid OLAP

● Memory OLAP uses optimized multidimensional arrays● Requires pre-computation and storage of the cube

(processing)● Often better in performances than ROLAP, better

caching, multidimensional indexing● Compression techniques, statistical indexes● Less scalable than ROLAP on high volume of data,

less tools and vendors available● Hybrid OLAP (HOLAP) is the combination of ROLAP

and MOLAP

Page 11: Business Intelligence Open Source

www.robertomarchetto.com

Slowly Changing Dimensions

● In some Business Intelligence implementations data is always added and almost never modified

● This makes possible to go back in the timeline ● For example if an employer was hired in a time period

you can analyze data as being in that period, counting exactly the number of employes

● A common approach to ensure Slowly Changing Dimesions is to add some special fields to the database records, giving a time-related validity for each record

Page 12: Business Intelligence Open Source

www.robertomarchetto.com

MDX

● Multidimensional Expressions (MDX) is a query language for OLAP databases

● MDX is to OLAP as SQL queries are to OLTP databases

● Powerfull on computing indexes and navigating through OLAP dimensions

● SELECT {[Measures].[Store Sales]} ON COLUMNS{[Date].[2002], [Date].[2003]} ON ROWS FROM Sales WHERE ([Store].[USA].[CA])

Page 13: Business Intelligence Open Source

www.robertomarchetto.com

Features for a BI platform

● Data storage, data management● Data Integration, process schedulement● Querying and reporting● On Line Analitycal Processing (OLAP)● Documents management, versioning● Statistical computations● Microsoft Office or Open Office support● Easy to use and end user self creation of

documents (indipendence from developers)

Page 14: Business Intelligence Open Source

www.robertomarchetto.com

Dashboards, KPIs

Page 15: Business Intelligence Open Source

www.robertomarchetto.com

Geoanalysis

Page 16: Business Intelligence Open Source

www.robertomarchetto.com

Data Mining

● Requires a strong preparation in computational statistics

Page 17: Business Intelligence Open Source

www.robertomarchetto.com

What-if analysis

Page 18: Business Intelligence Open Source

www.robertomarchetto.com

● Reporting● OLAP● Charts● Portal containers● Data integration tools● Libraries, CMS,

scheduler● Databases

Open Source offers

Page 19: Business Intelligence Open Source

www.robertomarchetto.com

SpagoBI (BI Suite)

● Engineering Informatica (Italy)

● Integration of components using drivers

● Comprehensive● Full Open Source

Page 20: Business Intelligence Open Source

www.robertomarchetto.com

Pentaho (BI Suite)

● Pentaho (USA)● Acquisition instead of

integration● Strong marketing● Commercial and

Open Source

Page 21: Business Intelligence Open Source

www.robertomarchetto.com

JasperServer (BI Suite)

● JasperSoft (USA)● Famous for

JasperReports● Easy to use● Commercial and

Open Souce

Page 22: Business Intelligence Open Source

www.robertomarchetto.com

Palo (In memory OLAP)

● Jedox (Germany)● Interesting technology

(M-OLAP, GPU)● Excel and OpenOffice

plugins● Web spreadsheet and

reporting● Open Source and

Commercial support

Page 23: Business Intelligence Open Source

www.robertomarchetto.com

Talend (Data Integration)

● Talend (France)● „Cool Vendor“

Gartner for Data Integration

● Data Integration, Data Quality, Data Management, ESB

● Open Source and Commercial support


Recommended