Business Intelligence Open Source

  • View
    1.236

  • Download
    3

Embed Size (px)

DESCRIPTION

Business Intelligence Open Source course, theory and principal vendors.

Text of Business Intelligence Open Source

  • 1.Business Intelligence

2. History Business Intelligence term first apparition on1958 by Hans Peter Luhn, an IBM researcher Authomatic method to provide currentawareness services to scientists and engineers Current definition of Business Intelligence as acombination of processes and technologies forgathering, storing, analyzing and providingaccess to informations to help enterprise usersto make conscious decisions www.robertomarchetto.com 3. Main concept Collect data from different sources Integrate and clean up data in a common, easyto analyze repository Provide business related analysis for managersand decision makers Focus on business, data integration, datapresentationwww.robertomarchetto.com 4. Datawarehouse Bill Inmon: A collection of data in support ofdecisional process End-user oriented Collected from different sources Time dependence Data is not editable In theory means a group of processes In the real world is often used for the database www.robertomarchetto.com 5. OLTP: On-Line Transaction Processing Commonly used in ERP, CRM systems and database applications Focuson transaction level (one invoice, one sales order, a search query, etc.) Updates and insertions are frequent Relational model with many tables, using normalization rules www.robertomarchetto.com 6. OLAP: On-Line Analytical Processing A system designed for analysis prouposes Focused on the data exploration on the whole Data once added changes a lot less frequently 13 (12+0) rules of Dr. Codd (1993) Multidimensional view Intuitive data manipulation Dimensions, Facts, Hierarchy levels, Cardinality www.robertomarchetto.com 7. On-Line Analytical Processing www.robertomarchetto.com 8. Relational OLAP Uses relational database schemas and SQL tostore and access OLAP cubes Reuse of RDBMS technology Many tools and vendors available SQL can be used directly by many tools Scalabilitywww.robertomarchetto.com 9. Star schemawww.robertomarchetto.com 10. Memory OLAP, Hybrid OLAP Memory OLAP uses optimized multidimensional arrays Requires pre-computation and storage of the cube(processing) Often better in performances than ROLAP, bettercaching, multidimensional indexing Compression techniques, statistical indexes Less scalable than ROLAP on high volume of data,less tools and vendors available Hybrid OLAP (HOLAP) is the combination of ROLAPand MOLAPwww.robertomarchetto.com 11. Slowly Changing Dimensions In some Business Intelligence implementations data isalways added and almost never modified This makes possible to go back in the timeline For example if an employer was hired in a time periodyou can analyze data as being in that period, countingexactly the number of employes A common approach to ensure Slowly ChangingDimesions is to add some special fields to thedatabase records, giving a time-related validity foreach record www.robertomarchetto.com 12. MDX Multidimensional Expressions (MDX) is a querylanguage for OLAP databases MDX is to OLAP as SQL queries are to OLTPdatabases Powerfull on computing indexes and navigatingthrough OLAP dimensions SELECT{[Measures].[Store Sales]} ON COLUMNS{[Date].[2002], [Date].[2003]} ON ROWSFROM SalesWHERE ([Store].[USA].[CA])www.robertomarchetto.com 13. Features for a BI platform Data storage, data management Data Integration, process schedulement Querying and reporting On Line Analitycal Processing (OLAP) Documents management, versioning Statistical computations Microsoft Office or Open Office support Easy to use and end user self creation ofdocuments (indipendence from developers) www.robertomarchetto.com 14. Dashboards, KPIs www.robertomarchetto.com 15. Geoanalysiswww.robertomarchetto.com 16. Data Mining Requires a strong preparation in computational statistics www.robertomarchetto.com 17. What-if analysis www.robertomarchetto.com 18. Open Source offers Reporting OLAP Charts Portal containers Data integration tools Libraries, CMS, scheduler Databases www.robertomarchetto.com 19. SpagoBI (BI Suite) Engineering Informatica (Italy) Integration of components using drivers Comprehensive Full Open Sourcewww.robertomarchetto.com 20. Pentaho (BI Suite) Pentaho (USA) Acquisition instead of integration Strong marketing Commercial and Open Source www.robertomarchetto.com 21. JasperServer (BI Suite) JasperSoft (USA) Famous forJasperReports Easy to use Commercial andOpen Soucewww.robertomarchetto.com 22. Palo (In memory OLAP) Jedox (Germany) Interesting technology (M-OLAP, GPU) Excel and OpenOffice plugins Web spreadsheet and reporting Open Source and Commercial support www.robertomarchetto.com 23. Talend (Data Integration) Talend (France) Cool Vendor Gartner for Data Integration Data Integration, Data Quality, Data Management, ESB Open Source and Commercial support www.robertomarchetto.com