Business Intelligence Open Source

  • View

  • Download

Embed Size (px)


Business Intelligence Open Source course, theory and principal vendors.

Text of Business Intelligence Open Source

  • 1.Business Intelligence

2. History Business Intelligence term first apparition on1958 by Hans Peter Luhn, an IBM researcher Authomatic method to provide currentawareness services to scientists and engineers Current definition of Business Intelligence as acombination of processes and technologies forgathering, storing, analyzing and providingaccess to informations to help enterprise usersto make conscious decisions 3. Main concept Collect data from different sources Integrate and clean up data in a common, easyto analyze repository Provide business related analysis for managersand decision makers Focus on business, data integration, 4. Datawarehouse Bill Inmon: A collection of data in support ofdecisional process End-user oriented Collected from different sources Time dependence Data is not editable In theory means a group of processes In the real world is often used for the database 5. OLTP: On-Line Transaction Processing Commonly used in ERP, CRM systems and database applications Focuson transaction level (one invoice, one sales order, a search query, etc.) Updates and insertions are frequent Relational model with many tables, using normalization rules 6. OLAP: On-Line Analytical Processing A system designed for analysis prouposes Focused on the data exploration on the whole Data once added changes a lot less frequently 13 (12+0) rules of Dr. Codd (1993) Multidimensional view Intuitive data manipulation Dimensions, Facts, Hierarchy levels, Cardinality 7. On-Line Analytical Processing 8. Relational OLAP Uses relational database schemas and SQL tostore and access OLAP cubes Reuse of RDBMS technology Many tools and vendors available SQL can be used directly by many tools 9. Star 10. Memory OLAP, Hybrid OLAP Memory OLAP uses optimized multidimensional arrays Requires pre-computation and storage of the cube(processing) Often better in performances than ROLAP, bettercaching, multidimensional indexing Compression techniques, statistical indexes Less scalable than ROLAP on high volume of data,less tools and vendors available Hybrid OLAP (HOLAP) is the combination of ROLAPand 11. Slowly Changing Dimensions In some Business Intelligence implementations data isalways added and almost never modified This makes possible to go back in the timeline For example if an employer was hired in a time periodyou can analyze data as being in that period, countingexactly the number of employes A common approach to ensure Slowly ChangingDimesions is to add some special fields to thedatabase records, giving a time-related validity foreach record 12. MDX Multidimensional Expressions (MDX) is a querylanguage for OLAP databases MDX is to OLAP as SQL queries are to OLTPdatabases Powerfull on computing indexes and navigatingthrough OLAP dimensions SELECT{[Measures].[Store Sales]} ON COLUMNS{[Date].[2002], [Date].[2003]} ON ROWSFROM SalesWHERE ([Store].[USA].[CA]) 13. Features for a BI platform Data storage, data management Data Integration, process schedulement Querying and reporting On Line Analitycal Processing (OLAP) Documents management, versioning Statistical computations Microsoft Office or Open Office support Easy to use and end user self creation ofdocuments (indipendence from developers) 14. Dashboards, KPIs 15. 16. Data Mining Requires a strong preparation in computational statistics 17. What-if analysis 18. Open Source offers Reporting OLAP Charts Portal containers Data integration tools Libraries, CMS, scheduler Databases 19. SpagoBI (BI Suite) Engineering Informatica (Italy) Integration of components using drivers Comprehensive Full Open 20. Pentaho (BI Suite) Pentaho (USA) Acquisition instead of integration Strong marketing Commercial and Open Source 21. JasperServer (BI Suite) JasperSoft (USA) Famous forJasperReports Easy to use Commercial andOpen 22. Palo (In memory OLAP) Jedox (Germany) Interesting technology (M-OLAP, GPU) Excel and OpenOffice plugins Web spreadsheet and reporting Open Source and Commercial support 23. Talend (Data Integration) Talend (France) Cool Vendor Gartner for Data Integration Data Integration, Data Quality, Data Management, ESB Open Source and Commercial support