41
The Open Analytics Platform Bernd Wiswedel KNIME.com AG Copyright © 2014 KNIME.com AG

The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

The Open Analytics Platform

Bernd Wiswedel

KNIME.com AG

Copyright © 2014 KNIME.com AG

KNIME.com AG

Page 2: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

Agenda

• KNIME.com AG

• The KNIME Platform

• Recognition

• Small Sales Pitch

Copyright © 2014 KNIME.com AG

• KNIME and R – the best of two worlds

• KNIME (Node) Development

2

Page 3: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

A Brief History of KNIME

• 2004: KNIME development commences

• 2006: KNIME v1 released

• 2006: Spin-off in Konstanz, Germany

• 2008: KNIME moves to Zurich

• 2010: Enterprise products released

Copyright © 2014 KNIME.com AG

• 2010: Enterprise products released

• 2011: KNIME.com AG founded

• 2013: KNIME opens San Francisco office

• 2014: KNIME opens Berlin office

„KNIME saved my

life in a world of scripts

that I do not want to learn!

3

Page 4: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

The KNIME Platform

Copyright © 2014 KNIME.com AG 4

Page 5: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

The KNIME Platform

Copyright © 2014 KNIME.com AG 5

Page 6: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

KNIME loads and integrates data from diverse data sources:

• Different data bases

• Various file formats (CSV, XML, SDF, etc.)

Copyright © 2014 KNIME.com AG

Data Loading

6

Page 7: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

KNIME provides huge repository of

modules for easy-to-use, modular

• Data preprocessing

• Data fusion

• Data transformation

Copyright © 2014 KNIME.com AG

Data Loading ETL

7

Page 8: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

In addition to standard data

mining techniques, KNIME

adds cutting edge data

analysis algorithms.

(…thanks to its academic

roots)

Copyright © 2014 KNIME.com AG

Data Loading ETL Data Mining

8

Page 9: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

Interactive views provide data overviews

and insights into the learned models.

Interactive linking&brushing techniques

allow for powerful exploration of models

and data.

Copyright © 2014 KNIME.com AG

Data Loading ETL Data Mining Visualization

and data.

9

Page 10: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

Due to its open API and “node-in-a-sandbox”-approach

additional (also external) tools are easily integrated,

e.g.

• Access to the R Project

(statistical analysis/visualizations)

• Complete integration of the

machine learning library WEKA

• Application area specific integration, e.g. CDK

(Chemical Development Kit), RDKit, ImageJ, …

Copyright © 2014 KNIME.com AG

Data Loading ETL Data Mining Visualization External Tools

(Chemical Development Kit), RDKit, ImageJ, …

KNIME is Eclipse-based: Integrating other Eclipse

projects such as BIRT, DTP, etc. provides even more

functionality

10

Page 11: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

Over 1000 native and embedded nodes included:

Copyright © 2014 KNIME.com AG

StatisticsData MiningMachine LearningWeb AnalyticsText MiningNetwork AnalysisSocial Media AnalysisWEKARCommunity / 3rd

MySQL, Oracle, etc.SAS, SPSS, etc.Excel, Flat, etc.Hive etc.XML, PMMLText, Doc, ImageWeb CrawlersIndustry SpecificCommunity / 3rd

ETLRow, ColumnMatrixText, ImageTime SeriesJavaPythonCommunity / 3rd

RJFreeChartCommunity / 3rd

via BIRTPMMLXMLDatabasesExcel, Flat, etc.Hive etc.Text, Doc, ImageIndustry SpecificCommunity / 3rd

11

Page 12: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

Commercial Partners integrate their proprietary tools

⇒ KNIME serves as an integration platform for tools of

various vendors (or your inhouse/legacy applications)

Copyright © 2014 KNIME.com AG

Visualization External Tools3rd Party Tools

various vendors (or your inhouse/legacy applications)

12

Page 13: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

Small KNIME Demo

Copyright © 2014 KNIME.com AG 13

Page 14: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

Who’s Using KNIME?

>25,000 Individuals using KNIME

>3,000 Organizations using KNIME

>300 Customers paying for KNIME

as of January 2014

60kAnnual Unique Downloads

Copyright © 2014 KNIME.com AG

2011 2012 2013

40k

Annual Unique Downloads

20kOpen Source Users

14

Page 15: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

Advanced

Pharma

Health CareManu-

facturing

Broad Range of KNIME Application Areas

Copyright © 2014 KNIME.com AG

Advanced

Analytics

Finance

Retail

Customer

Intelligence

15

Page 16: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

Top in User Satisfaction

Copyright © 2014 KNIME.com AG

2012 & 2013 Rexer Analytics Survey

16

Page 17: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

Sales Pitch: The KNIME Server at Work

Copyright © 2014 KNIME.com AG 17

Page 18: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

KNIME in Action: Big Data

“As long as your machine can handle it, KNIME will

play along.”

Copyright © 2014 KNIME.com AG 18

Page 19: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

KNIME and Big Data

Copyright © 2014 KNIME.com AG 19

Page 20: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

KNIME and Big Data

Copyright © 2014 KNIME.com AG 20

Page 21: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

KNIME and R

The best of two worlds

Copyright © 2014 KNIME.com AG 21

Page 22: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

Why use KNIME and R?

• Powerful statistics

• (b)Leading edge algorithms

• Powerful GUI

• Good Extract/Transform/Load

• Open source analytics

R KNIME

Copyright © 2014 KNIME.com AG

• (b)Leading edge algorithms

• Powerful/flexible graphics

• Widely accepted language

• Good Extract/Transform/Load

• Integrates diverse tools

• Enterprise grade solutions

• Cross platform

• Vibrant communities

Page 23: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

Two Integrations

• Community (RServe Integration)

Copyright © 2014 KNIME.com AG

• R Interactive (Today's topic)

Page 24: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

Overview of new nodes

Copyright © 2014 KNIME.com AG

• Different input and output options

• Grey ports enable workspace branching

Page 25: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

Columns

Workspace

The Interactive Editor

Copyright © 2014 KNIME.com AG

VariablesCode Editor

Workspace

Overview

Console

Page 26: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

Templates

List

Copyright © 2014 KNIME.com AG

Preview

List

Summary

Page 27: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

R Source nodes

• Get data from an R data frame

Copyright © 2014 KNIME.com AG

• Get data from an R data frame

• Assign output to a data frame named knime.out

• Use with foreign, RCurl, or ...

Page 28: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

R Snippet nodes

• Generic data manipulation

Copyright © 2014 KNIME.com AG

• Edit tables or workspaces

• Derive knime.out from knime.in

• Use for cumulative stats, plyr, or ...

Page 29: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

R Data Mining nodes

• Use R models in KNIME

• Learner (knime.model) &

Copyright © 2014 KNIME.com AG

• Learner (knime.model) & Predictor motif

• R to PMML support for model portability

Page 30: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

R View nodes

• Generic R plots

Copyright © 2014 KNIME.com AG

• Plot(knime.in)

• Use with many packages including ggplot2

Page 31: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

R in Action: Choropleth Generation

Copyright © 2014 KNIME.com AG

Page 32: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

R in Action: Dose Response modeling

Copyright © 2014 KNIME.com AG

Page 33: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

A Peak under the Hood:

KNIME (Node) Development

Copyright © 2014 KNIME.com AG 33

Page 34: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

KNIME Workflow Manager & User Interface

KNIME

I/O

KNIME

Native

Algorithms

Open Source

Integrations

(R, BIRT, …)

Partner

Extensions

Node InterfaceNode Interface Node Interface Node Interface

Community

Extensions

Node Interface

Data Mgmt

&

Data Mgmt

&

Data Mgmt

&

Data Mgmt

&

Data Mgmt

&

KNIME Analytics Platform: Technology Overview

Copyright © 2014 KNIME.com AG

KNIME Data Management and Execution Layer

Execution ControlMeta Data

Handling Data Management

&

Execution Ctrl

&

Execution Ctrl

&

Execution Ctrl

&

Execution Ctrl

&

Execution Ctrl

Clu

ste

r

Exe

cuti

on

Mu

lti

Co

re

Exe

cuti

on

Dis

trib

ute

d

Da

ta S

tora

ge

Dis

trib

ute

d

Exe

cuti

on

In M

em

ory

Da

ta H

an

dli

ng

Au

tom

ati

c

Da

ta C

ach

ing

Da

ta T

yp

e

Ext

en

sio

ns

Page 35: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

Node Architecture

• KNIME interacts only with a Node

• Node takes care of embedding the node in

class Node(final)

class Node-

class class

Copyright © 2014 KNIME.com AG

embedding the node in the infrastructure

• New nodes implement Model/View/Dialog

Node-Dialog-Pane

(abstract)

class Node-View

(abstract)

class Node-Model

(abstract)

class NodeFactory (abstract)

35

Page 36: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

Node Extension Wizard

• Included in the KNIME Developer Version

• Allows creation of plugin projects including

functioning KNIME nodes (with sample code)

Copyright © 2014 KNIME.com AG

• Helpful to easily create all node classes

– Generates all Java classes

– Node is registered with the plugin project

– Launch KNIME and enjoy the new node working!

36

Page 37: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

Node Extension Wizard

Copyright © 2014 KNIME.com AG 37

Page 38: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

Node Extension Wizard

• Specify all settings to create a new KNIME node

– In a completely new plugin project, or

– Into an existing project

Copyright © 2014 KNIME.com AG

• Node type: Sink, Source, Learner, Predictor, Manipulator, Visualizer, Meta, or Other

• Include sample code or not

38

Page 39: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

Node Extension Wizard

• Contains all Java

classes (including

sample code)

• Node is registered in

the plugin.xml

Copyright © 2014 KNIME.com AG

the plugin.xml

• NodeDialog and

NodeView class are

also created and

registered to the

NodeFactory

39

Page 40: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

Resources

• KNIME pages (www.knime.org)

– APPLICATIONS for example workflows

– LEARNING HUB under RESOURCES www.knime.org/learning-hub

• KNIME Tech pages(tech.knime.org)

Copyright © 2014 KNIME.com AG

(tech.knime.org)

– FORUM for questions and answers

– DOCUMENTATION for documentation, FAQ, changelogs, ...

– LABS where to find new experimental nodes

– COMMUNITY CONTRIBUTIONS for development instructionsand third party nodes

• KNIME TV channel on

40

Page 41: The Open Analytics Platform - GitHub Pages · Title KNIME_Intro_BWiswedel_22Oct2014 Author: wiswedel Created Date: 10/29/2014 12:00:00 AM

Thank you

Copyright © 2014 KNIME.com AG

Thank you