29
IBM Cloud Forum 20 novembre 2019 New Cap Event Center, Paris Exploiter toute la valeur de vos données avec Cloud Pak for Data

20 novembre 2019 New Cap Event Center, Paris · Data Science : Model Monitor with Open Scale Watson Knowledge Catalog • Spark • Python 2.7 with Anaconda, R, Scala • Python 3.5

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 20 novembre 2019 New Cap Event Center, Paris · Data Science : Model Monitor with Open Scale Watson Knowledge Catalog • Spark • Python 2.7 with Anaconda, R, Scala • Python 3.5

IBM Cloud Forum

20 novembre 2019New Cap Event Center, Paris

Exploiter toute la valeur de vos données

avec Cloud Pak for Data

Page 2: 20 novembre 2019 New Cap Event Center, Paris · Data Science : Model Monitor with Open Scale Watson Knowledge Catalog • Spark • Python 2.7 with Anaconda, R, Scala • Python 3.5

Cloud Pak for Data :La plateforme collaborative de services Data & AI

2 11/20/2019

Corinne Baragoin

Data Architect

[email protected]

IBM

Page 3: 20 novembre 2019 New Cap Event Center, Paris · Data Science : Model Monitor with Open Scale Watson Knowledge Catalog • Spark • Python 2.7 with Anaconda, R, Scala • Python 3.5

3 11/20/2019

82% are concerned with [data] connectivity across cloudsDataAI

Cloud

66% of cloud

workloads will be AI-driven

94% use multiple clouds platforms

Data is what fuels digital transformation

Digital transformation requires unified Data + AI + Cloud services spanning an open, hybrid cloud environment

Source: IBM MD&I; : BCG and McKinsey, MIT Slone, Forrester, LogicMonitor

Modern Architecture

Page 4: 20 novembre 2019 New Cap Event Center, Paris · Data Science : Model Monitor with Open Scale Watson Knowledge Catalog • Spark • Python 2.7 with Anaconda, R, Scala • Python 3.5

4

A faster, more secure way to move your core business applications to any cloudthrough enterprise-ready containerized software solutions

Cloud Paks – Enterprise-ready containerized software

IBM containerized softwarePackaged with Open Source components,

pre-integrated with the common operational services,and secure by design

Container platformand operational services

Logging, monitoring, security,identity access management

IBM Cloud Private SystemsEdge

Complete yet simpleApplication, data and AI services that aremodular, term licensed, and easy to consume

IBM certifiedFull software stack support, and ongoing security, compliance and version compatibility

Run anywhereOn-premises, on private and public clouds,and in pre-integrated systems

Page 5: 20 novembre 2019 New Cap Event Center, Paris · Data Science : Model Monitor with Open Scale Watson Knowledge Catalog • Spark • Python 2.7 with Anaconda, R, Scala • Python 3.5

5

Cloud Paks – Pre-integrated for cloud use cases

Today, IBM offers clients the first six Cloud Paks…

IBM Cloud SystemsEdge Private

Cloud Pak for Security

Containerplatform andoperational services

Cloud Pak for Applications

Developer & DevOps Tools

ModernizationToolkit

Frameworks and Runtimes

Containerplatform andoperational services

Cloud Pak for Data

Containerplatform andoperational services

Organize Analyze

Collect

Cloud Pak for Integration

Containerplatform andoperational services

API Lifecycle

Messaging and Events

App and Data Integration

Cloud Pak for Automation

Containerplatform andoperational services

Workflow and Decisions

Operational Intelligence

Content

Cloud Pak for Multicloud

Management

Containerplatform andoperational services

App and Infrastructure

Multicluster

Security and ComplianceManagement

Incident Response

Federated Search and

Investigation

Security Orchestration and Automation

Page 6: 20 novembre 2019 New Cap Event Center, Paris · Data Science : Model Monitor with Open Scale Watson Knowledge Catalog • Spark • Python 2.7 with Anaconda, R, Scala • Python 3.5

6

Data Ecosystem• Data in silos• Difficult to access• No lineage

Analytics Tools• Discrete tools

• Different preferences

• Difficult to manage

Workflow• Not integrated

• Not governed

• Lack dev/prod parity

Culture• Not collaborative

• Slow provisioning

• Lack trust in AI

“No amount of AI algorithmic sophistication will overcome a

lack of data [architecture] … bad data is simply paralyzing”

Data & AI Challenges

Page 7: 20 novembre 2019 New Cap Event Center, Paris · Data Science : Model Monitor with Open Scale Watson Knowledge Catalog • Spark • Python 2.7 with Anaconda, R, Scala • Python 3.5

Modernize your applications

7

AppDeployment

Data

New Requirements& Engagement

DevelopmentTest

MonitoringRetraining

Search for Data

Acquiring Data/Self Service

Modelbuilding

Hadoop

EDWNOSQL

Data Lake

ML Model Deployment

RefiningData

Continuous Delivery

of Applications

Continuous Delivery

of Insights

Multi-CloudGovernanceMicroservices & APIs

7

Page 8: 20 novembre 2019 New Cap Event Center, Paris · Data Science : Model Monitor with Open Scale Watson Knowledge Catalog • Spark • Python 2.7 with Anaconda, R, Scala • Python 3.5

COLLECT - Make data simple and accessible

ORGANIZE - Create a business-ready analytics foundation

ANALYZE - Build and scale AI with trust & transparency

Data of every type, regardless of where it lives

INFUSE - Operationalize AI throughout the business

AI

MODERNIZEUnlock the value of data for an AI and multicloud world

One Platform, Any Cloud

The IBM Data & AI Ladder

8

Page 9: 20 novembre 2019 New Cap Event Center, Paris · Data Science : Model Monitor with Open Scale Watson Knowledge Catalog • Spark • Python 2.7 with Anaconda, R, Scala • Python 3.5

9

The IBM Data & AI Offering

Watson Data Science Platform (DS/AI)Watson Studio, Watson Knowledge Catalog, Watson ML, Watson OpenScale, Decision Optimization, SPSS,

Watson Apps & SolutionsWatson Assistant, Watson Discovery, Watson APIs, Cognos Analytics, Planning Analytics, BPM, GBS

Integration & GovernanceInfosphere Family, Watson Knowledge Catalog

Hybrid Data ManagementDb2 Family, Hadoop/Cloudera, MongoDB

LOB Execs and Business Analysts

Data Stewards, Data Engineers,

Compliance Officers

Data Engineers, Data Developers

and Data Admins

Data Scientists, AI Developers,

Programmatic Analysts

Cloud Pak for DataCIO, CTO, CDO, Cloud

& Data Architects

Speed time to value with prebuilt AI apps

Collect and manage hybrid Data of all types

Deploy a unified Data & AI Cloud Platform

One Platform, Any Cloud

Page 10: 20 novembre 2019 New Cap Event Center, Paris · Data Science : Model Monitor with Open Scale Watson Knowledge Catalog • Spark • Python 2.7 with Anaconda, R, Scala • Python 3.5

Infuse

IBM Cloud Pak for Data : Collaborate

App DeveloperData ScientistData Engineer

Enterprise Catalog

✓ Integrate & Refine

✓ Deploy models as scalable web services

Data Steward

Web service

Applications Business processes

Business Analyst

AnalyzeOrganizeCollect & Connect

Watson Services

Make your data ready for an AI and hybrid multi-cloud world

Make data simple andaccessible

Create a business-ready analytics foundation

Build and scale AI with trust and explainability

Operationalize AIthroughout the business

✓ Eliminate data silos

✓ Connect all data

✓ Utilize fit for workloaddata repositories

Virtualize

✓ Automate and governthe data & AI lifecycle

✓ Analyze using open sourceand visual tooling

✓ Explore and visualize

✓ Build analytical models

✓ Operationalize AI

✓ Measure and track AI outcomes✓ Index and enrich assets

Multicloud Services• Logging

• Monitoring

• Metering

• Persistent Storage

• Identity Access Mgmt.

• Docker Registry / Helm

• Kubernetes

• Security

© 2019 IBM Corporation

Page 11: 20 novembre 2019 New Cap Event Center, Paris · Data Science : Model Monitor with Open Scale Watson Knowledge Catalog • Spark • Python 2.7 with Anaconda, R, Scala • Python 3.5

IBM Cloud Pak for Data

Unified platform of foundational Data & AI cloud services

IBM IBM BPIBM OSISVDSP Custom

On-Premise

KUBERNETES BASEDContainerized, easy to manage

PICK YOUR CLOUDPrivate or Public

PICK YOUR ADD-ONContainerized Services

DATA PLATFORM#1 Ranked by Forrester

Customize & Extend with add-on microservices

Built for Multi-cloudAvoid vendor lock-in & get started on your cloud journey today

11

Page 12: 20 novembre 2019 New Cap Event Center, Paris · Data Science : Model Monitor with Open Scale Watson Knowledge Catalog • Spark • Python 2.7 with Anaconda, R, Scala • Python 3.5

IBM Cloud Pak for Data Key Principles

• Omnipresent, yet invisible – infused throughoutGovernance

• Common look-and-feel with customizable persona-based workflows

• Automated common services – user management, authentication models, security configurations, provisioning, collaboration, etc.

Pre-Integrated Experiences

• Cloud-native architecture

• Cloud agnostic & multi-cloud – any vendor cloud or data center

• Hyper converged System available

• Coherent, efficient, and scalable data & analytics services

Deploy Anywhere

12

Page 13: 20 novembre 2019 New Cap Event Center, Paris · Data Science : Model Monitor with Open Scale Watson Knowledge Catalog • Spark • Python 2.7 with Anaconda, R, Scala • Python 3.5

Open by Design

A modern information architecture meets open source, on any cloud, working as one

The innovation & skills of Open Source Communities

IBM Cloud

One Open Platform, Any CloudBuilt upon cloud, data and AI open source frameworks

Page 14: 20 novembre 2019 New Cap Event Center, Paris · Data Science : Model Monitor with Open Scale Watson Knowledge Catalog • Spark • Python 2.7 with Anaconda, R, Scala • Python 3.5

15 11/20/2019

Data virtualization

Db2 Warehouse Governance

Data Discovery

Data Visualization & Dashboards Data Science: Model Deployment

1. Cloud Pak for Data ServicesCollect Organize Deploy

Powered by: new Db2 Technology & Db2 Warehouse

Powered by: Information Analyzer, IGC & Data Stage Powered by: Watson Studio

Db2 AESEDb2 BigSQL

Infosphere DataStage for CPDInfosphere Regulatory AcceleratorInfosphere multi-cloud Data MvmntInfosphere Entity Resolution

2. Premium Cartridge Services (Purchase license or BYOL)Collect Organize Infuse

Analyze Powered by: Cognos CDE

Cognos AnalyticsWatson Studio Premium

(SPSS Modeler, Model builder, Decision Optimization, Watson Explorer, Streams Designer)

Analyze

Auto AIWatson Bundles

(Discovery, Assistant, Speech to Text, Natural Language Understanding, API Kit,Watson Knowledge Studio)

3. Third Party Add-Ons

IBM Streams Db2 Event Store

PostgreSQLData Science : Model Monitor with

Open Scale

Watson Knowledge Catalog

• Spark• Python 2.7 with Anaconda, R, Scala• Python 3.5 with Tensorflow GPU• Apache Zeppelin• Rstudio etc.

Data

Science

Model Build

Foundation• Logging• Monitoring

• Metering• Persistent volume /Storage

• Identity Access Mgmt.• Docker registry/Helm chart mgmt.

• Kubernetes• Security

Cloud Pak for Data Packaging

Page 15: 20 novembre 2019 New Cap Event Center, Paris · Data Science : Model Monitor with Open Scale Watson Knowledge Catalog • Spark • Python 2.7 with Anaconda, R, Scala • Python 3.5

16 11/20/2019

Use Cases

Operationalize Data Science & AI

e.g. accelerate GDPR Compliance

Build, deploy, manage & govern models at scale to improve business outcomes

e.g. a. Customer Churn

b. Cross Sell / Up Sell

c. Predictive Maintenance

Shift to Cloud Native

a. Provision & scale in minutes

b. Build once, deploy anywhere – multi cloud support

c. Built in automation & collaboration to increase productivity

1. Manage all your enterprise data regardless of where it lives

(Data Virtualization)

2. Gain control & leverage your data from connected devices

(Fast data & Streaming analytics)

Manage your Data Anywhere

Shift to Next-Gen Workloads Smarter Governance

Governance to enable self service analytics

Auto-discover meta data, manage governance rules & policies, enforce privacy etc. to mitigate risk & ensure compliance

Page 16: 20 novembre 2019 New Cap Event Center, Paris · Data Science : Model Monitor with Open Scale Watson Knowledge Catalog • Spark • Python 2.7 with Anaconda, R, Scala • Python 3.5

Data

Architects

ETL

Developers

Data Engineering Data Governance Teams Data Consumers (LOB)

DG Technical Users DG Business Users Data Scientists Business Users

Quality

Developers

Repository

Managers

Quality

Developers

Data

Stewards

LOB

RiskCDO

LOB

Product

Data

Scientists

Data

Analysts

Business

Analysts

Data Architects and Engineers

IBM Cloud Pak for Data

Data Ingestion Data Transformation Data QualityData Governance

Technical UsersData Governance

Business UsersActivate and Exploit The Data

• Search and find relevant data

• Data Preparation

• Consume and analyze the data

• Comment, rate and share

• Business lineage*

• Reference data management*

• Data ownership

• Data stewardship

• Data governance workflow

• Profile data

• Understand data quality

• Classify data

• Build validation rules

• Apply validation rules

• Monitor data quality

• Remediate data quality

• Extract data

• Collect metadata

• Move data

• Ingest data

• Build integration jobs

• Run integration jobs

• Monitor

• Discover metadata assets

• Classify data assets

• Build data glossary

• Create data lineage

• Manage metadata repository

Get Business Ready Data

Page 17: 20 novembre 2019 New Cap Event Center, Paris · Data Science : Model Monitor with Open Scale Watson Knowledge Catalog • Spark • Python 2.7 with Anaconda, R, Scala • Python 3.5

Understand Your Data AnywhereWatson Knowledge CatalogOne Application. Delivering trusted data to the Enterprise.

WatsonKnowledge CatalogQuality | Governance | Catalog

Data LineageOperational Lineage

Asset LineageAI Lineage

Self ServiceShopping for dataData Preparation

Advanced Data Discovery

Data Source DetectionBusiness Term DetectionAdvanced Classification

Data QualityData Rules

Data Quality AnalysisDashboards

Business GlossaryTerm Management

Hierarchy managementReference Data

GovernancePolicy ManagementGovernance Rules

Policy Enforcement

© 2019 IBM Corporation 18

Enterprise Data

Integration

Enterprise Data

Governance

Enterprise Data

Quality

Enterprise Data

Consumption

Business Ready Data Foundation

Page 18: 20 novembre 2019 New Cap Event Center, Paris · Data Science : Model Monitor with Open Scale Watson Knowledge Catalog • Spark • Python 2.7 with Anaconda, R, Scala • Python 3.5

19

The Ladder to AI

Multicloud Data & AI Platform

IBM Cloud Pak for Data

Unlock the value of your data and accelerate your journey to AI

Everything you need for enterprise data science and AI

AutoAI Lifecycle Automation – “AI developing AI”

Watson

Studio

Watson

MachineLearning

Watson

OpenScale

Watson

Knowledge

Catalog

Data Profiling & Prep

Quality & Lineage

Policy-based Governance

Visual Design

Develop & Train

Lifecycle Mgmt

Run & Optimize

Model-ops

Dynamic Retraining

KPIs & Accuracy

Explainability & Lineage

Automated Optimization

Prepare and Organize Data

Build and TrainAI Models

Deploy and Run AI Models

Manage and Operate Trusted AI

Automate and Industrialize AI

ICP4Data System

Use Cases

Watson Services & Applications

Page 19: 20 novembre 2019 New Cap Event Center, Paris · Data Science : Model Monitor with Open Scale Watson Knowledge Catalog • Spark • Python 2.7 with Anaconda, R, Scala • Python 3.5

Pre-Integrated Tools, Algorithms, Librariesfor Data Science, ML/DL

• Best-of-breed tooling from open ecosystem• Authoring tools• Machine learning, deep learning, optimization• Customize environments, packages & images

• Coding and visual modeling options

• Cloud Pak infrastructure• Container-based resource management• Scale with distributed and GPU support

• Model Lifecycle Management• Dev -> Test -> Staging -> Prod• Versioning, release, SLAs, rolling upgrades

Page 20: 20 novembre 2019 New Cap Event Center, Paris · Data Science : Model Monitor with Open Scale Watson Knowledge Catalog • Spark • Python 2.7 with Anaconda, R, Scala • Python 3.5

Governed AI Lifecycle Management on Cloud Pak for Data

Data Exploration

Data Preparation

Model Development

Build

Watson Studio

Watson Machine Learning

Business KPIs and production

metrics

Fairness & Explainability

Inputs for Continuous Evolution

Infuse

Run

Deployment

RetrainingModel

Management

Watson OpenScale

Rebuild models, improve

performance and mitigate bias

Monitor and

orchestrate

models served

with WML

Easily deploy

models to WML for

online, batch,

streaming

deployments

Page 21: 20 novembre 2019 New Cap Event Center, Paris · Data Science : Model Monitor with Open Scale Watson Knowledge Catalog • Spark • Python 2.7 with Anaconda, R, Scala • Python 3.5

What does AutoAI do?

• Integrated with Watson Studio and Watson Machine learning

• Automatically ingest, clean, transform, and model with hyperparameter optimization

• Training feedback visualizations provide real-time results to see model performance

• One-click deployment to Watson Machine Learning

https://www.ibm.com/demos/collection/IBM-Watson-Studio-AutoAI/

AutoAI : automation of machine learning tasks

Page 22: 20 novembre 2019 New Cap Event Center, Paris · Data Science : Model Monitor with Open Scale Watson Knowledge Catalog • Spark • Python 2.7 with Anaconda, R, Scala • Python 3.5

Watson APIs Suite on Cloud Pak for Data

Cloud Pak for Data

Wat

son

Ass

ista

nt

Wat

son

Dis

cove

ry

Wat

son

AP

I Kit

Kn

ow

led

ge S

tud

io

Speech To Text

Natural Language Understanding

Watson Knowledge Studio

Text To Speech

Page 23: 20 novembre 2019 New Cap Event Center, Paris · Data Science : Model Monitor with Open Scale Watson Knowledge Catalog • Spark • Python 2.7 with Anaconda, R, Scala • Python 3.5

24 Cloud Analyst Summit/ Data & AI/ July 2019 / © 2019 IBM Corporation

USE CASE

Provide Nedbank’s customers access to fully functional ATMs at all times

• Better predict cash-outs & machine failures

• Optimize service and cash replenishment schedules

Optimize Nedbank’s ATM Experience

CASE STUDY“Even with a team of experienced data scientists on the ground, IBM was able to augment my team, provide strong technical leadership, and put in place a strong practice to set us up for quick delivery, but also enabling us for success in the future.”

Guy Taylor, Head of Data & Data-Driven Intelligence,

Nedbank

EXPECTED BENEFIT

Improve customer experience

Reduce planning cycle

Reduce replenishment & service costs

UNIQUE CHALLENGE

Difficult to predict machine fault category

Lengthy planning cycle due to uncertain travel times, custodian skills & availability

24

Page 24: 20 novembre 2019 New Cap Event Center, Paris · Data Science : Model Monitor with Open Scale Watson Knowledge Catalog • Spark • Python 2.7 with Anaconda, R, Scala • Python 3.5

25 Cloud Analyst Summit/ Data & AI/ July 2019 / © 2019 IBM Corporation

Turbo Charged Digital Transformation

Sprint chose Cloud Pak for Data because it enables AI projects in weeks rather than months through unifying and simplifying three critical stages in the journey to AI: the collection, organization and analysis of data.

Facing its own path toward digital transformation, Sprint started preparing its data for Artificial Intelligence (AI) – with the goal of using machine learning algorithms to gain quicker insights and increase responsiveness to customers.

Data and AI / © 2019 IBM Corporation

“ Include a quote if one is available, we can always add this at closeout. ”

-- Include quote author’s name and title

“ Include a quote if one is available, we can always add this at closeout. ”

-- Include quote author’s name and title

Solution

Cloud Pak for Data

Read the story in any of these magazines: Business Chief US , Business Chief Canada , Gigabit Magazine

Modernize

Unlock the value of your data for an AI and multicloud world

Michelle Gehl

VP Networks OSS

Applications and Operations,

Sprint

IBM Cloud Pak for Data enabled Sprint to digest high volumes of data for near, real-time ML/AI analysis, and the trial results have shown potential to take Sprint to the next phase of digital transformation.

Industry: TelecommunicationsGeography: North America

Sprint

Page 25: 20 novembre 2019 New Cap Event Center, Paris · Data Science : Model Monitor with Open Scale Watson Knowledge Catalog • Spark • Python 2.7 with Anaconda, R, Scala • Python 3.5

26 Cloud Analyst Summit/ Data & AI/ July 2019 / © 2019 IBM Corporation

Is AI your priority? Start with a data strategy

The union between IBM and Intel is supercharging the ability of data scientists to drive better insight and better business outcomes in a way that has never been seen before.

Intel and IBM are great partners and closely aligned in becoming more data-centric.

Intel's participation and contribution is meaningful because customers can run Cloud Pak for Data at speed on their Intel-based infrastructure

Data and AI / © 2019 IBM Corporation

“ Include a quote if one is available, we can always add this at closeout. ”

-- Include quote author’s name and title

“ Include a quote if one is available, we can always add this at closeout. ”

-- Include quote author’s name and title

Solution

Cloud Pak for Data

Read the blog and watch the videoModernize

Unlock the value of your data for an AI and multicloud world

Melvin Greer

Senior Principal Engineer and

Chief Data Scientist - Americas,

Intel Corporation

IBM Cloud Pak for Data is really important because it helps to do a couple of things that are mind blowing for data scientists — auto discovery of data and rapid integration of hyper relevant data.

Industry: TechnologyGeography: North America

Intel

Page 26: 20 novembre 2019 New Cap Event Center, Paris · Data Science : Model Monitor with Open Scale Watson Knowledge Catalog • Spark • Python 2.7 with Anaconda, R, Scala • Python 3.5

Client Value PropositionWhy ?

Data & security regulation requirements around data governance, data privacy, and security– New data protection regulations around the world GDPR, etc.– Increasing focus on IT security requirements

Faster data access for analytics and Line of Business– Average customer has 168 difference data sources they want

to use within analytics– 81% of business are having issues preparing data required for

AI

AI trust and explainability– 85% of companies see AI as a strategic opportunity– 60% of companies see regulatory constraints as barriers to

implementing AI

Common experience across hybrid clouds; provision containerized data and AI services in minutes behind your firewall, instead of taking weeks– Only 20% of workloads have moved to the cloud– 75% of enterprises will modernize existing applications over

the next 3 years

IBM Differentiators

Data Governance– Data privacy and governance by design: data discovery and curation, with

policy and rules management – Metadata management and shopping for data– Smarter compliance: Regulatory ML, Accelerators, FISMA HIGH

certification, etc.

Data Virtualization– Query all of your data sources as one– Governance, security, and scalability by design– 5X faster data access; 40X faster than federation

Governing & Operationalizing AI– Governed AI lifecycle management

– Quality-of-Service optimization– AI model trust and explainability

Delivery Models– OpenShift enables consistent hybrid deployment patterns with any cloud

(public, private or on-premises)– Hyper-converged System offering combines storage, compute,

networking, and software into a single system to reduce complexity and increase scalability

– Targeted professional services offerings to accelerate use case execution29

Page 27: 20 novembre 2019 New Cap Event Center, Paris · Data Science : Model Monitor with Open Scale Watson Knowledge Catalog • Spark • Python 2.7 with Anaconda, R, Scala • Python 3.5

Summary

• Optimized for fast & flexible data service lifecycle management

• Portability across on-premises, private clouds & public clouds

• Appliance available

Multi-Cloud

• One experience across all data services

• Coherent, efficient, and scalable data services experience

• Ease of provisioning, monitoring, and moreUnified Console

• IBM add-ons

• Open Ecosystem

• Data Virtualization

Ecosystem & Virtualization

• Data automatically integrated with governance capabilities for catalog and search subject to policies & rules – enabling self-service data discovery

Data Governance & Self-Service Analytics

IBM Cloud Pak for Data allows:

30

Page 28: 20 novembre 2019 New Cap Event Center, Paris · Data Science : Model Monitor with Open Scale Watson Knowledge Catalog • Spark • Python 2.7 with Anaconda, R, Scala • Python 3.5

Experience It ! https://www.ibm.com/cloud/garage/cloud-pak-experiences-for-data/

Page 29: 20 novembre 2019 New Cap Event Center, Paris · Data Science : Model Monitor with Open Scale Watson Knowledge Catalog • Spark • Python 2.7 with Anaconda, R, Scala • Python 3.5