20
DataOps In Action - Accelerating Business Value Technical Roadmap – session 6922 Jean-Claude Mamou STSM and Program Director, DataOps Think 2020 / 6922 Technical Roadmap/ May 2020 / © 2020 IBM Corporation DataOps In Action - Accelerating Business Value Technical Roadmap – session 6922 Jean-Claude Mamou STSM and Program Director, DataOps Think 2020 / 6922 Technical Roadmap/ May 2020 / © 2020 IBM Corporation

DataOpsIn Action -Accelerating Business Value...AI and hybrid cloud world One Platform, Any Cloud Talent & Skills Analyze & Infuse Plug and play 45+ data, analytics and AI apps. Manage

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: DataOpsIn Action -Accelerating Business Value...AI and hybrid cloud world One Platform, Any Cloud Talent & Skills Analyze & Infuse Plug and play 45+ data, analytics and AI apps. Manage

DataOps In Action - Accelerating Business ValueTechnical Roadmap – session 6922—Jean-Claude MamouSTSM and Program Director, DataOps

Think 2020 / 6922 Technical Roadmap/ May 2020 / © 2020 IBM Corporation

DataOps In Action - Accelerating Business ValueTechnical Roadmap – session 6922—Jean-Claude MamouSTSM and Program Director, DataOps

Think 2020 / 6922 Technical Roadmap/ May 2020 / © 2020 IBM Corporation

Page 2: DataOpsIn Action -Accelerating Business Value...AI and hybrid cloud world One Platform, Any Cloud Talent & Skills Analyze & Infuse Plug and play 45+ data, analytics and AI apps. Manage

IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice and at IBM’s sole discretion.

Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision.

The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or functionality. Information about potential future products may not be incorporated into any contract.

The development, release, and timing of any future features or functionality described for our products remains at our sole discretion.

Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon many factors, including considerations such as the amount of multiprogramming in the user’s job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve results similar to those stated here.

Think 2020 / 6922 Technical Roadmap/ May 2020 / © 2020 IBM Corporation

Please note

Page 3: DataOpsIn Action -Accelerating Business Value...AI and hybrid cloud world One Platform, Any Cloud Talent & Skills Analyze & Infuse Plug and play 45+ data, analytics and AI apps. Manage

The AI LadderA prescriptive approach to accelerating the journey to AI

COLLECT - Make data simple and accessible

ORGANIZE - Create a business-ready analytics foundation

ANALYZE - Build and scale AI with trust and transparency

INFUSE - Operationalize AI throughout the business

AI

MODERNIZEMake your data ready for anAI and hybrid cloud world

One Platform, Any CloudTalent &

Skills

Page 4: DataOpsIn Action -Accelerating Business Value...AI and hybrid cloud world One Platform, Any Cloud Talent & Skills Analyze & Infuse Plug and play 45+ data, analytics and AI apps. Manage

Analyze & InfusePlug and play 45+ data, analytics and AI apps.

Manage your favorite open source capabilities along side IBM’s market leading differentiators.

Organize Ingest, Transform, Catalog and govern all enterprise

data, models, rules, and providing insights through a common experience

OpenShiftLeverage the leading open source hybrid cloud platform to SCALE data & AI workloads.

CollectVirtually connect, manage and query data & AI

assets no matter where they live.

Run on ANY CloudDecoupling enterprise data, analytics and AI will

prevent lock in and accelerate polyglot eco-systems.

IBM Cloud Pak for DataSimplifies, unifies and automates the AI Ladder

Think 2020 / 6922 Technical Roadmap/ May 2020 / © 2020 IBM Corporation

Page 5: DataOpsIn Action -Accelerating Business Value...AI and hybrid cloud world One Platform, Any Cloud Talent & Skills Analyze & Infuse Plug and play 45+ data, analytics and AI apps. Manage

Getting Data to your AI Initiatives is Hard

Build Run ManageDiscover, understand, ingest, integrate, cleanse

*Source: Data scientist report, Figure Eight Inc

Where teams focus

Where 80%of time is spent

Where business impact is created

Page 6: DataOpsIn Action -Accelerating Business Value...AI and hybrid cloud world One Platform, Any Cloud Talent & Skills Analyze & Infuse Plug and play 45+ data, analytics and AI apps. Manage

Think 2020 / 6922 Technical Roadmap/ May 2020 / © 2020 IBM Corporation

Page 7: DataOpsIn Action -Accelerating Business Value...AI and hybrid cloud world One Platform, Any Cloud Talent & Skills Analyze & Infuse Plug and play 45+ data, analytics and AI apps. Manage

Overall Themes Across DataOps

Think 2020 / 6922 Technical Roadmap/ May 2020 / © 2020 IBM Corporation

– Cloud Native and Cloud First

• Bring all the new WKC capabilities to our SaaS platform

• Bring DataStage to our SaaS platform

• Bringing new capabilities using a cloud first approach

• Support for multi-cloud SaaS runtime

– Feature Consolidation

• Consolidating existing capabilities into a set of modern cloud native micro-services

– Platform integration

• Deep integration with the Cloud Pak for Data platform

• Streamlined user experience

Cloud Pak for Data DataStage

Page 8: DataOpsIn Action -Accelerating Business Value...AI and hybrid cloud world One Platform, Any Cloud Talent & Skills Analyze & Infuse Plug and play 45+ data, analytics and AI apps. Manage

Watson Knowledge Catalog

Page 9: DataOpsIn Action -Accelerating Business Value...AI and hybrid cloud world One Platform, Any Cloud Talent & Skills Analyze & Infuse Plug and play 45+ data, analytics and AI apps. Manage

What’s New in Watson Knowledge Catalog in CPD v3.0New look and feel§ New home page, color theme and layout to improve consistency and usability.

Globalization support§ Available in Group 1 languages and Russian.

Advanced data curation and data quality§ More accurate automatic term assignments through learning from rejected terms.§ Faster relationship analysis and overlap analysis by filtering out columns.§ View trends in Data Quality score over time by data asset and time interval.

Automatic data class creation § Quickly create and assign a data class to clusters of similar columns using patent-protected

Fingerprint algorithm.

Data protection rules that are more powerful and flexible§ Include Classifications in criteria when defining data protection rules.

Workflow enhancements § Improved activity log for full history of governance artifacts (terms, policies etc).

Smarter global search§ Search suggestions based on results most relevant to the user.

New data sources and connections§ New connectors: Impala and Planning Analytics (TM1).§ Teradata and Files are synchronized from Information Assets to the default catalog.

Migrate assets from IBM InfoSphere Information Server v11.7.1.x

Page 10: DataOpsIn Action -Accelerating Business Value...AI and hybrid cloud world One Platform, Any Cloud Talent & Skills Analyze & Infuse Plug and play 45+ data, analytics and AI apps. Manage

GovernanceSimplified Experience for Policies and RulesExpand connections EcosystemRegulatory Accelerator EnhancementsCustomization of views by personaReference Data versioningBusiness LineageDelete categories and its contents

QualityProfiling of unstructured dataML assisted processing time estimatesDQ Remediation workflowAddress parse/enhance/verify

ConsumptionWatson Assistant IntegrationGUI for creating custom assetsSupport external reporting and querying tools

GovernanceMigration of IS governance artifactsData Protection rules in Data VirtualizationWorkflow customization for governance artifacts

QualityEnhanced learning for term suggestionsView of data quality trends over timeData Rule Exception Management‘Fingerprint’ data classesSimplified Discovery ExperienceWKC Instascan

ConsumptionNew Connectors: SharePoint, Hive MetaStore, OracleBI, Impala, Planning Analytics

OverallNew look and feel!Globalization for Brazilian Portuguese, English, French, German, Italian, Japanese, Russian, Simplified Chinese, Spanish, and Traditional Chinese

Watson Knowledge Catalog on Cloud Pak for Data2020/2021 Roadmap and Strategic Vision

Think 2020 / 6922 Technical Roadmap/ May 2020 / © 2020 IBM Corporation

1H 2020 2H 2020 1H 2021GovernanceDiscovery and profiling of Unstructured DataReference Data Set mapping, hierarchies & custom columnsWorkflow request managementPermissions and workflow by categoriesAI model policies and rulesSupport for Knowledge AcceleratorsCustom Relationships

QualityAdditional ML for DQML based data sampling

Consumption3rd Party Data Accelerators/ProvidersEnhanced catalog for more asset typesRestricted Asset Metadata DisplayOpen Metadata ServicesIntegration with ADP and CognosExpanded asset and column metadataModel factsheets to document the AI lifecycle

OverallSupport on PowerGlobalization for Swedish

Page 11: DataOpsIn Action -Accelerating Business Value...AI and hybrid cloud world One Platform, Any Cloud Talent & Skills Analyze & Infuse Plug and play 45+ data, analytics and AI apps. Manage

Phase 1 – Alignment of Public Cloud and Cloud Pak for Data + initial consolidation 2H 2020

1H 2021

• Full alignment of governance artifacts components – Data Protection Rules, Policies, Reference Data, Terms etc• Global search• Workflow support• One metadata import service (initial list of asset types)• One Metadata enrichment service• Import/Export

Phase 2 – Full consolidation

• Business Lineage• Data Quality• SQL views• Parity with IGC (except for consciously deprecated components)• AI Governance• Admin experience

Reaching feature parity on Public Cloud

Page 12: DataOpsIn Action -Accelerating Business Value...AI and hybrid cloud world One Platform, Any Cloud Talent & Skills Analyze & Infuse Plug and play 45+ data, analytics and AI apps. Manage

Data Integration

Page 13: DataOpsIn Action -Accelerating Business Value...AI and hybrid cloud world One Platform, Any Cloud Talent & Skills Analyze & Infuse Plug and play 45+ data, analytics and AI apps. Manage

What’s New in DataStage in CPD v3.0

Additional Content for Flow Designer• All Stages now support General, Stage Advanced and Output Advanced property tabs• Hierarchical stage to process JSON and XML documents with 10 in-built operator steps and tree-based view• Slowly Changing Dimension (SCD) stage for warehousing• Transformer stage enhanced to support SCD• CFF stage and z/OS File stage support to process Db2 z and legacy sources

Globalization support• Available in Group 1 languages and Russian.

Support for Data Virtualization within CPD DataStage

Enhanced connectivity for Cloud • SAP Odata• Generic Odata• Cloudera Impala

PX Runtime Micro Service and Workload Management feature• Dynamically create configurations and scale computes to reduce job wait times

OpenShift 4.3 for Cloud Pak for Data DataStage

Page 14: DataOpsIn Action -Accelerating Business Value...AI and hybrid cloud world One Platform, Any Cloud Talent & Skills Analyze & Infuse Plug and play 45+ data, analytics and AI apps. Manage

IBM Cloud Pak for Data DataStage comes with built-in automatic workload balancing and best of breed parallel engine

Think 2020 / 6922 Technical Roadmap/ May 2020 / © 2020 IBM Corporation

– Virtually unlimited scaling (horizontal, vertical) using PX engine

– Automatic load balancing to maximize throughput and minimize resource congestion

– Supports to run resource intensive jobs in parallel pipelining

– Built on container based architecture to allow for handling of any data volume and execution on any environment

Conductor

6 Jobs

Compute 1CPU: 60%Mem: 80

6 Jobs

Conductor

10 Jobs

Compute 1CPU: 60%Mem: 80

+4 Jobs

Workload 1:

Workload 2:

Compute 2CPU: 40%Mem: 53

Page 15: DataOpsIn Action -Accelerating Business Value...AI and hybrid cloud world One Platform, Any Cloud Talent & Skills Analyze & Infuse Plug and play 45+ data, analytics and AI apps. Manage

IBM Cloud

DataStage Hub

On Premises

§ Integrated with the IBM Data and AI platform• Cloud Pak for Data and IBM Cloud• Common canvas on Cloud Pak for Data• Data integration, machine learning, data science

§ Design Automation• Accelerate well known pattern• Automated workflows

§ Governance infused• Catalog integration• Policy integration

§ Polyglot Execution Engines• Spark, IBM PX, Virtualization, Replication

§ Smart and optimized data flows• Data Gravity• Distribute processing to multiple clouds or on-prem

Cloud-First, Cloud-Native

Think 2020 / 6922 Technical Roadmap/ May 2020 / © 2020 IBM Corporation

..

SnowFlake

Spanner

GCS Blob

BigQuery

..

SnowFlake

RedShift

S3

Aurora

..

HDInsights

CosmosDB

SQL DW

ADLS

..

MongoDB

Postgres

Blob

Db2

..SQL Server

HiveOracle

PostgresDB2

CostData Locality

Performance

SparkPX

Replication

SparkPX

ReplicationSparkPX

Replication

SparkPX

ReplicationSparkPX

Replication

Page 16: DataOpsIn Action -Accelerating Business Value...AI and hybrid cloud world One Platform, Any Cloud Talent & Skills Analyze & Infuse Plug and play 45+ data, analytics and AI apps. Manage

Deeply integrated with Cloud Pak for Data

1. Design/Generate flows on Cloud Pak for Data’s Common Canvas• Fully wired into OpenShift and Cloud Pak for Data

à easy sharing or utilization of common assets• Built on a runtime neutral canonical design model

à allows to translates into any possible runtime logic• Utilize and enhance on pre-existing flow design experience

à One design canvas experience for the entire platform2. Dynamically execute flows on supported built-in or SaaS-based Runtime services

3. Built-in dynamic scaling and workload management

4. Utilizing common platform management and operations

Think 2020 / 6922 Technical Roadmap/ May 2020 / © 2020 IBM Corporation

Page 17: DataOpsIn Action -Accelerating Business Value...AI and hybrid cloud world One Platform, Any Cloud Talent & Skills Analyze & Infuse Plug and play 45+ data, analytics and AI apps. Manage

Notices and disclaimers

Think 2020 / 6922 Technical Roadmap/ May 2020 / © 2020 IBM Corporation

© 2020 International Business Machines Corporation. No part of this document may be reproduced or transmitted in any form without written permission from IBM.

U.S. Government Users Restricted Rights — use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM.

This document is current as of the initial date of publication and may be changed by IBM at any time. Not all offerings are available in every country in which IBM operates.

Information in these presentations (including information relating to products that have not yet been announced by IBM) has been reviewed for accuracy as of the date of initial publication and could include unintentional technical or typographical errors. IBM shall have no responsibility to update this information. This document is distributed “as is” without any warranty, either express or implied. In no event, shall IBM be liable for any damage arising from the use of this information, including but not limited to, loss of data, business interruption, loss of profit or loss of opportunity. IBM products and services are warranted per the terms and conditions of the agreements under which they are provided. The performance data and client examples cited are presented for illustrative purposes only. Actual performance results may vary depending on specific configurations and operating conditions.

IBM products are manufactured from new parts or new and used parts. In some cases, a product may not be new and may have been previously installed. Regardless, our warranty terms apply.”

Any statements regarding IBM's future direction, intent or product plans are subject to change or withdrawal without notice.

Performance data contained herein was generally obtained in a controlled, isolated environments. Customer examples are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual performance, cost, savings or other results in other operating environments may vary.

References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services available in all countries in which IBM operates or does business.

Workshops, sessions and associated materials may have been prepared by independent session speakers, and do not necessarily reflect the views of IBM. All materials and discussions are provided for informational purposes only, and are neither intended to, nor shall constitute legal or other guidance or advice to any individual participant or their specific situation.

Page 18: DataOpsIn Action -Accelerating Business Value...AI and hybrid cloud world One Platform, Any Cloud Talent & Skills Analyze & Infuse Plug and play 45+ data, analytics and AI apps. Manage

Notices and disclaimerscontinued

Think 2020 / 6922 Technical Roadmap/ May 2020 / © 2020 IBM Corporation

It is the customer’s responsibility to insure its own compliance with legal requirements and to obtain advice of competent legal counsel as to the identification and interpretation of any relevant laws and regulatory requirements that may affect the customer’s business and any actions the customer may need to take to comply with such laws. IBM does not provide legal advice or represent or warrant that its services or products will ensure that the customer follows any law.

Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products about this publication and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products.Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. IBM does not warrant the quality of any third-party products, or the ability of any such third-party products to interoperate with IBM’s products. IBM expressly disclaims all warranties, expressed or implied, including but not limited to, the implied warranties of merchantability and fitness for a purpose.

The provision of the information contained herein is not intended to, and does not, grant any right or license under any IBM patents, copyrights, trademarks or other intellectual property right.

IBM, the IBM logo, and ibm.com are trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at “Copyright and trademark information” at: www.ibm.com/legal/copytrade.shtml.

Page 19: DataOpsIn Action -Accelerating Business Value...AI and hybrid cloud world One Platform, Any Cloud Talent & Skills Analyze & Infuse Plug and play 45+ data, analytics and AI apps. Manage

Thank you

Think 2020 / 6922 Technical Roadmap/ May 2020 / © 2020 IBM Corporation

Jean-Claude MamouSTSM and Program Director, DataOps—[email protected]

Page 20: DataOpsIn Action -Accelerating Business Value...AI and hybrid cloud world One Platform, Any Cloud Talent & Skills Analyze & Infuse Plug and play 45+ data, analytics and AI apps. Manage

®

Think 2020 / 6922 Technical Roadmap/ May 2020 / © 2020 IBM Corporation