29
park the future. May 4 – 8, 2015 Chicago, IL

1.Increasing data volumes 2.New data sources and types 3.Real-time data 4.Cloud-born data 5.Hybrid infrastructures “…data warehousing has reached

Embed Size (px)

Citation preview

Page 1: 1.Increasing data volumes 2.New data sources and types 3.Real-time data 4.Cloud-born data 5.Hybrid infrastructures “…data warehousing has reached

Spark the future.

May 4 – 8, 2015Chicago, IL

Page 2: 1.Increasing data volumes 2.New data sources and types 3.Real-time data 4.Cloud-born data 5.Hybrid infrastructures “…data warehousing has reached

What’s New in Master Data Services and Integration Services in SQL Server 2016Matt Masson – [email protected] Program ManagerSQL Server - Information Management and Machine Learning (IMML)

BRK2578

Page 3: 1.Increasing data volumes 2.New data sources and types 3.Real-time data 4.Cloud-born data 5.Hybrid infrastructures “…data warehousing has reached

New challenges in the data integration space

SQL Server 2016 Public Preview – SSIS and MDS

A few demos…What to expect beyond the Public Preview

release

Agenda

Page 4: 1.Increasing data volumes 2.New data sources and types 3.Real-time data 4.Cloud-born data 5.Hybrid infrastructures “…data warehousing has reached

DATA SOURCES

BI AND ANALYTICS

ETL

DATA WAREHOUSE

Traditional analytics platforms: inflection point1. Increasing data volumes

2. New data sources and

types

3. Real-time data

4. Cloud-born data

5. Hybrid infrastructures

DATA SOURCES

OLTP ERP CRM LOB

NON-RELATIONAL BATCH DATA

REAL-TIME DATA

“…data warehousing has reached the most significant tipping point since its inception. The biggest, possibly most elaborate data management system in IT is changing.”

-- GARTNER, “THE STATE OF DATA WAREHOUSING IN 2012”

Page 5: 1.Increasing data volumes 2.New data sources and types 3.Real-time data 4.Cloud-born data 5.Hybrid infrastructures “…data warehousing has reached

You have an opportunity.

Page 6: 1.Increasing data volumes 2.New data sources and types 3.Real-time data 4.Cloud-born data 5.Hybrid infrastructures “…data warehousing has reached

Opportunities

Infinite storage

Big data processing

Advanced analytics

Dynamic infrastructure

Real-time dashboards

Self-service ETL

Page 7: 1.Increasing data volumes 2.New data sources and types 3.Real-time data 4.Cloud-born data 5.Hybrid infrastructures “…data warehousing has reached

Data integration is hard.

Page 8: 1.Increasing data volumes 2.New data sources and types 3.Real-time data 4.Cloud-born data 5.Hybrid infrastructures “…data warehousing has reached

You’ll have multiple data processing environmentsHeterogeneous ETL?• Different tools for different jobs

• Mix of legacy environments

• Central vs. departmental processing

• ELT vs. ETL, custom code, stored procedures…

Different needs for processing environments• Hot path / real-time vs. batch / archival

• Data lakes

• Big data processing

Page 9: 1.Increasing data volumes 2.New data sources and types 3.Real-time data 4.Cloud-born data 5.Hybrid infrastructures “…data warehousing has reached

Master data continues to requires curation and stewardshipBig data systems don’t solve master data management challenges• The need for accurate, curated data

remains

• There’s usually a human factor

• Lookups against master data or reference data is a common requirement

Page 10: 1.Increasing data volumes 2.New data sources and types 3.Real-time data 4.Cloud-born data 5.Hybrid infrastructures “…data warehousing has reached

SQL Server 2016Master Data Services Integration Services

Page 11: 1.Increasing data volumes 2.New data sources and types 3.Real-time data 4.Cloud-born data 5.Hybrid infrastructures “…data warehousing has reached

Focus areas

Manageability

Connectivity

Customer Feedback

Page 12: 1.Increasing data volumes 2.New data sources and types 3.Real-time data 4.Cloud-born data 5.Hybrid infrastructures “…data warehousing has reached

Additional features coming after the initial release

Some of the SSIS features will also work with existing SQL Server releasesAzure Storage and HDInsight connectorsDesigner improvements

MDS focused on platform improvementsUX improvements coming later

Public Preview Notes

Page 13: 1.Increasing data volumes 2.New data sources and types 3.Real-time data 4.Cloud-born data 5.Hybrid infrastructures “…data warehousing has reached

SQL Server Integration Services Incremental project deployment

You can now incrementally deploy projects to the SSIS Catalog.

Multi-version support in SSDT-BI*Develop, debug, and deploy to multiple versions of SSIS from a single version of SSDT-BI. You can now set a Target Version value on the VS Project.

Improved project and catalog upgradeResolved multiple issues with SSIS project upgrade in the designer (i.e. loss of layout information), and improved the way catalog patches are applied.

Lineage and Impact Analysis*Automatically collect dependency information for all of your packages to determine impact analysis and data lineage between systems.

* Coming post Public Preview

Page 14: 1.Increasing data volumes 2.New data sources and types 3.Real-time data 4.Cloud-born data 5.Hybrid infrastructures “…data warehousing has reached

Always On support for SSISDBHigh Availability for the SSIS Catalog

CapabilityImproved support for Always On availability groups for the SSIS Catalog

Benefits Consistent with SQL Server High

Availability setup flow Works for any user database which

contains encrypted data based on database master key

Support SSISDB upgrade even when it has been added to the HA group

Easy to configure

Page 15: 1.Increasing data volumes 2.New data sources and types 3.Real-time data 4.Cloud-born data 5.Hybrid infrastructures “…data warehousing has reached

Do more. Achieve more.Azure storage connectorsMove data to and from Azure storage locations to enable hybrid data movement scenarios. Extensions at the Control Flow and Data Flow levels.

HDInsight tasksOrchestrate HDInsight jobs and manage the lifecycle of your cluster from the SSIS Control Flow.

Power Query integration* Allows SSIS to leverage the breadth connectivity experience of Power Query, and enables “grow up” scenarios from self service to managed ETL.

Azure Data Factory integration*Monitor and orchestrate SSIS environments from Azure Data Factory. See dependencies between packages to for lineage and impact analysis.

Expanded Connectivity

* Coming post Public Preview

Page 16: 1.Increasing data volumes 2.New data sources and types 3.Real-time data 4.Cloud-born data 5.Hybrid infrastructures “…data warehousing has reached

Azure Storage ConnectorsEnables hybrid ETL

CapabilityAzure Blob upload and download task. Azure Blob Source and Destination. Blob File enumerator.

Benefits Extend existing ETL pipeline with cloud

storage, or cloud based SSIS execution through Azure VMs

Data preparation for cloud compute services such as HDInsight or Azure Machine Learning

Data archival to cloud storage Easy enumeration of files in blob storage

Page 17: 1.Increasing data volumes 2.New data sources and types 3.Real-time data 4.Cloud-born data 5.Hybrid infrastructures “…data warehousing has reached

HDInsight TasksOrchestrate and manage HDI clusters

CapabilityTrigger HD Insight job and manage HD Insight cluster life cycle directly from SSIS.

Benefits Integrate big data processing into your

existing ETL flows Filter and process raw cloud data using HDI

before moving useful data to on-premises Dynamically create your HDI cluster on

demand, and remove it once processing is complete

Combine with Azure Storage connectors to extend your ETL environments to the cloud

Page 18: 1.Increasing data volumes 2.New data sources and types 3.Real-time data 4.Cloud-born data 5.Hybrid infrastructures “…data warehousing has reached

Master Data ServicesModeling improvements in MDSIncreased attribute name length limit, added Display Name property for attributes, support for special characters in model names. The Excel add-in now lets you hide name or code values for domain based attributes.

Simplified hierarchiesSimplified the various hierarchy types (now just Derived), and made it easier to find and manage unused members.

Improved MDS model deploymentFaster deployment, and removed size limitations when deploying models with data.

Page 19: 1.Increasing data volumes 2.New data sources and types 3.Real-time data 4.Cloud-born data 5.Hybrid infrastructures “…data warehousing has reached

Do more. Faster.Massive improvements to performance and scale in MDS

CapabilityHeavy performance optimizations for the MDS backend system. Scale entities to 100 million members (and beyond).

Benefits 15x performance increase for

Excel

Faster entity based staging

Reduced impact of row level security

Smarter caching of security rules

15x

Page 20: 1.Increasing data volumes 2.New data sources and types 3.Real-time data 4.Cloud-born data 5.Hybrid infrastructures “…data warehousing has reached

Do more. Achieve more.MDS transaction log retention settingsConfigurable settings for retaining the MDS transaction history table to easily enable automatic truncation and cleanup.

Multiple administrator rolesMDS now allows multiple system administrators as well as distinct roles for model admins and super users.

Granular security permissions in MDSAllows read, modify, create, and delete permissions to be set at the attribute (column) level and hierarchy member (row) level.

Manageability and Administration

Page 21: 1.Increasing data volumes 2.New data sources and types 3.Real-time data 4.Cloud-born data 5.Hybrid infrastructures “…data warehousing has reached

Demo

SQL Server Integration Services and Master Data Services in SQL Server 2016 – Public Preview

Page 22: 1.Increasing data volumes 2.New data sources and types 3.Real-time data 4.Cloud-born data 5.Hybrid infrastructures “…data warehousing has reached

SSIS SQL 2016 – Public preview at a glanceManageability Connectivity Usability &

Productivity

Incremental deploymentPackages within project can be selectively deployed.

Always-On supportEnables high availability for the SSIS Catalog.

Better project upgrade**Improved project upgrade experience.

Azure storage connectors*Move data to and from Azure blob storage source and destination to enable the hybrid data movement scenarios.

HDInsight Tasks*Orchestrate HDInsight jobs and manage your HDInsight cluster life cycle directly from SSIS.

Designer Improvement**A number of enhancement and fixes on the designer capabilities such as drag and drop, resizing, etc.

Multi-Version Support in SSDT-BI**Ability to author, execute, deploy and debug multiple versions of SSIS packages from a single version of SSDT-BI.

* These features will be first available as out of box downloadable for SSIS 2012/2014, and will be later released for SSIS 2016

** These features will be first available as out of box downloadable for SSDT-BI for Visual Studio 2013 and will be later released for SSIS 2016 (SSDT-BI for Visual Studio 2015)

Page 23: 1.Increasing data volumes 2.New data sources and types 3.Real-time data 4.Cloud-born data 5.Hybrid infrastructures “…data warehousing has reached

MDS SQL 2016 – Public preview at a glancePerformance and

Scale Modeling and Mgmt Security and Admin

Excel add-in4-8x performance increase in the January release of the MDS Excel add-in for SQL Server 2012 and SQL Server 2014.

Up to 15x increase in SQL Server 2016.

Entity based stagingLoading data into MDS is now 15-200% faster than SQL Server 2014.

Model deploymentFaster deployment, and removed size limitations when deploying models with data.

API optimizationOverall performance increase of 60% or more on MDS web service API calls.

Transaction log retentionConfigurable settings for retaining the MDS transaction history table to enable automatic truncation.

Simplified hierarchiesSimplified the various hierarchy types, and made it easier to find and manage unused members.

Display Name for attributesGives more control over the names displayed for a given attribute – including the Code and Name attributes.

Granular security permissionsAllows permissions to be set around read, write, create, and delete.

Multiple administrator rolesSupport for Super User and Model Admin roles allows for multiple system administrators, and model level admins.

Smarter caching of permissionsReduces the overhead of adding member and attribute level security settings.

Page 24: 1.Increasing data volumes 2.New data sources and types 3.Real-time data 4.Cloud-born data 5.Hybrid infrastructures “…data warehousing has reached

Stay tuned…

Page 25: 1.Increasing data volumes 2.New data sources and types 3.Real-time data 4.Cloud-born data 5.Hybrid infrastructures “…data warehousing has reached

SSIS features post-public previewPower Query Source for SSISEnables grow up scenario from self-service ETL in ExcelLeverage the Power Query formula language for data mashup and pushdown optimizationsExtends SSIS connectivity to Power Query’s 35+ data sources out of the box

Support multiple product versions from SSDT-BIProjects now have a target version to output packages/projects to SSIS 2012 and beyondDeploy to different versions of the SSIS CatalogExecute and debug using your target runtimeUse the latest version of SSDT-BI without having to upgrade your SSIS instances

Azure Data Factory integrationSurface lineage and impact analysis information from your SSIS packagesMonitor and orchestrate SSIS packages from ADF

Page 26: 1.Increasing data volumes 2.New data sources and types 3.Real-time data 4.Cloud-born data 5.Hybrid infrastructures “…data warehousing has reached

MDS features post-public previewEntity sharing/sync Allows entities to be reused across models

Merge conflict resolutionFeatures to help solve merge conflict when multiple people are trying to modify the same entity.

Support for custom indexesSpecify which attributes should be indexed, based on your application workloads.

Subscription view for historical dataFacilitates processing type-2 attributes for slowly changing dimensions.

Page 27: 1.Increasing data volumes 2.New data sources and types 3.Real-time data 4.Cloud-born data 5.Hybrid infrastructures “…data warehousing has reached

Check out all the data sessions Pre-register for the SQL Server 2016

CTP & Azure SQL DW previews

http://www.microsoft.com/mcpnews

Page 28: 1.Increasing data volumes 2.New data sources and types 3.Real-time data 4.Cloud-born data 5.Hybrid infrastructures “…data warehousing has reached

Visit Myignite at http://myignite.microsoft.com or download and use the Ignite Mobile App with the QR code above.

Please evaluate this sessionYour feedback is important to us!

Page 29: 1.Increasing data volumes 2.New data sources and types 3.Real-time data 4.Cloud-born data 5.Hybrid infrastructures “…data warehousing has reached

© 2015 Microsoft Corporation. All rights reserved.