36
Module 4 Designing Databases for Optimal Performance

Module 4 Designing Databases for Optimal Performance

  • Upload
    meryl

  • View
    93

  • Download
    5

Embed Size (px)

DESCRIPTION

Module 4 Designing Databases for Optimal Performance. Module Overview. Guidelines for Designing Indexes Designing a Partitioning Strategy Designing a Plan Guide Designing Scalable Databases. Lesson 1: Guidelines for Designing Indexes. Guidelines for Selecting a Clustered Index - PowerPoint PPT Presentation

Citation preview

Page 1: Module 4 Designing Databases for  Optimal Performance

Module 4

Designing Databases for Optimal Performance

Page 2: Module 4 Designing Databases for  Optimal Performance

Module Overview

• Guidelines for Designing Indexes

• Designing a Partitioning Strategy

• Designing a Plan Guide

• Designing Scalable Databases

Page 3: Module 4 Designing Databases for  Optimal Performance

Lesson 1: Guidelines for Designing Indexes

• Guidelines for Selecting a Clustered Index

• Guidelines for Selecting a Nonclustered Index

• Guidelines for Selecting a Filtered Index

• Guidelines for Selecting a Computed Column Index

• Guidelines for Selecting a Strategy for Index Compression

• Discussion: Using Indexing

Page 4: Module 4 Designing Databases for  Optimal Performance

Create a clustered index on the frequently used columns ü

Consider clustered index data types and column widthsü

Consider the frequency of data changesü

Clustered Index

Guidelines for Selecting a Clustered Index

Page 5: Module 4 Designing Databases for  Optimal Performance

id indid = 2 root

Page 12 - Root

Page 37 Page 28

Page 51 Page 61 Page 71

MartinSmith...

MartinMatherOwen

4:708:014:706:044:707:02

MartinAkersGanio...

Akers…MartinMartin

Owen 4:707:02Mather 4:706:04

NonleafLevel

Page 12 - Root

Page 37 Page 28

Leaf Level(Key Value)

Page 51 Page 61 Page 71 Page 41AkersBarrCon

4:706:014:705:034:704:01

MartinSmith...

SmithSmithSmith

4:706:034:708:044:707:01

GanioHall

Jones

4:709:014:709:034:709:02

MartinMatherOwen

4:708:014:706:044:707:02

sys.sysindexes

MartinAkersGanio...

Akers…MartinMartin

id indid = 2 root

Owen 4:707:02Mather 4:706:04

Guidelines for Selecting a Nonclustered Index

Consider performance gain versus maintenance costüIndex on frequently used search argumentsüConsider nonclustered indexes for columns with high selectivityüConsider placing nonclustered indexes on foreign key columnsüChoose a nonclustered index to cover the queryüConsider using included columnsüConsider using sys.sysindexes to gather information about an indexü

Page 6: Module 4 Designing Databases for  Optimal Performance

Create filtered indexes for heterogeneous dataü

Create filtered indexes for subsets of dataü

Compare views with filtered indexesü

Include a small number of key or included columns in a filtered index definitionü

Use filtered indexes when columns contain well-defined subsets of data ü

Compare indexed views with filtered indexesü

Use data conversion operators in the filter predicateü

Use referencing dependenciesü

Guidelines for Selecting a Filtered Index

Page 7: Module 4 Designing Databases for  Optimal Performance

• Assess benefits for common or important queries

• Assign only values of other columns in the same row

• Assess performance cost against performance gain

• Choose a deterministic and precise computed column expression

• Use CLR functions in computed columns to restrict access

Guidelines for Selecting a Computed Column Index

Page 8: Module 4 Designing Databases for  Optimal Performance

Compresses Nonclustered indexes individuallyüRebuild all the nonclustered indexes on the table to compress a heapüEnable or disable ROW or PAGE compression online or offlineüNon–leaf-level pages do not receive page compression when compressing indexesüData compression is not available for data that is stored separatelyü

Avoid specifying out-of-range partitionsüRebuild a heap to compress new pages allocated to the heapü

For individual partitions, set the compression type to NONE and for a list of partitions, set the type to ROWü

Compress tables with row size less than 8,060 bytesü

Guidelines for Selecting a Strategy for Index Compression

Page 9: Module 4 Designing Databases for  Optimal Performance

• Is it necessary for every table to have a clustered index? Justify your answer.

• An Orders table has a clustered index on the InvoiceNumber (int). The most frequently executed queries use SARG arguments on the OrderDate (datetime) column. A nonclustered index has been created on the OrderDate column. What are the advantages and disadvantages of this clustered index?

Discussion: Using Indexing

Page 10: Module 4 Designing Databases for  Optimal Performance

• Overview of Partitioning

• Guidelines for Planning Partitioned Tables and Indexes

• Designing Partitions to Manage Subsets of Data

• Designing Partitions to Improve Query Performance

• Special Guidelines for Partitioned Indexes

• Discussion: Using Partitioning

Lesson 2: Designing a Partitioning Strategy

Page 11: Module 4 Designing Databases for  Optimal Performance

Overview of Partitioning

Advantages of Partitioning When to Implement Partitioning?

Implement partitioning when:• The table contains, or is expected

to contain data that is used in different ways

• Queries or updates against the table are not performing as intended

• Maintenance costs exceed predefined maintenance periods

• Partitioning makes large tables or indexes more manageable

• Partitioned tables and indexes support designing and querying

• Maintenance operations performed on subsets of data can be performed more efficiently

• Partitioning a table or index might improve query performance

Partitioning helps to break a large table into multiple physical files without comprising the integrity or structure of the database

Page 12: Module 4 Designing Databases for  Optimal Performance

Guidelines for Planning Partitioned Tables and Indexes

Defines how the rows of a table or index are mapped to partitioning columns

Partition function

Maps each partition specified by the partition function to a filegroup

Partition scheme

Page 13: Module 4 Designing Databases for  Optimal Performance

Designing Partitions to Manage Subsets of Data

Adding a table as a partition to an already existing partitioned table

Switching a partition from onepartitioned table to another

Removing a partition to form a single table

Page 14: Module 4 Designing Databases for  Optimal Performance

Partitioning for Join Queries

Taking Advantage of Multiple Disk Drives

Controlling Lock Escalation Behavior

Designing Partitions to Improve Query Performance

Page 15: Module 4 Designing Databases for  Optimal Performance

Partitioning Clustered Indexes

Partitioning Nonclustered Indexes

Memory Limitations and Partitioned Indexes

Partitioning Unique Indexesü

ü

ü

ü

Special Guidelines for Partitioned Indexes

Page 16: Module 4 Designing Databases for  Optimal Performance

Discussion: Using Partitioning

• What problems does table partitioning solve? How?

• Please explain how to create a table partition, identifying the T-SQL object and statement level support

Page 17: Module 4 Designing Databases for  Optimal Performance

• Overview of Plan Guide

• Guidelines for Designing Plan Guides

• Designing Plan Guides for Parameterized Queries

• Discussion: Using Plan Guides

Lesson 3: Designing a Plan Guide

Page 18: Module 4 Designing Databases for  Optimal Performance

Types of plan guides include:

Plan guides in SQL Server are useful when a small subset of queries in a database application deployed from a third-party vendor are not performing as expected.

Plan guides influence optimization of queries by attaching query hints or a fixed query plan to them

• Object plan guide• SQL plan guide• Template plan guide

Overview of Plan Guide

Page 19: Module 4 Designing Databases for  Optimal Performance

Attach a query plan to a plan guide ü

Follow the plan guide that matches requirementsü

Evaluate the plan guide effect on the plan cache ü

Guidelines for Designing Plan Guides

Attach query hints to plan guide ü

Page 20: Module 4 Designing Databases for  Optimal Performance

To obtain the parameterized form of a query and create a plan guide on it, perform the following steps:

Obtain the parameterized form of the query by executing the sp_get_query_template

1

Create a plan guide of type TEMPLATE to force parameterization If the query is not already being parameterized by SQL Server by using the sp_executesql or the PARAMETERIZATION FORCED database SET option

2

Create a plan guide of type SQL on the parameterized query3

Designing Plan Guides for Parameterized Queries

Page 21: Module 4 Designing Databases for  Optimal Performance

What problems does plan guide solve? How?

Discussion: Using Plan Guides

Page 22: Module 4 Designing Databases for  Optimal Performance

• Guidelines for Scaling-Out Databases

• Overview of Federated Databases

• Selecting Federated Databases

• Overview of Scalable Shared Databases

• Guidelines for Selecting Scalable Shared Databases

• Overview of Replication

• Guidelines for Selecting Replication

• Overview of Database Mirroring

• Guidelines for Selecting Database Mirroring

• Discussion: Using Scalable Databases

Lesson 4: Designing Scalable Databases

Page 23: Module 4 Designing Databases for  Optimal Performance

Scale out to multiple database servers and instances

Scale out with redundancy

Scale up for improved performance

Guidelines for Scaling-Out Databases

Page 24: Module 4 Designing Databases for  Optimal Performance

Single Server Tier Federated Server Tier

There is one instance of SQL Server on the production server.

There is one instance of SQL Server on each member server.

The production data is stored in one database.

Each member server has a member database, containing a copy of each table, with only the data relevant to that site.

Each table is typically a single entity. Distributed partitioned views are used to make it appear as if there was a full copy of the original table on each member server.

All connections are made to the single server, and all SQL statements are processed by the same instance of SQL Server.

The application layer must be able to direct the SQL statements to the member server that contains most of the data referenced by the statement.

SQL Server shares the database processing load across a group of servers that process database requests cooperatively. This cooperative group of servers is called a federation.

Overview of Federated Databases

Page 25: Module 4 Designing Databases for  Optimal Performance

Symmetric partitions are effective when:• Related data is put on the same member server • Data is partitioned uniformly across the member servers

Selecting Federated Databases

Symmetric Partitions Asymmetric PartitionsDistributed Partitioned

Views

Asymmetric partitions can:• Improve the performance of databases that cannot be symmetrically

partitioned • Partition a large, existing system by using a series of iterative, asymmetric

improvements

To use distributed partitioned views, consider the:• Pattern of SQL statements executed by an application • Relationships of the tables• Frequency of SQL statements against the partitions • SQL statement routing rules

Symmetric Partitions Asymmetric PartitionsDistributed Partitioned

Views

Page 26: Module 4 Designing Databases for  Optimal Performance

Scalable shared databases let you attach a read-only reporting database to multiple server instances over a storage area network (SAN)

• Allows workload scale-out on reporting databases by using commodity servers and hardware

• Provides workload isolation • Ensures identical views of reporting

data from all servers

Benefits

• The database must be on a read-only volume

• The data files can be accessed only over a SAN

• The databases do not support database snapshots

Limitations

SAN

Overview of Scalable Shared Databases

Page 27: Module 4 Designing Databases for  Optimal Performance

• Verify that the reporting servers and associated reporting database are running on identical platforms

• Update all reporting servers for a scalable shared database uniformly• Limit your scalable shared database configurations to eight server instances

per shared database• Ensure that the reporting database has the same layout as the production

database• Use a single path for the reporting database and the production database• Ensure that the scalable shared database is on a read-only volume that is

accessible over your SAN from all the reporting servers• Ensure that all the server instances use the same sort order• Ensure that all the server instances use the same memory footprint

Guidelines for Selecting Scalable Shared Databases

Page 28: Module 4 Designing Databases for  Optimal Performance

Overview of Replication

Snapshot ReplicationDistributes data exactly as it appears at a specific moment in time and does not monitor for updates to the data

Transactional Replication

Takes an initial snapshot. Subsequent data changes and schema modifications are delivered to the Subscriber as they occur

Merge Replication

Takes an initial snapshot. Subsequent data changes and schema modifications are tracked with triggers

Peer-to-Peer ReplicationProvides a scale-out and high-availability solution by maintaining copies of data across multiple server instances

Replication

Page 29: Module 4 Designing Databases for  Optimal Performance

Snapshot Replication

MergeReplication

TransactionalReplication

Peer-to-PeerReplication

• Create and secure the snapshot folder• Estimate the disk space required to transfer and store snapshot files• Schedule snapshots at off-peak hours• Set up a mail-enabled user account in Active Directory Domain Services (ADDS)

• Ensure that any SELECT and INSERT statements that reference published tables use column lists

• Filter out Timestamp columns during article validation• Specify a value of TRUE for the @stream_blob_columns parameter of

sp_addmergearticle• Add a dummy UPDATE statement within a transaction• Track changes when performing bulk updates

• Ensure adequate space for the transaction log• Ensure adequate space for the distribution database• Declare primary keys for each published table• Consider the issues with using triggers• Consider using large object (LOB) data types

• Use each node for its own distribution database• Avoid including tables in multiple peer-to-peer publications in a single publication

database• Enable publications for peer-to-peer replication before creating subscriptions• Initialize subscriptions by using a backup• Avoid using identity columns

Snapshot Replication

MergeReplication

TransactionalReplication

Peer-to-PeerReplication

Guidelines for Selecting Replication

Page 30: Module 4 Designing Databases for  Optimal Performance

Benefits

Witness Server (optional)

Principal Server

Mirror Server

Data Flow

Improved data protection

Improved databaseavailability

Improved availability of the production databaseduring upgrades

Allows reporting of MirrorServer

Working of Database Mirroring

Overview of Database Mirroring

Page 31: Module 4 Designing Databases for  Optimal Performance

Consider using the high-performance mode for disaster-recovery scenarios in which the principal and mirror servers are separated by a significant distance and where you do not want small errors to impact the principal server

ü

Consider using log shipping as an alternative to asynchronous database mirroringü

Consider setting the WITNESS property to OFF if the SAFETY property is set to OFF when you use Transact-SQL to configure high-performance modeü

When the principal server fails, you can:

• Leave the database unavailable until the principal server becomes available• Manually update the database and then begin a new database mirroring session• Sparingly use forced service on the mirror server

ü

Guidelines for Selecting Database Mirroring

Page 32: Module 4 Designing Databases for  Optimal Performance

Discussion: Using Scalable Databases

• Federated databases can increase the total storage and performance in extremely high capacity or high performance systems. What is the single key element necessary to ensure that a query is executed on the server contains the appropriate data?

• What is the primary problem that scalable shared databases solve?

• A single table from the production database is required to be copied to a different database, on a different server instance. Select the best solution from the following options. Why?(A) Clustering, (B) Mirroring, (C) Replication

Page 33: Module 4 Designing Databases for  Optimal Performance

Logon Information

Estimated time: 60 minutes

• Exercise 1: Applying Optimization Techniques

• Exercise 2: Creating Plan Guides

• Exercise 3: Designing a Partitioning Strategy

Lab 4: Designing Databases for Optimal Performance

Virtual machine

User name

Password

NYC-SQL1

Administrator

Pa$$w0rd

Page 34: Module 4 Designing Databases for  Optimal Performance

You are a lead database administrator at QuantamCorp. You are working on the Human Resources Vacation and Sick Leave Enhancement (HR VASE) project that is designed to enhance the current HR system of your organization. This system is based on the QuantamCorp sample database in SQL Server 2008.

The main goals of the HR VASE project are as follows:

• Provide managers with current and historical information about employee vacation and sick-leave data.• Provide permission to individual employees to view their vacation and sick-leave balances.• Provide permission to selected employees in the HR department to view and update employee vacation and sick-leave data.• Provide permission to the HR manager to view and update all data.•Ensure that the application uses the database in an optimal way and optimize the performance of reports for managers and HR personnel.

You need to formulate a list of tasks that you would need to ensure optimal query performance. Before finalizing the task, you need to verify the result of each task.In this lab, you will examine the business requirements and identify different ways to improve performance. You will enhance the database performance by creating appropriate indexes, plan guide, and partition.

Lab Scenario

Page 35: Module 4 Designing Databases for  Optimal Performance

Lab Review

• What is the purpose of examining the database model, schema, data metadata, and dynamic management views before you decide the course of action to improve query performance.

• What is a plan guide?

• You are developing a partitioning scheme for your application database. The table that you need to partition is sorted according to the date. Users usually access yearly data from that table. How would you design the partitioning scheme?

• You are working on partitioning a data warehouse table by using a column that has the datetime datatype. Why you would you use RIGHT as the RANGE parameter for the partitioning scheme?

Page 36: Module 4 Designing Databases for  Optimal Performance

Module Review and Takeaways

• Review Questions

• Real-world Issues and Scenarios

• List of Tools