32
Module 7 Designing Queries for Optimal Performance

Module 7 Designing Queries for Optimal Performance

Embed Size (px)

Citation preview

Page 1: Module 7 Designing Queries for Optimal Performance

Module 7

Designing Queries for Optimal Performance

Page 2: Module 7 Designing Queries for Optimal Performance

Module Overview

• Considerations for Optimizing Queries for Performance

• Refactoring Cursors into Queries

• Extending Set-Based Operations

Page 3: Module 7 Designing Queries for Optimal Performance

Lesson 1: Considerations for Optimizing Queries for Performance

• Overview of Query Logical Flow

• Using the Query Optimizer to Process Queries

• Guidelines for Building Efficient Queries

• Considerations for Creating User-Defined Functions

• Considerations for Using User-Defined Functions

• Considerations for Determining Temporary Storage

• Discussion: Optimizing a Query

Page 4: Module 7 Designing Queries for Optimal Performance

Aggregate Query Aggregate Query

Non-Aggregate QueryNon-Aggregate Query

Overview of Query Logical Flow

From and Join

Rows

Where Select OrderBy

ResultSet

Groupingand

AggregationHavingResult

SetResult

SetOrder

By

Page 5: Module 7 Designing Queries for Optimal Performance

Using the Query Optimizer to Process Queries

QueryOptimizer

Query

Database Schema

Query Plan

Page 6: Module 7 Designing Queries for Optimal Performance

Guidelines for Building Efficient Queries

Test query variations for performanceüü

Avoid query hintsüü

Use correlated subqueries to improve performanceüü

Use table-valued, user-defined functions as derived tablesüü

Avoid unnecessary GROUP BY columns; use a subquery insteadüü

Use CASE expressions to include variable logic in a queryüü

Divide joins into temporary tables when you query large tablesüü

Favor set-based logic over procedural or cursor logicüü

Avoid using a scalar user-defined function in the WHERE clauseüü

Page 7: Module 7 Designing Queries for Optimal Performance

Considerations for Creating User-Defined Functions

Consider relevant factors when indexing the results of the function

Troubleshoot and test the function

Create each function to accomplish a single task

Qualify object names referenced by a function with the appropriate schema name

Identify the type of function to be used

User-Defined Function

SELECT

FROM

WHERE

Page 8: Module 7 Designing Queries for Optimal Performance

Considerations for Using User-Defined Functions

User-Defined Function

Integrate the user-defined function into the query plan as a join

Consider the balance between performance and maintainability

Avoid using a user-defined function if performance suffers tremendously

Page 9: Module 7 Designing Queries for Optimal Performance

To achieve optimal tempdb performance:

• Set the recovery model of tempdb to SIMPLE• Allow for tempdb files to automatically grow• Set the file growth increment to a reasonable size• Preallocate space for all tempdb files• Create multiple files to maximize disk bandwidth• Make each data file of the same size• Load the tempdb database on a fast I/O subsystem• Consider transferring the tempdb database to a

different subsystem or disk

Considerations for Determining Temporary Storage

Page 10: Module 7 Designing Queries for Optimal Performance

Discussion: Optimizing a Query

• What is the primary consideration when handling repetitive tasks against a set of data?

• What will be the effect of having the tempdb database on the same disk or Logical Unit Number (LUN) as the transaction log file?

• Can disciplined code formatting and using naming standards improve query execution performance? Explain the benefits of disciplined code formatting and using naming standards.

Page 11: Module 7 Designing Queries for Optimal Performance

Lesson 2: Refactoring Cursors into Queries

• Building a T-SQL Cursor

• Common Scenarios for Cursor-Based Operations

• Demonstration: How To Refactor a Cursor

• Discussion: Using Cursors

• Guidelines for Using Result Set-Based Operations

• Selecting Appropriate Server-Side Cursors

• Selecting Appropriate Client-Side Cursors

Page 12: Module 7 Designing Queries for Optimal Performance

Building a T-SQL Cursor

Use the OPEN statement to execute the SELECT statement33

Use the FETCH NEXT INTO statement to retrieve values from the next row44

Use the DECLARE CURSOR statement to define the SELECT statement 22

Issue the CLOSE and DEALLOCATE statements to close the cursor 55

Declare the variables for the data to be returned by the cursorDeclare the variables for the data to be returned by the cursor11

• Each FETCH in a cursor has the same performance as a SELECT statement• Cursors use large amounts of memory• Cursors can cause locking problems in the database• Cursors consume network bandwidth

• Each FETCH in a cursor has the same performance as a SELECT statement• Cursors use large amounts of memory• Cursors can cause locking problems in the database• Cursors consume network bandwidth

Why Cursors Are SlowWhy Cursors Are Slow

Page 13: Module 7 Designing Queries for Optimal Performance

Problem Description Solution Cursor Usage

Complex Logic Difficult to translate into a set-based solution

Refactor the logic as a data driven query Rare

Dynamic Code Iteration Requires DDL code Use Transact-SQL cursors Always

List Denormalization

Converts a vertical list of values to a single comma-delimited horizontal list or string

User set-based operations, recursion, or XML queries Sometimes

Crosstab Query Building

Difficult to build by using SQL Server

Use series of case expressions or PIVOT syntax

Never*

Cumulative TotalsNeeds to be calculated within SQL Server and written to a table

Use Transact-SQL cursors Sometimes

Hierarchical Tree Navigation

Needs recursive examination of each node

Use set-based methods that use stored procedures or UDFs

Never

Common Scenarios for Cursor-Based Operations

*Constructing a dynamic cross-tab query requires using a cursor to build the columns for the dynamic SQL

Page 14: Module 7 Designing Queries for Optimal Performance

Demonstration: How To Refactor a Cursor

In this demonstration, you will see how to:

Refactor a cursor

Page 15: Module 7 Designing Queries for Optimal Performance

Discussion: Using Cursors

• List some of the disadvantages of using a cursor.

• What is the major issue with using a cursor in modern relational databases?

• What kind of a problem is best solved by using a cursor?

• Discuss your own experiences with cursors.

Page 16: Module 7 Designing Queries for Optimal Performance

Guidelines for Using Result Set-Based Operations

Use queries that affect groups of rows rather than one row at a timeüü

Avoid making inline calls to scalar UDF in large result setsAvoid making inline calls to scalar UDF in large result setsüü

Limit query cardinality as early as possibleüü

Use result sets instead of cursor-based processes to minimize I/Oüü

Minimize the use of conditional branches inside queriesüü

Page 17: Module 7 Designing Queries for Optimal Performance

Selecting Appropriate Server-Side Cursors

Static Cursor

Forward-Only Cursor

Keyset-Driven Cursor

Server-Side Cursors

Dynamic Cursor

Page 18: Module 7 Designing Queries for Optimal Performance

Selecting Appropriate Client-Side Cursors

• Network latency. Client cursors use more network resources • Additional cursor types. Client cursors support only a limited functionality• Positioned updates. Client-side cursors will not reflect database changes until the

changes are synchronized with the database• Memory usage. The client computer should have enough memory to handle the size

of the entire result set

Considerations for Using Client-Side Cursors

Client Data Access Libraries That Support Client-Side Cursors

ODBC ADOADO.NET-SqlClient

OLE DB

Page 19: Module 7 Designing Queries for Optimal Performance

Lesson 3: Extending Set-Based Operations

• What Are Common Table Expressions?

• Comparing CTE with Other SQL Tuning Techniques

• Demonstration: How To Use a CTE

• Discussion: Using Common Table Expressions

• Demonstration: How To Perform Recursive Queries with CTE

• Discussion: Recursion with CTEs

• Introduction to Ranking Functions

• Demonstration: How To Use Ranking Functions To Rank Rows

• What Are PIVOT and UNPIVOT Operators?

• Demonstration: How To Use PIVOT and UNPIVOT Options To Convert Data

Page 20: Module 7 Designing Queries for Optimal Performance

Parameter Description

expression_name • Is used to reference the query that is using the CTE

• Can be any valid identifier

column_name • Specifies the name of a column for the CTE

• Is taken from the result set in case no column_name parameters are specified

CTE_query_definition • Specifies the SELECT statement that forms the result set

• Is followed by a SELECT, INSERT, UPDATE, or DELETE query

What Are Common Table Expressions?

A CTE is a named temporary result set based on a regular SELECT query. The following table describes the syntax parameters for a CTE

Page 21: Module 7 Designing Queries for Optimal Performance

Comparing CTE with Other SQL Tuning Techniques

• A CTE does not store data anywhere until you actually execute it whereas in a temporary table, the data is stored in the tempdb database

• You must call a CTE immediately after stating whereas you can call a temporary table over and over again from within a statement

• Compute, Order By (without a TOP), INTO, Option, FOR XML, and FOR BROWSE are all not allowed in CTE whereas these options are supported in a temporary table

CTE vs Temporary Table

• In the CTE, the result set will be evaluated just once when a query is executed whereas in a subquery the result set will be evaluated every time a query is executed

CTE vs Subquery

Page 22: Module 7 Designing Queries for Optimal Performance

Demonstration: How To Use a CTE

In this demonstration, you will see how to:

Create and use a CTE

Page 23: Module 7 Designing Queries for Optimal Performance

Discussion: Using Common Table Expressions

• How does a CTE differ from a #Temp table?

• Can you execute two or more queries against a CTE?

• How does a CTE differ from a derived table?

• Can you build indexes or constraints on a CTE?

Page 24: Module 7 Designing Queries for Optimal Performance

Demonstration: How To Perform Recursive Queries with CTEs

In this demonstration, you will see how to:

Perform recursive queries with CTEs

Page 25: Module 7 Designing Queries for Optimal Performance

Discussion: Recursion with CTEs

• What is the maximum number of recursive levels in a common table expression (CTE)?

• What is the default number of recursions in a recursive common table expression?

• Assuming that each recursion adds only one row to the results, how many rows will be returned with OPTION MAXRECURSION(100)? Select an option from the following:

• 99• 100• 101

Page 26: Module 7 Designing Queries for Optimal Performance

Introduction to Ranking Functions

Ranking functions return a ranking value for each row in a partition

Function Description

RANK Returns the rank of each row within the partition of a result set

NTILE Distributes the rows in an ordered partition into a specified number of groups

DENSE_RANK Returns the rank of rows within the partition of a result set, without any gaps in the ranking

ROW_NUMBER Returns the sequential number of a row within a partition of a result set, starting at 1 for the first row in each partition

Page 27: Module 7 Designing Queries for Optimal Performance

Demonstration: How To Use Ranking Functions To Rank Rows

In this demonstration, you will see how to:

Use Ranking Functions to rank rows

Page 28: Module 7 Designing Queries for Optimal Performance

Parameter Description

table_source Is the name of the table that you need to pivot

aggregate_function Is a system or user-defined aggregate function that applies to the specified value_colum

pivot_column Is the source column that provides the values for the new crosstab column

column_list Is a list of values of pivot_column to display as the crosstab column headers

table_alias Is the name of the resulting result set

PIVOT is used to generate crosstab queries in which values are converted to column headers. UNPIVOT is used to convert column headers to values. The following table describes the parameters in the PIVOT and UNPIVOT syntax.

What Are PIVOT and UNPIVOT Operators?

Page 29: Module 7 Designing Queries for Optimal Performance

Demonstration: How To Use PIVOT and UNPIVOT Options To Convert Data

In this demonstration, you will see how to:

Use PIVOT and UNPIVOT options to convert data

Page 30: Module 7 Designing Queries for Optimal Performance

Lab 7: Designing Queries for Optimal Performance

• Exercise 1: Optimizing Query Performance

• Exercise 2: Refactoring Cursors into Queries

Estimated time: 60 minutes

Logon Information

Virtual machine

User name

Password

NYC-SQL1

Administrator

Pa$$w0rd

Page 31: Module 7 Designing Queries for Optimal Performance

Lab Scenario

You are a lead database designer at QuantamCorp. You are working on the Human Resources Vacation and Sick Leave Enhancement (HR VASE) project that is designed to enhance the current HR system of your organization. This system is based on the QuantamCorp sample database in SQL Server 2008.

The main goals of the HR VASE project are as follows:

• Provide managers with current and historical information about employee vacation and sick leave.

• Grant view rights to individual employees to view their vacation and sick leave balances.

• Provide permission to selected employees in the HR department to view and update the vacation and sick leave details of employees.

• Grant the HR manager with the view and update rights to all the data.

You are working on a project to integrate HR VASE with an intranet site which is used to send email broadcast to external people. The details of email recipients are loaded from QuantamCorp HR VASE into the system named Baldwin2.

Recently, a number of functions at Baldwin2 receive many complaints about the performance. You are assigned to help fine tune the performance of the SQL used by those functions.

Page 32: Module 7 Designing Queries for Optimal Performance

Module Review and Takeaways

• Review Questions

• Real-World Issues and Scenarios