How to Use CTE

Embed Size (px)

DESCRIPTION

How to Use CTE in SQl R2

Citation preview

A Simple Common Table Expression ExampleBefore we dive into the syntax or gritty details of CTEs, let's start by looking at a simple example. I think you'll agree that even without knowing the syntax of CTEs, they are pretty readable and straightforward (the hallmark of a well-designed programming language construct).WITH ProductAndCategoryNamesOverTenDollars (ProductName, CategoryName, UnitPrice) AS(SELECTp.ProductName,c.CategoryName,p.UnitPriceFROM Products pINNER JOIN Categories c ONc.CategoryID = p.CategoryIDWHERE p.UnitPrice > 10.0)SELECT *FROMProductAndCategoryNamesOverTenDollarsORDER BY CategoryName ASC, UnitPrice ASC, ProductName ASCThis query creates a CTE namedProductAndCategoryNamesOverTenDollarsthat returns the name, category name, and price of those products whose unit price exceeds $10.00. Once the CTE has been defined, it must then immediately be used in a query. The query treats the CTE as if were a view or table in the system, returning the three fields defined by the CTE (ProductName,CategoryName, andUnitPrice), ordered alphabetically by category, then by price, and then alphabetically by product name.In short, a Common Table Expression allows us to define a temporary, view-like construct. We start by (optionally) specifying the columns it returns, then define the query. Following that, the CTE can be used in aSELECT,INSERT,UPDATE, orDELETEstatement.Common Table Expression SyntaxA Common Table Expression contains three core parts:* The CTE name (this is what follows theWITHkeyword)* The column list (optional)* The query (appears within parentheses after theASkeyword)The query using the CTE must be the first query appearing after the CTE. That is, youcannotdo the following:WITH ProductAndCategoryNamesOverTenDollars (ProductName, CategoryName, UnitPrice) AS(SELECTp.ProductName,c.CategoryName,p.UnitPriceFROM Products pINNER JOIN Categories c ONc.CategoryID = p.CategoryIDWHERE p.UnitPrice > 10.0)SELECT *FROM ProductsSELECT *FROM ProductAndCategoryNamesOverTenDollarsORDER BY CategoryName ASC, UnitPrice ASC, ProductName ASCTheProductAndCategoryNamesOverTenDollarsCTE only applies to the first query following it. So when the second query is reached,ProductAndCategoryNamesOverTenDollarsis undefined, resulting in an "Invalid object name 'ProductAndCategoryNamesOverTenDollars'" error message.You can, however, define multiple CTEs after theWITHkeyword by separating each CTE with a comma. For example, the following query uses two CTEs. The subsequentSELECTquery then uses anINNER JOINto match together the records from the two CTEs:WITH CategoryAndNumberOfProducts (CategoryID, CategoryName, NumberOfProducts) AS(SELECTCategoryID,CategoryName,(SELECT COUNT(1) FROM Products p WHERE p.CategoryID = c.CategoryID) as NumberOfProductsFROM Categories c),ProductsOverTenDollars (ProductID, CategoryID, ProductName, UnitPrice) AS(SELECTProductID,CategoryID,ProductName,UnitPriceFROM Products pWHERE UnitPrice > 10.0)SELECT c.CategoryName, c.NumberOfProducts,p.ProductName, p.UnitPriceFROMProductsOverTenDollarspINNER JOINCategoryAndNumberOfProductsc ONp.CategoryID = c.CategoryIDORDER BY ProductNameUnlike a derived table, CTEs can be defined just once, yet appear multiple times in the subsequent query. To demonstrate this, consider the following example: the Northwind database'sEmployeestable contains an optionalReportsTocolumn that, if specified, indicates the employee's manager.ReportsTois a self-referencing foreign key, meaning that, if provided, it refers back to anotherEmployeeIDin theEmployeestable. Imagine that we wanted to display a list of employees including how many other employees they directly managed. This could be done using a simple, CTE-freeSELECTstatement, but let's use a CTE for now (for reasons which will become clear soon):WITH EmployeeSubordinatesReport (EmployeeID, LastName, FirstName, NumberOfSubordinates, ReportsTo) AS(SELECTEmployeeID,LastName,FirstName,(SELECT COUNT(1) FROM Employees e2 WHERE e2.ReportsTo = e.EmployeeID) as NumberOfSubordinates,ReportsToFROM Employees e)SELECT LastName, FirstName, NumberOfSubordinatesFROM EmployeeSubordinatesReportThis query will return the employees records, showing each employee's last and first name along with how many other employees they manage. As the figure below shows, only Andrew Fuller and Steven Buchanan are manager material.Now, imagine that our boss (Andrew Fuller, perhaps) comes charging into our office and demands that the report also lists each employee's manager's name and number of subordinates (if the employee has a manager, that is - Mr. Fuller is all to quick to point out thathereports to no one). Adding such functionality is a snap with the CTE - just add it in aLEFT JOIN!WITH EmployeeSubordinatesReport (EmployeeID, LastName, FirstName, NumberOfSubordinates, ReportsTo) AS(SELECTEmployeeID,LastName,FirstName,(SELECT COUNT(1) FROM Employees e2 WHERE e2.ReportsTo = e.EmployeeID) as NumberOfSubordinates,ReportsToFROM Employees e)SELECT Employee.LastName, Employee.FirstName, Employee.NumberOfSubordinates,Manager.LastName as ManagerLastName, Manager.FirstName as ManagerFirstName, Manager.NumberOfSubordinates as ManagerNumberOfSubordinatesFROM EmployeeSubordinatesReport EmployeeLEFT JOIN EmployeeSubordinatesReport Manager ONEmployee.ReportsTo = Manager.EmployeeIDWith this additionalLEFT JOIN, the employee's manager's results are returned; if there's no manager for the employee,NULLs are returned instead.When to Use Common Table ExpressionsCommon Table Expressions offer the same functionality as a view, but are ideal for one-off usages where you don't necessarily need a view defined for the system. Even when a CTE is not necessarily needed (as when listing just the employees and their subordinate count in the example above), it can improve readability. InUsing Common Table Expressions, Microsoft offers the following four advantages of CTEs:* Create a recursive query.* Substitute for a view when the general use of a view is not required; that is, you do not have to store the definition in metadata.* Enable grouping by a column that is derived from a scalar subselect, or a function that is either not deterministic or has external access.* Reference the resulting table multiple times in the same statement.Using a CTE offers the advantages of improved readability and ease in maintenance of complex queries. The query can be divided into separate, simple, logical building blocks. These simple blocks can then be used to build more complex, interim CTEs until the final result set is generated.Using scalar subqueries (such as the(SELECT COUNT(1) FROM ...)examples we've looked at in this article) cannot be grouped or filtered directly in the containing query. Similarly, when using SQL Server 2005's ranking functions -ROW_NUMBER(),RANK(),DENSE_RANK(), and so on - the containing query cannot include a filter or grouping expression to return only a subset of the ranked results. For both of these instances, CTEs are quite handy. (For more on SQL Server 2005's ranking capabilities, be sure to read:Returning Ranked Results with Microsoft SQL Server 2005.)CTEs can also be used to recursively enumerate hierarchical data. We'll examine this next!Recursive Common Table ExpressionsRecursion is the process of defining a solution to a problem in terms of itself. For example, a teacher needs to sort a stack of tests alphabetically by the students' names. She could process the tests one at a time and, for each test, insert it into the appropriate spot to the left (calledinsertion sort), probably the way most people sort a hand of cards (at least that's the way I do it). However, depending on the distribution of the tests, the size of the work space, the number of tests to sort, and so on, it may be far more efficient to break down the problem into pieces. Rather than doing an insertion sort right off the bat, it might first make sense to divide the stack of papers in half, and then do an insertion sort on one half, an insertion sort on the second half, and then a merge of the two piles. Or perhaps it would make sense to divide the tests into four piles, or eight piles. (This approach is referred to asmerge sort.)With a recursive solution you will always have the following two pieces:* The base case- what to do when you're done recursing. After dividing the tests into separate piles of say, eight elements per pile, the base case is to sort these piles via insertion sort.* The recursive step- the action to perform that involves plugging the input "back into" the system. For merge sort, the recursive step is the division of one pile into two. Then into four. Then into eight, and so on, until the base case is reached.For more on recursion, seeRecursion, Why It's Cool.Returning to CTEs... theEmployeesdatabase table holds the corporate hierarchy within its rows. Imagine that good ol' Andrew Fuller has come back and insisted on a report that would list all persons in the company along with their position in the hierarchy. Since theEmployeestable can capture an arbitrary number of hierarchy levels, we need a recursive solution. Enter CTEs...Like any recursive definition, a recursive Common Table Expression requires both a base case and the recursive step. In SQL parlance, this translates into two SQL queries - one that gets the "initial" dataUNIONed with one that performs the recursion. For theEmployeesexample, the base case is returning those employees that have no manager:SELECT ...FROM EmployeesWHERE ReportsTo IS NULLThe recursion includes a query on the CTE itself. The following shows the CTE - with both the base case and recursive step - along with aSELECTquery that returns the rows from the CTE:WITH EmployeeHierarchy (EmployeeID, LastName, FirstName, ReportsTo, HierarchyLevel) AS(-- Base caseSELECTEmployeeID,LastName,FirstName,ReportsTo,1 as HierarchyLevelFROM EmployeesWHERE ReportsTo IS NULLUNION ALL-- Recursive stepSELECTe.EmployeeID,e.LastName,e.FirstName,e.ReportsTo,eh.HierarchyLevel + 1 AS HierarchyLevelFROM Employees eINNER JOIN EmployeeHierarchy eh ONe.ReportsTo = eh.EmployeeID)SELECT *FROM EmployeeHierarchyORDER BY HierarchyLevel, LastName, FirstNameThe recursion occurs in the second query in the CTE by joining the results ofEmployeesagainst the CTE itself (EmployeeHierarchy) where the employees'ReportsTofield matches up to the CTE'sEmployeeID. Included in this query is theHierarchyLevelfield, which returns 1 for the base case and one greater than the previous level for each recursive step down the hierarchy. As requested, this resultset clearly shows that Mr. Fuller is the alpha male in this organization. Furthermore, we can see that Steven, Laura, Nancy, Janet, and Margaret make up the second tier in the organizational hierarchy, while poor Anne, Robert, and Michael are down at the bottom:Alternatives to Recursive Common Table ExpressionsAs we saw in this article, enumerating hierarchical data recursively can be accomplished via CTEs (for more on using recursive CTEs, don't forget to check out the official documentation -Recursive Queries Using Common Table Expressions). However, there are other options as well. One choice is to perform the recursion at the ASP/ASP.NET layer. That is, read inallemployee information to a Recordset of DataSet in code, and then recurse there. My articleEfficiently DisplayingParent-Child Datadiscusses this approach.If you need to perform the recursion in SQL, you can use recursive stored procedures, as discussed inThe Zen of Recursion. If you are designing a data model that needs to support hierarchical data, your best bet is to bake in some lineage information directly into the table from the get-go. SeeSQL for Threaded DiscussionsandMore Trees & Hierarchies in SQLfor more information.ConclusionOne of the many new features in SQL Server 2005 are Common Table Expressions (CTEs), which provide a more readable and usable approach to derived tables. Additionally, CTEs may be recursively defined, allowing a recursive entity to be enumerated without the need for recursive stored procedures. For more on the new features found in SQL Server 2005, be sure to also check outReturning Ranked Results with Microsoft SQL Server 2005andTRY...CATCHin SQL Server 2005.Happy Programming!Severity Level 15Message TextIncorrect syntax near the keyword '%.*ls'.ExplanationThis error indicates that the syntax of a Transact-SQL statement is incorrect and that the syntax error was detected near the keyword specified in the error message. The most frequent causes for syntax errors are misspellings of Transact-SQL keywords or operators, and specifying the syntax of a Transact-SQL statement in the wrong order.One of the more complicated causes for this error may be a compatibility level mismatch for the current database. If the current database has a compatibility level other than 70, Microsoft SQL Server will not recognize any of the keywords that a database with a compatibility level of 70 would recognize.ActionFirst, check the Transact-SQL statement syntax near the keyword specified in the error message. Because Transact-SQL language syntax can be very complex, SQL Server may incorrectly report the position of the syntax error as later in the Transact-SQL statement than it actually occurred. Second, reexamine the entire Transact-SQL statement that generated the error. Verify the syntax order of the statement.Ensure that the database does not have a compatibility level of 65 and has a compatibility level of 70.