35
Click to edit Master subtitle style 04 | Grouping and Aggregating Data Brian Alderman | MCT, CEO / Founder of MicroTechPoint Tobias Ternstrom | Microsoft SQL Server Program Manager

Module_4 Grouping and Aggregating Dataod1

  • Upload
    cviga

  • View
    236

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Module_4 Grouping and Aggregating Dataod1

Click to edit Master subtitle style

04 | Grouping and Aggregating Data

Brian Alderman | MCT, CEO / Founder of MicroTechPointTobias Ternstrom | Microsoft SQL Server Program Manager

Page 2: Module_4 Grouping and Aggregating Dataod1

Course TopicsQuerying Microsoft SQL Server 2012 Jump Start01 | Introducing SQL Server 2012 SQL Server types of statements; other SQL statement elements; basic SELECT statements

02 | Advanced SELECT Statements DISTINCT, Aliases, scalar functions and CASE, using JOIN and MERGE; Filtering and sorting data, NULL values

03 | SQL Server Data Types Introduce data types, data type usage, converting data types, understanding SQL Server function types

04 | Grouping and Aggregating Data Aggregate functions, GROUP BY and HAVING clauses, subqueries; self-contained, correlated, and EXISTS; Views, inline-table valued functions, and derived tables

| Lunch Break Eat, drink, and recharge for the afternoon session

Page 3: Module_4 Grouping and Aggregating Dataod1

Aggregate functionsGROUP BY and HAVING clausesSubqueries (self-contained, correlated, and EXISTS)Working with table functions

Module Overview

Page 4: Module_4 Grouping and Aggregating Dataod1

Aggregate Functions

Page 5: Module_4 Grouping and Aggregating Dataod1

Common built-in aggregate functions

• STDEV• STDEVP• VAR• VARP

• SUM• MIN• MAX• AVG• COUNT• COUNT_BIG

• CHECKSUM_AGG• GROUPING• GROUPING_ID

Common Statistical Other

Page 6: Module_4 Grouping and Aggregating Dataod1

Working with aggregate functionsAggregate functions:

Return a scalar value (with no column name)Ignore NULLs except in COUNT(*)Can be used in

SELECT, HAVING, and ORDER BY clausesFrequently used with GROUP BY clause

UniqueOrders Avg_UnitPrice Min_OrderQty Max_LineTotal------------- ------------ ------------ -------------31465 465.0934 1 27893.619000

SELECT COUNT (DISTINCT SalesOrderID) AS UniqueOrders, AVG(UnitPrice) AS Avg_UnitPrice, MIN(OrderQty)AS Min_OrderQty, MAX(LineTotal) AS Max_LineTotalFROM Sales.SalesOrderDetail;

Page 7: Module_4 Grouping and Aggregating Dataod1

Using DISTINCT with aggregate functionsUse DISTINCT with aggregate functions to summarize only unique valuesDISTINCT aggregates eliminate duplicate values, not rows (unlike SELECT DISTINCT)Compare (with partial results):

SELECT SalesPersonID, YEAR(OrderDate) AS OrderYear,COUNT(CustomerID) AS All_Custs,COUNT(DISTINCT CustomerID) AS Unique_CustsFROM Sales.SalesOrderHeaderGROUP BY SalesPersonID, YEAR(OrderDate);

SalesPersonID OrderYear All_Custs Unique_custs----------- ----------- ----------- ------------289 2006 84 48281 2008 52 27285 2007 9 8277 2006 140 57

Page 8: Module_4 Grouping and Aggregating Dataod1

Using the GROUP BY clauseGROUP BY creates groups for output rows, according to unique combination of values specified in the GROUP BY clause

GROUP BY calculates a summary value for aggregate functions in subsequent phases

Detail rows are “lost” after GROUP BY clause is processed

SELECT <select_list>FROM <table_source>WHERE <search_condition>GROUP BY <group_by_list>;

SELECT SalesPersonID, COUNT(*) AS CntFROM Sales.SalesOrderHeaderGROUP BY SalesPersonID;

Page 9: Module_4 Grouping and Aggregating Dataod1

Using Aggregate functionsDemo

Page 10: Module_4 Grouping and Aggregating Dataod1

GROUP BY and HAVING

Page 11: Module_4 Grouping and Aggregating Dataod1

GROUP BY and logical order of operationsHAVING, SELECT, and ORDER BY must return a single value per groupAll columns in SELECT, HAVING, and ORDER BY must appear in GROUP BY clause or be inputs to aggregate expressions

If a query uses GROUP BY, all subsequent phases operate on the groups, not source rows

Logical Order Phase Comments

5 SELECT

1 FROM

2 WHERE

3 GROUP BY Creates groups

4 HAVING Operates on groups

6 ORDER BY

Page 12: Module_4 Grouping and Aggregating Dataod1

Using GROUP BY with aggregate functionsAggregate functions are commonly used in SELECT clause, summarize per group:

Aggregate functions may refer to any columns, not just those in GROUP BY clause

SELECT productid, MAX(OrderQty) AS largest_orderFROM Sales.SalesOrderDetailGROUP BY productid;

SELECT CustomerID, COUNT(*) AS cntFROM Sales.SalesOrderHeaderGROUP BY CustomerID;

Page 13: Module_4 Grouping and Aggregating Dataod1

Filtering grouped data using HAVING ClauseHAVING clause provides a search condition that each group must satisfyHAVING clause is processed after GROUP BY

SELECT CustomerID, COUNT(*) AS Count_OrdersFROM Sales.SalesOrderHeaderGROUP BY CustomerIDHAVING COUNT(*) > 10;

Page 14: Module_4 Grouping and Aggregating Dataod1

Compare HAVING to WHERE clauses

WHERE filters rows before groups createdControls which rows are placed into groups

HAVING filters groupsControls which groups are passed to next logical phase

• Using a COUNT(*) expression in HAVING clause is useful to solve common business problems:

• Show only customers that have placed more than one order:

• Show only products that appear on 10 or more orders:

SELECT Cust.Customerid, COUNT(*) AS cntFROM Sales.Customer AS Cust JOIN Sales.SalesOrderHeader AS Ord ON Cust.CustomerID = ORD.CustomerIDGROUP BY Cust.CustomerIDHAVING COUNT(*) > 1;

SELECT Prod.ProductID, COUNT(*) AS cntFROM Production.Product AS ProdJOIN Sales.SalesOrderDetail AS Ord ON Prod.ProductID = Ord.ProductIDGROUP BY Prod.ProductIDHAVING COUNT(*) >= 10;

Page 15: Module_4 Grouping and Aggregating Dataod1

Using GROUP BY and HAVING Demo

Page 16: Module_4 Grouping and Aggregating Dataod1

Subqueries

Page 17: Module_4 Grouping and Aggregating Dataod1

Working with subqueriesSubqueries are nested queries or queries within queriesResults from inner query are passed to outer query

Inner query acts like an expression from perspective of outer query

Subqueries can be self-contained or correlatedSelf-contained subqueries have no dependency on outer queryCorrelated subqueries depend on values from outer query

Subqueries can be scalar, multi-valued, or table-valued

Page 18: Module_4 Grouping and Aggregating Dataod1

Writing scalar subqueriesScalar subquery returns single value to outer queryCan be used anywhere single-valued expression can be used: SELECT, WHERE, etc.

If inner query returns an empty set, result is converted to NULLConstruction of outer query determines whether inner query must return a single value

SELECT SalesOrderID, ProductID, UnitPrice, OrderQtyFROM Sales.SalesOrderDetailWHERE SalesOrderID = (SELECT MAX(SalesOrderID) AS LastOrderFROM Sales.SalesOrderHeader);

Page 19: Module_4 Grouping and Aggregating Dataod1

Writing multi-valued subqueriesMulti-valued subquery returns multiple values as a single column set to the outer queryUsed with IN predicate

If any value in the subquery result matches IN predicate expression, the predicate returns TRUE

May also be expressed as a JOIN (test both for performance)

SELECT CustomerID, SalesOrderId,TerritoryIDFROM Sales.SalesorderHeaderWHERE CustomerID IN (SELECT CustomerIDFROM Sales.CustomerWHERE TerritoryID = 10);

Page 20: Module_4 Grouping and Aggregating Dataod1

Writing queries using EXISTS with subqueriesThe keyword EXISTS does not follow a column name or other expression.The SELECT list of a subquery introduced by EXISTS typically only uses an asterisk (*).

SELECT CustomerID, PersonIDFROM Sales.Customer AS CustWHERE EXISTS (SELECT * FROM Sales.SalesOrderHeader AS OrdWHERE Cust.CustomerID = Ord.CustomerID);

SELECT CustomerID, PersonIDFROM Sales.Customer AS CustWHERE NOT EXISTS (SELECT * FROM Sales.SalesOrderHeader AS OrdWHERE Cust.CustomerID = Ord.CustomerID);

Page 21: Module_4 Grouping and Aggregating Dataod1

Using subqueriesDemo

Page 22: Module_4 Grouping and Aggregating Dataod1

Table Functions

Page 23: Module_4 Grouping and Aggregating Dataod1

Creating simple viewsViews are saved queries created in a database by administrators and developersViews are defined with a single SELECT statementORDER BY is not permitted in a view definition without the use of TOP, OFFSET/FETCH, or FOR XML

To sort the output, use ORDER BY in the outer queryView creation supports additional options beyond the scope of this class

CREATE VIEW HumanResources.EmployeeListASSELECT BusinessEntityID, JobTitle, HireDate, VacationHoursFROM HumanResources.Employee;

SELECT * FROM HumanResources.EmployeeList

Page 24: Module_4 Grouping and Aggregating Dataod1

Creating simple inline table-valued functionsTable-valued functions are created by administrators and developersCreate and name function and optional parameters with CREATE FUNCTIONDeclare return type as TABLEDefine inline SELECT statement following RETURN

CREATE FUNCTION Sales.fn_LineTotal (@SalesOrderID INT)RETURNS TABLEASRETURN SELECT SalesOrderID, CAST((OrderQty * UnitPrice * (1 - SpecialOfferID)) AS DECIMAL(8, 2)) AS LineTotal FROM Sales.SalesOrderDetail WHERE SalesOrderID = @SalesOrderID ;

Page 25: Module_4 Grouping and Aggregating Dataod1

Writing queries with derived tablesDerived tables are named query expressions created within an outer SELECT statementNot stored in database – represents a virtual relational tableWhen processed, unpacked into query against underlying referenced objectsAllow you to write more modular queries

Scope of a derived table is the query in which it is defined

SELECT <column_list>FROM (

<derived_table_definition>) AS <derived_table_alias>;

Page 26: Module_4 Grouping and Aggregating Dataod1

Guidelines for derived tables

Derived Tables Must

• Have an alias• Have names for all

columns• Have unique names

for all columns• Not use an ORDER BY

clause (without TOP or OFFSET/FETCH)• Not be referred to

multiple times in the same query

Derived Tables May

• Use internal or external aliases for columns• Refer to parameters

and/or variables• Be nested within other

derived tables

Page 27: Module_4 Grouping and Aggregating Dataod1

Passing arguments to derived tablesDerived tables may refer to argumentsArguments may be:

Variables declared in the same batch as the SELECT statementParameters passed into a table-valued function or stored procedure

DECLARE @emp_id INT = 9;SELECT orderyear, COUNT(DISTINCT custid) AS cust_countFROM (

SELECT YEAR(orderdate) AS orderyear, custidFROM Sales.OrdersWHERE empid=@emp_id

) AS derived_yearGROUP BY orderyear;

Page 28: Module_4 Grouping and Aggregating Dataod1

Creating queries with common table expressionsUse WITH clause to create a CTE:

Define the table expression in WITH clauseReference the CTE in the outer queryAssign column aliases (inline or external)Pass arguments if desired

WITH CTE_year AS(SELECT YEAR(OrderDate) AS OrderYear, customerIDFROM Sales.SalesOrderHeader)SELECT orderyear, COUNT(DISTINCT CustomerID) AS CustCountFROM CTE_yearGROUP BY OrderYear;

Page 29: Module_4 Grouping and Aggregating Dataod1

Table functionsDemo

Page 30: Module_4 Grouping and Aggregating Dataod1

SummaryAggregate functions are used in SELECT, HAVING, and ORDER By clauses, but are most frequently used with the GROUP BY clause and returns a scalar value

Common built-in aggregate functions include

• STDEV• STDEVP• VAR• VARP

• SUM• MIN• MAX• AVG• COUNT• COUNT_BIG

• CHECKSUM_AGG• GROUPING• GROUPING_ID

Common Statistical Other

Page 31: Module_4 Grouping and Aggregating Dataod1

SummaryUse DISTINCT with aggregate functions to only summarize the unique values as it will eliminate duplicate values, not rows

GROUP BY creates groups for output rows, according to unique combination of values specified in the GROUP BY clause. GROUP BY also calculates a summary value for aggregate functions in subsequent phases

HAVING clause provides a search condition that each group must satisfy and is processed after the GROUP BY clause

Page 32: Module_4 Grouping and Aggregating Dataod1

SummarySubqueries are nested queries or queries within queries where the results from inner query are passed to the outer query

Type of subqueries includeScalar subqueriesMulti-valued subqueries Subqueries with the EXISTS clause

Page 33: Module_4 Grouping and Aggregating Dataod1

SummaryViews are named tables expressions with definitions stored in a database that can be referenced in a SELECT statement just like a table

Views are defined with a single SELECT statement and then saved in the database as queries

Table-valued functions are created with the CREATE FUNCTION. They contain a RETURN type of table

Derived tables allow you to write more modular queriesas named query expressions that are created within an outer SELECT statement. They represent a virtual relational table so are not stored in the database

CTEs are similar to derived tables in scope and naming requirements but unlike derived tables, CTEs support multiple definitions, multiple references, and recursion

Page 34: Module_4 Grouping and Aggregating Dataod1

Course Topics

Querying Microsoft SQL Server 2012 Jump Start01 | Introducing SQL Server 2012 SQL Server types of statements; other SQL statement elements; basic SELECT statements

02 | Advanced SELECT Statements DISTINCT, Aliases, scalar functions and CASE, using JOIN and MERGE; Filtering and sorting data, NULL values

03 | SQL Server Data Types Introduce data types, data type usage, converting data types, understanding SQL Server function types

04 | Grouping and Aggregating dataAggregate functions, GROUP BY and HAVING clauses, subqueries; self-contained, correlated, and EXISTS; Views, inline-table valued functions, and derived tables

| Lunch BreakEat, drink, and recharge for the afternoon session

Page 35: Module_4 Grouping and Aggregating Dataod1

©2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Office, Azure, System Center, Dynamics and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.