Home Uncategorized SQL Server 2005 T-SQL: Aggregates and the OVER clause

SQL Server 2005 T-SQL: Aggregates and the OVER clause

3

A new feature added to SQL Server 2005 for the sake of the windowing functions is the OVER clause. Using this clause, you can specify ordering or partitioning for the windowing functions. For instance, to enumerate the names of all of the products in the AdventureWorks database that have a list price, along with their list prices and the rank of those prices compared to all of the other prices, the following query can now be used:

SELECT
	P.Name,
	P.ListPrice,
	DENSE_RANK() OVER (ORDER BY P.ListPrice DESC) AS PriceRank
FROM Production.Product P
WHERE 
	ListPrice > 0
ORDER BY 
	P.Name ASC



Name			List Price	PriceRank
-------------------------------------------------
All-Purpose Bike Stand	159.0000	44
AWC Logo Cap		8.9900		98
Bike Wash - Dissolver	7.9500		99
Cable Lock		25.0000		88
Chain			20.2400		93
Classic Vest, L		63.5000		66
Classic Vest, M		63.5000		66
Classic Vest, S		63.5000		66
Fender Set - Mountain	21.9800		91
Front Brakes		106.5000	55
Front Derailleur	91.4900		59
...

So what does this tell us? All-Purpose Bike Stand is the 44th most expensive item sold by AdventureWorks. AWC Logo Cap is the 98th most expensive item. And the Vests are tied for 66th most expensive. Which is why DENSE_RANK was used for this example! But really, this example is only here to demonstrate one use of the OVER clause. And this post isn’t about windowing functions or rankings at all. That’s another post for another day.

What this post is about is normal, non-windowing aggregate functions. Like SUM(). It turns out that the OVER clause can be used for them, too!

Pretend that you’re an employee of AdventureWorks and your manager comes to you with a request: Write a query to return all of the products, their prices, their subcategories, and the average price for all products in the subcategory that any given product belongs to… Why? Perhaps the manager wants to re-categorize products based on whether they fall, percentage-wise, close to the same average price. Or maybe it just makes a good contrived example for showing this feature! Regardless…

Here’s how you can solve this in SQL Server 2000:

SELECT
	P.Name AS ProductName,
	P.ListPrice,
	PS.Name AS ProductSubCategoryName,
	x.AveragePrice
FROM Production.Product P
JOIN Production.ProductSubCategory PS ON P.ProductSubCategoryID = PS.ProductSubCategoryID
JOIN
(
	SELECT 
		P2.ProductSubCategoryID,
		AVG(P2.ListPrice) AS AveragePrice
	FROM Production.Product P2
	WHERE 
		P2.ProductSubCategoryID IS NOT NULL
	GROUP BY 
		P2.ProductSubCategoryID
) x ON x.ProductSubCategoryID = P.ProductSubCategoryId
ORDER BY 
	P.Name

I don’t know about you (since I have no clue who you are), but I personally have a difficult time reading this. If I came back to this query in six months, it would take me a few minutes to figue out what was going on. And doesn’t it feel like there should be a more efficient way of expressing it?

…Well, now there is…

SELECT
	P.Name AS ProductName,
	P.ListPrice,
	PS.Name AS ProductSubCategoryName,
	AVG(P.ListPrice) OVER (PARTITION BY P.ProductSubCategoryID)
FROM Production.Product P
JOIN Production.ProductSubCategory PS ON P.ProductSubCategoryID = PS.ProductSubCategoryID
ORDER BY 
	P.Name

So what’s going on here? Under the covers, SQL Server builds a subquery for the average, based on the partitioning column of the OVER clause — which in this case is ProductSubCategoryID. It’s a little bit less efficient in this case than the derived table approach, but a lot cleaner from a readability standpoint. Personally, I think it’s a really cool feature, although I don’t honestly see myself using it too often.

More ways to express yourself using SQL Server 2005. Madonna would be proud.

SHARE
Previous articleTokenize UDF
Next articleLooping over routines using sp_foreachroutine
Adam Machanic helps companies get the most out of their SQL Server databases. He creates solid architectural foundations for high performance databases and is author of the award-winning SQL Server monitoring stored procedure, sp_WhoIsActive. Adam has contributed to numerous books on SQL Server development. A long-time Microsoft MVP for SQL Server, he speaks and trains at IT conferences across North America and Europe.

3 COMMENTS

LEAVE A REPLY

Please enter your comment!
Please enter your name here