Home Uncategorized SQL Server 2005 T-SQL: Aggregates and the OVER clause

    SQL Server 2005 T-SQL: Aggregates and the OVER clause

    390
    3

    A new feature added to SQL Server 2005 for the sake of the windowing functions is the OVER clause. Using this clause, you can specify ordering or partitioning for the windowing functions. For instance, to enumerate the names of all of the products in the AdventureWorks database that have a list price, along with their list prices and the rank of those prices compared to all of the other prices, the following query can now be used:

    SELECT
    	P.Name,
    	P.ListPrice,
    	DENSE_RANK() OVER (ORDER BY P.ListPrice DESC) AS PriceRank
    FROM Production.Product P
    WHERE 
    	ListPrice > 0
    ORDER BY 
    	P.Name ASC
    
    
    
    Name			List Price	PriceRank
    -------------------------------------------------
    All-Purpose Bike Stand	159.0000	44
    AWC Logo Cap		8.9900		98
    Bike Wash - Dissolver	7.9500		99
    Cable Lock		25.0000		88
    Chain			20.2400		93
    Classic Vest, L		63.5000		66
    Classic Vest, M		63.5000		66
    Classic Vest, S		63.5000		66
    Fender Set - Mountain	21.9800		91
    Front Brakes		106.5000	55
    Front Derailleur	91.4900		59
    ...
    

    So what does this tell us? All-Purpose Bike Stand is the 44th most expensive item sold by AdventureWorks. AWC Logo Cap is the 98th most expensive item. And the Vests are tied for 66th most expensive. Which is why DENSE_RANK was used for this example! But really, this example is only here to demonstrate one use of the OVER clause. And this post isn’t about windowing functions or rankings at all. That’s another post for another day.

    What this post is about is normal, non-windowing aggregate functions. Like SUM(). It turns out that the OVER clause can be used for them, too!

    Pretend that you’re an employee of AdventureWorks and your manager comes to you with a request: Write a query to return all of the products, their prices, their subcategories, and the average price for all products in the subcategory that any given product belongs to… Why? Perhaps the manager wants to re-categorize products based on whether they fall, percentage-wise, close to the same average price. Or maybe it just makes a good contrived example for showing this feature! Regardless…

    Here’s how you can solve this in SQL Server 2000:

    SELECT
    	P.Name AS ProductName,
    	P.ListPrice,
    	PS.Name AS ProductSubCategoryName,
    	x.AveragePrice
    FROM Production.Product P
    JOIN Production.ProductSubCategory PS ON P.ProductSubCategoryID = PS.ProductSubCategoryID
    JOIN
    (
    	SELECT 
    		P2.ProductSubCategoryID,
    		AVG(P2.ListPrice) AS AveragePrice
    	FROM Production.Product P2
    	WHERE 
    		P2.ProductSubCategoryID IS NOT NULL
    	GROUP BY 
    		P2.ProductSubCategoryID
    ) x ON x.ProductSubCategoryID = P.ProductSubCategoryId
    ORDER BY 
    	P.Name
    

    I don’t know about you (since I have no clue who you are), but I personally have a difficult time reading this. If I came back to this query in six months, it would take me a few minutes to figue out what was going on. And doesn’t it feel like there should be a more efficient way of expressing it?

    …Well, now there is…

    SELECT
    	P.Name AS ProductName,
    	P.ListPrice,
    	PS.Name AS ProductSubCategoryName,
    	AVG(P.ListPrice) OVER (PARTITION BY P.ProductSubCategoryID)
    FROM Production.Product P
    JOIN Production.ProductSubCategory PS ON P.ProductSubCategoryID = PS.ProductSubCategoryID
    ORDER BY 
    	P.Name
    

    So what’s going on here? Under the covers, SQL Server builds a subquery for the average, based on the partitioning column of the OVER clause — which in this case is ProductSubCategoryID. It’s a little bit less efficient in this case than the derived table approach, but a lot cleaner from a readability standpoint. Personally, I think it’s a really cool feature, although I don’t honestly see myself using it too often.

    More ways to express yourself using SQL Server 2005. Madonna would be proud.

    Previous articleTokenize UDF
    Next articleLooping over routines using sp_foreachroutine
    Adam Machanic helps companies get the most out of their SQL Server databases. He creates solid architectural foundations for high performance databases and is author of the award-winning SQL Server monitoring stored procedure, sp_WhoIsActive. Adam has contributed to numerous books on SQL Server development. A long-time Microsoft MVP for SQL Server, he speaks and trains at IT conferences across North America and Europe.

    3 COMMENTS

    Comments are closed.