T-SQL Tutorial - Aggregate Window Functions Performance

preview_player
Показать описание
Learn how to measure and optimise aggregate window functions in this T-SQL Tutorial. In the video I show how the version of SQL Server impacts aggregate window functions and how we can make queries that use window functions faster.

If you wish to follow along with the video, you can use the SQL Scripts below:

-- check if table exists
IF OBJECT_ID(N'dbo.Orders', N'U') IS NOT NULL
DROP TABLE dbo.Orders;

-- create table
CREATE TABLE dbo.Orders
(
OrderId BIGINT IDENTITY(1, 1)
CONSTRAINT PK_Orders_OrderID PRIMARY KEY (OrderId),
CustomerId BIGINT,
Amount DECIMAL(6, 2)
);

-- set variables
DECLARE
@Customers BIGINT = 1000,
@NumRows BIGINT = 1000000

;
WITH
L0 AS (SELECT 1 AS N UNION ALL SELECT 1),
L1 AS (SELECT 1 AS N FROM L0 AS A CROSS JOIN L0 AS B),
L2 AS (SELECT 1 AS N FROM L1 AS A CROSS JOIN L1 AS B),
L3 AS (SELECT 1 AS N FROM L2 AS A CROSS JOIN L2 AS B),
L4 AS (SELECT 1 AS N FROM L3 AS A CROSS JOIN L3 AS B),
L5 AS (SELECT 1 AS N FROM L4 AS A CROSS JOIN L4 AS B)

INSERT INTO dbo.Orders (CustomerId, Amount)
SELECT
CAST(ABS(CHECKSUM(NEWID())) / 1000000.00 AS DECIMAL(6, 2)) AS Amount
FROM
(
N
FROM L5
) AS D;

-- aggregate window function
SET STATISTICS TIME ON;
SET STATISTICS IO ON;

SELECT
CustomerId,
Amount,
CAST((Amount * 100) / SUM(Amount) OVER() AS DECIMAL(8, 6)) AS PctofTotal,
CAST((Amount * 100) / SUM(Amount) OVER(PARTITION BY CustomerId) AS DECIMAL(8, 6)) AS PctofCustTotal
FROM dbo.Orders;

SET STATISTICS TIME OFF;
SET STATISTICS IO OFF;

SET STATISTICS TIME ON;
SET STATISTICS IO ON;

-- using derived tables
SELECT
A.CustomerId,
A.Amount,
CAST((A.Amount * 100) / Total AS DECIMAL(8, 6)) AS PctofTotal,
CAST((A.Amount * 100) / CustTotal AS DECIMAL(8, 6)) AS PctofCustTotal
FROM dbo.Orders AS A
INNER JOIN
(
SELECT
CustomerId,
SUM(Amount) AS CustTotal
FROM dbo.Orders
GROUP BY CustomerId
) AS B
ON A.CustomerId = B.CustomerId
CROSS JOIN
(
SELECT
SUM(Amount) AS Total
FROM dbo.Orders
) AS C

SET STATISTICS TIME OFF;
SET STATISTICS IO OFF;
Рекомендации по теме
Комментарии
Автор

As usual, I find your video quite helpful and useful. Thanks again.

OldPick-Unix-dude-pbjg
Автор

Been using SQL since 95, and didn't use the OVER() stuff specifically because of performance; I tend to favor derived tables.

But I never tested again when 2019 shipped.

Thanks for teaching this old dog :)

JohnBoen
Автор

rand() is executed once per batch rather than for each row.

Sttuey
Автор

I do the best that I can to avoid functions where possible and I'm not familiar with this particular one, my only guess is that you would take a performance hit. But please let me know how far off I am? 😄👍

ameobiamicrobiological
Автор

Ok, this is where I am at.

SELECT
[Customer Name],
ROUND(SUM([Sales]), 2) as customer_total
FROM Orders O
GROUP BY [Customer Name];
CROSS JOIN
(
SELECT [Customer Name],
SUM(Sales) OVER(PARTITION BY [Customer Name]) as customer_total,
SUM(Sales) OVER() as total_sales,
ROUND((SUM(Sales) OVER(PARTITION BY [Customer Name])/SUM(Sales) OVER() ) *100, 3) as percent_of_total
FROM Orders
ORDER BY percent_of_total desc
) AS W
ON O.[Customer Name] = W.[Customer Name];

CaribouDataScience
join shbcf.ru