返回首頁SQL

SQL Window Function 教學:ROW_NUMBER、RANK、LAG 實戰應用

17 min 分鐘閱讀

SQL Window Function 教學:ROW_NUMBER、RANK、LAG 實戰應用

SQL Window Function 教學:ROW_NUMBER、RANK、LAG 實戰應用【完整指南】

視窗函數(Window Functions)是 SQL 進階查詢中最強大的工具之一,能夠在不改變資料列數的情況下進行排名、累計計算與前後值比較。本文將系統性介紹視窗函數的核心概念與實戰應用,幫助您解決複雜的商業分析需求。

視窗函數概念

什麼是視窗函數?

視窗函數對一組相關的資料列(稱為「視窗」或「窗框」)執行計算,並為每一列回傳一個結果值。與聚合函數不同的是,視窗函數不會將多列合併成一列。

-- 聚合函數:5 筆資料變成 1 筆
SELECT Department, SUM(Salary) AS TotalSalary
FROM Employees
GROUP BY Department;

-- 視窗函數:5 筆資料仍是 5 筆,但多了計算結果
SELECT
    Name,
    Department,
    Salary,
    SUM(Salary) OVER (PARTITION BY Department) AS DeptTotalSalary
FROM Employees;

與聚合函數的差異

特性聚合函數視窗函數
結果列數合併為一列保留原始列數
GROUP BY必須使用不需要
原始資料無法存取可同時存取
使用場景彙總統計排名、累計、比較

語法結構 OVER()

視窗函數的核心是 OVER() 子句:

函數名稱() OVER (
    [PARTITION BY 分區欄位]
    [ORDER BY 排序欄位]
    [ROWS/RANGE 窗框範圍]
)

基本範例

SELECT
    Name,
    Department,
    Salary,
    -- 全公司薪資排名
    ROW_NUMBER() OVER (ORDER BY Salary DESC) AS CompanyRank,
    -- 部門內薪資排名
    ROW_NUMBER() OVER (PARTITION BY Department ORDER BY Salary DESC) AS DeptRank
FROM Employees;

想了解更多 SQL 進階功能?請參考 SQL 進階應用實戰技巧

PARTITION BY 分區

PARTITION BY 將資料分成多個群組,視窗函數會在每個群組內獨立計算。

基本用法

-- 計算每個部門的薪資排名
SELECT
    EmployeeId,
    Name,
    Department,
    Salary,
    ROW_NUMBER() OVER (
        PARTITION BY Department  -- 依部門分區
        ORDER BY Salary DESC     -- 依薪資排序
    ) AS DeptSalaryRank
FROM Employees;

執行結果

EmployeeIdNameDepartmentSalaryDeptSalaryRank
101AliceSales800001
102BobSales700002
103CarolSales600003
201DavidEngineering900001
202EveEngineering850002

多欄位分區

-- 依部門與年度分區
SELECT
    EmployeeId,
    Name,
    Department,
    YEAR(HireDate) AS HireYear,
    Salary,
    ROW_NUMBER() OVER (
        PARTITION BY Department, YEAR(HireDate)
        ORDER BY Salary DESC
    ) AS RankInDeptYear
FROM Employees;

不使用 PARTITION BY

若不指定 PARTITION BY,則整個結果集視為一個分區:

-- 全公司薪資排名(無分區)
SELECT
    Name,
    Salary,
    ROW_NUMBER() OVER (ORDER BY Salary DESC) AS CompanyRank
FROM Employees;

編號函數

編號函數為每一列指派一個數字編號,常用於排名與分頁。

ROW_NUMBER()

為每一列指派唯一的連續編號,即使值相同也會有不同編號。

SELECT
    Name,
    Department,
    Salary,
    ROW_NUMBER() OVER (ORDER BY Salary DESC) AS RowNum
FROM Employees;

結果

NameSalaryRowNum
David900001
Eve850002
Alice800003
Frank800004

常見應用:分頁查詢

-- 取得第 2 頁資料(每頁 10 筆)
WITH NumberedEmployees AS (
    SELECT
        *,
        ROW_NUMBER() OVER (ORDER BY EmployeeId) AS RowNum
    FROM Employees
)
SELECT * FROM NumberedEmployees
WHERE RowNum BETWEEN 11 AND 20;

RANK()

相同值會得到相同排名,但會跳過後續名次。

SELECT
    Name,
    Salary,
    RANK() OVER (ORDER BY Salary DESC) AS SalaryRank
FROM Employees;

結果

NameSalarySalaryRank
David900001
Eve850002
Alice800003
Frank800003
Bob700005

DENSE_RANK()

相同值會得到相同排名,但不會跳過後續名次。

SELECT
    Name,
    Salary,
    DENSE_RANK() OVER (ORDER BY Salary DESC) AS DenseRank
FROM Employees;

結果

NameSalaryDenseRank
David900001
Eve850002
Alice800003
Frank800003
Bob700004

三種編號函數比較

SELECT
    Name,
    Salary,
    ROW_NUMBER() OVER (ORDER BY Salary DESC) AS RowNum,
    RANK() OVER (ORDER BY Salary DESC) AS Rank,
    DENSE_RANK() OVER (ORDER BY Salary DESC) AS DenseRank
FROM Employees
ORDER BY Salary DESC;
NameSalaryRowNumRankDenseRank
David90000111
Eve85000222
Alice80000333
Frank80000433
Bob70000554

NTILE()

將資料平均分成 N 組:

-- 將員工依薪資分成 4 組(四分位數)
SELECT
    Name,
    Salary,
    NTILE(4) OVER (ORDER BY Salary DESC) AS Quartile
FROM Employees;

應用場景

  • 將客戶分成 VIP/一般/低價值客群
  • 計算百分位數
  • 平均分配工作負載

前後值函數

前後值函數用於存取當前列之前或之後的資料值,非常適合計算變化量與趨勢分析。

LAG() - 取前一列值

LAG(欄位, 偏移量, 預設值) OVER (PARTITION BY ... ORDER BY ...)
-- 計算每月銷售額與上月比較
SELECT
    Month,
    Sales,
    LAG(Sales, 1, 0) OVER (ORDER BY Month) AS PrevMonthSales,
    Sales - LAG(Sales, 1, 0) OVER (ORDER BY Month) AS MoM_Change
FROM MonthlySales;

結果

MonthSalesPrevMonthSalesMoM_Change
2024-011000000100000
2024-0212000010000020000
2024-03115000120000-5000

LEAD() - 取後一列值

-- 查看下一筆訂單資訊
SELECT
    OrderId,
    OrderDate,
    LEAD(OrderDate, 1) OVER (
        PARTITION BY CustomerId
        ORDER BY OrderDate
    ) AS NextOrderDate,
    DATEDIFF(DAY, OrderDate,
        LEAD(OrderDate, 1) OVER (
            PARTITION BY CustomerId
            ORDER BY OrderDate
        )
    ) AS DaysToNextOrder
FROM Orders;

FIRST_VALUE() / LAST_VALUE()

取得視窗內第一個或最後一個值:

SELECT
    EmployeeId,
    Name,
    Department,
    Salary,
    FIRST_VALUE(Name) OVER (
        PARTITION BY Department
        ORDER BY Salary DESC
    ) AS HighestPaidInDept,
    FIRST_VALUE(Salary) OVER (
        PARTITION BY Department
        ORDER BY Salary DESC
    ) AS MaxSalaryInDept
FROM Employees;

LAST_VALUE 注意事項

-- 需要指定窗框範圍,否則結果可能不如預期
SELECT
    Name,
    Salary,
    LAST_VALUE(Name) OVER (
        ORDER BY Salary DESC
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS LowestPaidEmployee
FROM Employees;

聚合視窗函數

傳統聚合函數加上 OVER() 即可成為視窗函數,保留原始資料列的同時進行彙總計算。

累計加總(Running Total)

-- 計算累計銷售額
SELECT
    OrderDate,
    Amount,
    SUM(Amount) OVER (ORDER BY OrderDate) AS RunningTotal
FROM Orders;

結果

OrderDateAmountRunningTotal
2024-01-0110001000
2024-01-0215002500
2024-01-038003300
2024-01-0420005300

依部門累計

SELECT
    Department,
    OrderDate,
    Amount,
    SUM(Amount) OVER (
        PARTITION BY Department
        ORDER BY OrderDate
    ) AS DeptRunningTotal
FROM Orders;

移動平均(Moving Average)

-- 計算 7 日移動平均
SELECT
    Date,
    Sales,
    AVG(Sales) OVER (
        ORDER BY Date
        ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
    ) AS MovingAvg7Days
FROM DailySales;

窗框範圍說明

  • ROWS BETWEEN 6 PRECEDING AND CURRENT ROW:當前列與前 6 列
  • ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW:從開頭到當前列
  • ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING:從當前列到結尾

其他聚合視窗函數

SELECT
    Department,
    Name,
    Salary,
    -- 部門平均薪資
    AVG(Salary) OVER (PARTITION BY Department) AS DeptAvgSalary,
    -- 部門最高薪資
    MAX(Salary) OVER (PARTITION BY Department) AS DeptMaxSalary,
    -- 部門最低薪資
    MIN(Salary) OVER (PARTITION BY Department) AS DeptMinSalary,
    -- 部門人數
    COUNT(*) OVER (PARTITION BY Department) AS DeptHeadCount,
    -- 薪資與部門平均的差異
    Salary - AVG(Salary) OVER (PARTITION BY Department) AS DiffFromAvg
FROM Employees;

實戰案例:銷售排名分析

以下是一個完整的銷售分析案例,綜合運用多種視窗函數。

情境說明

分析各業務員的銷售表現,需要:

  1. 全公司銷售排名
  2. 區域內銷售排名
  3. 與上月業績比較
  4. 累計業績達成率

完整查詢

WITH SalesData AS (
    SELECT
        s.SalesRepId,
        e.Name AS SalesRepName,
        e.Region,
        s.SalesMonth,
        s.SalesAmount,
        s.TargetAmount
    FROM MonthlySales s
    JOIN Employees e ON s.SalesRepId = e.EmployeeId
    WHERE YEAR(s.SalesMonth) = 2024
)
SELECT
    SalesRepName,
    Region,
    SalesMonth,
    SalesAmount,
    TargetAmount,

    -- 1. 全公司當月排名
    RANK() OVER (
        PARTITION BY SalesMonth
        ORDER BY SalesAmount DESC
    ) AS CompanyRank,

    -- 2. 區域內當月排名
    RANK() OVER (
        PARTITION BY Region, SalesMonth
        ORDER BY SalesAmount DESC
    ) AS RegionRank,

    -- 3. 上月業績
    LAG(SalesAmount, 1, 0) OVER (
        PARTITION BY SalesRepId
        ORDER BY SalesMonth
    ) AS PrevMonthSales,

    -- 4. 月增長率
    CASE
        WHEN LAG(SalesAmount, 1, 0) OVER (
            PARTITION BY SalesRepId ORDER BY SalesMonth
        ) = 0 THEN NULL
        ELSE ROUND(
            (SalesAmount - LAG(SalesAmount, 1, 0) OVER (
                PARTITION BY SalesRepId ORDER BY SalesMonth
            )) * 100.0 / LAG(SalesAmount, 1, 0) OVER (
                PARTITION BY SalesRepId ORDER BY SalesMonth
            ), 2)
    END AS MoM_GrowthRate,

    -- 5. 年度累計業績
    SUM(SalesAmount) OVER (
        PARTITION BY SalesRepId
        ORDER BY SalesMonth
        ROWS UNBOUNDED PRECEDING
    ) AS YTD_Sales,

    -- 6. 年度累計目標
    SUM(TargetAmount) OVER (
        PARTITION BY SalesRepId
        ORDER BY SalesMonth
        ROWS UNBOUNDED PRECEDING
    ) AS YTD_Target,

    -- 7. 累計達成率
    ROUND(
        SUM(SalesAmount) OVER (
            PARTITION BY SalesRepId
            ORDER BY SalesMonth
            ROWS UNBOUNDED PRECEDING
        ) * 100.0 /
        NULLIF(SUM(TargetAmount) OVER (
            PARTITION BY SalesRepId
            ORDER BY SalesMonth
            ROWS UNBOUNDED PRECEDING
        ), 0), 2
    ) AS YTD_AchievementRate

FROM SalesData
ORDER BY SalesMonth, CompanyRank;

進階分析:找出連續成長的業務員

WITH GrowthAnalysis AS (
    SELECT
        SalesRepId,
        SalesMonth,
        SalesAmount,
        LAG(SalesAmount) OVER (
            PARTITION BY SalesRepId ORDER BY SalesMonth
        ) AS PrevAmount,
        CASE
            WHEN SalesAmount > LAG(SalesAmount) OVER (
                PARTITION BY SalesRepId ORDER BY SalesMonth
            ) THEN 1
            ELSE 0
        END AS IsGrowth
    FROM MonthlySales
)
SELECT
    SalesRepId,
    COUNT(*) AS ConsecutiveGrowthMonths
FROM GrowthAnalysis
WHERE IsGrowth = 1
GROUP BY SalesRepId
HAVING COUNT(*) >= 3
ORDER BY ConsecutiveGrowthMonths DESC;

效能注意事項

視窗函數雖然強大,但需注意效能影響。

效能優化建議

  1. 建立適當索引

    -- 為 PARTITION BY 和 ORDER BY 欄位建立索引
    CREATE INDEX IX_Sales_Rep_Month
    ON MonthlySales (SalesRepId, SalesMonth)
    INCLUDE (SalesAmount);
    
  2. 減少視窗函數數量

    -- 差:重複計算相同視窗
    SELECT
        LAG(Amount) OVER (ORDER BY Date) AS Prev,
        LEAD(Amount) OVER (ORDER BY Date) AS Next
    FROM Sales;
    
    -- 好:相同視窗定義共用
    SELECT
        LAG(Amount) OVER w AS Prev,
        LEAD(Amount) OVER w AS Next
    FROM Sales
    WINDOW w AS (ORDER BY Date);  -- SQL Server 2022+ 支援
    
  3. 限制處理資料量

    -- 先篩選再計算
    WITH FilteredData AS (
        SELECT * FROM Sales WHERE Year = 2024
    )
    SELECT *, ROW_NUMBER() OVER (ORDER BY Amount DESC)
    FROM FilteredData;
    

更多效能優化技巧,請參考 SQL 效能優化完整指南

結論

視窗函數是 SQL 中處理複雜分析需求的利器,掌握本文介紹的核心功能:

  1. ✅ 理解 OVER() 與 PARTITION BY 語法
  2. ✅ 活用 ROW_NUMBER、RANK、DENSE_RANK 排名函數
  3. ✅ 運用 LAG、LEAD 進行前後值比較
  4. ✅ 使用聚合視窗函數計算累計與移動平均
  5. ✅ 注意效能優化與索引設計

視窗函數能夠大幅簡化原本需要子查詢或自連接才能完成的複雜查詢,是資料分析師與後端工程師必備的進階技能。

想學習更多預存程序開發?請參考 SQL 預存程序教學


CloudInsight 文末 CTA

想深入學習資料分析 SQL 技巧?

CloudInsight 提供專業的進階 SQL 培訓課程,課程內容包含:

  • 視窗函數完整實戰演練
  • 複雜報表查詢設計
  • 效能優化與執行計畫分析
  • 真實商業案例分析

了解資料分析 SQL 進階課程 →


常見問題 FAQ

ROW_NUMBER 和 RANK 有什麼差別?

最關鍵的差異在於處理相同值的方式

ROW_NUMBER()

  • 永遠回傳唯一的連續數字
  • 相同值會得到不同編號(順序由資料庫決定)
  • 結果:1, 2, 3, 4, 5...

RANK()

  • 相同值會得到相同排名
  • 相同排名後會跳過名次
  • 結果:1, 2, 3, 3, 5...(跳過 4)

選擇建議

需求建議函數
分頁查詢(需要唯一編號)ROW_NUMBER
排行榜(允許並列)RANK 或 DENSE_RANK
取得前 N 名(含並列)RANK
分組(如四分位數)NTILE

視窗函數會影響查詢效能嗎?

會的,視窗函數需要額外的計算資源,但影響程度取決於多個因素:

影響效能的因素

  1. 資料量:處理資料列越多,排序與計算開銷越大
  2. 視窗數量:多個不同視窗定義需要多次排序
  3. 索引設計:缺少適當索引會導致大量排序操作
  4. 窗框範圍:ROWS UNBOUNDED 需要掃描更多資料

優化建議

  1. 為 PARTITION BY 和 ORDER BY 欄位建立索引
  2. 先用 WHERE 或 CTE 減少資料量再使用視窗函數
  3. 避免在同一查詢中使用過多不同視窗定義
  4. 使用 ROWS 而非 RANGE(ROWS 效能較好)
-- 效能較好的寫法
CREATE INDEX IX_Covering ON Sales(RepId, Month) INCLUDE (Amount);

WITH FilteredSales AS (
    SELECT * FROM Sales WHERE Year = 2024
)
SELECT *, SUM(Amount) OVER (PARTITION BY RepId ORDER BY Month)
FROM FilteredSales;

延伸閱讀


參考資料

  1. Microsoft SQL Server 視窗函數官方文件
  2. SQL Window Functions 標準規範
  3. 進階 SQL 查詢設計最佳實踐

插圖:視窗函數 vs 聚合函數比較圖

場景描述: 左右對照圖,左側展示聚合函數將多列合併為一列的過程,右側展示視窗函數保留原始列數並新增計算欄位的過程,使用表格視覺化呈現

視覺重點:

  • 主要內容清晰呈現

必須出現的元素:

  • 依據描述中的關鍵元素

需要顯示的中文字:

顏色調性: 專業、清晰

避免元素: 抽象圖形、齒輪、發光特效

Slug: window-vs-aggregate-functions


需要專業的雲端建議?

無論您正在評估雲平台、優化現有架構,或尋找節費方案,我們都能提供協助

預約免費諮詢

相關文章