What is the difference between GROUP BY and window functions?

GROUP BY collapses multiple rows into one row per group, discarding the individual row data. Window functions compute a value for each row using data from a defined set of rows (the window), while keeping every original row in the result. You lose individual rows with GROUP BY; window functions preserve all rows alongside their computed values.

What is the difference between RANK and DENSE_RANK?

Both assign rank numbers based on ORDER BY within a window. RANK skips numbers after ties — if two rows tie for rank 2, the next rank is 4. DENSE_RANK never skips — if two rows tie for rank 2, the next rank is 3. Use DENSE_RANK when you need consecutive rank numbers (e.g., "top 3 categories"); use RANK when the gap matters for reporting actual position.

Can I use window functions in a WHERE clause?

No. Window functions execute after WHERE, GROUP BY, and HAVING. You cannot filter on a window function result in WHERE. To filter by a window function output, wrap the query in a CTE or subquery and apply the WHERE condition in the outer query. This is the standard pattern for "top N per group" queries.

Which databases support window functions?

All major relational databases support window functions: PostgreSQL (since 8.4, 2008), SQL Server (since 2005), Oracle (since 8i), MySQL (since 8.0, 2018), MariaDB (since 10.2, 2017), SQLite (since 3.25.0, 2018), and DB2. The core syntax (OVER, PARTITION BY, ORDER BY, ROW_NUMBER, RANK, LAG, LEAD) is identical across all of them. Frame clause syntax differs slightly between vendors.

SQL Window Functions: OVER, PARTITION BY, ROW_NUMBER, RANK, and More

Requires: SQL SQL:2003+
Difficulty: Intermediate
Published: March 15, 2025
Updated: March 22, 2025
Author: EduHT Editorial

window functions OVER clause PARTITION BY ROW_NUMBER RANK DENSE_RANK NTILE LAG LEAD FIRST_VALUE LAST_VALUE running totals moving averages frame clauses

What are window functions?

A window function performs a calculation across a set of rows that are related to the current row — without collapsing those rows into a single output row the way GROUP BY does. The set of rows the function sees is called the window, defined by the OVER clause.

The best way to understand why window functions exist is to start with a problem that is awkward without them. Suppose you have a sales table and you want to show each sale alongside the department total. With a plain GROUP BY, you lose the individual rows:

SQL — setup

CREATE TABLE sales (
    id         INT,
    employee   VARCHAR(50),
    department VARCHAR(50),
    amount     NUMERIC(10, 2),
    sale_date  DATE
);

INSERT INTO sales VALUES
    (1,  'Alice',  'Engineering', 1200.00, '2025-01-05'),
    (2,  'Bob',    'Engineering',  850.00, '2025-01-08'),
    (3,  'Carol',  'Marketing',    600.00, '2025-01-03'),
    (4,  'Dave',   'Marketing',    950.00, '2025-01-10'),
    (5,  'Eve',    'Marketing',    720.00, '2025-01-12'),
    (6,  'Frank',  'Engineering', 1500.00, '2025-01-15'),
    (7,  'Grace',  'Sales',        400.00, '2025-01-04'),
    (8,  'Henry',  'Sales',        880.00, '2025-01-09'),
    (9,  'Iris',   'Sales',        1100.00, '2025-01-11'),
    (10, 'Jack',   'Engineering',  970.00, '2025-01-14');

SQL — GROUP BY loses individual rows

-- This gives you one row per department
-- You can't see individual employees anymore
SELECT department, SUM(amount) AS dept_total
FROM   sales
GROUP BY department;

Outputdepartment | dept_total -------------|---------- Engineering | 4520.00 Marketing | 2270.00 Sales | 2380.00

SQL — window function keeps all rows

-- Window function: each row gets the department total alongside it
SELECT
    employee,
    department,
    amount,
    SUM(amount) OVER (PARTITION BY department) AS dept_total,
    ROUND(amount / SUM(amount) OVER (PARTITION BY department) * 100, 1) AS pct_of_dept
FROM sales
ORDER BY department, amount DESC;

Outputemployee | department | amount | dept_total | pct_of_dept ---------|-------------|---------|------------|------------ Frank | Engineering | 1500.00 | 4520.00 | 33.2 Alice | Engineering | 1200.00 | 4520.00 | 26.5 Jack | Engineering | 970.00 | 4520.00 | 21.5 Bob | Engineering | 850.00 | 4520.00 | 18.8 Dave | Marketing | 950.00 | 2270.00 | 41.9 Eve | Marketing | 720.00 | 2270.00 | 31.7 Carol | Marketing | 600.00 | 2270.00 | 26.4 Iris | Sales | 1100.00 | 2380.00 | 46.2 Henry | Sales | 880.00 | 2380.00 | 37.0 Grace | Sales | 400.00 | 2380.00 | 16.8

Every row is present, each showing its individual sale, the department total, and the individual employee's percentage contribution to that department — computed in a single query without any joins or subqueries.

The OVER clause anatomy

The OVER clause defines the window — the set of rows available to the function for each row being processed. It has three optional sub-clauses:

SQL — OVER clause template

function_name() OVER (
    [PARTITION BY column1, column2, ...]   -- divide rows into groups
    [ORDER BY     column3 ASC|DESC, ...]  -- define row order within group
    [frame_clause]                          -- define the physical frame
)

Each sub-clause is independent and optional:

No OVER sub-clauses — OVER () means the window is all rows in the result set. A SUM() OVER () gives you the grand total on every row.
PARTITION BY only — divides rows into groups, restarts calculation for each group. Like GROUP BY but keeps all rows.
ORDER BY only — defines an ordering across all rows. For aggregate functions, this implicitly creates a running/cumulative calculation (see frame clause section).
Both — partitions first, then orders within each partition.

SQL

SELECT
    employee,
    department,
    amount,

    -- 1. OVER () — grand total of all rows
    SUM(amount) OVER ()                            AS grand_total,

    -- 2. PARTITION BY — total per department (no ordering)
    SUM(amount) OVER (PARTITION BY department)   AS dept_total,

    -- 3. ORDER BY only — running total across all rows by date
    SUM(amount) OVER (ORDER BY sale_date)        AS running_total,

    -- 4. PARTITION BY + ORDER BY — running total per department
    SUM(amount) OVER (PARTITION BY department
                      ORDER BY sale_date)         AS dept_running_total

FROM sales
ORDER BY sale_date;

PARTITION BY

PARTITION BY splits the result set into independent groups before the window function runs. The function restarts its calculation for each partition. You can partition by any expression — one column, multiple columns, or even a computed value:

SQL

-- Average and max per department alongside each row
SELECT
    employee,
    department,
    amount,
    AVG(amount) OVER (PARTITION BY department) AS dept_avg,
    MAX(amount) OVER (PARTITION BY department) AS dept_max,
    amount - AVG(amount) OVER (PARTITION BY department) AS vs_dept_avg
FROM sales
ORDER BY department, amount DESC;

Output (Engineering rows only)employee | department | amount | dept_avg | dept_max | vs_dept_avg ---------|-------------|---------|----------|----------|------------ Frank | Engineering | 1500.00 | 1130.00 | 1500.00 | 370.00 Alice | Engineering | 1200.00 | 1130.00 | 1500.00 | 70.00 Jack | Engineering | 970.00 | 1130.00 | 1500.00 | -160.00 Bob | Engineering | 850.00 | 1130.00 | 1500.00 | -280.00

SQL — partitioning by a computed value

-- Partition by month extracted from date
SELECT
    sale_date,
    employee,
    amount,
    SUM(amount) OVER (
        PARTITION BY DATE_TRUNC('month', sale_date)   -- PostgreSQL
        -- DATE_FORMAT(sale_date, '%Y-%m')  for MySQL
    ) AS monthly_total
FROM sales
ORDER BY sale_date;

ROW_NUMBER, RANK, DENSE_RANK, NTILE

The four ranking functions assign a number to each row based on position within a window. Understanding when to use each is important for correct results:

SQL

SELECT
    employee,
    department,
    amount,

    -- ROW_NUMBER: unique sequential number, no gaps, no ties
    ROW_NUMBER() OVER (PARTITION BY department ORDER BY amount DESC) AS row_num,

    -- RANK: gaps after ties (1, 2, 2, 4 — position 3 is skipped)
    RANK()       OVER (PARTITION BY department ORDER BY amount DESC) AS rank_num,

    -- DENSE_RANK: no gaps after ties (1, 2, 2, 3 — consecutive)
    DENSE_RANK() OVER (PARTITION BY department ORDER BY amount DESC) AS dense_num

FROM sales
ORDER BY department, amount DESC;

To see the difference between RANK and DENSE_RANK clearly, here is a dedicated example with intentional ties:

SQL — ties example

SELECT
    name,
    score,
    RANK()       OVER (ORDER BY score DESC) AS rank_pos,
    DENSE_RANK() OVER (ORDER BY score DESC) AS dense_pos,
    ROW_NUMBER() OVER (ORDER BY score DESC) AS row_pos
FROM (
    VALUES ('Alice', 95), ('Bob', 88), ('Carol', 88),
           ('Dave', 72), ('Eve', 72),  ('Frank', 65)
) AS t(name, score);

Outputname | score | rank_pos | dense_pos | row_pos ------|-------|----------|-----------|-------- Alice | 95 | 1 | 1 | 1 Bob | 88 | 2 | 2 | 2 Carol | 88 | 2 | 2 | 3 <- ROW_NUMBER breaks tie arbitrarily Dave | 72 | 4 | 3 | 4 <- RANK skips 3; DENSE_RANK does not Eve | 72 | 4 | 3 | 5 Frank | 65 | 6 | 4 | 6 <- RANK skips 5

NTILE — divide rows into buckets

NTILE(n) divides the ordered rows within a partition into n approximately equal buckets, assigning each row a bucket number from 1 to n. This is useful for percentile analysis — quartiles, deciles, quintiles:

SQL

-- Divide employees into 4 quartiles by sales amount
SELECT
    employee,
    amount,
    NTILE(4) OVER (ORDER BY amount) AS quartile,
    CASE NTILE(4) OVER (ORDER BY amount)
        WHEN 1 THEN 'Bottom 25%'
        WHEN 2 THEN 'Lower mid'
        WHEN 3 THEN 'Upper mid'
        WHEN 4 THEN 'Top 25%'
    END AS tier
FROM sales
ORDER BY amount;

The classic "top N per group" pattern

One of the most frequent uses of ROW_NUMBER is selecting the top N rows per partition. Because window functions cannot appear in WHERE, wrap in a CTE:

SQL — top 2 earners per department

WITH ranked AS (
    SELECT
        employee,
        department,
        amount,
        ROW_NUMBER() OVER (
            PARTITION BY department
            ORDER BY     amount DESC
        ) AS rn
    FROM sales
)
SELECT employee, department, amount
FROM   ranked
WHERE  rn <= 2
ORDER BY department, amount DESC;

Outputemployee | department | amount ---------|-------------|-------- Frank | Engineering | 1500.00 Alice | Engineering | 1200.00 Dave | Marketing | 950.00 Eve | Marketing | 720.00 Iris | Sales | 1100.00 Henry | Sales | 880.00

Aggregate functions as window functions

All standard aggregate functions — SUM, AVG, COUNT, MIN, MAX — can be used as window functions by appending an OVER clause. This gives you the aggregate result while keeping individual row visibility:

SQL

SELECT
    employee,
    department,
    amount,

    -- Count of employees in same department
    COUNT(*) OVER (PARTITION BY department)    AS dept_headcount,

    -- Total sales company-wide
    SUM(amount) OVER ()                          AS company_total,

    -- Highest single sale across all departments
    MAX(amount) OVER ()                          AS top_sale,

    -- Each row's share of company total
    ROUND(
        100.0 * amount / SUM(amount) OVER (), 2
    )                                            AS pct_of_company

FROM sales
ORDER BY department, amount DESC;

Running totals and cumulative sums

A running total accumulates values as you move through rows in a defined order. Add ORDER BY to an aggregate window function, and the database automatically computes the cumulative value from the first row up to and including the current row:

SQL

-- Running total of sales ordered by date
SELECT
    sale_date,
    employee,
    amount,
    SUM(amount) OVER (
        ORDER BY sale_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS running_total
FROM sales
ORDER BY sale_date;

Outputsale_date | employee | amount | running_total -----------|----------|---------|--------------- 2025-01-03 | Carol | 600.00 | 600.00 2025-01-04 | Grace | 400.00 | 1000.00 2025-01-05 | Alice | 1200.00 | 2200.00 2025-01-08 | Bob | 850.00 | 3050.00 2025-01-09 | Henry | 880.00 | 3930.00 2025-01-10 | Dave | 950.00 | 4880.00 2025-01-11 | Iris | 1100.00 | 5980.00 2025-01-12 | Eve | 720.00 | 6700.00 2025-01-14 | Jack | 970.00 | 7670.00 2025-01-15 | Frank | 1500.00 | 9170.00

SQL — running total per department

-- Cumulative sales within each department, ordered by date
SELECT
    sale_date,
    employee,
    department,
    amount,
    SUM(amount) OVER (
        PARTITION BY department
        ORDER BY     sale_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS dept_running_total,

    -- Running count of transactions per department
    COUNT(*) OVER (
        PARTITION BY department
        ORDER BY     sale_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS dept_tx_count

FROM sales
ORDER BY department, sale_date;

Moving averages and frame clauses

The frame clause gives you precise control over which rows are included in the window for each row's calculation. It defines the frame as a range of rows relative to the current row.

The full frame clause syntax:

SQL — frame clause syntax

{ROWS | RANGE} BETWEEN frame_start AND frame_end

-- Frame boundary options:
UNBOUNDED PRECEDING   -- first row of the partition
N PRECEDING           -- N rows before the current row
CURRENT ROW           -- the current row itself
N FOLLOWING           -- N rows after the current row
UNBOUNDED FOLLOWING   -- last row of the partition

ROWS counts physical rows. RANGE counts rows by value — when using RANGE, all rows with the same ORDER BY value as the current row are included in the same frame boundary.

SQL — 3-row moving average

-- 3-sale moving average (current row + 2 preceding)
SELECT
    sale_date,
    employee,
    amount,
    ROUND(
        AVG(amount) OVER (
            ORDER BY sale_date
            ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
        ), 2
    ) AS moving_avg_3
FROM sales
ORDER BY sale_date;

Outputsale_date | employee | amount | moving_avg_3 -----------|----------|---------|------------- 2025-01-03 | Carol | 600.00 | 600.00 <- only 1 row available 2025-01-04 | Grace | 400.00 | 500.00 <- average of 2 rows 2025-01-05 | Alice | 1200.00 | 733.33 <- average of 3 rows 2025-01-08 | Bob | 850.00 | 816.67 2025-01-09 | Henry | 880.00 | 976.67 2025-01-10 | Dave | 950.00 | 893.33 2025-01-11 | Iris | 1100.00 | 976.67 2025-01-12 | Eve | 720.00 | 923.33 2025-01-14 | Jack | 970.00 | 930.00 2025-01-15 | Frank | 1500.00 | 1063.33

SQL — centered moving average

-- Centered 5-row window: 2 before + current + 2 after
SELECT
    sale_date,
    employee,
    amount,
    ROUND(
        AVG(amount) OVER (
            ORDER BY sale_date
            ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING
        ), 2
    ) AS centered_avg_5,

    -- Cumulative max (highest sale seen so far)
    MAX(amount) OVER (
        ORDER BY sale_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS running_max

FROM sales
ORDER BY sale_date;

When ORDER BY is present in a window function but no frame clause is specified, the default frame is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW — a cumulative window. This is why SUM() OVER (ORDER BY date) gives a running total without an explicit frame clause. Always write the frame clause explicitly in production code to avoid ambiguity.

LAG and LEAD

LAG accesses data from a previous row without a self-join. LEAD accesses a following row. Both are invaluable for period-over-period comparisons, change calculations, and gap analysis:

SQL — LAG and LEAD syntax

LAG  (column [, offset [, default]]) OVER (...)
LEAD (column [, offset [, default]]) OVER (...)

-- offset: how many rows back/forward (default 1)
-- default: value to use when the offset goes out of bounds (default NULL)

SQL — day-over-day change

SELECT
    sale_date,
    employee,
    amount,

    -- Previous row's amount (within the same date ordering)
    LAG(amount)             OVER (ORDER BY sale_date) AS prev_amount,

    -- Amount two rows back
    LAG(amount, 2, 0)       OVER (ORDER BY sale_date) AS amount_2_back,

    -- Change vs previous sale
    amount - LAG(amount, 1, amount) OVER (ORDER BY sale_date) AS delta,

    -- Next sale's amount
    LEAD(amount)            OVER (ORDER BY sale_date) AS next_amount

FROM sales
ORDER BY sale_date;

SQL — month-over-month growth percentage

WITH monthly AS (
    SELECT
        DATE_TRUNC('month', sale_date) AS month,  -- PostgreSQL
        SUM(amount)                               AS total
    FROM   sales
    GROUP BY 1
)
SELECT
    month,
    total,
    LAG(total) OVER (ORDER BY month) AS prev_month_total,
    ROUND(
        100.0 * (total - LAG(total) OVER (ORDER BY month))
               / LAG(total) OVER (ORDER BY month),
        2
    ) AS mom_growth_pct
FROM monthly
ORDER BY month;

SQL — gap analysis: days between consecutive sales per employee

SELECT
    employee,
    sale_date,
    LAG(sale_date) OVER (
        PARTITION BY employee
        ORDER BY     sale_date
    ) AS prev_sale_date,
    sale_date - LAG(sale_date) OVER (
        PARTITION BY employee
        ORDER BY     sale_date
    ) AS days_since_last_sale  -- PostgreSQL date arithmetic
FROM sales
ORDER BY employee, sale_date;

FIRST_VALUE, LAST_VALUE, NTH_VALUE

These functions return the value of a specified expression from the first, last, or nth row of the window frame:

SQL

SELECT
    employee,
    department,
    amount,

    -- Top earner in the same department
    FIRST_VALUE(employee) OVER (
        PARTITION BY department
        ORDER BY     amount DESC
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS dept_top_earner,

    -- Lowest earner in the same department
    LAST_VALUE(employee) OVER (
        PARTITION BY department
        ORDER BY     amount DESC
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS dept_lowest_earner,

    -- Second highest earner in the department
    NTH_VALUE(employee, 2) OVER (
        PARTITION BY department
        ORDER BY     amount DESC
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS dept_second_earner

FROM sales
ORDER BY department, amount DESC;

LAST_VALUE is frequently misused. Without an explicit ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING, the default frame ends at the current row, so LAST_VALUE returns the current row's own value — not the partition's last row. Always specify the full frame when using LAST_VALUE or NTH_VALUE.

Named windows (WINDOW clause)

When the same window definition appears multiple times in a query, the WINDOW clause lets you name it once and reference it by name. This avoids repetition and keeps the SELECT readable:

SQL (PostgreSQL, MySQL 8+)

-- Without named window: repetitive
SELECT
    employee,
    department,
    amount,
    ROW_NUMBER() OVER (PARTITION BY department ORDER BY amount DESC) AS rn,
    RANK()       OVER (PARTITION BY department ORDER BY amount DESC) AS rnk,
    DENSE_RANK() OVER (PARTITION BY department ORDER BY amount DESC) AS dense_rnk,
    SUM(amount)  OVER (PARTITION BY department ORDER BY amount DESC) AS running_sum
FROM sales;

-- With named window: define once, reuse cleanly
SELECT
    employee,
    department,
    amount,
    ROW_NUMBER() OVER dept_window AS rn,
    RANK()       OVER dept_window AS rnk,
    DENSE_RANK() OVER dept_window AS dense_rnk,
    SUM(amount)  OVER dept_window AS running_sum
FROM   sales
WINDOW dept_window AS (
    PARTITION BY department
    ORDER BY     amount DESC
);

Named windows also support inheritance — one window can extend another by referencing its name in the OVER clause with additional clauses:

SQL (PostgreSQL)

SELECT
    employee,
    department,
    amount,
    SUM(amount) OVER (by_dept)                        AS dept_total,
    SUM(amount) OVER (by_dept ORDER BY sale_date)    AS dept_running_total
FROM   sales
WINDOW by_dept AS (PARTITION BY department);
-- The second OVER inherits PARTITION BY from by_dept and adds ORDER BY

Real-world query patterns

Deduplicate: keep the latest record per entity

SQL

-- Table with multiple rows per user — keep only the most recent
WITH latest AS (
    SELECT
        *,
        ROW_NUMBER() OVER (
            PARTITION BY user_id
            ORDER BY     created_at DESC
        ) AS rn
    FROM user_events
)
SELECT * FROM latest WHERE rn = 1;

Median without a dedicated function

SQL

-- Median salary using PERCENTILE_CONT (PostgreSQL, SQL Server)
SELECT
    department,
    PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY amount) AS median_sale
FROM   sales
GROUP BY department;

-- Median using ROW_NUMBER for databases without PERCENTILE_CONT
WITH ordered AS (
    SELECT
        amount,
        ROW_NUMBER() OVER (ORDER BY amount) AS rn,
        COUNT(*) OVER () AS total
    FROM sales
)
SELECT AVG(amount) AS median
FROM   ordered
WHERE  rn IN ((total + 1) / 2, (total + 2) / 2);

Identify consecutive streaks

SQL — find consecutive days of sales

-- The "islands and gaps" technique using ROW_NUMBER
-- A consecutive streak is where date - row_number is constant
WITH ranked AS (
    SELECT
        employee,
        sale_date,
        sale_date - INTERVAL (ROW_NUMBER() OVER (
            PARTITION BY employee
            ORDER BY     sale_date
        ) || ' days')::interval AS streak_group  -- PostgreSQL
    FROM (
        SELECT DISTINCT employee, sale_date FROM sales
    ) d
)
SELECT
    employee,
    MIN(sale_date) AS streak_start,
    MAX(sale_date) AS streak_end,
    COUNT(*)       AS streak_length
FROM   ranked
GROUP BY employee, streak_group
ORDER BY streak_length DESC;

Year-over-year comparison in a single query

SQL

WITH yearly AS (
    SELECT
        EXTRACT(YEAR FROM sale_date) AS yr,
        department,
        SUM(amount) AS total
    FROM   sales
    GROUP BY 1, 2
)
SELECT
    yr,
    department,
    total,
    LAG(total) OVER (
        PARTITION BY department
        ORDER BY     yr
    ) AS prior_yr_total,
    ROUND(
        100.0 * (total - LAG(total) OVER (PARTITION BY department ORDER BY yr))
               / LAG(total) OVER (PARTITION BY department ORDER BY yr),
        2
    ) AS yoy_growth_pct
FROM   yearly
ORDER BY department, yr;

Performance and indexes

Window functions are evaluated late in the query pipeline — after WHERE, GROUP BY, and HAVING — so they never benefit directly from indexes to reduce the rows scanned. However, indexes can still help in two important ways:

1. Speeding up the sort for ORDER BY inside OVER: If a window function's ORDER BY matches an existing index, the database can avoid a sort step entirely by reading from the index in order. A partial or covering index that matches the PARTITION BY and ORDER BY columns is ideal:

SQL

-- This window function benefits from an index on (department, amount DESC)
ROW_NUMBER() OVER (PARTITION BY department ORDER BY amount DESC)

-- Create a matching index
CREATE INDEX idx_sales_dept_amount
    ON sales (department, amount DESC);

2. Reducing rows early with WHERE/CTE filtering: Place as many filters as possible before the window function evaluates. Wrapping in a CTE that filters first means the window function operates on a smaller dataset:

SQL — filter before windowing

-- Better: filter first, then rank
WITH recent_sales AS (
    SELECT * FROM sales
    WHERE sale_date >= '2025-01-01'   -- filter rows before windowing
)
SELECT
    employee,
    amount,
    RANK() OVER (ORDER BY amount DESC) AS overall_rank
FROM   recent_sales;

-- Always use EXPLAIN / EXPLAIN ANALYZE to verify the execution plan
EXPLAIN ANALYZE
SELECT
    employee,
    ROW_NUMBER() OVER (PARTITION BY department ORDER BY amount DESC) AS rn
FROM sales;

For very large datasets, be aware that window functions may require the database to materialize the sorted partition in memory. If the partition exceeds work_mem (PostgreSQL) or the equivalent sort buffer, the database will spill to disk, which dramatically slows the query. Increasing the sort buffer for a specific query session or rewriting with batched CTEs can help for extremely large datasets.

Finally, avoid computing the same window expression multiple times in a single query. The optimizer may not deduplicate them. Use a CTE or a subquery to compute the window once, then reference the result:

SQL — avoid recomputing the same window

-- Less efficient: same window computed twice (optimizer may not deduplicate)
SELECT
    amount / SUM(amount) OVER (PARTITION BY department) AS pct,
    amount - AVG(amount) OVER (PARTITION BY department) AS vs_avg
FROM sales;

-- More explicit: compute in a subquery, reuse the column aliases
SELECT
    amount / dept_total AS pct,
    amount - dept_avg   AS vs_avg
FROM (
    SELECT
        amount,
        SUM(amount) OVER (PARTITION BY department) AS dept_total,
        AVG(amount) OVER (PARTITION BY department) AS dept_avg
    FROM sales
) sub;

Frequently Asked Questions

What is the difference between GROUP BY and window functions?: GROUP BY collapses multiple rows into one row per group, discarding the individual row data. Window functions compute a value for each row using data from a defined set of rows (the window), while keeping every original row in the result. You lose individual rows with GROUP BY; window functions preserve all rows alongside their computed values.
What is the difference between RANK and DENSE_RANK?: Both assign rank numbers based on ORDER BY within a window. RANK skips numbers after ties — if two rows tie for rank 2, the next rank is 4. DENSE_RANK never skips — if two rows tie for rank 2, the next rank is 3. Use DENSE_RANK when you need consecutive rank numbers (e.g., "top 3 categories"); use RANK when the gap matters for reporting actual position.
Can I use window functions in a WHERE clause?: No. Window functions execute after WHERE, GROUP BY, and HAVING. You cannot filter on a window function result in WHERE. To filter by a window function output, wrap the query in a CTE or subquery and apply the WHERE condition in the outer query. This is the standard pattern for "top N per group" queries.
Which databases support window functions?: All major relational databases support window functions: PostgreSQL (since 8.4, 2008), SQL Server (since 2005), Oracle (since 8i), MySQL (since 8.0, 2018), MariaDB (since 10.2, 2017), SQLite (since 3.25.0, 2018), and DB2. The core syntax (OVER, PARTITION BY, ORDER BY, ROW_NUMBER, RANK, LAG, LEAD) is identical across all of them. Frame clause syntax differs slightly between vendors.