GROUP BY is great for summaries. But it collapses your rows. You get one row per group. What if you want the summary and the individual rows at the same time?

That is what window functions do. They calculate across rows without removing any of them.

In the previous article, you learned about aggregation and subqueries. Window functions are the next level.

What Are Window Functions?

A window function performs a calculation across a set of rows that are related to the current row. This set of rows is called a window.

Here is the key difference:

  • GROUP BY + SUM() gives you one row per group
  • SUM() OVER() gives you the sum on every row, without collapsing anything
-- GROUP BY: one row per author
SELECT author_id, SUM(price) AS total
FROM books
GROUP BY author_id;

-- Window function: every row stays, with a running total added
SELECT
    title,
    author_id,
    price,
    SUM(price) OVER() AS total_all_books
FROM books;
| title                    | author_id | price | total_all_books |
|--------------------------|-----------|-------|-----------------|
| The Great Gatsby         | 1         | 12.99 | 88.93           |
| To Kill a Mockingbird    | 2         | 14.99 | 88.93           |
| 1984                     | 3         | 11.99 | 88.93           |
| Pride and Prejudice      | 4         | 9.99  | 88.93           |
| The Catcher in the Rye   | 5         | 13.99 | 88.93           |
| Animal Farm              | 3         | 8.99  | 88.93           |
| Tender Is the Night      | 1         | 15.99 | 88.93           |

Every row has the total price of all books. No rows are lost.

The OVER() Clause

Every window function uses OVER(). This is what makes it a window function.

  • OVER() — the window is all rows
  • OVER(PARTITION BY column) — the window is rows with the same value in that column
  • OVER(ORDER BY column) — the window is all rows, ordered by that column

You can combine them: OVER(PARTITION BY column1 ORDER BY column2).

PARTITION BY — Windows per Group

PARTITION BY divides rows into groups (partitions). The window function runs separately for each group.

Sum of book prices per author, on every row:

SELECT
    title,
    author_id,
    price,
    SUM(price) OVER(PARTITION BY author_id) AS author_total
FROM books;
| title                    | author_id | price | author_total |
|--------------------------|-----------|-------|--------------|
| The Great Gatsby         | 1         | 12.99 | 28.98        |
| Tender Is the Night      | 1         | 15.99 | 28.98        |
| To Kill a Mockingbird    | 2         | 14.99 | 14.99        |
| 1984                     | 3         | 11.99 | 20.98        |
| Animal Farm              | 3         | 8.99  | 20.98        |
| Pride and Prejudice      | 4         | 9.99  | 9.99         |
| The Catcher in the Rye   | 5         | 13.99 | 13.99        |

Author 1 has two books totaling 28.98. Both of their rows show that same total. Author 3 also has two books totaling 20.98. This is like GROUP BY, but you keep every individual row.

ROW_NUMBER — Numbering Rows

ROW_NUMBER() assigns a sequential number to each row.

SELECT
    ROW_NUMBER() OVER(ORDER BY price DESC) AS row_num,
    title,
    price
FROM books;
| row_num | title                    | price |
|---------|--------------------------|-------|
| 1       | Tender Is the Night      | 15.99 |
| 2       | To Kill a Mockingbird    | 14.99 |
| 3       | The Catcher in the Rye   | 13.99 |
| 4       | The Great Gatsby         | 12.99 |
| 5       | 1984                     | 11.99 |
| 6       | Pride and Prejudice      | 9.99  |
| 7       | Animal Farm              | 8.99  |

Number rows within each group:

SELECT
    ROW_NUMBER() OVER(PARTITION BY author_id ORDER BY price DESC) AS author_rank,
    title,
    author_id,
    price
FROM books;
| author_rank | title                    | author_id | price |
|-------------|--------------------------|-----------|-------|
| 1           | Tender Is the Night      | 1         | 15.99 |
| 2           | The Great Gatsby         | 1         | 12.99 |
| 1           | To Kill a Mockingbird    | 2         | 14.99 |
| 1           | 1984                     | 3         | 11.99 |
| 2           | Animal Farm              | 3         | 8.99  |
| 1           | Pride and Prejudice      | 4         | 9.99  |
| 1           | The Catcher in the Rye   | 5         | 13.99 |

The numbering restarts for each author. This is useful for finding the “top N per group” — a very common interview question.

Get the most expensive book per author:

WITH ranked AS (
    SELECT
        ROW_NUMBER() OVER(PARTITION BY author_id ORDER BY price DESC) AS rn,
        title,
        author_id,
        price
    FROM books
)
SELECT title, author_id, price
FROM ranked
WHERE rn = 1;
| title                    | author_id | price |
|--------------------------|-----------|-------|
| Tender Is the Night      | 1         | 15.99 |
| To Kill a Mockingbird    | 2         | 14.99 |
| 1984                     | 3         | 11.99 |
| Pride and Prejudice      | 4         | 9.99  |
| The Catcher in the Rye   | 5         | 13.99 |

You cannot use WHERE ROW_NUMBER() ... = 1 directly. You need a CTE or subquery first, then filter.

RANK and DENSE_RANK — Ranking with Ties

When two rows have the same value, ROW_NUMBER gives them different numbers. RANK and DENSE_RANK handle ties differently.

Let us add some data for this example:

CREATE TABLE exam_scores (
    student TEXT,
    score INTEGER
);

INSERT INTO exam_scores VALUES
('Alex', 95), ('Sam', 90), ('Jordan', 90),
('Taylor', 85), ('Morgan', 80);

Compare all three ranking functions:

SELECT
    student,
    score,
    ROW_NUMBER() OVER(ORDER BY score DESC) AS row_num,
    RANK() OVER(ORDER BY score DESC) AS rank,
    DENSE_RANK() OVER(ORDER BY score DESC) AS dense_rank
FROM exam_scores;
| student | score | row_num | rank | dense_rank |
|---------|-------|---------|------|------------|
| Alex    | 95    | 1       | 1    | 1          |
| Sam     | 90    | 2       | 2    | 2          |
| Jordan  | 90    | 3       | 2    | 2          |
| Taylor  | 85    | 4       | 4    | 3          |
| Morgan  | 80    | 5       | 5    | 4          |

Sam and Jordan both scored 90. Here is how each function handles this:

  • ROW_NUMBER: Gives them different numbers (2, 3). No ties.
  • RANK: Both get rank 2. The next rank is 4 (skips 3).
  • DENSE_RANK: Both get rank 2. The next rank is 3 (no gap).

Use RANK when you want gaps after ties (like sports standings). Use DENSE_RANK when you want consecutive numbers.

LAG and LEAD — Previous and Next Row Values

LAG looks at the previous row. LEAD looks at the next row. These are incredibly useful for comparing a value with the one before or after it.

Let us create a monthly sales table:

CREATE TABLE monthly_sales (
    month TEXT,
    revenue REAL
);

INSERT INTO monthly_sales VALUES
('2026-01', 1200.00),
('2026-02', 1500.00),
('2026-03', 1350.00),
('2026-04', 1800.00),
('2026-05', 1650.00),
('2026-06', 2100.00);

Compare each month to the previous month:

SELECT
    month,
    revenue,
    LAG(revenue) OVER(ORDER BY month) AS prev_month,
    revenue - LAG(revenue) OVER(ORDER BY month) AS change
FROM monthly_sales;
| month   | revenue | prev_month | change |
|---------|---------|------------|--------|
| 2026-01 | 1200.00 | NULL       | NULL   |
| 2026-02 | 1500.00 | 1200.00    | 300.00 |
| 2026-03 | 1350.00 | 1500.00    | -150.00|
| 2026-04 | 1800.00 | 1350.00    | 450.00 |
| 2026-05 | 1650.00 | 1800.00    | -150.00|
| 2026-06 | 2100.00 | 1650.00    | 450.00 |

January has no previous month, so LAG returns NULL.

Month-over-month growth percentage:

SELECT
    month,
    revenue,
    ROUND(
        (revenue - LAG(revenue) OVER(ORDER BY month))
        / LAG(revenue) OVER(ORDER BY month) * 100,
        1
    ) AS growth_pct
FROM monthly_sales;
| month   | revenue | growth_pct |
|---------|---------|------------|
| 2026-01 | 1200.00 | NULL       |
| 2026-02 | 1500.00 | 25.0       |
| 2026-03 | 1350.00 | -10.0      |
| 2026-04 | 1800.00 | 33.3       |
| 2026-05 | 1650.00 | -8.3       |
| 2026-06 | 2100.00 | 27.3       |

LEAD looks ahead:

SELECT
    month,
    revenue,
    LEAD(revenue) OVER(ORDER BY month) AS next_month
FROM monthly_sales;
| month   | revenue | next_month |
|---------|---------|------------|
| 2026-01 | 1200.00 | 1500.00    |
| 2026-02 | 1500.00 | 1350.00    |
| 2026-03 | 1350.00 | 1800.00    |
| 2026-04 | 1800.00 | 1650.00    |
| 2026-05 | 1650.00 | 2100.00    |
| 2026-06 | 2100.00 | NULL       |

LAG and LEAD with an offset:

-- Look 2 months back instead of 1
LAG(revenue, 2) OVER(ORDER BY month)

-- Provide a default value instead of NULL
LAG(revenue, 1, 0) OVER(ORDER BY month)

Running Totals

A running total adds up values as you go through the rows. Use SUM() OVER(ORDER BY ...).

SELECT
    month,
    revenue,
    SUM(revenue) OVER(ORDER BY month) AS running_total
FROM monthly_sales;
| month   | revenue | running_total |
|---------|---------|---------------|
| 2026-01 | 1200.00 | 1200.00       |
| 2026-02 | 1500.00 | 2700.00       |
| 2026-03 | 1350.00 | 4050.00       |
| 2026-04 | 1800.00 | 5850.00       |
| 2026-05 | 1650.00 | 7500.00       |
| 2026-06 | 2100.00 | 9600.00       |

Each row shows the total revenue from January up to that month.

Moving Averages

A moving average averages a fixed number of recent rows. This smooths out fluctuations.

3-month moving average:

SELECT
    month,
    revenue,
    ROUND(AVG(revenue) OVER(
        ORDER BY month
        ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
    ), 2) AS moving_avg_3m
FROM monthly_sales;
| month   | revenue | moving_avg_3m |
|---------|---------|---------------|
| 2026-01 | 1200.00 | 1200.00       |
| 2026-02 | 1500.00 | 1350.00       |
| 2026-03 | 1350.00 | 1350.00       |
| 2026-04 | 1800.00 | 1550.00       |
| 2026-05 | 1650.00 | 1600.00       |
| 2026-06 | 2100.00 | 1850.00       |

ROWS BETWEEN 2 PRECEDING AND CURRENT ROW means “look at the current row and the 2 rows before it.” For January, there are no previous rows, so it just averages itself.

NTILE — Dividing Rows into Buckets

NTILE(n) divides rows into n roughly equal groups.

Divide books into 3 price tiers:

SELECT
    title,
    price,
    NTILE(3) OVER(ORDER BY price) AS price_tier
FROM books;
| title                    | price | price_tier |
|--------------------------|-------|------------|
| Animal Farm              | 8.99  | 1          |
| Pride and Prejudice      | 9.99  | 1          |
| 1984                     | 11.99 | 1          |
| The Great Gatsby         | 12.99 | 2          |
| The Catcher in the Rye   | 13.99 | 2          |
| To Kill a Mockingbird    | 14.99 | 3          |
| Tender Is the Night      | 15.99 | 3          |

Tier 1 is the cheapest third. Tier 3 is the most expensive third. This is useful for creating percentile groups.

Divide into quartiles (4 groups):

NTILE(4) OVER(ORDER BY price) AS quartile

Frame Clauses

Frame clauses define exactly which rows the window function looks at. You saw one already with the moving average.

The syntax is:

ROWS BETWEEN start AND end

Common options:

FrameMeaning
UNBOUNDED PRECEDINGFrom the first row
N PRECEDINGN rows before the current row
CURRENT ROWThe current row
N FOLLOWINGN rows after the current row
UNBOUNDED FOLLOWINGTo the last row

Examples:

-- Running total (default with ORDER BY)
SUM(revenue) OVER(ORDER BY month ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)

-- 3-month moving average
AVG(revenue) OVER(ORDER BY month ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)

-- Total of current row, 1 before, and 1 after
SUM(revenue) OVER(ORDER BY month ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING)

Most of the time, you do not need to specify the frame explicitly. The defaults work for running totals. Use explicit frames for moving averages and centered calculations.

Window Functions vs GROUP BY

When should you use which?

Use CaseUse
“How many books per author?” (one row per author)GROUP BY
“Rank books by price” (keep all rows)Window function
“Running total of revenue” (keep all rows)Window function
“Average order value” (one number)GROUP BY or aggregate
“Compare each month to previous month”Window function (LAG)
“Top 3 books per author” (keep individual rows)Window function + CTE

The rule: if you need individual rows plus a summary, use a window function. If you only need the summary, use GROUP BY.

Common Mistakes

1. Using window functions in WHERE

-- Wrong: you cannot filter by a window function in WHERE
SELECT * FROM books WHERE ROW_NUMBER() OVER(ORDER BY price) <= 3;

-- Correct: use a CTE first
WITH numbered AS (
    SELECT *, ROW_NUMBER() OVER(ORDER BY price) AS rn
    FROM books
)
SELECT * FROM numbered WHERE rn <= 3;

Window functions are calculated after WHERE. You need a subquery or CTE to filter by them.

2. Forgetting ORDER BY in OVER()

-- This gives unpredictable results
SELECT title, ROW_NUMBER() OVER() AS rn FROM books;

Without ORDER BY, the database picks any order. The row numbers will be different each time. Always specify ORDER BY when the order matters.

3. Confusing RANK with ROW_NUMBER for top-N queries

If you want exactly N rows, use ROW_NUMBER. If you want all rows tied at position N, use RANK or DENSE_RANK.

What You Learned

In this article, you learned:

  • Window functions calculate across rows without collapsing them
  • OVER() defines the window
  • PARTITION BY creates separate windows per group
  • ROW_NUMBER, RANK, DENSE_RANK number and rank rows
  • LAG and LEAD access previous and next row values
  • Running totals use SUM() OVER(ORDER BY ...)
  • Moving averages use frame clauses: ROWS BETWEEN
  • NTILE divides rows into equal buckets

What’s Next?

In the next article, you will learn about indexes and performance. You will see how to make slow queries fast using CREATE INDEX and EXPLAIN ANALYZE.