SQL Window Functions: SUM, AVG, LAG, and LEAD

Posted by Kyle Hankinson January 25, 2026


Window functions perform calculations across a set of rows related to the current row — without collapsing the result into groups like GROUP BY does. They are one of the most powerful features in modern SQL.

The OVER() Clause

Every window function uses OVER() to define which rows to include:

SELECT name, department, salary,
    SUM(salary) OVER () AS total_salary
FROM employees;

This adds a total_salary column with the sum of all salaries — without grouping. Every row still appears individually.

PARTITION BY

PARTITION BY divides rows into groups (like GROUP BY, but without collapsing):

SELECT name, department, salary,
    SUM(salary) OVER (PARTITION BY department) AS dept_total,
    AVG(salary) OVER (PARTITION BY department) AS dept_avg
FROM employees;

Each row shows the department total and average alongside the individual salary.

ORDER BY in OVER()

Adding ORDER BY creates a running calculation:

SELECT order_date, amount,
    SUM(amount) OVER (ORDER BY order_date) AS running_total
FROM orders;
order_date amount running_total
Jan 1 100 100
Jan 2 150 250
Jan 3 75 325
Jan 4 200 525

LAG and LEAD

Compare the current row to previous or next rows:

SELECT
    month,
    revenue,
    LAG(revenue) OVER (ORDER BY month) AS prev_month,
    revenue - LAG(revenue) OVER (ORDER BY month) AS month_over_month
FROM monthly_revenue;
month revenue prev_month month_over_month
Jan 10000 NULL NULL
Feb 12000 10000 2000
Mar 11500 12000 -500

LAG(col, n) looks back n rows (default 1). LEAD(col, n) looks forward n rows.

Default Values

Avoid NULLs for the first/last row:

LAG(revenue, 1, 0) OVER (ORDER BY month)  -- returns 0 instead of NULL

FIRST_VALUE and LAST_VALUE

SELECT name, department, salary,
    FIRST_VALUE(name) OVER (PARTITION BY department ORDER BY salary DESC) AS highest_paid
FROM employees;

Percent of Total

SELECT name, department, salary,
    ROUND(100.0 * salary / SUM(salary) OVER (PARTITION BY department), 1) AS pct_of_dept
FROM employees;

Moving Average

Use a frame specification to average over a sliding window:

SELECT order_date, amount,
    AVG(amount) OVER (
        ORDER BY order_date
        ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
    ) AS seven_day_avg
FROM daily_sales;

Frame Types

Frame Meaning
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW Current row + 2 rows before
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW All rows from start to current (default for running totals)
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING 3-row window centered on current
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING All rows in partition

Combining Multiple Window Functions

SELECT
    name, department, salary,
    RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dept_rank,
    SUM(salary) OVER (PARTITION BY department) AS dept_total,
    ROUND(100.0 * salary / SUM(salary) OVER (), 2) AS pct_of_company
FROM employees
ORDER BY department, salary DESC;

WINDOW Clause (Reusable Definitions)

Avoid repeating the same OVER clause:

SELECT
    order_date, amount,
    SUM(amount) OVER w AS running_total,
    AVG(amount) OVER w AS running_avg,
    COUNT(*) OVER w AS running_count
FROM orders
WINDOW w AS (ORDER BY order_date);

Supported in PostgreSQL, MySQL 8.0+, and SQLite 3.28+. Not supported in SQL Server or Oracle.

Database Compatibility

Feature MySQL PostgreSQL SQL Server Oracle SQLite
Basic window functions 8.0+ 8.4+ 2005+ 8i+ 3.25+
LAG / LEAD 8.0+ 8.4+ 2012+ 8i+ 3.25+
Frame specification 8.0+ 8.4+ 2012+ 8i+ 3.28+
WINDOW clause 8.0+ Yes No No 3.28+

Tags: MySQL PostgreSQL Microsoft SQL Server

UNION vs UNION ALL in SQL

Posted by Kyle Hankinson November 15, 2025


UNION and UNION ALL both combine results from multiple SELECT statements, but they handle duplicate rows differently — and the performance difference can be significant.

The Difference

  • UNION removes duplicate rows from the combined result (like an implicit DISTINCT)
  • UNION ALL keeps all rows, including duplicates
-- Table A: 1, 2, 3
-- Table B: 2, 3, 4

SELECT id FROM table_a
UNION
SELECT id FROM table_b;
-- Result: 1, 2, 3, 4 (duplicates removed)

SELECT id FROM table_a
UNION ALL
SELECT id FROM table_b;
-- Result: 1, 2, 3, 2, 3, 4 (all rows kept)

Performance

UNION must sort or hash the entire result set to find and remove duplicates. On large datasets, this is expensive.

UNION ALL simply appends the results — no additional processing.

Rule of thumb: Use UNION ALL unless you specifically need duplicate removal. In practice, this is the majority of cases.

When UNION ALL is Correct

If the source queries cannot produce overlapping rows, deduplication is wasted work:

-- Orders from different months — no overlap possible
SELECT * FROM orders WHERE order_date >= '2025-01-01' AND order_date < '2025-02-01'
UNION ALL
SELECT * FROM orders WHERE order_date >= '2025-02-01' AND order_date < '2025-03-01';
-- Different tables entirely
SELECT 'customer' AS source, name, email FROM customers
UNION ALL
SELECT 'vendor' AS source, name, email FROM vendors;

When UNION is Correct

Use UNION when you genuinely need to deduplicate:

-- Find all cities where we have customers OR offices
SELECT city FROM customers
UNION
SELECT city FROM offices;

Rules for UNION / UNION ALL

Both require:

  1. Same number of columns in each SELECT
  2. Compatible data types (the database will attempt implicit conversion)
-- This works
SELECT name, email FROM customers
UNION ALL
SELECT company_name, contact_email FROM vendors;

-- This fails (different column counts)
SELECT name, email, phone FROM customers
UNION ALL
SELECT company_name, contact_email FROM vendors;

Column names in the result come from the first SELECT statement.

ORDER BY with UNION

ORDER BY applies to the final combined result. Place it after the last SELECT:

SELECT name, 'customer' AS type FROM customers
UNION ALL
SELECT name, 'vendor' AS type FROM vendors
ORDER BY name;

You cannot put ORDER BY inside individual SELECT statements (except in subqueries).

UNION ALL with Aggregation

A common pattern — combine data then aggregate:

WITH all_transactions AS (
    SELECT amount, transaction_date FROM sales
    UNION ALL
    SELECT -amount, transaction_date FROM refunds
)
SELECT
    DATE_TRUNC('month', transaction_date) AS month,
    SUM(amount) AS net_revenue
FROM all_transactions
GROUP BY 1
ORDER BY 1;

Combining More Than Two Queries

You can chain multiple UNIONs:

SELECT name FROM customers
UNION ALL
SELECT name FROM vendors
UNION ALL
SELECT name FROM employees
ORDER BY name;

INTERSECT and EXCEPT

Related set operations worth knowing:

-- Rows in both queries
SELECT city FROM customers
INTERSECT
SELECT city FROM offices;

-- Rows in the first query but not the second
SELECT city FROM customers
EXCEPT            -- SQL Server, PostgreSQL, SQLite
SELECT city FROM offices;

-- Oracle uses MINUS instead of EXCEPT
SELECT city FROM customers
MINUS
SELECT city FROM offices;

MySQL 8.0.31+ supports INTERSECT and EXCEPT. Earlier versions do not.


Tags: MySQL PostgreSQL Microsoft SQL Server

Common Table Expressions (CTEs) in SQL

Posted by Kyle Hankinson July 10, 2025


A Common Table Expression (CTE) is a named temporary result set that exists for the duration of a single query. CTEs make complex queries easier to read, write, and debug.

Basic CTE Syntax

WITH active_users AS (
    SELECT id, name, email
    FROM users
    WHERE status = 'active'
)
SELECT * FROM active_users WHERE email LIKE '%@company.com';

The WITH clause defines the CTE, and the main query uses it like a table. The CTE only exists for that one statement.

Why Use CTEs?

Compare a nested subquery to a CTE:

-- Nested subquery (hard to read)
SELECT d.name, stats.avg_salary
FROM departments d
JOIN (
    SELECT department_id, AVG(salary) AS avg_salary
    FROM employees
    WHERE status = 'active'
    GROUP BY department_id
) stats ON d.id = stats.department_id
WHERE stats.avg_salary > 75000;

-- CTE (much clearer)
WITH dept_salaries AS (
    SELECT department_id, AVG(salary) AS avg_salary
    FROM employees
    WHERE status = 'active'
    GROUP BY department_id
)
SELECT d.name, ds.avg_salary
FROM departments d
JOIN dept_salaries ds ON d.id = ds.department_id
WHERE ds.avg_salary > 75000;

Multiple CTEs

Chain multiple CTEs with commas:

WITH
active_users AS (
    SELECT * FROM users WHERE status = 'active'
),
user_orders AS (
    SELECT user_id, COUNT(*) AS order_count, SUM(total) AS total_spent
    FROM orders
    GROUP BY user_id
)
SELECT au.name, uo.order_count, uo.total_spent
FROM active_users au
JOIN user_orders uo ON au.id = uo.user_id
ORDER BY uo.total_spent DESC;

Each CTE can reference the ones defined before it.

Recursive CTEs

Recursive CTEs solve hierarchical data problems like org charts, category trees, and bill-of-materials.

Employee Org Chart

WITH RECURSIVE org_chart AS (
    -- Anchor: start with the CEO (no manager)
    SELECT id, name, manager_id, 0 AS depth
    FROM employees
    WHERE manager_id IS NULL

    UNION ALL

    -- Recursive: find each employee's direct reports
    SELECT e.id, e.name, e.manager_id, oc.depth + 1
    FROM employees e
    JOIN org_chart oc ON e.manager_id = oc.id
)
SELECT REPEAT('  ', depth) || name AS org_tree, depth
FROM org_chart
ORDER BY depth, name;

Building a Path (Breadcrumbs)

WITH RECURSIVE category_path AS (
    SELECT id, name, parent_id, name AS path
    FROM categories
    WHERE parent_id IS NULL

    UNION ALL

    SELECT c.id, c.name, c.parent_id,
           cp.path || ' > ' || c.name
    FROM categories c
    JOIN category_path cp ON c.parent_id = cp.id
)
SELECT name, path FROM category_path ORDER BY path;

Result: Electronics > Computers > Laptops

Number Series

Generate a sequence of numbers without a table:

WITH RECURSIVE numbers AS (
    SELECT 1 AS n
    UNION ALL
    SELECT n + 1 FROM numbers WHERE n < 100
)
SELECT n FROM numbers;

Database Compatibility

Database Basic CTE Recursive CTE Keyword
PostgreSQL Yes Yes WITH RECURSIVE
MySQL 8.0+ Yes Yes WITH RECURSIVE
SQL Server 2005+ Yes Yes WITH (no RECURSIVE keyword)
Oracle 11g+ Yes Yes WITH (no RECURSIVE keyword)
SQLite 3.8.3+ Yes Yes WITH RECURSIVE

SQL Server and Oracle do not use the RECURSIVE keyword — the database detects recursion automatically.

Preventing Infinite Loops

A bug in the recursive condition can cause infinite recursion. Protect against it:

-- PostgreSQL / MySQL: limit depth
WITH RECURSIVE tree AS (
    SELECT id, parent_id, 1 AS depth FROM nodes WHERE parent_id IS NULL
    UNION ALL
    SELECT n.id, n.parent_id, t.depth + 1
    FROM nodes n JOIN tree t ON n.parent_id = t.id
    WHERE t.depth < 50  -- safety limit
)
SELECT * FROM tree;

-- SQL Server: OPTION (MAXRECURSION)
WITH tree AS (...)
SELECT * FROM tree OPTION (MAXRECURSION 100);

CTE vs Subquery vs Temp Table

CTE Subquery Temp Table
Readability Best Worst (nested) Good
Reusable in same query Yes No Yes
Persists across queries No No Yes
Indexable No No Yes
Recursive support Yes No No

Tags: MySQL PostgreSQL Microsoft SQL Server

How to Drop a Database with Active Connections

Posted by Kyle Hankinson March 20, 2025


Trying to drop a database while other sessions are connected to it will fail. Here is how to terminate those connections and drop the database safely in each database system.

PostgreSQL

The Error

ERROR: database "mydb" is being accessed by other users
DETAIL: There are 3 other sessions using the database.

Solution: Terminate Connections, Then Drop

-- Step 1: Prevent new connections
REVOKE CONNECT ON DATABASE mydb FROM PUBLIC;

-- Step 2: Terminate existing connections
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE datname = 'mydb' AND pid <> pg_backend_pid();

-- Step 3: Drop the database
DROP DATABASE mydb;

PostgreSQL 13+: DROP DATABASE FORCE

PostgreSQL 13 added a FORCE option that terminates connections automatically:

DROP DATABASE mydb WITH (FORCE);

This is the simplest approach if you are on PostgreSQL 13 or later.

Checking Active Connections

SELECT pid, usename, application_name, state, query_start
FROM pg_stat_activity
WHERE datname = 'mydb';

SQL Server

The Error

Cannot drop database "mydb" because it is currently in use.

Solution: Set Single User Mode

-- Terminate all connections and switch to single-user mode
ALTER DATABASE mydb SET SINGLE_USER WITH ROLLBACK IMMEDIATE;

-- Drop the database
DROP DATABASE mydb;

WITH ROLLBACK IMMEDIATE rolls back any in-progress transactions and disconnects all users immediately.

Alternative: Kill Connections Manually

-- Find active sessions
SELECT session_id, login_name, status, last_request_start_time
FROM sys.dm_exec_sessions
WHERE database_id = DB_ID('mydb');

-- Kill each one
KILL 52;
KILL 53;

-- Then drop
DROP DATABASE mydb;

MySQL

MySQL is more lenient — DROP DATABASE does not require exclusive access. However, active queries will continue running against the dropped tables until they finish or error out.

To cleanly terminate connections first:

-- Find connections to the database
SELECT ID, USER, HOST, DB, COMMAND, TIME, STATE
FROM INFORMATION_SCHEMA.PROCESSLIST
WHERE DB = 'mydb';

-- Kill each connection
KILL 42;
KILL 43;

-- Drop the database
DROP DATABASE mydb;

Or in MySQL 8.0+:

SELECT CONCAT('KILL ', ID, ';')
FROM INFORMATION_SCHEMA.PROCESSLIST
WHERE DB = 'mydb';

Run the output to kill all connections at once.

Quick Reference

Database Command
PostgreSQL 13+ DROP DATABASE mydb WITH (FORCE)
PostgreSQL < 13 pg_terminate_backend() then DROP DATABASE
SQL Server ALTER DATABASE SET SINGLE_USER WITH ROLLBACK IMMEDIATE then DROP
MySQL KILL connections then DROP DATABASE

Safety Considerations

  • Always double-check the database name before dropping. There is no undo.
  • In production: Take a backup first. Use pg_dump, mysqldump, or SQL Server backup.
  • In CI/CD: The PostgreSQL WITH (FORCE) option is ideal for tearing down test databases.
  • Never run these commands against production databases unless you are absolutely certain. A typo in the database name can be catastrophic.

Tags: MySQL PostgreSQL Microsoft SQL Server

How to Copy or Clone a Table in SQL

Posted by Kyle Hankinson October 24, 2024


Cloning a table is useful for backups, testing, and schema experiments. Every database can do it, but what gets copied varies.

MySQL

Structure + Data

CREATE TABLE users_copy AS SELECT * FROM users;

Warning: This copies data and column definitions but NOT indexes, primary keys, auto-increment, or foreign keys.

Structure Only (No Data)

CREATE TABLE users_copy LIKE users;

LIKE copies the full structure including indexes and auto-increment. To then copy data:

INSERT INTO users_copy SELECT * FROM users;

Partial Copy

CREATE TABLE active_users AS
SELECT * FROM users WHERE status = 'active';

PostgreSQL

Structure + Data

CREATE TABLE users_copy AS SELECT * FROM users;

Like MySQL, this does not copy indexes, constraints, or defaults.

Structure with Constraints

CREATE TABLE users_copy (LIKE users INCLUDING ALL);

INCLUDING ALL copies defaults, constraints, indexes, comments, and identity columns. You can be selective:

CREATE TABLE users_copy (LIKE users INCLUDING DEFAULTS INCLUDING CONSTRAINTS);

Then copy data separately:

INSERT INTO users_copy SELECT * FROM users;

SQL Server

Structure + Data

SELECT * INTO users_copy FROM users;

SELECT INTO creates the new table automatically. It copies column definitions and data but not indexes, constraints, or triggers.

Structure Only

SELECT * INTO users_copy FROM users WHERE 1 = 0;

The WHERE 1 = 0 ensures no rows are copied.

With Specific Columns or Filters

SELECT id, name, email INTO active_users
FROM users WHERE status = 'active';

Oracle

-- Structure + Data
CREATE TABLE users_copy AS SELECT * FROM users;

-- Structure Only
CREATE TABLE users_copy AS SELECT * FROM users WHERE 1 = 0;

To get the full DDL including indexes and constraints:

SELECT DBMS_METADATA.GET_DDL('TABLE', 'USERS') FROM DUAL;

Then edit the output to change the table name.

SQLite

-- Structure + Data
CREATE TABLE users_copy AS SELECT * FROM users;

-- Structure Only
CREATE TABLE users_copy AS SELECT * FROM users WHERE 0;

SQLite's CREATE TABLE AS does not copy primary keys or autoincrement.

What Gets Copied

Feature CREATE AS SELECT LIKE / Structure Copy
Column names & types Yes Yes
Data Yes No
Primary key No Yes (MySQL LIKE, PG INCLUDING ALL)
Indexes No Yes (MySQL LIKE, PG INCLUDING ALL)
Auto-increment No Yes (MySQL LIKE, PG INCLUDING ALL)
Foreign keys No No (must add manually)
Triggers No No
Constraints No Yes (PG INCLUDING ALL)

Using SQLPro Studio

In , you can duplicate a table directly from the sidebar. Right-click a table and select "Duplicate Table" to create a copy with the structure and optionally the data — no SQL required.


Tags: MySQL PostgreSQL Microsoft SQL Server SQLite

More articles: