SQL Window Functions: SUM, AVG, LAG, and LEAD
Posted by Kyle Hankinson January 25, 2026
Window functions perform calculations across a set of rows related to the current row — without collapsing the result into groups like GROUP BY does. They are one of the most powerful features in modern SQL.
The OVER() Clause
Every window function uses OVER() to define which rows to include:
SELECT name, department, salary,
SUM(salary) OVER () AS total_salary
FROM employees;
This adds a total_salary column with the sum of all salaries — without grouping. Every row still appears individually.
PARTITION BY
PARTITION BY divides rows into groups (like GROUP BY, but without collapsing):
SELECT name, department, salary,
SUM(salary) OVER (PARTITION BY department) AS dept_total,
AVG(salary) OVER (PARTITION BY department) AS dept_avg
FROM employees;
Each row shows the department total and average alongside the individual salary.
ORDER BY in OVER()
Adding ORDER BY creates a running calculation:
SELECT order_date, amount,
SUM(amount) OVER (ORDER BY order_date) AS running_total
FROM orders;
| order_date | amount | running_total |
|---|---|---|
| Jan 1 | 100 | 100 |
| Jan 2 | 150 | 250 |
| Jan 3 | 75 | 325 |
| Jan 4 | 200 | 525 |
LAG and LEAD
Compare the current row to previous or next rows:
SELECT
month,
revenue,
LAG(revenue) OVER (ORDER BY month) AS prev_month,
revenue - LAG(revenue) OVER (ORDER BY month) AS month_over_month
FROM monthly_revenue;
| month | revenue | prev_month | month_over_month |
|---|---|---|---|
| Jan | 10000 | NULL | NULL |
| Feb | 12000 | 10000 | 2000 |
| Mar | 11500 | 12000 | -500 |
LAG(col, n) looks back n rows (default 1). LEAD(col, n) looks forward n rows.
Default Values
Avoid NULLs for the first/last row:
LAG(revenue, 1, 0) OVER (ORDER BY month) -- returns 0 instead of NULL
FIRST_VALUE and LAST_VALUE
SELECT name, department, salary,
FIRST_VALUE(name) OVER (PARTITION BY department ORDER BY salary DESC) AS highest_paid
FROM employees;
Percent of Total
SELECT name, department, salary,
ROUND(100.0 * salary / SUM(salary) OVER (PARTITION BY department), 1) AS pct_of_dept
FROM employees;
Moving Average
Use a frame specification to average over a sliding window:
SELECT order_date, amount,
AVG(amount) OVER (
ORDER BY order_date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS seven_day_avg
FROM daily_sales;
Frame Types
| Frame | Meaning |
|---|---|
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW |
Current row + 2 rows before |
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW |
All rows from start to current (default for running totals) |
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING |
3-row window centered on current |
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING |
All rows in partition |
Combining Multiple Window Functions
SELECT
name, department, salary,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dept_rank,
SUM(salary) OVER (PARTITION BY department) AS dept_total,
ROUND(100.0 * salary / SUM(salary) OVER (), 2) AS pct_of_company
FROM employees
ORDER BY department, salary DESC;
WINDOW Clause (Reusable Definitions)
Avoid repeating the same OVER clause:
SELECT
order_date, amount,
SUM(amount) OVER w AS running_total,
AVG(amount) OVER w AS running_avg,
COUNT(*) OVER w AS running_count
FROM orders
WINDOW w AS (ORDER BY order_date);
Supported in PostgreSQL, MySQL 8.0+, and SQLite 3.28+. Not supported in SQL Server or Oracle.
Database Compatibility
| Feature | MySQL | PostgreSQL | SQL Server | Oracle | SQLite |
|---|---|---|---|---|---|
| Basic window functions | 8.0+ | 8.4+ | 2005+ | 8i+ | 3.25+ |
| LAG / LEAD | 8.0+ | 8.4+ | 2012+ | 8i+ | 3.25+ |
| Frame specification | 8.0+ | 8.4+ | 2012+ | 8i+ | 3.28+ |
| WINDOW clause | 8.0+ | Yes | No | No | 3.28+ |
Tags: MySQL PostgreSQL Microsoft SQL Server
UNION vs UNION ALL in SQL
Posted by Kyle Hankinson November 15, 2025
UNION and UNION ALL both combine results from multiple SELECT statements, but they handle duplicate rows differently — and the performance difference can be significant.
The Difference
UNIONremoves duplicate rows from the combined result (like an implicitDISTINCT)UNION ALLkeeps all rows, including duplicates
-- Table A: 1, 2, 3
-- Table B: 2, 3, 4
SELECT id FROM table_a
UNION
SELECT id FROM table_b;
-- Result: 1, 2, 3, 4 (duplicates removed)
SELECT id FROM table_a
UNION ALL
SELECT id FROM table_b;
-- Result: 1, 2, 3, 2, 3, 4 (all rows kept)
Performance
UNION must sort or hash the entire result set to find and remove duplicates. On large datasets, this is expensive.
UNION ALL simply appends the results — no additional processing.
Rule of thumb: Use UNION ALL unless you specifically need duplicate removal. In practice, this is the majority of cases.
When UNION ALL is Correct
If the source queries cannot produce overlapping rows, deduplication is wasted work:
-- Orders from different months — no overlap possible
SELECT * FROM orders WHERE order_date >= '2025-01-01' AND order_date < '2025-02-01'
UNION ALL
SELECT * FROM orders WHERE order_date >= '2025-02-01' AND order_date < '2025-03-01';
-- Different tables entirely
SELECT 'customer' AS source, name, email FROM customers
UNION ALL
SELECT 'vendor' AS source, name, email FROM vendors;
When UNION is Correct
Use UNION when you genuinely need to deduplicate:
-- Find all cities where we have customers OR offices
SELECT city FROM customers
UNION
SELECT city FROM offices;
Rules for UNION / UNION ALL
Both require:
- Same number of columns in each SELECT
- Compatible data types (the database will attempt implicit conversion)
-- This works
SELECT name, email FROM customers
UNION ALL
SELECT company_name, contact_email FROM vendors;
-- This fails (different column counts)
SELECT name, email, phone FROM customers
UNION ALL
SELECT company_name, contact_email FROM vendors;
Column names in the result come from the first SELECT statement.
ORDER BY with UNION
ORDER BY applies to the final combined result. Place it after the last SELECT:
SELECT name, 'customer' AS type FROM customers
UNION ALL
SELECT name, 'vendor' AS type FROM vendors
ORDER BY name;
You cannot put ORDER BY inside individual SELECT statements (except in subqueries).
UNION ALL with Aggregation
A common pattern — combine data then aggregate:
WITH all_transactions AS (
SELECT amount, transaction_date FROM sales
UNION ALL
SELECT -amount, transaction_date FROM refunds
)
SELECT
DATE_TRUNC('month', transaction_date) AS month,
SUM(amount) AS net_revenue
FROM all_transactions
GROUP BY 1
ORDER BY 1;
Combining More Than Two Queries
You can chain multiple UNIONs:
SELECT name FROM customers
UNION ALL
SELECT name FROM vendors
UNION ALL
SELECT name FROM employees
ORDER BY name;
INTERSECT and EXCEPT
Related set operations worth knowing:
-- Rows in both queries
SELECT city FROM customers
INTERSECT
SELECT city FROM offices;
-- Rows in the first query but not the second
SELECT city FROM customers
EXCEPT -- SQL Server, PostgreSQL, SQLite
SELECT city FROM offices;
-- Oracle uses MINUS instead of EXCEPT
SELECT city FROM customers
MINUS
SELECT city FROM offices;
MySQL 8.0.31+ supports INTERSECT and EXCEPT. Earlier versions do not.
Tags: MySQL PostgreSQL Microsoft SQL Server
Common Table Expressions (CTEs) in SQL
Posted by Kyle Hankinson July 10, 2025
A Common Table Expression (CTE) is a named temporary result set that exists for the duration of a single query. CTEs make complex queries easier to read, write, and debug.
Basic CTE Syntax
WITH active_users AS (
SELECT id, name, email
FROM users
WHERE status = 'active'
)
SELECT * FROM active_users WHERE email LIKE '%@company.com';
The WITH clause defines the CTE, and the main query uses it like a table. The CTE only exists for that one statement.
Why Use CTEs?
Compare a nested subquery to a CTE:
-- Nested subquery (hard to read)
SELECT d.name, stats.avg_salary
FROM departments d
JOIN (
SELECT department_id, AVG(salary) AS avg_salary
FROM employees
WHERE status = 'active'
GROUP BY department_id
) stats ON d.id = stats.department_id
WHERE stats.avg_salary > 75000;
-- CTE (much clearer)
WITH dept_salaries AS (
SELECT department_id, AVG(salary) AS avg_salary
FROM employees
WHERE status = 'active'
GROUP BY department_id
)
SELECT d.name, ds.avg_salary
FROM departments d
JOIN dept_salaries ds ON d.id = ds.department_id
WHERE ds.avg_salary > 75000;
Multiple CTEs
Chain multiple CTEs with commas:
WITH
active_users AS (
SELECT * FROM users WHERE status = 'active'
),
user_orders AS (
SELECT user_id, COUNT(*) AS order_count, SUM(total) AS total_spent
FROM orders
GROUP BY user_id
)
SELECT au.name, uo.order_count, uo.total_spent
FROM active_users au
JOIN user_orders uo ON au.id = uo.user_id
ORDER BY uo.total_spent DESC;
Each CTE can reference the ones defined before it.
Recursive CTEs
Recursive CTEs solve hierarchical data problems like org charts, category trees, and bill-of-materials.
Employee Org Chart
WITH RECURSIVE org_chart AS (
-- Anchor: start with the CEO (no manager)
SELECT id, name, manager_id, 0 AS depth
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- Recursive: find each employee's direct reports
SELECT e.id, e.name, e.manager_id, oc.depth + 1
FROM employees e
JOIN org_chart oc ON e.manager_id = oc.id
)
SELECT REPEAT(' ', depth) || name AS org_tree, depth
FROM org_chart
ORDER BY depth, name;
Building a Path (Breadcrumbs)
WITH RECURSIVE category_path AS (
SELECT id, name, parent_id, name AS path
FROM categories
WHERE parent_id IS NULL
UNION ALL
SELECT c.id, c.name, c.parent_id,
cp.path || ' > ' || c.name
FROM categories c
JOIN category_path cp ON c.parent_id = cp.id
)
SELECT name, path FROM category_path ORDER BY path;
Result: Electronics > Computers > Laptops
Number Series
Generate a sequence of numbers without a table:
WITH RECURSIVE numbers AS (
SELECT 1 AS n
UNION ALL
SELECT n + 1 FROM numbers WHERE n < 100
)
SELECT n FROM numbers;
Database Compatibility
| Database | Basic CTE | Recursive CTE | Keyword |
|---|---|---|---|
| PostgreSQL | Yes | Yes | WITH RECURSIVE |
| MySQL 8.0+ | Yes | Yes | WITH RECURSIVE |
| SQL Server 2005+ | Yes | Yes | WITH (no RECURSIVE keyword) |
| Oracle 11g+ | Yes | Yes | WITH (no RECURSIVE keyword) |
| SQLite 3.8.3+ | Yes | Yes | WITH RECURSIVE |
SQL Server and Oracle do not use the RECURSIVE keyword — the database detects recursion automatically.
Preventing Infinite Loops
A bug in the recursive condition can cause infinite recursion. Protect against it:
-- PostgreSQL / MySQL: limit depth
WITH RECURSIVE tree AS (
SELECT id, parent_id, 1 AS depth FROM nodes WHERE parent_id IS NULL
UNION ALL
SELECT n.id, n.parent_id, t.depth + 1
FROM nodes n JOIN tree t ON n.parent_id = t.id
WHERE t.depth < 50 -- safety limit
)
SELECT * FROM tree;
-- SQL Server: OPTION (MAXRECURSION)
WITH tree AS (...)
SELECT * FROM tree OPTION (MAXRECURSION 100);
CTE vs Subquery vs Temp Table
| CTE | Subquery | Temp Table | |
|---|---|---|---|
| Readability | Best | Worst (nested) | Good |
| Reusable in same query | Yes | No | Yes |
| Persists across queries | No | No | Yes |
| Indexable | No | No | Yes |
| Recursive support | Yes | No | No |
Tags: MySQL PostgreSQL Microsoft SQL Server
How to Drop a Database with Active Connections
Posted by Kyle Hankinson March 20, 2025
Trying to drop a database while other sessions are connected to it will fail. Here is how to terminate those connections and drop the database safely in each database system.
PostgreSQL
The Error
ERROR: database "mydb" is being accessed by other users
DETAIL: There are 3 other sessions using the database.
Solution: Terminate Connections, Then Drop
-- Step 1: Prevent new connections
REVOKE CONNECT ON DATABASE mydb FROM PUBLIC;
-- Step 2: Terminate existing connections
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE datname = 'mydb' AND pid <> pg_backend_pid();
-- Step 3: Drop the database
DROP DATABASE mydb;
PostgreSQL 13+: DROP DATABASE FORCE
PostgreSQL 13 added a FORCE option that terminates connections automatically:
DROP DATABASE mydb WITH (FORCE);
This is the simplest approach if you are on PostgreSQL 13 or later.
Checking Active Connections
SELECT pid, usename, application_name, state, query_start
FROM pg_stat_activity
WHERE datname = 'mydb';
SQL Server
The Error
Cannot drop database "mydb" because it is currently in use.
Solution: Set Single User Mode
-- Terminate all connections and switch to single-user mode
ALTER DATABASE mydb SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
-- Drop the database
DROP DATABASE mydb;
WITH ROLLBACK IMMEDIATE rolls back any in-progress transactions and disconnects all users immediately.
Alternative: Kill Connections Manually
-- Find active sessions
SELECT session_id, login_name, status, last_request_start_time
FROM sys.dm_exec_sessions
WHERE database_id = DB_ID('mydb');
-- Kill each one
KILL 52;
KILL 53;
-- Then drop
DROP DATABASE mydb;
MySQL
MySQL is more lenient — DROP DATABASE does not require exclusive access. However, active queries will continue running against the dropped tables until they finish or error out.
To cleanly terminate connections first:
-- Find connections to the database
SELECT ID, USER, HOST, DB, COMMAND, TIME, STATE
FROM INFORMATION_SCHEMA.PROCESSLIST
WHERE DB = 'mydb';
-- Kill each connection
KILL 42;
KILL 43;
-- Drop the database
DROP DATABASE mydb;
Or in MySQL 8.0+:
SELECT CONCAT('KILL ', ID, ';')
FROM INFORMATION_SCHEMA.PROCESSLIST
WHERE DB = 'mydb';
Run the output to kill all connections at once.
Quick Reference
| Database | Command |
|---|---|
| PostgreSQL 13+ | DROP DATABASE mydb WITH (FORCE) |
| PostgreSQL < 13 | pg_terminate_backend() then DROP DATABASE |
| SQL Server | ALTER DATABASE SET SINGLE_USER WITH ROLLBACK IMMEDIATE then DROP |
| MySQL | KILL connections then DROP DATABASE |
Safety Considerations
- Always double-check the database name before dropping. There is no undo.
- In production: Take a backup first. Use
pg_dump,mysqldump, or SQL Server backup. - In CI/CD: The PostgreSQL
WITH (FORCE)option is ideal for tearing down test databases. - Never run these commands against production databases unless you are absolutely certain. A typo in the database name can be catastrophic.
Tags: MySQL PostgreSQL Microsoft SQL Server
How to Copy or Clone a Table in SQL
Posted by Kyle Hankinson October 24, 2024
Cloning a table is useful for backups, testing, and schema experiments. Every database can do it, but what gets copied varies.
MySQL
Structure + Data
CREATE TABLE users_copy AS SELECT * FROM users;
Warning: This copies data and column definitions but NOT indexes, primary keys, auto-increment, or foreign keys.
Structure Only (No Data)
CREATE TABLE users_copy LIKE users;
LIKE copies the full structure including indexes and auto-increment. To then copy data:
INSERT INTO users_copy SELECT * FROM users;
Partial Copy
CREATE TABLE active_users AS
SELECT * FROM users WHERE status = 'active';
PostgreSQL
Structure + Data
CREATE TABLE users_copy AS SELECT * FROM users;
Like MySQL, this does not copy indexes, constraints, or defaults.
Structure with Constraints
CREATE TABLE users_copy (LIKE users INCLUDING ALL);
INCLUDING ALL copies defaults, constraints, indexes, comments, and identity columns. You can be selective:
CREATE TABLE users_copy (LIKE users INCLUDING DEFAULTS INCLUDING CONSTRAINTS);
Then copy data separately:
INSERT INTO users_copy SELECT * FROM users;
SQL Server
Structure + Data
SELECT * INTO users_copy FROM users;
SELECT INTO creates the new table automatically. It copies column definitions and data but not indexes, constraints, or triggers.
Structure Only
SELECT * INTO users_copy FROM users WHERE 1 = 0;
The WHERE 1 = 0 ensures no rows are copied.
With Specific Columns or Filters
SELECT id, name, email INTO active_users
FROM users WHERE status = 'active';
Oracle
-- Structure + Data
CREATE TABLE users_copy AS SELECT * FROM users;
-- Structure Only
CREATE TABLE users_copy AS SELECT * FROM users WHERE 1 = 0;
To get the full DDL including indexes and constraints:
SELECT DBMS_METADATA.GET_DDL('TABLE', 'USERS') FROM DUAL;
Then edit the output to change the table name.
SQLite
-- Structure + Data
CREATE TABLE users_copy AS SELECT * FROM users;
-- Structure Only
CREATE TABLE users_copy AS SELECT * FROM users WHERE 0;
SQLite's CREATE TABLE AS does not copy primary keys or autoincrement.
What Gets Copied
| Feature | CREATE AS SELECT | LIKE / Structure Copy |
|---|---|---|
| Column names & types | Yes | Yes |
| Data | Yes | No |
| Primary key | No | Yes (MySQL LIKE, PG INCLUDING ALL) |
| Indexes | No | Yes (MySQL LIKE, PG INCLUDING ALL) |
| Auto-increment | No | Yes (MySQL LIKE, PG INCLUDING ALL) |
| Foreign keys | No | No (must add manually) |
| Triggers | No | No |
| Constraints | No | Yes (PG INCLUDING ALL) |
Using SQLPro Studio
In