SQL Subqueries Explained Simply (With Real Examples)
SQL subqueries confuse most beginners — not because they’re complicated, but because nobody explains what they’re actually doing. Here’s a clear, simple breakdown with real examples.
A SQL subquery is exactly what it sounds like — a query inside another query. The outer query uses the result of the inner query as if it were a table, a value, or a filter condition.
The concept is simple. What trips people up is the syntax — nested brackets, indentation, and figuring out where one query ends and another begins. This post breaks down SQL subqueries with clear examples so the structure clicks and stays clicked.
What Is a SQL Subquery?
A subquery is a SELECT statement nested inside another SQL statement. The inner query runs first, and its result is passed to the outer query. Think of it as a temporary result set that exists only for the duration of your outer query.
SELECT column_name FROM table_name WHERE column_name IN ( -- This is the subquery — it runs first SELECT column_name FROM another_table WHERE condition );
Read a subquery from the inside out. Ask: what does the inner query return? Then ask: what does the outer query do with that result? Answer both and the full query makes sense.
The 3 Places a Subquery Can Live
Subqueries can appear in three different parts of a SQL statement — each behaves slightly differently.
The most common type. The subquery returns a value or list of values that the outer query filters against. Used with IN, NOT IN, =, >, <, EXISTS.
SELECT customer_id, customer_name FROM customers WHERE customer_id IN ( SELECT DISTINCT customer_id FROM orders WHERE YEAR(order_date) = 2024 ); -- Inner query: list of customer_ids who ordered in 2024 -- Outer query: names of those customers from the customers table
The subquery acts as a temporary table — sometimes called a derived table. You give it an alias and query it like any other table. Useful when you need to aggregate data before filtering it.
SELECT customer_id, total_spent FROM ( SELECT customer_id, SUM(order_total) AS total_spent FROM orders GROUP BY customer_id ) AS customer_totals -- alias is required WHERE total_spent > 1000;
Subqueries in the FROM clause must always have an alias — the name after the closing bracket. Without it, MySQL throws: “Every derived table must have its own alias.”
The least common type — used when you want to return a single calculated value alongside each row. The subquery must return exactly one value (a scalar subquery). Use carefully — it runs once per row and can be slow on large datasets.
SELECT order_id, customer_id, order_total, ( SELECT COUNT(*) FROM orders o2 WHERE o2.customer_id = o1.customer_id ) AS total_orders_by_customer FROM orders o1;
Subqueries vs CTEs — Which Should You Use?
Both solve the same problem — breaking a complex query into steps. The difference is readability and debuggability.
Nested inside the main query. Works for simple, one-step logic. Harder to read when nested more than one level deep. Can’t be referenced more than once.
Defined before the main query, named, and reusable. Much easier to read, test, and debug. Preferred for anything more than one step.
-- Subquery approach (harder to read) SELECT customer_id, total_spent FROM ( SELECT customer_id, SUM(order_total) AS total_spent FROM orders GROUP BY customer_id ) AS totals WHERE total_spent > 1000; -- CTE approach (clean and testable) WITH totals AS ( SELECT customer_id, SUM(order_total) AS total_spent FROM orders GROUP BY customer_id ) SELECT customer_id, total_spent FROM totals WHERE total_spent > 1000;
Use a subquery when the logic is simple and used once. Use a CTE when the logic is multi-step, reused, or when someone else needs to read your query. In practice, CTEs win most of the time.
Correlated Subqueries — The Advanced Version
A regular subquery runs once and returns a fixed result. A correlated subquery references a column from the outer query — which means it runs once for every row in the outer query.
SELECT customer_id, total_spent, region FROM customer_totals ct WHERE total_spent > ( -- Runs once per row, referencing ct.region from the outer query SELECT AVG(total_spent) FROM customer_totals WHERE region = ct.region -- correlated reference );
Correlated subqueries are powerful but can be slow on large tables. For most use cases a JOIN or window function is faster — but they appear frequently in SQL interviews, so understanding them is worth the effort.
The Simple Way to Think About SQL Subqueries
Every time you see a subquery, ask two questions: what does the inner query return, and what does the outer query do with that result? Answer those two questions and the full query becomes readable — no matter how nested it gets.
And when writing your own: if the subquery is getting complex, switch to a CTE. Your future self — and anyone reading your code — will thank you.
Both techniques appear throughout my SQL portfolio. The Contoso Retail Analysis uses multi-step CTEs for RFM scoring, cohort analysis, and CLV modeling — all built from the concepts in this post.
