Preparing for an SQL interview can be intimidating, but having a solid grasp of common questions and answers will boost your confidence. Below is a comprehensive list of over 100 SQL interview questions, organized by difficulty and topic. This resource is perfect for developers, analysts, engineers, product managers, and anyone prepping for technical interviews.
SQL (Structured Query Language) is the standard language used to manage relational databases. It allows users to retrieve, insert, update, and delete data, as well as manage database structure and permissions.
A primary key uniquely identifies each record in a table. It must be unique and cannot contain NULL values.
A foreign key is a column (or set of columns) that creates a link between two tables by referencing the primary key of another table.
NULL represents a missing or undefined value. It is not equivalent to zero or an empty string.
A unique key ensures all values in a column or group of columns are distinct. Unlike the primary key, it allows a single NULL.
Constraints enforce rules on data, such as:
Normalization is the process of organizing data to reduce redundancy and improve data integrity. Common forms include:
Denormalization introduces redundancy for improved read performance. It’s often used in OLAP/data warehousing scenarios.
A composite key is made up of two or more columns used to uniquely identify rows in a table.
A view is a virtual table based on a SELECT statement. It does not store data but shows results of a query.
CREATE TABLE employees (
id INT PRIMARY KEY,
name VARCHAR(100),
department VARCHAR(50)
);
UPDATE employees SET name = 'Alice' WHERE id = 1;
GROUP BY aggregates rows with the same values into summary rows:
SELECT department, COUNT(*) FROM employees GROUP BY department;
DISTINCT returns unique values for specified columns.
SELECT * FROM employees WHERE name LIKE 'A%';
A query nested inside another query, often in WHERE, FROM, or SELECT.
A subquery that references columns from the outer query.
ALTER TABLE employees RENAME COLUMN old_name TO new_name;
ALTER TABLE old_table_name RENAME TO new_table_name;
Functions that perform calculations across a set of table rows related to the current row:
Returns a unique number for each row within a partition.
RANK() skips numbers for ties, DENSE_RANK() does not.
Temporary named result set used in a WITH clause.
Returns the first non-null value in a list of inputs.
Acts as an IF-THEN-ELSE for conditional logic.
Use DISTINCT, or GROUP BY, or ROW_NUMBER() and filter.
Use MOD(id, 2) = 0 for even, = 1 for odd.
Improves performance by reducing the number of rows scanned.
A join where a table is joined to itself.
Use parameterized queries, prepared statements, and avoid dynamic SQL.
A sequence of operations performed atomically. Uses BEGIN, COMMIT, ROLLBACK.
Functionally similar – both return the current date and time.
SELECT MAX(salary) FROM employees WHERE salary < (SELECT MAX(salary) FROM employees);
SELECT * FROM employees LIMIT 10;
In SQL Server:
SELECT TOP 10 * FROM employees;
SELECT column_name, COUNT(*)
FROM table_name
GROUP BY column_name
HAVING COUNT(*) > 1;
SELECT COUNT(*) FROM table_name;
In PostgreSQL/MySQL:
DESCRIBE table_name;
In SQL Server:
EXEC sp_help 'table_name';
A table that exists temporarily during a session or transaction. It is prefixed with # in SQL Server or created via CREATE TEMP TABLE.
A precomputed view stored on disk for performance. Unlike regular views, materialized views store actual data.
Stored procedures are reusable SQL code blocks that perform actions such as inserts, updates, or complex logic, and can accept parameters.
Triggers are automated procedures that run in response to database events like INSERT, UPDATE, or DELETE.
The database query planner evaluates possible strategies and picks the most efficient one based on indexes, statistics, and constraints.
They are extensions to GROUP BY that allow subtotals and grand totals in aggregation queries.
Turning rows into columns for aggregation. For example, monthly sales in columns instead of rows.
Turning columns into rows, often used for flattening data structures.
Use recursive Common Table Expressions:
WITH RECURSIVE cte_name AS (
SELECT ...
UNION ALL
SELECT ...
FROM cte_name
WHERE condition
)
SELECT * FROM cte_name;
MERGE allows you to INSERT, UPDATE, or DELETE in a single query based on conditions.
Use EXPLAIN, database-specific profiling tools (e.g. pg_stat_statements, MySQL EXPLAIN), and logs.
Improving query speed and efficiency by rewriting queries, creating indexes, and reducing operations like joins or subqueries.
A situation where two transactions wait indefinitely for each other to release locks. Resolved via detection or timeouts.
Transaction isolation levels control concurrency:
SELECT * FROM users WHERE last_login < CURRENT_DATE - INTERVAL '30 days';
Use window functions:
SELECT *
FROM (
SELECT product_id, month, sales,
RANK() OVER (PARTITION BY month ORDER BY sales DESC) AS rk
FROM sales_table
) t
WHERE rk <= 3;
SELECT user_id, amount,
SUM(amount) OVER (PARTITION BY user_id ORDER BY date) AS running_total
FROM transactions;
SELECT region, SUM(sales) AS total_sales,
SUM(sales) * 100.0 / SUM(SUM(sales)) OVER () AS percent_total
FROM sales
GROUP BY region;
SELECT *
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY updated_at DESC) AS rn
FROM users
) sub
WHERE rn = 1;
SELECT id FROM table_a
EXCEPT
SELECT id FROM table_b;
Use LEAD() and check for differences > 1.
SELECT * FROM users WHERE LOWER(name) = 'john';
In PostgreSQL:
SELECT regexp_split_to_table('a,b,c', ',');
SELECT *
FROM orders o
JOIN customers c
ON o.customer_id = c.id AND o.region = c.region;
Look for outliers in login frequency, activity counts, or session durations using aggregation + standard deviation or window functions.
SELECT user_id, event_date,
AVG(metric) OVER (PARTITION BY user_id ORDER BY event_date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) AS moving_avg
FROM metrics;
Use AT TIME ZONE in PostgreSQL or CONVERT_TZ() in MySQL. Always store data in UTC and convert on display.
SELECT * FROM events WHERE EXTRACT(DOW FROM event_date) IN (0, 6); -- PostgreSQL: Sunday = 0
Cohort analysis: group users by signup date and track their return behavior in time buckets (e.g., week 0, week 1, week 2).
Use CASE WHEN:
SELECT user_id,
SUM(CASE WHEN month = '2024-01' THEN amount ELSE 0 END) AS jan_total,
SUM(CASE WHEN month = '2024-02' THEN amount ELSE 0 END) AS feb_total
FROM purchases
GROUP BY user_id;
SELECT DATE(timestamp_column) FROM events;
SELECT user_id, event_date,
LAG(event_date) OVER (PARTITION BY user_id ORDER BY event_date) AS prev_date
FROM logins;
Just chain joins:
SELECT *
FROM a
JOIN b ON a.id = b.a_id
JOIN c ON b.id = c.b_id;
They allow the DB engine to locate data faster, reducing full-table scans.
Avoid on frequently updated columns or small tables. Indexes slow down writes.
Use PARTITION BY clause based on ranges, lists, or hashes (varies by DB).
Embed related data, use redundant fields, pre-join tables into wide tables for faster read access.
Joins, filtering, grouping, subqueries, window functions, and basic data wrangling.
To answer product usage questions: conversion funnels, retention, feature adoption.
Debugging issues, writing test queries, analyzing logs/events data, profiling performance.
Compare daily active users to rolling 7-day average using window functions.
Cohort users by signup date, count returning users in each time bucket.
Use user ID or session ID as the join key, ensure consistent data types.
Run on a limited dataset, use EXPLAIN, and sanity-check outputs with known inputs.
This list of 100+ SQL interview questions and answers is designed to help you:
💡 Next steps: