Writing SQL That Wows the Interviewer

1 min read

Your Guide to Advanced SQL

Once an interviewer knows you can write a basic JOIN, they'll want to see how you handle real business problems. 'Find the top 3 highest-paid employees in each department' or 'Calculate the running total of sales for each day.'

Trying to solve these with basic GROUP BY queries is either impossible or leads to a horribly complex and slow query.

This is where advanced SQL comes in. Knowing these two concepts will instantly set you apart:

Window Functions (PARTITION BY): These are the magic bullet. They let you perform aggregate calculations (like SUM, AVG, RANK) without collapsing your rows.
Common Table Expressions (CTEs): These let you break down a complex, unreadable query with 10 nested subqueries into clean, logical, step-by-step 'temporary' tables.

This cluster will teach you how to use these tools to write clean, powerful, and impressive SQL.

Window Functions: `ROW_NUMBER()`, `RANK()`, `PARTITION BY`

Interview Question: 'How would you find the top 3 salaries per department?'

This is the classic window function question. A GROUP BY can't solve this because it would collapse all the employee rows.

Answer: 'I would use a Window Function like ROW_NUMBER(), using PARTITION BY to restart the ranking for each department.'

A window function performs a calculation across a 'window' of rows related to the current row. The OVER (PARTITION BY ...) clause defines that window.

Code Example: Solving the Problem

The Data (Employees Table):

| Name | Department | Salary | |---|---|---| | Alice | Sales | 90000 | | Bob | Sales | 85000 | | Carol | Sales | 80000 | | David | Sales | 75000 | | Eve | HR | 70000 | | Frank | HR | 65000 | | Grace | HR | 60000 |

First, we write a query to 'rank' the employees. We'll use a CTE to make it readable.

WITH RankedSalaries AS (
  SELECT 
    Name,
    Department,
    Salary,
    -- This is the window function:
    ROW_NUMBER() OVER(PARTITION BY Department ORDER BY Salary DESC) AS RankNum
  FROM Employees
)
-- 'RankedSalaries' is a temporary result that looks like:
-- Name | Department | Salary | RankNum
-- Alice | Sales | 90000 | 1
-- Bob | Sales | 85000 | 2
-- Carol | Sales | 80000 | 3
-- David | Sales | 75000 | 4
-- Eve | HR | 70000 | 1  (Notice the rank restarted for HR)
-- Frank | HR | 65000 | 2
-- Grace | HR | 60000 | 3

-- Now, we just select from our ranked results:
SELECT Name, Department, Salary, RankNum
FROM RankedSalaries
WHERE RankNum <= 3;

The Result: Alice, Bob, Carol, Eve, Frank, and Grace are returned. David (rank 4) is filtered out. This is a clean, powerful solution.

`ROW_NUMBER()` vs. `RANK()` vs. `DENSE_RANK()`

The interviewer might ask a follow-up:

ROW_NUMBER(): Gives a unique, sequential number (1, 2, 3, 4).
RANK(): Skips numbers if there's a tie. (1, 2, 2, 4).
DENSE_RANK(): Does not skip numbers for ties. (1, 2, 2, 3).

CTEs (Common Table Expressions): When to Use Them

Interview Question: 'What is a CTE and why would you use one?'

Answer: 'A CTE, or Common Table Expression, is a temporary, named result set that you define using a WITH clause. The main reason to use one is readability. It lets you break a single, massive, complex query into logical, readable 'steps', almost like variables in programming.'

'Before': The 'Nested Subquery' Mess

Imagine you need to get all orders from 'Sales' employees, and then get the total sum. This is hard to read.

-- Find the total sum of orders from 'Sales' employees
SELECT SUM(t1.OrderTotal)
FROM (
  -- Step 2: Get all orders for those employees
  SELECT o.OrderID, o.OrderTotal
  FROM Orders o
  WHERE o.EmployeeID IN (
    -- Step 1: Get all employees in 'Sales'
    SELECT e.EmployeeID 
    FROM Employees e 
    WHERE e.Department = 'Sales'
  )
) AS t1;

'After': The Clean CTE Solution

Using WITH, we can define our steps logically. The query is executed from top to bottom, and it's much easier to debug and understand.

-- Step 1: Define a CTE for 'Sales' employees
WITH SalesEmployees AS (
  SELECT EmployeeID
  FROM Employees
  WHERE Department = 'Sales'
),

-- Step 2: Define a CTE for their orders
-- This CTE can reference the one above it
SalesOrders AS (
  SELECT o.OrderID, o.OrderTotal
  FROM Orders o
  WHERE o.EmployeeID IN (SELECT EmployeeID FROM SalesEmployees)
)

-- Step 3: Now, just run the final, simple query
SELECT SUM(OrderTotal)
FROM SalesOrders;

Bonus Point: 'CTEs are also useful for writing recursive queries, such as for finding all employees in a management hierarchy (an org chart). This is something a simple JOIN can't do.'

`GROUP BY` vs. `PARTITION BY`: A Clear Comparison

Interview Question: 'What's the difference between `GROUP BY` and `PARTITION BY`?'

This is a top-tier advanced question. Answering it well proves you truly understand window functions.

Answer: 'They both perform aggregations, but the key difference is this: GROUP BY collapses rows into a single row, reducing the number of rows returned. PARTITION BY does not collapse rows; it's a window function that adds an aggregate value to each row in its 'window' or 'partition'.'

Example: The Problem

You have a list of employees. You want to show each employee's salary and the average salary for their department in the same row.

You cannot solve this with GROUP BY. If you GROUP BY Department, you lose the individual employee names and salaries.

1. The `GROUP BY` Way (Collapses Rows)

This query only returns the department-level average. We've lost the employee detail.

SELECT 
  Department,
  AVG(Salary) AS DeptAverage
FROM Employees
GROUP BY Department;

-- Result (Rows are collapsed):
-- Department | DeptAverage
-- Sales | 82500
-- HR | 65000

2. The `PARTITION BY` Way (Keeps Rows)

This is the correct solution. We use AVG() as a window function.

SELECT
  Name,
  Department,
  Salary,
  -- This window function 'partitions' by department
  -- It calculates the AVG for that partition (window)
  -- and 'pastes' that value onto the row.
  AVG(Salary) OVER (PARTITION BY Department) AS DeptAverage
FROM Employees;

-- Result (Rows are not collapsed):
-- Name | Department | Salary | DeptAverage
-- Alice | Sales | 90000 | 82500
-- Bob | Sales | 85000 | 82500
-- Carol | Sales | 80000 | 82500
-- David | Sales | 75000 | 82500
-- Eve | HR | 70000 | 65000
-- Frank | HR | 65000 | 65000
-- Grace | HR | 60000 | 65000

The simple answer: 'GROUP BY reduces the number of rows. PARTITION BY does not; it lets me add an aggregate (like an AVG or SUM) to each row, relative to its group.'

Pivoting Data with `PIVOT` and `CASE` Statements

Interview Question: 'How do you turn rows into columns?'

This is called Pivoting data. For example, 'You have a table of Sales with Month and Amount. How do you create a report with a single row for the year, and columns for Jan, Feb, Mar, etc.?'

There are two ways: the 'classic' CASE statement method, and the PIVOT keyword (if your SQL dialect supports it).

The Data (`MonthlySales` Table):

| Month | Amount | |---|---| | Jan | 100 | | Feb | 150 | | Mar | 200 | | Jan | 50 |

1. The 'Classic' Method: `GROUP BY` with `CASE`

This method works in all SQL dialects (MySQL, PostgreSQL, SQL Server) and is very flexible. You group by the item you want to keep as a row (e.g., the year, if you had one) and use a CASE statement inside an aggregate function for each new column you want.

SELECT 
  -- We want one row per 'year', but we don't have one, 
  -- so we just sum everything.

  SUM(CASE WHEN Month = 'Jan' THEN Amount ELSE 0 END) AS JanTotal,
  SUM(CASE WHEN Month = 'Feb' THEN Amount ELSE 0 END) AS FebTotal,
  SUM(CASE WHEN Month = 'Mar' THEN Amount ELSE 0 END) AS MarTotal
FROM MonthlySales;

-- Result (Rows have become columns):
-- JanTotal | FebTotal | MarTotal
-- 150 | 150 | 200

2. The `PIVOT` Keyword (SQL Server / T-SQL)

The PIVOT keyword is syntactic sugar that does the same thing, but it can be a bit cleaner (or more confusing, depending on your view).

SELECT Jan, Feb, Mar
FROM (
  -- 1. The source data
  SELECT Month, Amount 
  FROM MonthlySales
) AS SourceData
PIVOT (
  -- 2. The aggregation to perform
  SUM(Amount)
  -- 3. The column to 'pivot' on (whose values become the new column names)
  FOR Month IN (Jan, Feb, Mar)
) AS PivotTable;

-- Result is identical:
-- Jan | Feb | Mar
-- 150 | 150 | 200

The simple answer: 'It's called pivoting. I do it using a GROUP BY with a SUM(CASE...) expression for each column I want to create. If I'm using SQL Server, I can also use the PIVOT keyword.'

SQL Performance and Optimization
12 Feb 2025
Database Design & Theory
12 Feb 2025