Median In SQL

Galaxy Glossary

How do you calculate the median of a column in SQL?

The median is the middle value in a sorted dataset. SQL doesn't have a direct median function. We can calculate it using other aggregate functions and sorting.
Sign up for the latest in SQL knowledge from the Galaxy Team!
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Description

Finding the median in SQL requires a multi-step approach. Unlike some other aggregate functions like AVG or SUM, there isn't a built-in MEDIAN function. This means we need to combine sorting and ranking techniques to determine the middle value. The core idea is to sort the data and then identify the value that sits in the middle. If the dataset has an even number of rows, the median is the average of the two middle values.Consider a table named 'sales' with columns 'product_name' and 'sales_amount'. To find the median sales amount, we first need to sort the sales amounts. Then, we can use a window function to rank the sales amounts. Finally, we can filter for the middle value(s). For example, if we have sales amounts of 10, 20, 30, 40, 50, the median is 30. If we have 10, 20, 30, 40, 50, 60, the median is the average of 30 and 40, which is 35.This approach is robust and works for various datasets. It's important to handle cases with an even number of rows correctly to ensure accuracy. The use of window functions makes the query efficient and scalable for larger datasets.

Why Median In SQL is important

Calculating the median is crucial for understanding the central tendency of a dataset. It's less susceptible to outliers than the mean, providing a more representative view of the typical value. This is important in various business applications, such as analyzing sales data, customer demographics, or financial performance.

Example Usage


SELECT
    CASE
        WHEN COUNT(*) % 2 = 0
        THEN (value_at_rank + value_at_rank_plus_one) / 2
        ELSE value_at_rank
    END AS median
FROM
    (
        SELECT
            sales_amount,
            NTILE(100) OVER (ORDER BY sales_amount) AS sales_rank
        FROM
            sales
    ) AS ranked_sales
WHERE
    sales_rank = 50; -- Adjust for the desired percentile

Common Mistakes

Want to learn about other SQL terms?