sql median

Galaxy Glossary

How do you calculate the median of a dataset in SQL?

The median is the middle value in a sorted dataset. SQL doesn't have a direct median function. We need to use other functions like `PERCENTILE_CONT` or a combination of `ORDER BY` and `ROW_NUMBER` to calculate it.
Sign up for the latest in SQL knowledge from the Galaxy Team!
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Description

Calculating the median in SQL requires a bit more work than using a built-in function. Unlike some other aggregate functions like `AVG` or `SUM`, SQL doesn't directly provide a `MEDIAN` function. This means we need to find a way to determine the middle value in a sorted dataset. One common approach is to use the `PERCENTILE_CONT` function, which returns the value at a specific percentile. To find the median, we use the 50th percentile. Alternatively, we can use a combination of `ORDER BY` and `ROW_NUMBER` to rank the data and then identify the middle value. This method is more flexible, but requires more code.

Why sql median is important

Understanding how to calculate the median is crucial for data analysis. The median provides a robust measure of central tendency, less susceptible to outliers than the mean. This is vital for understanding the typical value in a dataset, especially when dealing with skewed distributions.

Example Usage

```sql -- Using PERCENTILE_CONT SELECT PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY salary) AS median_salary FROM employees; -- Using ROW_NUMBER and ORDER BY WITH RankedSalaries AS ( SELECT salary, ROW_NUMBER() OVER (ORDER BY salary) as rn, ROW_NUMBER() OVER (ORDER BY salary DESC) as rn_desc FROM employees ) SELECT salary FROM RankedSalaries WHERE rn = (SELECT COUNT(*) / 2 + 1 FROM employees) or rn_desc = (SELECT COUNT(*) / 2 + 1 FROM employees) ORDER BY salary ASC; ```

Common Mistakes

Want to learn about other SQL terms?