Box Plots

Box plots are charts used to summarise the distribution of a numeric variable and to spot outliers quickly. They are especially helpful when you want to compare distributions across multiple groups, such as sales by region, scores by class, or delivery time by courier partner.

A box plot is built using five key values. The middle line inside the box shows the median, which is the middle value of the data. The bottom and top of the box represent the first quartile (25th percentile) and third quartile (75th percentile). The distance between these two quartiles is called the interquartile range (IQR), and it shows where the middle 50 percent of values lie. The whiskers extend beyond the box to show the typical lower and upper range, and points outside the whiskers are usually treated as outliers.

Box plots are useful because they show both spread and skewness in a compact way. If the median line is closer to the bottom of the box, it suggests the data is skewed higher. If it is closer to the top, it suggests the data is skewed lower. The length of the box and whiskers also helps you see variability. A larger box means the values vary more, while a smaller box suggests the values are more consistent.

Outliers are a major reason to use box plots. Outliers appear as individual points outside the whiskers. These can be genuine extreme values or data errors, so they are worth investigating. For example, very high order values may represent premium customers, or they may represent incorrect entries.

Box plots are best used when you want a quick distribution summary or when you want to compare multiple categories side-by-side. They help you understand typical values, variation, and unusual cases without needing long tables or multiple charts.

Histograms
Exercise: Skill Pay Analysis

Get industry recognized certification – Contact us

keyboard_arrow_up