Quartiles and Box Plots
Learning Objectives
- Find the five-number summary of a data set
- Find and interpret percentiles and quartiles of a data set
- Identify the interquartile range and potential outliers in a set of data
- Construct and understand box-and-whisker plots
Why This Matters
Boston Marathon results get reported as "average finish time: 3:55," but that number tells you almost nothing about what running the race actually looks like. The middle 50% of finishers cluster between 3:30 and 4:30, the top 25% blow through under 3:30, and the elite outliers finish near 2:05 -- basically a different sport. Coaches, race organizers, and runners use box plots instead of averages because the five-number summary tells you what the average hides: where the pack actually is, who's an outlier, and how stretched the tails get.
How to Use This Simulation
- Drag the outlier slider to move one data point and watch the whisker, box, and outlier flag respond.
- Switch preset datasets to see how data shape changes the box plot's shape and spread.
- Open the Percentile Explorer tab to convert between values and percentile ranks both directions.
- Read the Explanation Panel below the chart -- it updates with your interactions and explains what just changed.
Drag to move the last data point. Watch the whisker snap when the value crosses a fence.
Percentile Explorer
The Pth percentile is the value at or below which approximately P% of the data falls. Q1 is the 25th percentile, Q2 (the median) is the 50th, and Q3 is the 75th. Quartiles are computed by the median-of-halves method (your textbook's quartile rule). For other percentile values, this tool uses the formula i = (P/100)(n+1), the standard intro-stats method.
Current Dataset Quartiles
Find the value at a given percentile
Find the percentile rank of a value
Practice: Find a Quartile by Hand
Load a dataset, then answer the question below.
- Min
- --
- Q1
- --
- Median
- --
- Q3
- --
- Max
- --
What's Happening
Quick Check
A real estate site reports box plots for two neighborhoods. Neighborhood X has a wide box stretching from $280K (Q1) to $520K (Q3) with median at $390K. Neighborhood Y has a narrow box from $410K (Q1) to $445K (Q3) with median at $425K. What do these box plots reveal about the two neighborhoods?
Try This
Load the Sleep Hours per Night preset. Without dragging the slider yet, read off the median and the IQR from the Five-Number Summary and IQR cards.
Now predict: if you drag the slider to move the largest value from 9.0 hours up to 12.0 hours, will the median move? Drag and check. Then write down the IQR before and after. Did the IQR change? Why or why not?
A coffee shop records customer wait times in minutes for 9 customers:
2, 3, 3, 4, 5, 6, 7, 8, 11
By hand, find the five-number summary and the IQR. Then determine whether the 11-minute wait is an outlier using the 1.5 × IQR rule. Show your fence calculation.
Open the Edit Data panel, replace the dataset with these 9 values, and verify your work against the Five-Number Summary and Outliers cards.
A university is choosing between two course sections to assign a struggling-student tutoring program. The program can only support one section. Final exam scores:
Section A: 62, 68, 70, 71, 72, 73, 74, 75, 76, 77, 78, 95
Section B: 45, 58, 62, 68, 70, 72, 74, 76, 78, 80, 82, 88
Enter each dataset one at a time using the Edit Data panel. Compare the five-number summaries and IQRs. The means are nearly identical (around 74 vs 71).
Write a two-sentence recommendation to the program director: which section should get the tutoring resources, and what does the five-number summary tell you that the means don't?
Instructor Notes
Teaching Notes
This simulation is designed around a single visual moment: the whisker snapping back when a dragged point crosses the 1.5 × IQR fence. Have students load the Marathon Finish Times preset and predict where the upper fence sits (around 295 minutes for the default data) before they drag the slider. When the dot crosses the fence and turns red, the upper whisker visibly retreats to the next non-outlier value. That single frame anchors the rule.
The Concert Ticket Prices preset is intentionally bimodal. It teaches what box plots can't show. A box plot of bimodal data looks like a wide, undifferentiated rectangle -- the two clusters are invisible. This is the lesson, not a flaw. Pair it with a return to the histogram concept from the earlier simulation when it comes up.
Common Student Errors
- Drawing whiskers all the way to the min or max instead of stopping at the most extreme non-outlier. The simulation enforces the correct rule and the explanation panel narrates the snap when it happens.
- Confusing IQR (a measure of spread) with median position (a measure of center). The Quick Check question targets this directly.
- Using the wrong quartile method. This simulation uses the exclusive method (Tukey's hinges, also Moore & McCabe / OpenStax): split the sorted data at the median, take the median of each half. For odd n, exclude the median from both halves.
- Treating outlier flags as deletion instructions. The 1.5 × IQR rule identifies candidates for review, not data to discard.
Discussion Questions
- Why might the same data look like it has outliers under one quartile method but not another? Does that mean outliers are subjective?
- The IQR ignores the top 25% and the bottom 25% of values. When is that the right thing to do, and when is that hiding important information?
- If you were reporting on income inequality, would you use mean, median, or the full five-number summary? Why?
Exam Connection
Typical exam items provide a sorted dataset and ask for the five-number summary and IQR, often with a follow-up asking the student to determine whether a specific value is an outlier. Stress that the answer requires showing the fence calculation, not just stating "yes" or "no." For percentile items, the formula i = (P/100)(n+1) is the OpenStax convention and matches the quartile values shown here.