Sampling Errors, Bias & Variables

Learning Objectives

Why This Matters

In 1936, Literary Digest magazine surveyed 2.4 million Americans and confidently predicted Alf Landon would defeat Franklin Roosevelt in the presidential election. Roosevelt won 46 of 48 states. The magazine's sample was drawn from phone directories, car registrations, and its own subscriber list - sources that over-represented wealthier Americans during the Great Depression. The same math runs today: every political poll, A/B test, and clinical trial lives or dies on whether its sample actually represents the population it claims to describe.

How to Use This Simulation

  1. In the Sampling Simulator tab, click "Draw Samples" to pull a biased and an unbiased sample from the same population. Watch how their means compare to the population mean.
  2. Draw multiple samples and watch the bias tracker - sampling error scatters randomly, but bias stays tilted in the same direction every time.
  3. Switch to the Variable Classification tab to practice identifying variable roles and data types across 10 research scenarios.
  4. Check the Explanation Panel below - it updates as you interact and names the distinction between sampling error and bias.
40
Use the buttons for easier control, or drag the larger slider handle.
Each dotOne sample mean minus the population mean.
Zero lineThe sample mean matched the population mean.
Bias patternIf amber dots stay mostly on one side, the method is biased. Bigger samples will not fix that.
Draw samples to see whether the dots bounce around zero or stay mostly on one side.

Every variable in a study can be classified along multiple dimensions: its role in the study (explanatory or response) and its data type (qualitative or quantitative; if quantitative, discrete or continuous). These classifications determine which statistical methods apply. Classify both variables in each scenario below.

Note: "Explanatory" and "response" are sometimes called "independent" and "dependent" in other textbooks.

0 of 10 correct
Samples Drawn
0
Click "Draw Samples" to begin
Random Sample Avg. Signed Deviation
--
Average signed difference between random sample means and population mean
Biased Sample Avg. Signed Difference
--
Average signed difference between biased sample means and population mean

What's Happening

Quick Check

A survey asks college students to rate their campus dining hall on a scale of 1 to 5 stars. A student calculating the results reports the mean rating as 3.2 stars. Their statistics professor says the mean may not be the most appropriate measure for this variable. Why?

Try This

A phone survey calls 1,000 randomly selected households between 10 AM and 2 PM on weekdays and asks about employment status and daily screen time. (1) Identify whether sampling bias is present and what kind. (2) If researchers want to study whether employment status predicts screen time, identify the explanatory and response variables. (3) Classify each variable as qualitative or quantitative, and if quantitative, discrete or continuous. Verify your classifications by comparing to a similar preset in the simulator.

Two studies investigate how much sleep college students get per night. Study A hands out surveys to students leaving a campus gym at 6 AM. Study B emails a survey link to 500 randomly selected student email addresses; 200 respond. (1) Name the sampling method in each study. (2) Describe the likely direction of bias in each - would each method overestimate or underestimate average sleep? (3) Classify the variables in each study. (4) Explain in one sentence why the same question ("how much do students sleep?") produces different answers depending on the sampling method.

A news headline reads: "Fitness app users walk 40% more than the average American, study finds." The study analyzed step-count data from 50,000 users of a popular fitness tracking app. (1) Classify the study design (observational or experiment). (2) Identify the sampling method and describe at least two bias risks. (3) Identify the variables and classify each by role and data type. (4) Write one sentence stating what the study's design and sample actually support, and one sentence stating what the headline implies but the data cannot confirm. (5) Propose one specific change to the study that would make the headline's claim stronger.

Instructor Notes

Teaching Notes

This simulation is most effective when you let students draw 5-10 samples before explaining the distinction between sampling error and sampling bias. The bias tracker dot chart makes the distinction visible: random sample signed differences scatter both directions and average toward zero; biased sample signed differences cluster on one side. Let students describe the pattern they see before naming it.

The variable classification tab addresses a separate but related objective. Students who can identify "qualitative" and "quantitative" in isolated examples often struggle when a variable uses numbers but represents categories (star ratings, zip codes, jersey numbers). Scenario 8 (driver rating, 1-5 stars) is designed to surface this confusion.

Common Student Errors

Discussion Questions

Exam Connection

Typical exam questions present a research scenario and ask students to (1) identify the type of bias present, (2) classify variables by role and data type, and (3) explain whether the study's conclusions are justified. The Starter challenge directly practices this format. The Challenge tier extends to evaluating headlines against study designs, which appears in more advanced exam questions.