Missing Data, Sample Bias, and Why Representation Matters
Why Data Isn’t Always “Fair”
Data is often thought of as objective — just numbers, facts, and charts. But in reality, data is shaped by decisions:
Who we choose to ask
What questions we ask
What we ignore
And how we interpret the answers
These decisions can unintentionally (or intentionally) leave people out — especially those from underrepresented or marginalized communities.
If we only ask certain people certain questions, we’ll only get part of the truth. That’s not just a math problem — it’s a justice issue.
Missing Data & Missing Voices
Sometimes, certain people are not represented in a dataset at all. This is known as missing data. It can happen for many reasons:
The data collectors didn’t think to include them
They were excluded due to technical barriers (like language, internet, or access)
They chose not to participate because they didn’t feel safe, respected, or seen
When voices are missing from data, entire stories and perspectives disappear from the conversation. This makes the final conclusions unfair, misleading, or incomplete.
For example, a national survey about teen mental health only collects responses from schools with reliable Wi-Fi and English-language instruction. This leaves out:
Rural students without internet access
Newcomer students who don’t speak English
Teens who skip school due to trauma or instability
Even if the data shows “most teens are fine'“, it may be ignoring the very teens who are most at risk — meaning the data fails to inform the people who need it the most.
When the Data Doesn’t Reflect the Real World
Sample Bias
A sample is the group of people you study in order to understand a bigger population. But if your sample is too narrow, your results will be biased — and unequal experiences will be erased.
For example,
If a study on teen friendships only includes students from elite private schools, the results might show low rates of bullying and strong social networks.
But, those patterns could be very different for public school students, homeschooled students, or students in alternative schools.
Biased data leads to biased conclusions.
Biased conclusions lead to biased decisions — about policy, mental health, technology, and more.
More Than Just “Counting People”
Representational Equity
Sometimes people are technically “included” in the data — but their experiences are minimized, miscategorized, or erased.
Consider this:
A survey includes 1,000 teens, and 10 of them are Native American. That’s technically “representation” — but if:
The survey doesn’t allow them to select their tribal affiliation
None of their stories appear in the summary
Their answers are grouped under “Other”
Representation
means people are counted.
… then their inclusion is performative, not meaningful.
Equity in data means making sure every voice is counted, heard, and visible in the analysis — not just squeezed into a chart.
Inclusion
means their stories are respected.
Equity
means we recognize that different groups may need different things in order to be seen fairly.
-
Add a short summary or a list of helpful resources here.