Selection Bias Is A Fact Of Life, Not An Excuse For Rejecting Internet Surveys
Sometimes people do amateur research through online surveys. Then they find interesting things. Then their commenters say it doesn’t count, because “selection bias!” This has been happening to Aella for years, but people try it sometimes on me too.
I think these people are operating off some model where amateur surveys necessarily have selection bias, because they only capture the survey-maker’s Twitter followers, or blog readers, or some other weird highly-selected snapshot of the Internet-using public. But real studies by professional scientists don’t have selection bias, because . . . sorry, I don’t know how their model would end this sentence.
The real studies by professional scientists usually use Psych 101 students at the professional scientists’ university. Or sometimes they will put up a flyer on a bulletin board in town, saying “Earn $10 By Participating In A Study!” in which case their population will be selected for people who want $10 (poor people, bored people, etc). Sometimes the scientists will get really into cross-cultural research, and retest their hypothesis on various primitive tribes - in which case their population will be selected for the primitive tribes that don’t murder scientists who try to study them. As far as I know, nobody in history has ever done a psychology study on a truly representative sample of the world population.
This is fine. Why?
Selection bias is disastrous if you’re trying to do something like a poll or census. That is, if you want to know “What percent of Americans own smartphones?” then any selection at all limits your result. The percent of Psych 101 undergrads who own smartphones is different from the percent of poor people who want $10 who own smartphones, and both are different from the percent of Americans who own smartphones. The same is potentially true about “how many people oppose abortion?” or “what percent of people are color blind?” or anything else trying to find out how common something is in the population. The only good ways to do this are a) use a giant government dataset that literally includes everyone, b) hire a polling company like Gallup which has tried really hard to get a panel that includes the exact right number of Hispanic people and elderly people and homeless people and every other demographic, c) do a lot of statistical adjustments and pray.
Selection bias is fine-ish if you’re trying to do something like test a correlation. Does eating bananas make people smarter because something something potassium? Get a bunch of Psych 101 undergrads, test their IQs, and ask them how many bananas they eat per day (obviously there are many other problems with this study, like establishing causation - let’s ignore those for now). If you find that people who eat more bananas have higher IQ, then fine, that’s a finding. If you’re right about the mechanism (something something potassium), then probably it should generalize to groups other than Psych 101 undergrads. It might not! But it’s okay to publish a paper saying “Study Finds Eating Bananas Raises IQ” with a little asterisk at the bottom saying “like every study ever done, we only tested this in a specific population rather than everyone in the world, and for all we know maybe it isn’t true in other populations, whatever.” If there’s some reason why Psych 101 undergrads are a particularly bad population to test this in, and any other population is better, then you should use a different population. Otherwise, choose your poison.
Sometimes a correlation will genuinely fail to generalize out of sample. Suppose you find that, in a population of Psych 101 undergrads at a good college, family income is unrelated to obesity. This makes sense; they’re all probably pretty well-off, and they all probably eat at the same college cafeteria. But generalize to the entire US population, and poor people will be more obese, because they can’t afford healthy food / don’t have time to exercise / possible genetic correlations. And then generalize further to the entire world population, and poor people will be thinner, because some of them can’t afford food and are literally starving. And then generalize further to the entire world population over all of human history, and it stops holding again, because most people are cavemen who eat grubs and use shells for money, and having more shells doesn’t make it any easier to find grubs.
More often, we’re a little nervous about this but we cross our fingers and hope it works. Antidepressants have never been tested in the population of people named Melinda Hauptmann-Brown. If you’re a depressed person named Melinda Hauptmann-Brown, you will have to trust that the same antidepressants that work on people who aren’t named Melinda Hauptmann-Brown also work on you. Luckily the mechanism of antidepressants (something something serotonin, or maybe not) seems like the kind of thing that should work regardless of what your name is, so this is a good bet. But it’s still a bet.
Selection bias is fatal for polls, but only sometimes a problem for correlations. In real life, worrying about selection bias for correlations looks like thinking really hard about the mechanism, formulating hypotheses about how you expect something to generalize to particular out-of-sample populations, sometimes trying to test those hypotheses, but accepting that you can never test all of them and will have to take a lot of things on priors.
It doesn’t look like saying “This is an Internet survey, so it has selection bias, unlike real-life studies, which are fine.” Come on!