Pareidolia: When Correlations are Truly Meaningless

Timothy Taylor 03/06/2019 4

"Pareidolia" refers to the common human practice of looking at random outcomes but trying to impose patterns on them. For example, we all know in the logical part of our brain that there are a roughly a kajillion different variables in the world, and so if we look through the possibilities, we will will have a 100% chance of finding some variables that are highly correlated with each other. These correlations will be a matter of pure chance, and they carry no meaning. But when my own brain, and perhaps yours, sees one of these correlations, I can feel my thoughts start searching for a story to explain what looks to my eyes like a connected pattern.

Here are some examples from Tyler Vigen's website, drawn from his 2015 book Spurious Correlations.

Eye-balling these kinds of figures gives you a sense of why these correlations arise. For example, if you have both a right-hand and a left-hand axis, you can set the scales on those figures so that draw the figure so that the starting points and the ending points of the two lines are close to each other--and then the intermediate lines will look fairly common as well. If comparing to data on a certain statistic in a certain state (divorces in Maine, fishing accidents in Kentucky), your statistical antennae should be warning you that by the time you look through a large group of family or health statistics for each of 50 states, there's a reasonable chance of finding whatever pattern you are looking for just by random chance. If you limit the search to relatively short stretches of data like a decade or so, and plug in your computer to sort through the possibilities, finding meaningless correlations isn't going to be hard.

Of course, at the more serious level of academic research, these types of issues can still arise. Imagine that a researcher is trying to look at the effects of a particular large-scale program. The researcher has lots of data to divide people up into groups: by age, work status, family status, geographic location, education, health, race/ethnicity, gender, religion, and more. The researcher also has lots of possible outcomes for these people: income, marriage or divorce, childbearing, health, employment, retirement, and others. If a researcher looks at all the possible subcategories, it will inevitably be true that this program will seem to have major effects in a certain group: for example, the program may be correlated with a big change in the divorce behavior of white people in the 35-54 age bracket with low levels of religious observance in the state of New York. But if you (or your computer program) scanned through literally thousands of subgroups and possible effects to find this specific correlation, it's fair to assume that the correlation is just as meaningless as any of the examples presented by Vigen.

Classes in statistics emphasize that "correlation doesn't mean causation." The lesson here is even stronger. Correlation doesn't necessarily mean anything at all.

A version of this article first appeared on Conversable Economist.

Share this article

Leave your comments

Post comment as a guest

Comments

Comments (4)

Dan Ashdown

Thank you for this A+ analysis.

about 4 years ago
Reply
Neil Payne

Good one, I learned something new....

about 4 years ago
Reply
Gavin Steers

Well explained !!

about 4 years ago
Reply
Josh Mckeown

Excellent article

about 4 years ago
Reply

Timothy Taylor

Global Economy Expert

Timothy Taylor is an American economist. He is managing editor of the Journal of Economic Perspectives, a quarterly academic journal produced at Macalester College and published by the American Economic Association. Taylor received his Bachelor of Arts degree from Haverford College and a master's degree in economics from Stanford University. At Stanford, he was winner of the award for excellent teaching in a large class (more than 30 students) given by the Associated Students of Stanford University. At Minnesota, he was named a Distinguished Lecturer by the Department of Economics and voted Teacher of the Year by the master's degree students at the Hubert H. Humphrey Institute of Public Affairs. Taylor has been a guest speaker for groups of teachers of high school economics, visiting diplomats from eastern Europe, talk-radio shows, and community groups. From 1989 to 1997, Professor Taylor wrote an economics opinion column for the San Jose Mercury-News. He has published multiple lectures on economics through The Teaching Company. With Rudolph Penner and Isabel Sawhill, he is co-author of Updating America's Social Contract (2000), whose first chapter provided an early radical centrist perspective, "An Agenda for the Radical Middle". Taylor is also the author of The Instant Economist: Everything You Need to Know About How the Economy Works, published by the Penguin Group in 2012. The fourth edition of Taylor's Principles of Economics textbook was published by Textbook Media in 2017.

Pareidolia: When Correlations are Truly Meaningless