How to Be a Statistical Detective

How to Be a Statistical Detective

Mark Twain famously once said, “There are three kinds of lies: lies, damned lies and statistics.” People tend to view statistics with a bit of skepticism, recognizing that figures can be compiled and presented in any number of ways to support a particular argument.

According to Stanford University Associate Professor Kristin Sainani, that reputation is somewhat merited, given the abundance of misleading statistical information out in the world today. Professor Sainani recently led a webinar to discuss some of the tools of the trade that she uses to dig into statistics and find the truth, regardless of what the numbers appear to say.

Watch the webinar

Look out for basic errors

Simple mistakes can creep into seemingly ironclad statistical analyses. Those basic errors can have a significant impact on the findings and conclusions of that research. Professor Sainani cited the example of a medical study analyzing publicly available data that mixed up the number of cancerous tumors resistant to a particular drug with the number of tumors that were receptive to it.

You might not think it would be necessary to check for such simple errors in a published statistical study, but as this example shows, you can’t take everything at face value.

Check for missing data and inconsistencies

People tend to focus solely on the conclusions of studies and statistical analyses, and not so much on how those findings came about. As Professor Sainani demonstrated, however, the results of statistical reports can be compromised due to missing or inconsistent data.

Reviewing a study on post-exercise eating habits, she found that there were instances of missing data points, suggesting that it was possible the researchers discarded results that did not support their hypothesis. She noted it was certainly possible such inconsistencies were not deliberate, but their existence compromised the integrity of the study, nonetheless.

Spot statistical outliers that could skew results

In some studies, all it takes is one or two outliers to change the researchers’ findings. Professor Sainani highlighted a report suggesting that there was a correlation between infant stress levels and breast milk intake. After reviewing the associated charts and plotted data points, she suspected that one outlier had enough weight to potentially skew the results and support a conclusion that didn’t hold up to closer examination.

Using a few different software tools and statistical models, Sainani was able to conduct a more accurate analysis of the available data - both with and without the influential data point. She found that if that one data point was missing, there was very little correlation between the two factors.

Following statistics best practices

What’s alarming about the different examples that Professor Sainani shared is that in each case, the error was entirely preventable. She explained that if the researchers had simply followed best practices when compiling and presenting data, their findings would not have been compromised. Although there are many statistical analysis programs available, it’s important to always check that all data is accurate and present. Statisticians always need to keep the fundamentals in mind so they can ensure the integrity of their data.

Watch the full webinar here to hear more about how to detect statistical errors from Professor Sainani. If you would like to delve deeper into the topic, Stanford’s Medical Statistics Certificate Program covers the most widely used statistical concepts and techniques in medical research.

Watch the webinar