P-hacking

Katrina Borthwick - 20th February 2023

P-hacking is a data analysis technique that can be used to present patterns as statistically significant when there is really no underlying effect. It is a misuse of statistics and a misrepresentation, plain and simple, and disappointingly it’s usually perpetrated by scientists.

Pair this with a bias towards reporting on positive results, no matter how tenuous, and add a bit of journalistic flare in the headline, and things can get quite out of control. This is the case even if your study is a deliberate and publicly declared hoax. More on that later…

What is p-hacking?

In a nutshell, p-hackers perform a large number of statistical analyses on their data in different ways, with the intention of finding a desirable p-value (usually lower than 0.05). That means there is a less than 5% chance the result has been found by chance. But obviously if you have tried a very large number of permutations, or divided the data up in numerous ways, then there is a high chance you are going to find some weirdness in some parts of the data somewhere. When you find that thing, you can go back and change your method and hypothesis to fit the data, and publish your “significant” result, completely omitting any reference to all the other permutations that didn’t turn up anything.

The following XKCD cartoon absolutely nails it:

If you want to more about the technicalities there is a great video here:

Some real-life examples of p-hacking are:

How bad is it?

I’m not going to cherry pick here. There was a 2015 study on “The Extent and Consequences of P-Hacking in Science” (link below). They put together a methodology to determine if p-hacking had possibly occurred, and concluded that based on their data that p-hacking is probably common, but its effect seems to be weak relative to the real effect sizes being measured, and it probably does not drastically alter scientific consensuses drawn from meta-analyses.

However, as we have seen, the individual studies themselves can generate a bit of interest, and lead to the public basing their decisions on nonsense. Also, why would we use dodgy data if we can avoid it?

What can researchers do?

The general acceptance of these practices needs to stop. It is a form of scientific misconduct.

Researchers and journals could report all their dependent measures, including the stuff that fails to show a significant result.

The practice of deciding to ‘look for more data’ after getting a non-significant result, or changing the methodology or hypothesis slightly to fit the results, needs to be disclosed or stopped where the research is not clearly stated to be ‘exploratory’ in nature rather than testing a particular hypothesis. To this end, it would be best to require entire methods, including hypotheses, to be pre-specified in advance of the study, and for any subsequent adjustments to these to be disclosed. These should be available to view alongside the research online. Some journals are already moving in this direction, and won’t accept papers where the methods have not been received prior to the study’s commencement.

It is also important that we’re not incentivising the wrong behaviours. That means placing greater value on the quality of research methods and data collection, rather than whether there is a positive finding, when reviewing or assessing research. This is related to publishing bias – where negative results are often not published. Researchers want to get published, and this bias incentivises p-hacking.

Some other practical things researchers can do are measuring only response variables that are known (or predicted) to be important (not green jelly beans), using sufficient sample sizes, and performing data analysis blind wherever possible.

Finally, it would be good to see open access to the raw data. This makes researchers more accountable for marginal results, and allows reanalysis to check robustness. It also increases the likelihood of getting caught if you are being creative with the data, which is a disincentive in itself. Nobody wants to be the next Brian Wansink.