Recent studies reveal the prevalence of poor-quality data, exacerbated by increased use of machine learning that allows users to dredge far bigger datasets and identify spurious correlations.

A review of 100 major psychology studies, for instance, found that only 36 percent had statistical significance. Over half the alien planets identified by Nasa’s Kepler telescope turned out to be stars. And in preclinical cancer research, a mere six out of 53 breakthrough studies were found to be reproducible. Quantitative finance does not fare much better.