Skip to main content
eScholarship
Open Access Publications from the University of California

UCSF

UC San Francisco Previously Published Works bannerUCSF

Is N-Hacking Ever OK? The consequences of collecting more data in pursuit of statistical significance

Abstract

Upon completion of an experiment, if a trend is observed that is "not quite significant," it can be tempting to collect more data in an effort to achieve statistical significance. Such sample augmentation or "N-hacking" is condemned because it can lead to an excess of false positives, which can reduce the reproducibility of results. However, the scenarios used to prove this rule tend to be unrealistic, assuming the addition of unlimited extra samples to achieve statistical significance, or doing so when results are not even close to significant; an unlikely situation for most experiments involving patient samples, cultured cells, or live animals. If we were to examine some more realistic scenarios, could there be any situations where N-hacking might be an acceptable practice? This Essay aims to address this question, using simulations to demonstrate how N-hacking causes false positives and to investigate whether this increase is still relevant when using parameters based on real-life experimental settings.

Many UC-authored scholarly publications are freely available on this site because of the UC's open access policies. Let us know how this access is important for you.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View