Reproducible builds

Revision as of 08:15, 6 October 2014 by Schindelin (talk | contribs) (Initial stab)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

What are reproducible builds?

I software version (or build) is called reproducible if it is easy to re-generate the exact same software application from the source code.

For example, you can refer to "ImageJ 1.49g" as a reproducible build, or to Sholl Analysis 3.4.3, while referring to "ImageJ" is irreproducible.

It gets more subtle when making heavy use of software libraries (sometimes called dependencies). It is known, for example, that many plugins in the now-defunct MacBiophotonics distribution of ImageJ worked fine with ImageJ 1.42l, but stopped working somewhere between that version and ImageJ 1.44e. That is, referring to, say, the Colocalisation Analysis plugin does not refer to a reproducible build because it is very hard to re-generate a working Colocalisation Analysis and ImageJ 1.x version that could be used to verify previously published results.

Why are reproducible builds so essential for science?

Arguably the most important thing in science is to gain insights about nature that can be verified by other researchers. It is this mission for which Fiji/ImageJ stand, and it is the sole reason why Fiji and ImageJ are Open Source.

To verify results, it is absolutely necessary to be able to reproduce results claimed in scientific articles, and in the interest of efficiency, it should be easy to reproduce the results, and it should also be easy to scrutinize the used methods – incorrect results can be artifacts of flawed algorithms, after all.

To that end, it should be obvious that researchers need to have the ability to inspect the exact source code corresponding to the software used to generate the results to be verified. In other words, reproducible builds are required for sound scientific research.