Difference between revisions of "Software Reproducibility"
From SCECpedia
Jump to navigationJump to search(2 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
Recommendations from recent nature article: | Recommendations from recent nature article: | ||
− | *[https://www.nature.com/articles/d41586-020-02462-7 Nature] | + | *[https://www.nature.com/articles/d41586-020-02462-7 Nature Article on Software Reproduciblity] |
== Reproducibility checklist == | == Reproducibility checklist == | ||
Line 30: | Line 30: | ||
== Related Entries == | == Related Entries == | ||
*[[SCEC Software]] | *[[SCEC Software]] | ||
+ | *[https://www.software.ac.uk/ Software Sustainability Institute] | ||
+ | *[http://urssi.us/ US Research Software Institute Proposal] | ||
+ | *[https://joss.theoj.org/ Journal of Open Source Software] |
Latest revision as of 21:24, 2 September 2020
Recommendations from recent nature article:
Reproducibility checklist
Although it’s impossible to guarantee computational reproducibility over time, these strategies can maximize your chances.
- Code Workflows based on point-and-click interfaces, such as Excel, are not reproducible. Enshrine your computations and data manipulation in code.
- Document Use comments, computational notebooks and README files to explain how your code works, and to define the expected parameters and the computational environment required.
- Record Make a note of key parameters, such as the ‘seed’ values used to start a random-number generator. Such records allow you to reproduce runs, track down bugs and follow up on unexpected results.
- Test Create a suite of test functions. Use positive and negative control data sets to ensure you get the expected results, and run those tests throughout development to squash bugs as they arise.
- Guide Create a master script (for example, a ‘run.sh’ file) that downloads required data sets and variables, executes your workflow and provides an obvious entry point to the code.
- Archive GitHub is a popular but impermanent online repository. Archiving services such as Zenodo, Figshare and Software Heritage promise long-term stability.
- Track Use version-control tools such as Git to record your project’s history. Note which version you used to create each result.
- Package Create ready-to-use computational environments using containerization tools (for example, Docker, Singularity), web services (Code Ocean, Gigantum, Binder) or virtual-environment managers (Conda).
- Automate Use continuous-integration services (for example, Travis CI) to automatically test your code over time, and in various computational environments.
- Simplify Avoid niche or hard-to-install third-party code libraries that can complicate reuse.
- Verify Check your code’s portability by running it in a range of computing environments.