UCVM Release Planning

From SCECpedia
Jump to navigationJump to search

Fork and Pull Development

Using the fork and pull method, start with fork of sceccode/ucvm.git into personal repo. Then, to do development on CARC, plan to clone pjmaechling/ucvm.git for development. Set git upstream to original repo sceccode/ucvm.git to keep in sync with that repo. Follow Instruction for setting upstream here:

Development on CARC


Compiling

  • the source directory /project/scec_608/maechlin/dev/ucvm
  • the installation directory /project/scec_608/maechlin/ucvm_bin

Stage Large files on public S3

  • get_large_files.py script in release points to hypocenter.usc.edu/research/ucvmc/V19_4
  • copy this directory from hypocenter
  • create v21.10 directory
  • move large files into that
  • move v21.10 to ResearchComputing AWS account under maechlin@usc.edu

ToDo

  • Change the CVM names presented to users during install: Use science names.
  • Move the run_ucvm.sh and run_ucvm_query.sh script to utilities
  • Document the scripts in ucvm/utilities
  • Update the version numbers from 21.7


  • SAM Poster
  • SAM Presentation Video

Key UCVM Improvements:

  1. Convergence of versions
  2. Large files next release stored on S3
  3. CI setup
  4. Documentation Updated into new structure
  5. Tests output
  6. Code Metadata included in repo
  7. Tags from USGS Thesarus
  8. Post DOI badge on UCVM
  9. Test with singularity on an XSEDE system
  10. UCVM Communitee
  11. Code of Conduct
  12. Open Source Metrics setup
  13. Code coverage statements
  14. Identification of sub-licenses in distribution
  15. Authorship contributions noted

Standard Contents of Git repo:

  1. A README with pictures/gifs of the product in action and a nice logo.
  2. Documentation.
  3. Code QA (Static Code Analysis).
  4. Contributing instructions.
  5. A well-defined setup section.
  6. Support (Respond to Issues/PR)
  7. Publish software news in every possible way.

Contents of README.md

When someone is looking at your project, they want to know:

  1. what is it?
  2. how good the code is?
  3. how much support is available?
  4. what’s included?
  5. what does it look like?
  6. how set it up?

Science Code Manifesto Elements:

  1. Code
  2. Copyright
  3. Citation
  4. Credit
  5. Curation

Steps To Software Product:

  1. Create citable, definitive version of software with doi, license, and repository.
  2. Define reference publication used to cite software.
  3. Define software as reference implementation of a method, and define a set of approved software acceptance/regression tests that can be used to establish a software implements that “method”.
  4. Create software maintenance organization with commit authority for pull requests and approval process for change requests, and process of approving new releases.
  5. Establish software community through registrations, newsletters, activity, regular calls, regular meetings, define community and roles.

Adoption of Fork and Pull Git Repo Model

  • Use the model used by the majority of open-source projects (including pyCSEP).
  • The “maintainer” of the shared repo assigns rights to “Collaborators”
  • Collaborators do not have push access to main (upstream) repo
  • Core development teams accepts (PRs) from collaborators, reviews them, then merges them into main repo

Contributor Process:

Working with shared projects on GitHub

  1. Fork the repository
  2. Clone your forked copy
  3. Sync your personal repo with shared repo
  4. Git merge/git rebase
  5. Make a contribution
  6. Pull request

How we want it Cited:

  • Example Citation:
  • Example Acknowledgements:
  • Example Reference:

Basic Recommendations:

  1. Make source code publicly accessible
  2. Make software easy to discover by providing software metadata via a popular community registry (Examples of community registries of software metadata are bio.tools (Ison et al., 2016), (Ison et al., 2016) biojs.io (Corpas et al.,2014; Gómez et al., 2013) and Omic Tools (Henry et al., 2014) in the life sciences and DataCite (Brase, n.d.) as a generic metadata registry for software as well as data.
  3. Adopt a license and comply with the license of third party dependencies
  4. Define clear and transparent contribution, governance and communications processes (For instance the Galaxy project’s website describes the team’s structure, how to be part of the community, and their communication channels.)

Types of Documentation with axis:

  • help learning – help working
  • theoretical knowledge – practical knowledge
  1. tutorials - learning oriented
  2. how-to guides – task-oriented
  3. Background/Concept explanations – understanding-oriented
  4. technical reference – information-oriented

DOCUMENTATION TYPES

  • CODE DOCUMENTATION - Semantic identifiers, comments, API, engineering, dependencies, requirements
  • USER DOCUMENTATION - How to get, run, use the software; parameters, data model, etc.; license
  • MAINTENANCE DOCUMENTATION - How to build, release, review code, publish
  • DEVELOPER DOCUMENTATION - How to contribute, contribution templates (issues, pull/merge requests)
  • METADATA - Software metadata (CodeMeta), Citation File (CFF), "references" (dependencies)
  • PROJECT DOCUMENTATION - Rationale, teams, governance, community (contact, code of conduct)

Where Documentation Lives

Documentation lives where the source code lives! (This is never in an email, chat, or similar!) Conceptual Documentation:

  • Requirements
  • Projects

Hands-on documentation

  • How-tos, getting started
  • Templates for issues, pull/merge
  • Contribution guidelines

Reference documentation

  • API
  • Tests
  • Metadata

Toolbox Documentation:

Toolbox documentation should describe the steps off analysis in a pedagogical, narrative fashion, with example data that users can load to follow along with and understand the documentation.

Implement Multiple Test Types:

  1. Functional tests – Unit Tests essential core ucvm functions
  2. Integration Tests – Test utilities including meshing, layer searches, gtls, and performance
  3. Model Tests - Each velocity model has tests showing expected results for some points
  4. Acceptance tests – Confirm results on users system. Maybe union of funcational, integration, and model tests.

Recommended Basic Practices:

  1. Training on Software Practices
  2. Code in a Code Repo
  3. Automated Testing
  4. Persistent ID for software versions

UCVM Versus CIG Standards:

Minimum:

  • Version control – ok
  • Code – ok
  • Portable – ok
  • Testing – (a) tests that verify it runs properly (b) accuracy or benchmark tests
  • Documentation – (a) install (b) parameters (c) physics (d) example inputs cookbooks (e) citable pub
  • Userworkflow – ok

Standard:

  • Version control -ok
  • Coding – (a) params at runtime (b) development plan (c) code comments (d) add features without modify main branch (e) useful error reports
  • Portability: (a) dependency cheking (b) automake (c) output configuration and build options
  • Testing – pass fail tests
  • Documentation: (a) workflow for research (b) how to extend code
  • Userworkflow: (a) easy to change sim params (b) user specific directories/filenames for i/o (c) standard binary formats (d) citation for code version.