Shock-carc configuration

From SCECpedia
Jump to navigationJump to search

This page documents the configuration requirements for shock-carc.usc.edu, which will be used as a workflow submission host for CyberShake.

Requirements

Functionality

Once shock-carc is fully configured, we would like to be able to:

  • Have users on SCEC projects and the Pegasus team be able to log into shock-carc.
  • Submit remote Condor jobs to Summit.
  • Submit remote Condor jobs to Frontera.
  • Create, plan, and run a Pegasus workflow, which interacts with CyberShake databases.

Hardware

  • In addition to internal disk, 2 TB of local storage (preferred, but not required)

Software (installed by CARC)

  • Linux OS (whatever default flavor CARC prefers is fine - CentOS 7)
  • HTCondor (9.0.1, out in mid-May)
  • MariaDB (repo has 10.5.5). MySQL is also fine.
  • globus-url-copy (should be available in Extra Packages for Enterprise Linux (EPEL), globus-gass-copy-progs)
  • sqlite3, if not part of the OS
  • screen, if not part of the OS
  • ImageMagick, if not part of the OS
  • Python 3, if not part of the OS
  • Java 11 (11.0.11 is the latest)

Software (installed by SCEC)

  • Pegasus will be installed by SCEC developers in user space, since we often build from source.
  • OpenSHA will be installed by SCEC developers in user space.

Configuration

  • HTCondor configuration is specified through two configuration files, condor_config and condor_config.local. CARC, the Pegasus team, and SCEC developers will work together to correctly set up these files. We anticipate that condor_config will be owned by user condor and thus require CARC to edit, but that condor_config.local will be owned by user scottcal.
  • For MariaDB, permissions should be set up by CARC so that users can create and write to databases. Access to these databases should be restricted to localhost.
  • To enable remote job submission, we request that CARC adds whitelisting exceptions for a short list of OLCF and TACC nodes, which will be provided separately.
  • Trusted certificates (from Pegasus group)

Establishing Workflow Capability

We plan to follow the steps below in establishing the capability of running CyberShake workflows on shock-carc.

Step Status
Create and populate RC database with rupture geometries. Completed.
Compile DAX generator. Completed.
Use DAX generator to create DAX for TEST site (using Rup_Var_Scenario_ID 6, CVM-S4.26, 0.5 Hz). Completed.
Plan TEST site DAX.
Run jobs from TEST site DAX thru pegasus-run.
Successfully perform stage-in and stage-out between Summit and shock, and Summit and hpc-transfer*.
Verify TEST site results.

Once basic workflow capability is attained, we will test the additional advanced components.