Difference between revisions of "Shock-carc configuration"

From SCECpedia
Jump to navigationJump to search
 
(10 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
This page documents the configuration requirements for shock-carc.usc.edu, which will be used as a workflow submission host for CyberShake.
 
This page documents the configuration requirements for shock-carc.usc.edu, which will be used as a workflow submission host for CyberShake.
 +
 +
== Requirements ==
  
 
=== Functionality ===
 
=== Functionality ===
Line 13: Line 15:
  
 
=== Software (installed by CARC) ===
 
=== Software (installed by CARC) ===
*Linux OS (whatever default flavor CARC prefers is fine)
+
*Linux OS (whatever default flavor CARC prefers is fine - CentOS 7)
*HTCondor (latest stable version is 9.0.0)
+
*HTCondor (9.0.1, out in mid-May)
*MariaDB (latest version of MariaDB is 10.5.9).  MySQL is also fine.
+
*MariaDB (repo has 10.5.5).  MySQL is also fine.
*globus-url-copy (should be available in Extra Packages for Enterprise Linux (EPEL))
+
*globus-url-copy (should be available in Extra Packages for Enterprise Linux (EPEL), globus-gass-copy-progs)
 
*sqlite3, if not part of the OS
 
*sqlite3, if not part of the OS
 
*screen, if not part of the OS
 
*screen, if not part of the OS
 
*ImageMagick, if not part of the OS
 
*ImageMagick, if not part of the OS
 +
*Python 3, if not part of the OS
 
*Java 11 (11.0.11 is the latest)
 
*Java 11 (11.0.11 is the latest)
  
Line 30: Line 33:
 
*For MariaDB, permissions should be set up by CARC so that users can create and write to databases.  Access to these databases should be restricted to localhost.
 
*For MariaDB, permissions should be set up by CARC so that users can create and write to databases.  Access to these databases should be restricted to localhost.
 
*To enable remote job submission, we request that CARC adds whitelisting exceptions for a short list of OLCF and TACC nodes, which will be provided separately.
 
*To enable remote job submission, we request that CARC adds whitelisting exceptions for a short list of OLCF and TACC nodes, which will be provided separately.
*Certificate?
+
*Trusted certificates (from Pegasus group)
 +
 
 +
== Establishing Workflow Capability ==
 +
 
 +
We plan to follow the steps below in establishing the capability of running CyberShake workflows on shock-carc.
 +
 
 +
{| border="1" cellpadding="3"
 +
! Step !! Status
 +
|-
 +
| Create and populate RC database with rupture geometries.
 +
| Completed.
 +
|-
 +
| Compile DAX generator.
 +
| Completed.
 +
|-
 +
| Use DAX generator to create DAX for TEST site (using Rup_Var_Scenario_ID 6, CVM-S4.26, 0.5 Hz).
 +
| Completed.
 +
|-
 +
| Plan TEST site DAX.
 +
|
 +
|-
 +
| Run jobs from TEST site DAX thru pegasus-run.
 +
|
 +
|-
 +
| Successfully perform stage-in and stage-out between Summit and shock, and Summit and hpc-transfer*.
 +
|
 +
|-
 +
| Verify TEST site results.
 +
|
 +
|}
 +
 
 +
Once basic workflow capability is attained, we will test the additional advanced components.

Latest revision as of 16:41, 27 October 2021

This page documents the configuration requirements for shock-carc.usc.edu, which will be used as a workflow submission host for CyberShake.

Requirements

Functionality

Once shock-carc is fully configured, we would like to be able to:

  • Have users on SCEC projects and the Pegasus team be able to log into shock-carc.
  • Submit remote Condor jobs to Summit.
  • Submit remote Condor jobs to Frontera.
  • Create, plan, and run a Pegasus workflow, which interacts with CyberShake databases.

Hardware

  • In addition to internal disk, 2 TB of local storage (preferred, but not required)

Software (installed by CARC)

  • Linux OS (whatever default flavor CARC prefers is fine - CentOS 7)
  • HTCondor (9.0.1, out in mid-May)
  • MariaDB (repo has 10.5.5). MySQL is also fine.
  • globus-url-copy (should be available in Extra Packages for Enterprise Linux (EPEL), globus-gass-copy-progs)
  • sqlite3, if not part of the OS
  • screen, if not part of the OS
  • ImageMagick, if not part of the OS
  • Python 3, if not part of the OS
  • Java 11 (11.0.11 is the latest)

Software (installed by SCEC)

  • Pegasus will be installed by SCEC developers in user space, since we often build from source.
  • OpenSHA will be installed by SCEC developers in user space.

Configuration

  • HTCondor configuration is specified through two configuration files, condor_config and condor_config.local. CARC, the Pegasus team, and SCEC developers will work together to correctly set up these files. We anticipate that condor_config will be owned by user condor and thus require CARC to edit, but that condor_config.local will be owned by user scottcal.
  • For MariaDB, permissions should be set up by CARC so that users can create and write to databases. Access to these databases should be restricted to localhost.
  • To enable remote job submission, we request that CARC adds whitelisting exceptions for a short list of OLCF and TACC nodes, which will be provided separately.
  • Trusted certificates (from Pegasus group)

Establishing Workflow Capability

We plan to follow the steps below in establishing the capability of running CyberShake workflows on shock-carc.

Step Status
Create and populate RC database with rupture geometries. Completed.
Compile DAX generator. Completed.
Use DAX generator to create DAX for TEST site (using Rup_Var_Scenario_ID 6, CVM-S4.26, 0.5 Hz). Completed.
Plan TEST site DAX.
Run jobs from TEST site DAX thru pegasus-run.
Successfully perform stage-in and stage-out between Summit and shock, and Summit and hpc-transfer*.
Verify TEST site results.

Once basic workflow capability is attained, we will test the additional advanced components.