Difference between revisions of "Broadband Platform on HPCC"

From SCECpedia
Jump to navigationJump to search
Line 16: Line 16:
 
== Broadband File System Issues ==
 
== Broadband File System Issues ==
  
When running on HPCC, the validation scripts set up Broadband's BBP_DATA_DIR to use a directory in the /tmp filesystem. Because Broadband simulations are very I/O intensive, reading and writing thousands of small/medium files on an average simulation, we use a local filesystem on each compute node to minimize remote reads and writes and thus improve execution time. This approach also avoids creating a bottleneck a file server and eliminates unnecessary network traffic. It is therefore possible for multiple users to run their simulations on HPCC without significant interference.
+
Broadband simulations are very I/O intensive, reading and writing thousands of small/medium files on an average simulation, we use a local filesystem on each compute node to minimize remote reads and writes and thus improve execution time. This approach also avoids creating a bottleneck a file server and eliminates unnecessary network traffic. It is therefore possible for multiple users to run their simulations on HPCC without significant interference.
 +
 
 +
If users run broadband platform (bbp) validation processing on HPCC, the bbp validation scripts set up Broadband's BBP_DATA_DIR to use a directory in the /tmp filesystem. Because
  
 
Because the /tmp filesystem on each node is automatically cleaned at the end of each simulation, it is necessary to copy all wanted files to a permanent location. The HPCC validation scripts do that automatically after the simulations are finished (but before the PBS job ends).
 
Because the /tmp filesystem on each node is automatically cleaned at the end of each simulation, it is necessary to copy all wanted files to a permanent location. The HPCC validation scripts do that automatically after the simulations are finished (but before the PBS job ends).

Revision as of 21:25, 26 September 2012

Current broadband studies may exceed 200K seismograms. To produce this number of seismograms, we need to use high performance computing.

This entries contains information that may be useful when running a recent version of broadband platform on USC HPCC system. These instructions work for broadband platform v12.10 and later.

Overview

In order to set up Broadband validation runs on HPCC, users need to follow these steps:

  1. Install and Build Broadband on HPCC
  2. Install Desired Green's Functions and Validation Packages
  3. Configure Required Environment Variables
  4. Create Validation Runs and Start Simulations

Information about the USC HPCC system is available on the USC HPCC web site.

Broadband File System Issues

Broadband simulations are very I/O intensive, reading and writing thousands of small/medium files on an average simulation, we use a local filesystem on each compute node to minimize remote reads and writes and thus improve execution time. This approach also avoids creating a bottleneck a file server and eliminates unnecessary network traffic. It is therefore possible for multiple users to run their simulations on HPCC without significant interference.

If users run broadband platform (bbp) validation processing on HPCC, the bbp validation scripts set up Broadband's BBP_DATA_DIR to use a directory in the /tmp filesystem. Because

Because the /tmp filesystem on each node is automatically cleaned at the end of each simulation, it is necessary to copy all wanted files to a permanent location. The HPCC validation scripts do that automatically after the simulations are finished (but before the PBS job ends).

Install and Build BBP on HPCC

The SCEC broadband platform should make use of 64bit compute nodes. These are accessible through the hpc-login2.usc.edu head node.

Users should log into HPCC's hpc-login2.usc.edu head node and use the rcf-104 filesystem for their simulations. The rcf-104 filesystem is visible from the head node and from all worker nodes in the cluster. The first step in setting up the Broadband Platform on HPCC is to download and build the platform. Users should make sure they have a version of Broadband 12.x.x or greater. as Broadband releases 11.2.3 and earlier cannot be used on the USC HPCC cluster according to these instructions. It is also possible to use the svn version of Broadband (as described in the User Guide), but users should be aware that unreleased Broadband code from svn can change daily and is not recommended for official/paper simulations. After downloading the Broadband package from the website, users need to untar it using the following command:

$ tar -xzvf bbp_dist_<version>.tgz

Before compiling Broadband, users will need to set up their environments. Depending on the shell employed, users will need too add the following lines:

csh -- add the following lines to the .cshrc file

# Setup for running broadband
source /usr/usc/gnu/gcc/default/setup.csh
source /usr/usc/intel/10.0/setup.csh
source /home/scec-00/opt/Python-2.6.2/setup.csh
source /usr/usc/matlab/default/setup.csh

bash -- add the following lines to the .bashrc file

# Setup for running broadband
source /usr/usc/gnu/gcc/default/setup.sh
source /usr/usc/intel/10.0/setup.sh
source /home/scec-00/opt/Python-2.6.2/setup.sh
source /usr/usc/matlab/default/setup.sh

It may be necessary to logout and login back again for these changes to be incorporated in the user environment (alternatively, users can source the changed file to force the changes to take effect immediately). To make sure the correct compilers are set up, users can type the following commands:

$ ifort --version
ifort (IFORT) 10.0 20070426
Copyright (C) 1985-2007 Intel Corporation.  All rights reserved.

$ gcc --version
gcc (GCC) 4.3.3
Copyright (C) 2008 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Once the environment is configured with the proper compilers, users can build the Broadband Platform. Assuming the platform is installed in the /home/rcf-104/earthquake/bbp directory, users should do the following to build the Broadband Platform:

$ cd /home/rcf-104/earthquake/bbp/src
$ make

If all compilers were properly added to the user's path, the code will start compiling. This process can take a while, and users may encounter some "build warnings", which are fine. If compilation errors are found, the problem needs to be investigated further.

Install Desired Green's Functions and Validation Packages

In order to run simulations, users will need to download and install one or more velocity models/Green's Functions. Validation packages are only required for historical simulations/validation runs. The first step is to create a top-level directory where all Green's Functions packages will reside.

$ cd /home/rcf-104/earthquake
$ mkdir bbp_gf
$ cd bbp_gf

Then, users need to untar inside the Green's Functions top-level directory each Green's Functions package downloaded from the Broadband website. For example:

$ tar -xzvf bbp_northridge_gf_<version>.tgz
$ tar -xzvf bbp_lomaprieta_gf_<version>.tgz
...

The same procedure should be followed for needed validation packages. Users need to create a top-level directory for all validation packages:

$ cd /home/rcf-104/earthquake
$ mkdir bbp_val
$ cd bbp_val

And then download each validation package from the Broadband website and untar it inside the validation top-level directory:

$ tar -xzvf bbo_northridge_val_<version>.tgz
$ tar -xzvf bbp_lomaprieta_val_<version>.tgz

Configuring Required Environment Variables

Before users can run the Broadband Platform, they need to set up a few environment variables that tell the Platform how to find its components. This step is also shell dependent, and users may want to add these lines to their .cshrc (csh) or .bashrc (bash) in order to avoid having to type them every time they log into the head node to run simulations:

For csh:

setenv BBP_DIR /home/rcf-104/earthquake/bbp
setenv BBP_GF_DIR /home/rcf-104/earthquake/bbp_gf
setenv BBP_VAL_DIR /home/rcf-104/earthquake/bbp_val
setenv PYTHONPATH /home/rcf-104/earthquake/bbp/comps

For bash:

export BBP_DIR /home/rcf-104/earthquake/bbp
export BBP_GF_DIR /home/rcf-104/earthquake/bbp_gf
export BBP_VAL_DIR /home/rcf-104/earthquake/bbp_val
export PYTHONPATH /home/rcf-104/earthquake/bbp/comps

Please note that this example features the path names used in the steps above. Users need to customize these with their actual installation locations.

The step above is needed so users can run Broadband scripts on the head node (the steps in the next section will fail if these variables are not properly set!). Additionally, users need to edit the setup_bbp_env.template file (located inside the utils/batch directory), and change the values for BBP_DIR, BBP_GF_DIR, and BBP_VAL_DIR as described in that file. Once edited, the file should be renamed to setup_bbp_env.sh. This file will be used by worker nodes when running the actual simulations.

Create Validation Runs and Start the Simulations

After completing all the steps above, creating validation runs, and starting the simulations on HPCC is easy! Users should first create a top-level directory for their simulations:

$ cd /home/rcf-104/earthquake
$ mkdir sims
$ cd sims

The next step is to create the validation runs using the provided bbp_hpcc_validation.py script. This script needs a few parameters, such as codebase to use, event to use for validation, number of realizations to run, a simulation directory where the results will go, and an e-mail address for job/status notifications. For example, to run 8 realizations of the lomap validation using the Graves & Pitarka method, users should type:

$ /home/rcf-104/earthquake/utils/batch/bbp_hpcc_validation.py --codebase gp --event lomap --dir lomap-gp-8 -n 8 --email fsilva@usc.edu

The bbp_hpcc_validation.py script will prepare each realization and at the end will tell the user how to submit the job to the cluster. For example, when the bbp_hpcc_validation.py above finishes, it will print:

Validation run is set up on: /auto/rcf-104/earthquake/sims/lomap-gp-8

To start the validation run, just type: 
$ qsub /auto/rcf-104/earthquake/sims/lomap-gp-8/lomap-gp.pbs

Users should copy-paste the qsub line on their shell to start the validation run on HPCC.

Important Notes

  • The simulation directory provided to the bbp_hpcc_validation.py script should not exist. If it does, the script will ask the user if it should be deleted.
  • The script will allocate 8-core nodes on HPCC. Running 1-8 simulations will use 1 node, 9-16 simulations will use 2 nodes, and so on.
  • Each realization will use a different random seed for the rupture generator. All other simulation parameters remain the same among all realizations
  • When users re-run simulations, the same random seeds are used in order to allow for reproducible results.
  • Some validation packages include a SRF file instead of a source description (SRC file). In these cases the script cannot generate multiple realizations as the rupture is already defined. Users should invoke the bbp_hpcc_validation.py script with the --skip-rupgen option, which implies that only a single realization will run.
  • Users will receive an e-mail at the e-mail address provided when their job begins, and another one when the job finishes.
  • Users should only run 1 simulation at a time in order to be nice to others users of the cluster.