Difference between revisions of "Broadband Platform on HPCC"
m |
|||
Line 97: | Line 97: | ||
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. | warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. | ||
</pre> | </pre> | ||
+ | |||
+ | The current compiler on HPCC is v10.0. This older version of the compiler can generate somewhat different results than more recent versions. We must work with the versions available on HPCC, so add the following commands to your .bash_profile to point to approopriate versions of the compilers. | ||
+ | |||
+ | <pre> | ||
+ | source /usr/usc/gnu/gcc/default/setup.sh | ||
+ | source /usr/usc/intel/10.0/setup.sh | ||
+ | </pre> | ||
+ | |||
+ | |||
+ | == Building the Broadband Platform == | ||
Once the environment is configured with the proper compilers, users can build the Broadband Platform. Assuming the platform is installed in the /home/rcf-104/earthquake/bbp directory, users should do the following to build the Broadband Platform: | Once the environment is configured with the proper compilers, users can build the Broadband Platform. Assuming the platform is installed in the /home/rcf-104/earthquake/bbp directory, users should do the following to build the Broadband Platform: |
Revision as of 22:56, 26 September 2012
Current broadband studies may exceed 200K seismograms. To produce this number of seismograms, we need to use high performance computing.
This entries contains information that may be useful when running a recent version of broadband platform on USC HPCC system. These instructions work for broadband platform v12.10 and later.
Contents
- 1 Overview
- 2 Broadband File System Issues
- 3 Install and Build BBP on HPCC
- 4 Setup HPCC environment
- 5 Compiler Settings
- 6 Building the Broadband Platform
- 7 Install Desired Green's Functions and Validation Packages
- 8 Configuring Required Environment Variables
- 9 Create Validation Runs and Start the Simulations
- 10 Important Notes
Overview
In order to set up Broadband validation runs on HPCC, users need to follow these steps:
- Install and Build Broadband on HPCC
- Install Desired Green's Functions and Validation Packages
- Configure Required Environment Variables
- Create Validation Runs and Start Simulations
Information about the USC HPCC system is available on the USC HPCC web site.
Broadband File System Issues
Broadband simulations are very I/O intensive, reading and writing thousands of small/medium files on an average simulation. To manage this I/O load, we use a local filesystem on each compute node to minimize remote reads and writes and thus improve execution time. This approach also avoids creating a bottleneck a file server and eliminates unnecessary network traffic. It is therefore possible for multiple users to run their simulations on HPCC without significant interference.
If users run broadband platform (bbp) validation processing on HPCC, the bbp validation scripts set up Broadband's BBP_DATA_DIR to use a directory in the /tmp filesystem.
Because the /tmp filesystem on each node is automatically cleaned at the end of each simulation, it is necessary to copy all wanted files to a permanent location. The HPCC validation scripts do that automatically after the simulations are finished (but before the PBS job ends).
Install and Build BBP on HPCC
The SCEC broadband platform should make use of 64bit compute nodes. These are accessible through the hpc-login2.usc.edu head node.
Users should log into HPCC's hpc-login2.usc.edu head node and then confirm that they can access use the rcf-104 filesystem for their simulations. Their home directories may not be on this file system.
The rcf-104 filesystem is visible from the head node and from all worker nodes in the cluster. However, it may not be the default file system. To confirm your account is configured to use rcf-104, login into hpc-login2.usc.edu and cd over to rcf-104, like this:
Please replace the username shown as "maechlin" with your HPCC username in the commands given below.
-bash-3.2$ pwd /home/rcf-01/maechlin -bash-3.2$ cd /home/rcf-104/maechlin -bash-3.2$ pwd /home/rcf-104/maechlin
If these commands work, and you have access to a directory on /home/rcf-104, please proceed to the next steps. Otherwise, please contact the broadband developers at SCEC who will help you setup your HPCC account as needed.
The next step in setting up the Broadband Platform on HPCC is to download and build the platform. Users should make sure they have a version of Broadband 12.x.x or greater. as Broadband releases 11.2.3 and earlier cannot be used on the USC HPCC cluster according to these instructions.
It is also possible to use the svn version of Broadband (as described in the User Guide), but users should be aware that unreleased Broadband code from svn can change daily and is not recommended for official/paper simulations.
-bash-3.2$ svn co https://source.usc.edu/svn/broadband/trunk
In this case, let us assume a user has downloaded the tgz file. After downloading the Broadband package from the website, users need to untar it using the following command:
$ tar -xzvf bbp_dist_<version>.tgz
Setup HPCC environment
Before compiling Broadband, users will need to set up their HPCC computing environments. This is done by setting environment variables in their .bashrc or .login files. These files are typically in the users home directory (and probably not in your rcf-104 directory).
Depending on the shell employed, users will need too add the following lines:
bash -- add the following lines to the .bashrc file
# Setup for running broadband source /usr/usc/gnu/gcc/default/setup.sh source /usr/usc/intel/10.0/setup.sh source /home/scec-00/opt/Python-2.6.2/setup.sh source /usr/usc/matlab/default/setup.sh
csh -- add the following lines to the .cshrc file
# Setup for running broadband source /usr/usc/gnu/gcc/default/setup.csh source /usr/usc/intel/10.0/setup.csh source /home/scec-00/opt/Python-2.6.2/setup.csh source /usr/usc/matlab/default/setup.csh
It may be necessary to logout and login back again for these changes to be incorporated in the user environment (alternatively, users can source the changed file to force the changes to take effect immediately).
Compiler Settings
Some programs in the broadband platform must be compiled before they can be used. To make sure the correct compilers are set up, users can type the following commands:
$ ifort --version ifort (IFORT) 10.0 20070426 Copyright (C) 1985-2007 Intel Corporation. All rights reserved. $ gcc --version gcc (GCC) 4.3.3 Copyright (C) 2008 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
The current compiler on HPCC is v10.0. This older version of the compiler can generate somewhat different results than more recent versions. We must work with the versions available on HPCC, so add the following commands to your .bash_profile to point to approopriate versions of the compilers.
source /usr/usc/gnu/gcc/default/setup.sh source /usr/usc/intel/10.0/setup.sh
Building the Broadband Platform
Once the environment is configured with the proper compilers, users can build the Broadband Platform. Assuming the platform is installed in the /home/rcf-104/earthquake/bbp directory, users should do the following to build the Broadband Platform:
$ cd /home/rcf-104/earthquake/bbp/src $ make
If all compilers were properly added to the user's path, the code will start compiling. This process can take a while, and users may encounter some "build warnings", which are fine. If compilation errors are found, the problem needs to be investigated further.
Install Desired Green's Functions and Validation Packages
In order to run simulations, users will need to download and install one or more velocity models/Green's Functions. Validation packages are only required for historical simulations/validation runs. The first step is to create a top-level directory where all Green's Functions packages will reside.
$ cd /home/rcf-104/earthquake $ mkdir bbp_gf $ cd bbp_gf
Then, users need to untar inside the Green's Functions top-level directory each Green's Functions package downloaded from the Broadband website. For example:
$ tar -xzvf bbp_northridge_gf_<version>.tgz $ tar -xzvf bbp_lomaprieta_gf_<version>.tgz ...
The same procedure should be followed for needed validation packages. Users need to create a top-level directory for all validation packages:
$ cd /home/rcf-104/earthquake $ mkdir bbp_val $ cd bbp_val
And then download each validation package from the Broadband website and untar it inside the validation top-level directory:
$ tar -xzvf bbo_northridge_val_<version>.tgz $ tar -xzvf bbp_lomaprieta_val_<version>.tgz
Configuring Required Environment Variables
Before users can run the Broadband Platform, they need to set up a few environment variables that tell the Platform how to find its components. This step is also shell dependent, and users may want to add these lines to their .cshrc (csh) or .bashrc (bash) in order to avoid having to type them every time they log into the head node to run simulations:
For csh: setenv BBP_DIR /home/rcf-104/earthquake/bbp setenv BBP_GF_DIR /home/rcf-104/earthquake/bbp_gf setenv BBP_VAL_DIR /home/rcf-104/earthquake/bbp_val setenv PYTHONPATH /home/rcf-104/earthquake/bbp/comps For bash: export BBP_DIR /home/rcf-104/earthquake/bbp export BBP_GF_DIR /home/rcf-104/earthquake/bbp_gf export BBP_VAL_DIR /home/rcf-104/earthquake/bbp_val export PYTHONPATH /home/rcf-104/earthquake/bbp/comps
Please note that this example features the path names used in the steps above. Users need to customize these with their actual installation locations.
The step above is needed so users can run Broadband scripts on the head node (the steps in the next section will fail if these variables are not properly set!). Additionally, users need to edit the setup_bbp_env.template file (located inside the utils/batch directory), and change the values for BBP_DIR, BBP_GF_DIR, and BBP_VAL_DIR as described in that file. Once edited, the file should be renamed to setup_bbp_env.sh. This file will be used by worker nodes when running the actual simulations.
Create Validation Runs and Start the Simulations
After completing all the steps above, creating validation runs, and starting the simulations on HPCC is easy! Users should first create a top-level directory for their simulations:
$ cd /home/rcf-104/earthquake $ mkdir sims $ cd sims
The next step is to create the validation runs using the provided bbp_hpcc_validation.py script. This script needs a few parameters, such as codebase to use, event to use for validation, number of realizations to run, a simulation directory where the results will go, and an e-mail address for job/status notifications. For example, to run 8 realizations of the lomap validation using the Graves & Pitarka method, users should type:
$ /home/rcf-104/earthquake/utils/batch/bbp_hpcc_validation.py --codebase gp --event lomap --dir lomap-gp-8 -n 8 --email fsilva@usc.edu
The bbp_hpcc_validation.py script will prepare each realization and at the end will tell the user how to submit the job to the cluster. For example, when the bbp_hpcc_validation.py above finishes, it will print:
Validation run is set up on: /auto/rcf-104/earthquake/sims/lomap-gp-8 To start the validation run, just type: $ qsub /auto/rcf-104/earthquake/sims/lomap-gp-8/lomap-gp.pbs
Users should copy-paste the qsub line on their shell to start the validation run on HPCC.
Important Notes
- The simulation directory provided to the bbp_hpcc_validation.py script should not exist. If it does, the script will ask the user if it should be deleted.
- The script will allocate 8-core nodes on HPCC. Running 1-8 simulations will use 1 node, 9-16 simulations will use 2 nodes, and so on.
- Each realization will use a different random seed for the rupture generator. All other simulation parameters remain the same among all realizations
- When users re-run simulations, the same random seeds are used in order to allow for reproducible results.
- Some validation packages include a SRF file instead of a source description (SRC file). In these cases the script cannot generate multiple realizations as the rupture is already defined. Users should invoke the bbp_hpcc_validation.py script with the --skip-rupgen option, which implies that only a single realization will run.
- Users will receive an e-mail at the e-mail address provided when their job begins, and another one when the job finishes.
- Users should only run 1 simulation at a time in order to be nice to others users of the cluster.