Broadband User Guide v11.2.2

From SCECpedia
Revision as of 18:27, 14 October 2011 by Sandarsh (talk | contribs)
Jump to navigationJump to search

Version 11.2.2.

If you find errors in this document or sections to be unclear, please either fix them yourself or contact Sandarsh Kumar (sandarsh at usc.edu), Scott Callaghan (scottcal at usc.edu) or Philip Maechling (maechlin at usc.edu).

Installing the Second-Generation Broadband Platform

Installing the Broadband Platform involves obtaining a copy of the code and building the required executables. You can either download the platform from the Broadband web site (http://www.scec.org/research/broadband) or check the code out of SCEC's Subversion repository. Most users should download the platform.

System Dependencies

The current version of the Broadband Platform is designed to run on standard 64-bit Linux machines. Testing has been performed on SCEC's development servers running Fedora Core 10 (kernel version 2.6.27.41-170.2.117.fc10.x86_64). In this guide we outline how to install the platform into your own account on a Linux computer using the simplest approach.

Software Dependencies

The Broadband Platform has certain software dependencies.

Required:

  • Python v2.7 with
    • PyGTK
    • Matplotlib
    • Numpy
    • Pyproj
  • Intel compilers (64-bit) v12.x
  • GNU compilers (gcc, gfortran) v4.5
  • GNU Fortran 77 v3.4

Optional:

  • GMT tools (for plots)

Setting Up Your Account

For simplicity of installation, we recommend users use a bash shell for the Broadband Platform account. It is possible to get the platform running using other shell's, but we will focus on a bash shell installation. The user environment is a common source of problems since certain environment variables must be defined for the platform to work correctly.

To check your account, make sure you can run basic commands like ls and cd before proceeding.

Downloading the Platform

Download 4 files from the Broadband website, the code (bbp_dist_<version>.tgz), the data (bbp_data_<version>.tgz), and their checksum files (bbp_dist_<version>.tgz.md5 and bbp_data_<version>.tgz.md5). The code file is about 100 MB, the data file about 3 GB. After you've downloaded the files to your local Linux system, the next step is to calculate the checksums yourself and compare them to the checksums you downloaded.

First, verify that the md5sum command is in your path:

$> which md5sum

You should get something like /usr/bin/md5sum. If you see the message 'no md5sum in...', contact your Linux system administrator and ask to have md5sum added to your path.

Once you can run the md5sum command, run:

$> md5sum -c bbp_dist_<version>.tgz.md5
$> md5sum -c bbp_data_<version>.tgz.md5

You should get the messages

bbp_dist_<version>.tgz.md5: OK
bbp_data_<version>.tgz.md5: OK

If you get FAILED instead, re-download the tgz files and try again. When it passes, that means the files were downloaded without error.

Once both files have passed the checksum test, untar the files.

$> tar xzvf bbp_dist_<version>.tgz file
$> tar xzvf bbp_data_<version>.tgz file

If multiple users are planning to use the platform on the same system, you only need one copy of the data files per machine. Each user will still need his or her own copy of the code files.

Alternatively, if you would like access to the latest version of the platform and get frequent but less thoroughly tested improvements, you can check out the platform from SCEC's Subversion repository. Only advanced users should take this approach, outlined in detail in the Advanced Users section.

User Account Setup

The Broadband platform installation is divided into two parts (1) input Green Functions (GF) data directory (9.3Gb), and (2) BBP home directory (366Mb).

First, the Green's function libraries are larger, but they are static. The broadband platform reads the data files, but does not change them.

Second, the remainder of the broadband platform is organized in the BBP home directory. The BBP home directory has a specific directory structure that includes the source code for the scientific programs, the python scripts that link the scientific programs, simulation input directory, temporary and log file directories, and the output data directory where all the platform results are written.

The BBP home directory will increase in size as you run the platform, because both output data and output log files are written to the home directory. Running the acceptance tests will produce nearly ( 10Gb) of input data (5.2Gb), output data (500Mb), temporary files (2.7Gb), and output log files (250Mb). Once these tests have passed much of this data can be removed. However, the BBP home directory should have at least 10Gb of disk space to insure the acceptance tests can be run when a particular broadband platform software distribution is first installed on a system.

Data Directory

Input data files called Greens Functions are distributed with the platform. These files are generated using specific velocity structures, so they are often considered region specific. We have GF for three regions including near Loma Prieta, near Landers, and near Northridge.

The data directory is static, and the data directories are read-only. A single copy can be installed in a shared disk, which can then be shared by multiple users.

Setting Environment Variables

The BBP source codes and scripts are organized under the broadband platform home directory. The broadband platform home directory is specified in a couple of places during Broadband Platform installation.

Internal to the Broadband platform software, all broadband platform files (except the Greens Function data files) are in sub-directories of the BBP home directory.

BBP Environment Variable

We recommend setting both an alias and a environment variable to the BBP home directory. This will help avoid typing it many times.

If you're running a Bash shell, add the following line to your .bash_profile with your favorite text editor:

export BBP=/home/scec-00/kumar/bbp_2g

If you're running a C-shell, add the following line to your .cshrc with your favorite text editor:

setenv BBP /home/scec-00/kumar/bbp_2g

==

  • PYTHONPATH

After you've obtained a copy of the project, you'll need to make sure the comps directory is on Python's path so Python can find all the project modules. If you're running a Bash shell, add the following line to your .bash_profile with your favorite text editor:

export PYTHONPATH=$BBP/comps:$PYTHONPATH

If you're running a C-shell, add the following line to your .cshrc with your favorite text editor:

setenv PYTHONPATH $BBP/comps:$PYTHONPATH
  • PATH

In order to successfully compile the project, you'll need to make sure the required compilers directories are in your PATH variable. Broadband requires Intel 64-bit compilers, icc and ifort and GNU Compilers, gcc, g77 and f77 to compile the scientific code.

If you are planning to run Broadband on SCEC Development servers, make sure you have the following directories in your PATH:

For Bash Shell (in .bash_profile)

export PATH=/usr/scec/intel/cce/9.0/bin:/usr/scec/intel/fce/9.0/bin:$PATH

For C-shell (in .cshrc)

setenv PATH /usr/scec/intel/cce/9.0/bin:/usr/scec/intel/fce/9.0/bin:$PATH

When running elements of the platform over ssh, be sure to enable ssh forwarding (with the -X or -Y options).

After modifying your login script above, log out and log back into the machine so the changes are reflected in your environment.

Edit Install_cfg.py with Installation Directory Paths

You need to tell the platform where it's installed by editing a single python file called "install_cfg.py". Edit the file bbp_2g/comps/install_cfg.py with your favorite text editor, and edit the lines:

self.A_INSTALL_ROOT = <bbp_2g directory>
self.A_GF_DIR = <bbp_2g_gf directory>

as one example:

self.A_INSTALL_ROOT = /home/scec-00/kumar/bbp_2g
self.A_GF_DIR = /home/scec-00/kumar/bbp_2g_gf

with the paths on your system to the bbp_2g directory (the source) and the bbp_2g_gf directory (the data) that you unzipped. For example, it could be /home/scottcal/broadband_platform/bbp_2g and /home/scottcal/broadband_platform/bbp_2g_gf.

Here is some information about these two directories that may be useful as you decide how, and where, to install these two parts of the Broadband Platform on your computer disk system. The data files, and therefore the <bbp_2g_gf directory> are nearly 3GB, but they are static and will not be modified and will not grow in size during use of the platform. The source directory is small to begin. However, this directory will increase as the platform is used, since the results produced by the platform will be stored here.

Directory Structure

The platform consists of two top-level directories, bbp_2g and bbp_2g_gf. bbp_2g contains the source code, executables, scripts, tests, input, working, and output directories. bbp_2g_gf contains the Green's Functions, input files for the validation events, and other required input files for the various code bases. Note that indata, logs, outdata, tmpdata, and xml are created when the platform is first run, so they will be missing when you first install the platform.

bbp_2g has the following directories:

  • checksums: Contains checksums for bbp_2g_gf files
  • comps: The Python scripts to run the platform
  • docs: Documentation for the platform
  • etc: Miscellaneous utility scripts
  • examples: Contains example input files
  • indata: An internal directory, used to stage input files
  • logs: Contains logs from BBP runs
  • outdata: Contains output files from a run
  • ref_data: Contains reference files for BBP tests
  • start: Put input files for an interactive run here
  • src: Source code for BBP modules
  • tests: Contains unit and acceptance tests
  • tmpdata: An internal directory, used during a run
  • xml: Contains XML files which describe simulations and can be used as input

bbp_2g_gf has the following directories:

  • compare: Contains observed seismograms for validation events
  • plot: Data files for GMT plots
  • sdsu, ucsb, urs: Contains Green's functions, velocity files, and other required inputs for the codebases.

In general, you will be interacting with the start directory for input files, comps to run the platform, tests to test the platform, and outdata to examine data products.

Adding aliases

You may find it helpful to add aliases, so you can quickly and easily move to different broadband directories with a single command. We recommend creating aliases for the home, start, and outdata directories.

If you are using the Bash shell, you can create aliases by adding the following lines to ~/.bash_profile:

alias bbp= 'cd <path to bbp_2g directory>'
alias start='cd <path to bbp_2g directory>/start'
alias outdata='cd <path to bbp_2g directory>/outdata'

If you're using a C shell, edit your ~/.cshrc and add:

alias bbp cd <path to bbp_2g directory>
alias start cd <path to bbp_2g directory>/start
alias outdata cd <path to bbp_2g directory>/outdata

Log out and log back in. You'll notice that now you can type the alias command as a shortcut to change directories:

$> pwd
/home/scec-00/scottcal
$> start
$> pwd
/home/scec-00/scottcal/bband/bbp_2g/start

This can be a useful way to navigate around the broadband platform directories.

Building the Platform

Once you have checked out the code, you need to build it. By default, every executable is compiled using the compiler recommended by the code developer. However, if you have limited compiler options or are building the codes on an untested system, you may need to specify non-standard alternative compilers, as described below.

Before you can build the platform, you need to make sure that the Intel compilers are in your path. This is done automatically on intensity.usc.edu, but on other systems you can check by typing:

$>which icc

If you get the message "no icc in ...", then you'll need to add the Intel compilers to your path. Once the Intel compilers are in your path, you can make the code by cd-ing to the bbp_2g/src directory and typing make:

$> cd src
$> make

It takes a minute or two to build the code. You may encounter build warnings; these are fine. However, if you get any build errors, this is a problem and should be investigated.

By default, every executable is compiled using the compiler recommended by the code developer. Depending on the system, some compilers may not be available to you. You can override the C and Fortran compilers used by editing

src/makefile

Uncomment USER_C and set FC and CC to the compilers you wish. For example:

Before:

#USER_C=1
FC=f77
CC=gcc

After (an example, you may choose different compilers):

USER_C=1
FC=gfortran
CC=gcc

Note that not all compiler combinations have been thoroughly tested. You may encounter build errors with untested compiler combinations. If you encounter any errors while building the platform, consult the Troubleshooting section at the end of this user guide for know issues and their solutions.

Once the platform has been successfully built, you can move on to running the tests to verify that all components are working correctly.