Difference between revisions of "CyberShake Code Base"
Line 177: | Line 177: | ||
<b>Typical run configuration:</b> Parallel on ~1500 cores; for 5 billion points and the C version of UCVM, takes about 16 minutes. | <b>Typical run configuration:</b> Parallel on ~1500 cores; for 5 billion points and the C version of UCVM, takes about 16 minutes. | ||
− | <b>Input files:</b> [[CyberShake_Code_Base#AWP_format|AWP format velocity file]], [[CyberShake_Code_Base# | + | <b>Input files:</b> [[CyberShake_Code_Base#AWP_format|AWP format velocity file]], [[CyberShake_Code_Base#Gridout|gridout]], [[CyberShake_Code_Base#Coord|coord]] |
<b>Output files:</b> [[CyberShake_Code_Base#AWP_format|AWP format]] smoothed velocity file. | <b>Output files:</b> [[CyberShake_Code_Base#AWP_format|AWP format]] smoothed velocity file. |
Revision as of 19:55, 19 October 2017
This page details all the pieces of code which make up the CyberShake code base, as of November 2017. Note that this does not include the workflow middleware, or the workflow generators; that code is detailed at CyberShake Workflow Framework.
Conceptually, we can divide up the CyberShake codes into three categories:
- Strain Green Tensor-related codes: These codes produce the input files needed to generate SGTs, actually calculate the SGTs, and do some reformatting and sanity checks on the results.
- Synthesis-related codes: These codes take the SGTs and perform seismogram synthesis and intensity measure calculations.
- Data product codes: These codes insert the results into the database, and use the database to generate a variety of output data products.
Below is a description of each piece of software we use, organized by these categories. For each piece of software, we include a description of where it is located, how to compile and use it, and what its inputs and outputs are. At the end, we provide a description of input and output files and formats.
Contents
Code Installation
Historically, we have selected a root directory for CyberShake, then created the subdirectories 'software' for all the code, 'ruptures' for the rupture files, and 'utils' for workflow tools. Each code listed below, along with the configuration file, should be checked out into the 'software' subdirectory.
Configuration file
Many CyberShake codes use a configuration file, which specifies the root directory for the CyberShake installation, the command use to start an MPI executable, paths to a tmp and scratch space (which can be the same), and the path to the CyberShake rupture directory. We have done this instead of environment variables because it's more transparent and easier for multiple users. Both of these files should be stored in the 'software' subdirectory.
The configuration file is available at:
http://source.usc.edu/svn/cybershake/import/trunk/cybershake.cfg
Obviously, this file must be edited to be correct for the install.
Additionally, you must check out a Python script which is used to read in the configuration file and deliver it as key-value pairs, located here:
http://source.usc.edu/svn/cybershake/import/trunk/config.py
Several CyberShake codes import config, then use it to read out the cybershake.cfg file.
PreCVM
This code stands for "Pre-Community-Velocity-Model". It has to be run before the UCVM codes, since it generates input files required by UCVM.
Purpose: To determine the simulation volume for a particular CyberShake site.
Detailed description: PreCVM queries the CyberShake database to determine all of the ruptures which fall within a given cutoff for a certain site. From that information, padding is added around the edges to construct the CyberShake simulation volume for this site. Additional padding so the X and Y dimensions are multiples of 10, 20, or 40 might also be applied, depending on the input parameters. Using this volume, both the X/Y offset of each grid point, and then the latitude and longitude using a great circle projection, are determined and written to output files.
Needs to be changed if:
- The CyberShake volume depth needs to be changed, so as to have the right number of grid points. That is set in the genGrid() function in GenGrid_py/gen_grid.py.
- X and Y padding needs to be altered. That is set using 'bound_pad' in Modelbox/get_modelbox.py, around line 70.
- The rotation of the simulation volume needs to be changed. That is set using 'model_rot' in Modelbox/get_modelbox.py, around line 70.
- The database access parameters have changed. That's in Modelbox/get_modelbox.py, around line 80.
- The divisibility needs for GPU simulations change (currently, we need the dimensions to be evenly divisible by the number of GPUs used in that dimension. That is in Modelbox/get_modelbox.py, around line 250.
Source code location: http://source.usc.edu/svn/cybershake/import/trunk/PreCVM/
Author: Rob Graves, wrapped by Scott Callaghan
Dependencies: Getpar
Executable chain:
pre_cvm.py Modelbox/get_modelbox.py Modelbox/bin/gcproj GenGrid_py/gen_grid.py GenGrid_py/bin/gen_model_cords
Compile instructions:Run 'make' in the Modelbox/src and the Getpar_py/src directories.
Usage:
Usage: pre_cvm.py [options] Options: -h, --help show this help message and exit --site=SITE Site name --erf_id=ERF_ID ERF ID --modelbox=MODELBOX Path to modelbox file (output) --gridfile=GRIDFILE Path to gridfile (output) --gridout=GRIDOUT Path to gridout (output) --coordfile=COORDSFILE Path to coorfile (output) --paramsfile=PARAMSFILE Path to paramsfile (output) --boundsfile=BOUNDSFILE Path to boundsfile (output) --frequency=FREQUENCY Frequency --gpu Use GPU box settings. --spacing=SPACING Override default spacing with this value. --server=SERVER Address of server to query in creating modelbox, default is focal.usc.edu.
Typical run configuration: Serial; requires 6 minutes for 100m spacing, 10 billion point volume
Input files: None; inputs are retrieved from the database
Output files: modelbox, gridfile, gridout, params, coord, bounds
UCVM
Purpose: To generate a populated velocity mesh for a CyberShake simulation volume.
Detailed description: UCVM takes the volume defined by PreCVM and queries the UCVM software to populate the volume. The resulting mesh is then checked for Vp/Vs ratio, minimum Vp/Vs/rho, and for no Infs or NaNs. The data is outputted in either Graves (RWG) format or AWP format.
Needs to be changed if:
- New velocity models are added. Velocity models are specified in the DAX and passed through the wrapper scripts into the C code and then ultimately to UCVM, so an if statement must be added to around line 250 (and around line 450 if it's applicable for no GTL).
- The backend UCVM substantially changes. If we move to the Python implementation, for example.
Source code location: http://source.usc.edu/svn/cybershake/import/trunk/UCVM
Author: Scott Callaghan
Dependencies: Getpar, UCVM
Executable chain:
single_exe.py single_csh.py bin/ucvm-single-mpi
Compile instructions:Run 'make' in the UCVM/src directory.
Usage:
All of site, gridout, modelcords, models, and format must be specified. Usage: single_exe.py [options] Options: -h, --help show this help message and exit --site=SITE Site name --gridout=GRIDOUT Path to gridout (output) --coordfile=COORDSFILE Path to coordfile (output) --models=MODELS Comma-separated string on velocity models to use. --format=FORMAT Specify awp or rwg format for output. --frequency=FREQUENCY Frequency --spacing=SPACING Override default spacing with this value (km) --min_vs=MIN_VS Override minimum Vs value. Minimum Vp and minimum density will be 3.4 times this value.
Typical run configuration: Parallel on ~4000 cores; for 10 billion points and the C version of UCVM, takes about 20 minutes. Typically only half the cores per node are used to get more memory per process.
Output files: either RWG format or AWP format, depending on the option selected.
Smoothing
Purpose: To smooth a velocity file.
Detailed description: The smoothing code takes in a velocity mesh, determines the surface coordinates of the interfaces between velocity models, gets a list of all the points which need to be smoothed, and then performs the smoothing by averaging in both the X and Y direction for a user-specified number of points (default of 10km in each direction).
Needs to be changed if:
- We change our version of UCVM. The LD_LIBRARY_PATH needs to be modified, in run_smoothing.py around line 98.
- The smoothing algorithm is modified. Currently that is specified in the average_point() function in smooth_mpi.c.
- We start using velocity models with boundaries aren't perpendicular to the earth's surface.
Source code location: http://source.usc.edu/svn/cybershake/import/trunk/UCVM/smoothing
Author: Scott Callaghan
Dependencies: UCVM
Executable chain:
smoothing/run_smoothing.py bin/determine_surface_model smoothing/determine_smoothing_points.py smoothing/smooth_mpi
Compile instructions:Run 'make' in the smoothing directory, and make sure that direct_surface_model has been compiled in the UCVM/src directory.
Usage:
Usage: run_smoothing.py [options] Options: -h, --help show this help message and exit --gridout=GRIDOUT gridout file --coords=COORDS coords file --models=MODELSTRING comma-separated list of velocity models --smoothing-dist=SMOOTHING_DIST Number of grid points to smooth over. About 10km of grid points is a good starting place. --mesh=MESH AWP-format velocity mesh to smooth --mesh-out=MESH_OUT Output smoothed mesh
Typical run configuration: Parallel on ~1500 cores; for 5 billion points and the C version of UCVM, takes about 16 minutes.
Input files: AWP format velocity file, gridout, coord
Output files: AWP format smoothed velocity file.
File types
Modelbox
Purpose: Contains a description of the simulation box, at the surface.
Filename convention: <site>.modelbox
Format:
<site name> APPROXIMATE CENTROID: clon= <centroid lon> clat =<centroid lat> MODEL PARAMETERS: mlon= <model lon> mlat =<model lat> mrot=<model rot, default -55> xlen= <x-length in km> ylen= <y-length in km> MODEL CORNERS: <lon 1> <lat 1> (x= 0.000 y= 0.000) <lon 2> <lat 2> (x= <max x> y= 0.000) <lon 3> <lat 3> (x= <max x> y= <max y>) <lon 4> <lat 4> (x= 0.000 y= <max y>)
Generated by: PreCVM
Used by:
Gridfile
Purpose: Specify the three dimensions, and gridspacing in each dimension, of the volume.
Filename convention: gridfile_<site>
Format:
xlen=<x-length in km> 0.0 <x-length> <grid spacing in km> ylen=<y-length in km> 0.0 <y-length> <grid spacing in km> zlen=<z-length in km> 0.0 <z-length> <grid spacing in km>
Gridout
Purpose: Specify the km offsets for each grid index, in X, Y, and Z, from the upper southwest corner.
Filename convention: gridout_<site>
Format:
xlen=<x-length in km> nx=<number of gridpoints in X direction> 0 0 <grid spacing> 1 <grid spacing> <grid spacing> 2 <2*grid spacing> <grid spacing> 3 <3*grid spacing> <grid spacing> ... nx-1 <(nx-1)*grid spacing> <grid spacing> ylen=<y-length in km> ny=<number of gridpoints in Y direction> 0 0 <grid spacing> 1 <grid spacing> <grid spacing> ... ny-1 <(ny-1)*grid spacing> <grid spacing> zlen=<z-length in km> nz=<number of gridpoints in Z direction> 0 0 <grid spacing> 1 <grid spacing> <grid spacing> ... nz-1 <(nz-1)*grid spacing> <grid spacing>
Generated by: PreCVM
Used by: UCVM
Params
Purpose: Succinctly specify the parameters for the CyberShake volume. Similar information to the modelbox file, but in a different format.
Filename convention: model_params_GC_<site> (GC stands for 'great circle', the projection we use).
Format:
Model origin coordinates: lon= <model lon> lat= <model lat> rotate= <model rotation, default -55> Model origin shift (cartesian vs. geographic): xshift(km)= <x shift, usually half the x-length minus 1 grid spacing> yshift(km)= <y-shift, usually half the y-length minus 1 grid spacing> Model corners: c1= <nw lon> <nw lat> c2= <ne lon> <ne lat> c3= <se lon> <se lat> c4= <sw lon> <sw lat> Model Dimensions: xlen= <x-length> km ylen= <y-length> km zlen= <z-length> km
Generated by: PreCVM
Used by:
Coord
Purpose: Specify the mapping of latitude and longitude to X and Y offsets, for each point on the surface.
Filename convention: model_coords_GC_<site> (GC stands for 'great circle', the projection we use).
Format:
<lon> <lat> 0 0 <lon> <lat> 1 0 <lon> <lat> 2 0 ... <lon> <lat> <nx-1> 0 <lon> <lat> 0 1 ... <lon> <lat> <nx-1> 1 ... <lon> <lat> <nx-1> <ny-1>
Generated by: PreCVM
Used by: UCVM
Bounds
Purpose: Specify the mapping of latitude and longitude to X and Y offsets, but only for the points along the boundary. A subset of the coord file.
Filename convention: model_bounds_GC_<site> (GC stands for 'great circle', the projection we use).
Format:
<lon> <lat> 0 0 <lon> <lat> 1 0 <lon> <lat> 2 0 ... <lon> <lat> <nx-1> 0 <lon> <lat> 0 1 <lon> <lat> <nx-1> 1 <lon> <lat> 0 2 <lon> <lat> <nx-1> 2 ... <lon> <lat> 0 <ny-1> <lon> <lat> 1 <ny-1> ... <lon> <lat> <nx-1> <ny-1>
Generated by: PreCVM
Used by:
Velocity files
RWG format
Purpose: Input velocity files for the RWG wave propagation code, emod3d.
Filename convention: v_sgt-<site>.<p, s, or d>
Format: 3 files, one each for Vp (*.p), Vs (*.s), and rho (*.d). Each is binary, with 4-byte floats, in fast X, Z (surface down), slow Y order.
Generated by: UCVM
Used by:
AWP format
Purpose: Input velocity file for the AWP-ODC wave propagation code.
Filename convention: awp.<site>.media
Format: Binary, with 4-byte floats, in fast Y, X, slow Z (surface down) order.
Generated by: UCVM
Used by: