Difference between revisions of "CyberShake Code Base"
Line 545: | Line 545: | ||
direct_synth_v3.3.1.py | direct_synth_v3.3.1.py | ||
stat=<site short name> | stat=<site short name> | ||
− | slat=<site lat> slon=<site lon> run_id=<run id> | + | slat=<site lat> |
+ | slon=<site lon> | ||
+ | run_id=<run id> | ||
sgt_handlers=<number of SGT handler processes; must be enough for the SGTs to be read into memory> | sgt_handlers=<number of SGT handler processes; must be enough for the SGTs to be read into memory> | ||
debug=<print logs for each process; 1 is yes, 0 no> | debug=<print logs for each process; 1 is yes, 0 no> | ||
max_buf_mb=<buffer size in MB for each worker to use for storing SGT information> | max_buf_mb=<buffer size in MB for each worker to use for storing SGT information> | ||
rupture_spacing=<'uniform' or 'random' hypocenter spacing> | rupture_spacing=<'uniform' or 'random' hypocenter spacing> | ||
− | ntout=<nt for seismograms> dtout=<dt for seismograms> | + | ntout=<nt for seismograms> |
+ | dtout=<dt for seismograms> | ||
rup_list_file=<input file containing ruptures to process> | rup_list_file=<input file containing ruptures to process> | ||
− | sgt_xfile=<input SGT X file> sgt_yfile=<input SGT Y file> x_header=<input SGT X header> y_header=<input SGT Y header> | + | sgt_xfile=<input SGT X file> |
− | det_max_freq=<maximum frequency of deterministic part> stoch_max_freq=<maximum frequency of stochastic part> | + | sgt_yfile=<input SGT Y file> |
− | run_psa=<'1' to run X and Y component PSA, '0' to not> run_rotd=<'1' to run RotD calculations, '0' to not> run_durations=<'1' to run duration calculation, '0' to not> | + | x_header=<input SGT X header> |
− | simulation_out_pointsX=<'2', the number of components> simulation_out_pointsY=1 simulation_out_timesamples=<same as ntout> simulation_out_timeskip=<same as dtout> | + | y_header=<input SGT Y header> |
− | surfseis_rspectra_seismogram_units=cmpersec surfseis_rspectra_output_units=cmpersec2 surfseis_rspectra_output_type=aa surfseis_rspectra_period=all | + | det_max_freq=<maximum frequency of deterministic part> |
− | surfseis_rspectra_apply_filter_highHZ=<high filter, 5.0 for 1 Hz runs, 20.0 or higher for 10 Hz runs> surfseis_rspectra_apply_byteswap=no | + | stoch_max_freq=<maximum frequency of stochastic part> |
+ | run_psa=<'1' to run X and Y component PSA, '0' to not> | ||
+ | run_rotd=<'1' to run RotD calculations, '0' to not> | ||
+ | run_durations=<'1' to run duration calculation, '0' to not> | ||
+ | simulation_out_pointsX=<'2', the number of components> | ||
+ | simulation_out_pointsY=1 | ||
+ | simulation_out_timesamples=<same as ntout> | ||
+ | simulation_out_timeskip=<same as dtout> | ||
+ | surfseis_rspectra_seismogram_units=cmpersec | ||
+ | surfseis_rspectra_output_units=cmpersec2 | ||
+ | surfseis_rspectra_output_type=aa | ||
+ | surfseis_rspectra_period=all | ||
+ | surfseis_rspectra_apply_filter_highHZ=<high filter, 5.0 for 1 Hz runs, 20.0 or higher for 10 Hz runs> | ||
+ | surfseis_rspectra_apply_byteswap=no | ||
</pre> | </pre> | ||
Revision as of 21:51, 27 October 2017
This page details all the pieces of code which make up the CyberShake code base, as of November 2017. Note that this does not include the workflow middleware, or the workflow generators; that code is detailed at CyberShake Workflow Framework.
Conceptually, we can divide up the CyberShake codes into three categories:
- Strain Green Tensor-related codes: These codes produce the input files needed to generate SGTs, actually calculate the SGTs, and do some reformatting and sanity checks on the results.
- Synthesis-related codes: These codes take the SGTs and perform seismogram synthesis and intensity measure calculations.
- Data product codes: These codes insert the results into the database, and use the database to generate a variety of output data products.
Below is a description of each piece of software we use, organized by these categories. For each piece of software, we include a description of where it is located, how to compile and use it, and what its inputs and outputs are. At the end, we provide a description of input and output files and formats.
Contents
- 1 Code Installation
- 2 SGT-related codes
- 3 PP-related codes
- 4 File types
- 5 Dependencies
Code Installation
Historically, we have selected a root directory for CyberShake, then created the subdirectories 'software' for all the code, 'ruptures' for the rupture files, and 'utils' for workflow tools. Each code listed below, along with the configuration file, should be checked out into the 'software' subdirectory.
Configuration file
Many CyberShake codes use a configuration file, which specifies the root directory for the CyberShake installation, the command use to start an MPI executable, paths to a tmp and scratch space (which can be the same), and the path to the CyberShake rupture directory. We have done this instead of environment variables because it's more transparent and easier for multiple users. Both of these files should be stored in the 'software' subdirectory.
The configuration file is available at:
http://source.usc.edu/svn/cybershake/import/trunk/cybershake.cfg
Obviously, this file must be edited to be correct for the install.
Additionally, you must check out a Python script which is used to read in the configuration file and deliver it as key-value pairs, located here:
http://source.usc.edu/svn/cybershake/import/trunk/config.py
Several CyberShake codes import config, then use it to read out the cybershake.cfg file.
PreCVM
This code stands for "Pre-Community-Velocity-Model". It has to be run before the UCVM codes, since it generates input files required by UCVM.
Purpose: To determine the simulation volume for a particular CyberShake site.
Detailed description: PreCVM queries the CyberShake database to determine all of the ruptures which fall within a given cutoff for a certain site. From that information, padding is added around the edges to construct the CyberShake simulation volume for this site. Additional padding so the X and Y dimensions are multiples of 10, 20, or 40 might also be applied, depending on the input parameters. Using this volume, both the X/Y offset of each grid point, and then the latitude and longitude using a great circle projection, are determined and written to output files.
Needs to be changed if:
- The CyberShake volume depth needs to be changed, so as to have the right number of grid points. That is set in the genGrid() function in GenGrid_py/gen_grid.py.
- X and Y padding needs to be altered. That is set using 'bound_pad' in Modelbox/get_modelbox.py, around line 70.
- The rotation of the simulation volume needs to be changed. That is set using 'model_rot' in Modelbox/get_modelbox.py, around line 70.
- The database access parameters have changed. That's in Modelbox/get_modelbox.py, around line 80.
- The divisibility needs for GPU simulations change (currently, we need the dimensions to be evenly divisible by the number of GPUs used in that dimension. That is in Modelbox/get_modelbox.py, around line 250.
Source code location: http://source.usc.edu/svn/cybershake/import/trunk/PreCVM/
Author: Rob Graves, wrapped by Scott Callaghan
Dependencies: Getpar, MySQLdb for Python
Executable chain:
pre_cvm.py Modelbox/get_modelbox.py Modelbox/bin/gcproj GenGrid_py/gen_grid.py GenGrid_py/bin/gen_model_cords
Compile instructions:Run 'make' in the Modelbox/src and the GenGrid_py/src directories.
Usage:
Usage: pre_cvm.py [options] Options: -h, --help show this help message and exit --site=SITE Site name --erf_id=ERF_ID ERF ID --modelbox=MODELBOX Path to modelbox file (output) --gridfile=GRIDFILE Path to gridfile (output) --gridout=GRIDOUT Path to gridout (output) --coordfile=COORDSFILE Path to coorfile (output) --paramsfile=PARAMSFILE Path to paramsfile (output) --boundsfile=BOUNDSFILE Path to boundsfile (output) --frequency=FREQUENCY Frequency --gpu Use GPU box settings. --spacing=SPACING Override default spacing with this value. --server=SERVER Address of server to query in creating modelbox, default is focal.usc.edu.
Typical run configuration: Serial; requires 6 minutes for 100m spacing, 10 billion point volume
Input files: None; inputs are retrieved from the database
Output files: modelbox, gridfile, gridout, params, coord, bounds
UCVM
Purpose: To generate a populated velocity mesh for a CyberShake simulation volume.
Detailed description: UCVM takes the volume defined by PreCVM and queries the UCVM software to populate the volume. The resulting mesh is then checked for Vp/Vs ratio, minimum Vp/Vs/rho, and for no Infs or NaNs. The data is outputted in either Graves (RWG) format or AWP format.
Needs to be changed if:
- New velocity models are added. Velocity models are specified in the DAX and passed through the wrapper scripts into the C code and then ultimately to UCVM, so an if statement must be added to around line 250 (and around line 450 if it's applicable for no GTL).
- The backend UCVM substantially changes. If we move to the Python implementation, for example.
Source code location: http://source.usc.edu/svn/cybershake/import/trunk/UCVM
Author: Scott Callaghan
Executable chain:
single_exe.py single_csh.py bin/ucvm-single-mpi
Compile instructions:Run 'make' in the UCVM/src directory.
Usage:
All of site, gridout, modelcords, models, and format must be specified. Usage: single_exe.py [options] Options: -h, --help show this help message and exit --site=SITE Site name --gridout=GRIDOUT Path to gridout (output) --coordfile=COORDSFILE Path to coordfile (output) --models=MODELS Comma-separated string on velocity models to use. --format=FORMAT Specify awp or rwg format for output. --frequency=FREQUENCY Frequency --spacing=SPACING Override default spacing with this value (km) --min_vs=MIN_VS Override minimum Vs value. Minimum Vp and minimum density will be 3.4 times this value.
Typical run configuration: Parallel on ~4000 cores; for 10 billion points and the C version of UCVM, takes about 20 minutes. Typically only half the cores per node are used to get more memory per process.
Output files: either RWG format or AWP format, depending on the option selected.
Smoothing
Purpose: To smooth a velocity file along model interfaces.
Detailed description: The smoothing code takes in a velocity mesh, determines the surface coordinates of the interfaces between velocity models, gets a list of all the points which need to be smoothed, and then performs the smoothing by averaging in both the X and Y direction for a user-specified number of points (default of 10km in each direction).
Needs to be changed if:
- We change our version of UCVM. The LD_LIBRARY_PATH needs to be modified, in run_smoothing.py around line 98.
- The smoothing algorithm is modified. Currently that is specified in the average_point() function in smooth_mpi.c.
- We start using velocity models with boundaries aren't perpendicular to the earth's surface.
Source code location: http://source.usc.edu/svn/cybershake/import/trunk/UCVM/smoothing
Author: Scott Callaghan
Dependencies: UCVM
Executable chain:
smoothing/run_smoothing.py bin/determine_surface_model smoothing/determine_smoothing_points.py smoothing/smooth_mpi
Compile instructions:Run 'make' in the smoothing directory, and make sure that direct_surface_model has been compiled in the UCVM/src directory.
Usage:
Usage: run_smoothing.py [options] Options: -h, --help show this help message and exit --gridout=GRIDOUT gridout file --coords=COORDS coords file --models=MODELSTRING comma-separated list of velocity models --smoothing-dist=SMOOTHING_DIST Number of grid points to smooth over. About 10km of grid points is a good starting place. --mesh=MESH AWP-format velocity mesh to smooth --mesh-out=MESH_OUT Output smoothed mesh
Typical run configuration: Parallel on ~1500 cores; for 5 billion points and the C version of UCVM, takes about 16 minutes.
Input files: AWP format velocity file, gridout, coord
Output files: AWP format smoothed velocity file.
PreSGT
Purpose: To generate a series of input files which are used by the wave propagation codes.
Detailed description: PreSGT determines the X and Y coordinates of the site location (where the impulse will go for the wave propagation simulation) and determines, which mesh point (X and Y) maps most closely to every point on a fault surface which is within the cutoff. That information is combined with an adaptive mesh approach to create a list of all the points for which SGTs should be saved.
Needs to be changed if:
- We change our approach for saving adaptive mesh points.
- We switch to RSQSim ruptures, or other ruptures in which the geometry isn't planar. Modifications would be required to gen_sgtgrid.c.
Source code location: http://source.usc.edu/svn/cybershake/import/trunk/PreSGT
Author: Rob Graves, heavily modified by Scott Callaghan
Dependencies: Getpar, libcfu, MySQLdb for Python
Executable chain:
presgt.py faultlist_py/CreateFaultList.py bin/gen_sgtgrid
Compile instructions:Run 'make' in the src directory.
Usage:
Usage: ./presgt.py <site> <erf_id> <modelbox> <gridout> <model_coords> <fdloc> <faultlist> <radiusfile> <sgtcords> <spacing> [frequency] Example: ./presgt.py USC 33 USC.modelbox gridout_USC model_coords_GC_USC USC.fdloc USC.faultlist USC.radiusfile USC.cordfile 200.0 0.1
Typical run configuration: Parallel on 8 nodes, 32 cores (gen_sgtgrid is a parallel code); for 200m spacing UCERF2, takes about 8 minutes.
Input files: modelbox, gridout, coord
Output files: fdloc, faultlist, radiusfile, sgtcoords.
PreAWP
Purpose: To generate input files in a format that AWP-ODC expects.
Detailed description: PreAWP uses the input files to produce an IN3D parameter file, a file with the SGT coordinates to save, and velocity file in the right format (if it isn't already). Striping for the output file is also set up here, and files are symlinked into the directory structure that AWP expects. Note that slightly different versions of this exist for the CPU and GPU implementations of AWP-ODC-SGT.
Needs to be changed if:
- The AWP code changes its input format.
Source code location: http://source.usc.edu/svn/cybershake/import/trunk/AWP-GPU-SGT/utils/ (GPU) or http://source.usc.edu/svn/cybershake/import/trunk/AWP-ODC-SGT/utils/ (CPU)
Author: Scott Callaghan
Dependencies: SgtHead
Executable chain:
build_awp_inputs.py build_IN3D.py build_src.py build_cordfile.py SgtHead/gen_awp_cordfile.py build_media.py SgtHead/bin/reformat_velocity
Compile instructions:Run 'make' in the SgtHead/src directory.
Usage:
Usage: build_awp_inputs.py [options] Options: -h, --help show this help message and exit --site=SITE Site name --gridout=GRIDOUT Path to gridout input file --fdloc=FDLOC Path to fdloc input file --cordfile=CORDFILE Path to cordfile input file --velocity-prefix=VEL_PREFIX RWG velocity prefix. If omitted, will not reformat velocity file, just symlink. --frequency=FREQUENCY Frequency of SGT run, 0.5 Hz by default. --px=PX Number of processors in X-direction. --py=PY Number of processors in Y-direction. --pz=PZ Number of processors in Z-direction. --source-frequency=SOURCE_FREQ Low-pass filter frequency to use on the source, default is same frequency as the frequency of the run. --spacing=SPACING Override default spacing, derived from frequency. --velocity-mesh=VEL_MESH Provide path to velocity mesh. If omitted, will assume mesh is named awp.<site>.media.
Typical run configuration: Serial; for 1 Hz run, takes about 11 minutes.
Input files: gridout, fdloc, cordfile, velocity mesh (if in RWG format, will be converted to AWP), RWG source
Output files: IN3D, AWP source, AWP velocity mesh, AWP cordfile.
AWP-ODC-SGT, CPU version
Purpose: To perform SGT synthesis
Detailed description: AWP-ODC-SGT is the CPU version. It uses the IN3D file for its parameters.
Needs to be changed if:
- New science or features are added to the AWP code.
Source code location: http://source.usc.edu/svn/cybershake/import/trunk/AWP-ODC-SGT
Author: Kim Olsen, Steve Day, Yifeng Cui, various students and post-docs, wrapped by Scott Callaghan
Dependencies: iobuf module
Executable chain:
awp_odc_wrapper.sh bin/pmcl3d
Compile instructions:Using the GNU compilers, run 'make' in the src directory.
Usage:
pmcl3d <IN3D parameter file>
Typical run configuration: Parallel; for 0.5 Hz run (2 billion points, 20k timesteps), takes about 45 minutes on 10,000 cores.
Input files: IN3D, AWP cordfile, AWP velocity mesh), AWP source
Output files: AWP SGT file.
AWP-ODC-SGT, GPU version
Purpose: To perform SGT synthesis
Detailed description: AWP-ODC-SGT is the GPU version. It takes parameters on the command-line, so the wrapper converts the IN3D file into command-line arguments and invokes it.
Needs to be changed if:
- New science or features are added to the AWP code.
Source code location: http://source.usc.edu/svn/cybershake/import/trunk/AWP-GPU-SGT
Author: Kim Olsen, Steve Day, Yifeng Cui, various students and post-docs, wrapped by Scott Callaghan
Dependencies: CUDA toolkit module
Executable chain:
gpu_wrapper.py bin/pmcl3d
Compile instructions:modules PrgEnv-gnu and module cudatoolkit must be loaded first. Then, run 'make' in the src directory.
Usage:
Usage: ./pmcl3d Options: [(-T | --TMAX) <TMAX>] [(-H | --DH) <DH>] [(-t | --DT) <DT>] [(-A | --ARBC) <ARBC>] [(-P | --PHT) <PHT>] [(-M | --NPC) <NPC>] [(-D | --ND) <ND>] [(-S | --NSRC) <NSRC>] [(-N | --NST) <NST>] [(-V | --NVE) <NVE>] [(-B | --MEDIASTART) <MEDIASTART>] [(-n | --NVAR) <NVAR>] [(-I | --IFAULT) <IFAULT>] [(-R | --READ_STEP) <x READ_STEP] [(-X | --NX) <x length] [(-Y | --NY) <y length>] [(-Z | --NZ) <z length] [(-x | --NPX) <x processors] [(-y | --NPY) <y processors>] [(-z | --NPZ) <z processors>] [(-1 | --NBGX) <starting point to record in X>] [(-2 | --NEDX) <ending point to record in X>] [(-3 | --NSKPX) <skipping points to record in X>] [(-11 | --NBGY) <starting point to record in Y>] [(-12 | --NEDY) <ending point to record in Y>] [(-13 | --NSKPY) <skipping points to record in Y>] [(-21 | --NBGZ) <starting point to record in Z>] [(-22 | --NEDZ) <ending point to record in Z>] [(-23 | --NSKPZ) <skipping points to record in Z>] [(-i | --IDYNA) <i IDYNA>] [(-s | --SoCalQ) <s SoCalQ>] [(-l | --FL) <l FL>] [(-h | --FH) <i FH>] [(-p | --FP) <p FP>] [(-r | --NTISKP) <time skipping in writing>] [(-W | --WRITE_STEP) <time aggregation in writing>] [(-100 | --INSRC) <source file>] [(-101 | --INVEL) <mesh file>] [(-o | --OUT) <output file>] [(-c | --CHKFILE) <checkpoint file to write statistics>] [(-G | --IGREEN) <IGREEN for SGT>] [(-200 | --NTISKP_SGT) <NTISKP for SGT>] [(-201 | --INSGT) <SGT input file>]
Typical run configuration: Parallel; for 1 Hz run (10 billion points, 40k timesteps), takes about 55 minutes on 800 GPUs.
Input files: IN3D, AWP cordfile, AWP velocity mesh), AWP source
Output files: AWP SGT file.
PostAWP
Purpose: To prepare the AWP results for use in post-processing.
Detailed description: PostAWP reformats the AWP output files into the SGT component order expected by RWG (since we swap X and Y between RWG and AWP), creates separate SGT header files, and calculates MD5 sums on the SGT files. Calculating the header information requires a number of input files, since lambda, mu, and the location of the impulse must all be included. The MD5 sums can be calculated separately, using the MD5 wrapper RunMD5sum.
Needs to be changed if:
- The AWP code is modified to produce outputs in exactly RWG order
- The header format for the post-processing code changes
- We decide not to calculate MD5 sums
Source code location: http://source.usc.edu/svn/cybershake/import/trunk/AWP-GPU-SGT/utils/prepare_for_pp.py (this will work for the CPU version of AWP also, despite the path); http://source.usc.edu/svn/cybershake/import/trunk/software/SgtHead
Author: Scott Callaghan
Dependencies: Getpar
Executable chain:
AWP-GPU-SGT/utils/prepare_for_pp.py SgtHead/bin/reformat_awp_mpi SgtHead/bin/write_head
Compile instructions:Run 'make write_head' and 'make reformat_awp_mpi' in the SgtHead/src directory.
Usage:
Usage: ./prepare_for_pp.py <site> <AWP SGT> <reformatted SGT filename> <modelbox file> <rwg cordfile> <fdloc file> <gridout file> <IN3D file> <AWP media file> <component> <run_id> <header> [frequency]
Typical run configuration: Parallel, 4 processors on 2 nodes; for a 750 GB SGT, takes about 100 minutes without the MD5 sums.
Input files: AWP SGT file, modelbox, RWG cordfile), fdloc, IN3D, AWP velocity mesh
Output files: RWG SGT file, SGT header file
RunMD5sum
Purpose: Wrapper for performing MD5sums.
Detailed description: On Titan, we ran into wallclock issues when bundling the MD5sums along with PostAWP. This wrapper supports performing the MD5 sums separately.
Needs to be changed if:
- We change hash algorithms
Source code location: http://source.usc.edu/svn/cybershake/import/trunk/SgtHead/run_md5sum.sh
Author: Scott Callaghan
Dependencies: none
Executable chain:
run_md5sum.sh
Compile instructions: none
Usage:
Usage: ./run_md5sum.sh <file>
Typical run configuration: Serial; for a 750 GB SGT, takes about 70 minutes.
Input files: RWG SGT file
Output files: MD5sum, with filename <RWG SGT filename>.md5
NanCheck
Purpose: Check the SGTs for anomalies before the post-processing.
Detailed description: This code checks to be sure the SGTs are the expected size, then checks for NaNs or too many consecutive zeros in the SGT files.
Needs to be changed if:
- We change the number of timesteps in the SGT file. Currently this is hardcoded, but it should be a command-line parameter.
- We want to add additional checks.
Source code location: http://source.usc.edu/svn/cybershake/import/trunk/SgtTest/
Author: Rob Graves, Scott Callaghan
Dependencies: Getpar
Executable chain:
perform.checks.py bin/check_for_nans
Compile instructions: Run 'make' in SgtTest/src .
Usage:
Usage: ./perform_checks.py <SGT file> <SGT header file>
Typical run configuration: Serial; for a 750 GB SGT, takes about 45 minutes.
Input files: RWG SGT file, SGT header file
Output files: none
The following codes are related to the post-processing part of the workflow.
CheckSgt
Purpose: To check the MD5 sums of the SGT files to be sure they match.
Detailed description: CheckSgt takes the SGT files and their corresponding MD5 sums and checks for agreement.
Needs to be changed if:
- We change hashing algorithms.
- We decide to add additional sanity checks to the beginning of the post-processing.
Source code location: http://source.usc.edu/svn/cybershake/import/trunk/CheckSgt
Author: Scott Callaghan
Dependencies: none
Executable chain:
CheckSgt.py
Compile instructions: none
Usage:
Usage: ./CheckSgt.py <sgt file> <md5 file>
Typical run configuration: Serial; for a 750 GB SGT, takes about 90 minutes.
Input files: RWG SGT, SGT MD5 sums
Output files: None
DirectSynth
DirectSynth is the code we currently use to perform the post-processing. For historical reasons, all of the codes used for CyberShake post-processing are documented here: CyberShake post-processing options (login required).
Purpose: To perform reciprocity calculations and produce seismograms, intensity measures, and duration measures.
Detailed description: DirectSynth reads in the SGTs across a group of processes, and hands out tasks (synthesis jobs) to worker processes. These worker processes read in rupture geometry information from disk and call the RupGen-api to generate full slip histories in memory. The workers request SGTs from the reader processes over MPI. X and Y component PSA calculations are performed from the resultant seismograms, and RotD and duration calculations are also performed, if requested. More details about the approach used are available at DirectSynth.
Needs to be changed if:
- We have new intensity measures or other calculations per seismogram to perform.
- We decide to change the post-processing algorithm.
Source code location: http://source.usc.edu/svn/cybershake/import/trunk/DirectSynth, http://source.usc.edu/svn/cybershake/import/trunk/RuptureCodes/RupGen-api-3.3.1
Author: Scott Callaghan, original seismogram synthesis code by Rob Graves, X and Y component PSA code by David Okaya, RotD code by Christine Goulet
Dependencies: getpar, libcfu, RupGen-api-v3.3.1, FFTW, libmemcached and memcached (optional)
Executable chain:
direct_synth_v3.3.1.py (current version, uses the Graves & Pitarka (2014) rupture generator) utils/pegasus_wrappers/invoke_memcached.sh memcached bin/direct_synth
Compile instructions: Run 'make' in RuptureCodes/RupGen-api-3.3.1/src to make the librupgen.a library. Then run 'make direct_synth_v3.3.1' in DirectSynth/src.
Usage:
direct_synth_v3.3.1.py stat=<site short name> slat=<site lat> slon=<site lon> run_id=<run id> sgt_handlers=<number of SGT handler processes; must be enough for the SGTs to be read into memory> debug=<print logs for each process; 1 is yes, 0 no> max_buf_mb=<buffer size in MB for each worker to use for storing SGT information> rupture_spacing=<'uniform' or 'random' hypocenter spacing> ntout=<nt for seismograms> dtout=<dt for seismograms> rup_list_file=<input file containing ruptures to process> sgt_xfile=<input SGT X file> sgt_yfile=<input SGT Y file> x_header=<input SGT X header> y_header=<input SGT Y header> det_max_freq=<maximum frequency of deterministic part> stoch_max_freq=<maximum frequency of stochastic part> run_psa=<'1' to run X and Y component PSA, '0' to not> run_rotd=<'1' to run RotD calculations, '0' to not> run_durations=<'1' to run duration calculation, '0' to not> simulation_out_pointsX=<'2', the number of components> simulation_out_pointsY=1 simulation_out_timesamples=<same as ntout> simulation_out_timeskip=<same as dtout> surfseis_rspectra_seismogram_units=cmpersec surfseis_rspectra_output_units=cmpersec2 surfseis_rspectra_output_type=aa surfseis_rspectra_period=all surfseis_rspectra_apply_filter_highHZ=<high filter, 5.0 for 1 Hz runs, 20.0 or higher for 10 Hz runs> surfseis_rspectra_apply_byteswap=no
Typical run configuration: Parallel, typically on 3840 processors; for 750 GB SGTs with ~7000 ruptures, takes about 12 hours.
Input files: RWG SGT, SGT headers, rupture list file, rupture geometry files
Output files: Seismograms, PSA files, RotD files, Duration files
File types
Modelbox
Purpose: Contains a description of the simulation box, at the surface.
Filename convention: <site>.modelbox
Format:
<site name> APPROXIMATE CENTROID: clon= <centroid lon> clat =<centroid lat> MODEL PARAMETERS: mlon= <model lon> mlat =<model lat> mrot=<model rot, default -55> xlen= <x-length in km> ylen= <y-length in km> MODEL CORNERS: <lon 1> <lat 1> (x= 0.000 y= 0.000) <lon 2> <lat 2> (x= <max x> y= 0.000) <lon 3> <lat 3> (x= <max x> y= <max y>) <lon 4> <lat 4> (x= 0.000 y= <max y>)
Generated by: PreCVM
Used by: PreSGT, PostAWP
Gridfile
Purpose: Specify the three dimensions, and gridspacing in each dimension, of the volume.
Filename convention: gridfile_<site>
Format:
xlen=<x-length in km> 0.0 <x-length> <grid spacing in km> ylen=<y-length in km> 0.0 <y-length> <grid spacing in km> zlen=<z-length in km> 0.0 <z-length> <grid spacing in km>
Gridout
Purpose: Specify the km offsets for each grid index, in X, Y, and Z, from the upper southwest corner.
Filename convention: gridout_<site>
Format:
xlen=<x-length in km> nx=<number of gridpoints in X direction> 0 0 <grid spacing> 1 <grid spacing> <grid spacing> 2 <2*grid spacing> <grid spacing> 3 <3*grid spacing> <grid spacing> ... nx-1 <(nx-1)*grid spacing> <grid spacing> ylen=<y-length in km> ny=<number of gridpoints in Y direction> 0 0 <grid spacing> 1 <grid spacing> <grid spacing> ... ny-1 <(ny-1)*grid spacing> <grid spacing> zlen=<z-length in km> nz=<number of gridpoints in Z direction> 0 0 <grid spacing> 1 <grid spacing> <grid spacing> ... nz-1 <(nz-1)*grid spacing> <grid spacing>
Generated by: PreCVM
Used by: UCVM, smoothing, PreSGT, PreAWP
Params
Purpose: Succinctly specify the parameters for the CyberShake volume. Similar information to the modelbox file, but in a different format.
Filename convention: model_params_GC_<site> (GC stands for 'great circle', the projection we use).
Format:
Model origin coordinates: lon= <model lon> lat= <model lat> rotate= <model rotation, default -55> Model origin shift (cartesian vs. geographic): xshift(km)= <x shift, usually half the x-length minus 1 grid spacing> yshift(km)= <y-shift, usually half the y-length minus 1 grid spacing> Model corners: c1= <nw lon> <nw lat> c2= <ne lon> <ne lat> c3= <se lon> <se lat> c4= <sw lon> <sw lat> Model Dimensions: xlen= <x-length> km ylen= <y-length> km zlen= <z-length> km
Generated by: PreCVM
Used by:
Coord
Purpose: Specify the mapping of latitude and longitude to X and Y offsets, for each point on the surface.
Filename convention: model_coords_GC_<site> (GC stands for 'great circle', the projection we use).
Format:
<lon> <lat> 0 0 <lon> <lat> 1 0 <lon> <lat> 2 0 ... <lon> <lat> <nx-1> 0 <lon> <lat> 0 1 ... <lon> <lat> <nx-1> 1 ... <lon> <lat> <nx-1> <ny-1>
Generated by: PreCVM
Used by: UCVM, smoothing, PreSGT
Bounds
Purpose: Specify the mapping of latitude and longitude to X and Y offsets, but only for the points along the boundary. A subset of the coord file.
Filename convention: model_bounds_GC_<site> (GC stands for 'great circle', the projection we use).
Format:
<lon> <lat> 0 0 <lon> <lat> 1 0 <lon> <lat> 2 0 ... <lon> <lat> <nx-1> 0 <lon> <lat> 0 1 <lon> <lat> <nx-1> 1 <lon> <lat> 0 2 <lon> <lat> <nx-1> 2 ... <lon> <lat> 0 <ny-1> <lon> <lat> 1 <ny-1> ... <lon> <lat> <nx-1> <ny-1>
Generated by: PreCVM
Used by:
Velocity files
RWG format
Purpose: Input velocity files for the RWG wave propagation code, emod3d.
Filename convention: v_sgt-<site>.<p, s, or d>
Format: 3 files, one each for Vp (*.p), Vs (*.s), and rho (*.d). Each is binary, with 4-byte floats, in fast X, Z (surface down), slow Y order.
Generated by: UCVM
Used by: PreAWP
AWP format
Purpose: Input velocity file for the AWP-ODC wave propagation code.
Filename convention: awp.<site>.media
Format: Binary, with 4-byte floats, in fast Y, X, slow Z (surface down) order.
Generated by: UCVM
Used by: Smoothing, PreAWP, PostAWP
Fdloc
Purpose: Coordinates of the site, in X Y grid indices, and therefore the coordinates where the SGT impulse should be placed.
Filename convention: <site>.fdloc
Format:
<X grid index of site> <Y grid index of site>
Generated by: PreSGT
Used by: PreAWP, PostAWP
Faultlist
Purpose: List of paths to all the rupture geometry files for all ruptures which are within the cutoff for this site. Used to produce a list of points to save SGTs for.
Filename convention: <site>.faultlist
Format:
<path to rupture file> nheader=<number of header lines, usually 6> latfirst=<1, to signify that latitude comes first in the rupture files> ...
Generated by: PreSGT
Used by: PreSGT
Radiusfile
Purpose: Describe the adaptive mesh SGTs will be saved for.
Filename convention: <site>.radiusfile
Format:
<number of gradations in X and Y> <radius 1> <radius 2> <radius 3> <radius 4> <decimation less than radius 1> <decimation between radius 1 and 2> <between 2 and 3> <between 3 and 4> <number of gradations in Z> <depth 1> <depth 2> <depth 3> <depth 4> <decimation less than depth 1> <decimation between depth 1 and 2> <between 2 and 3> <between 3 and 4>
Generated by: PreSGT
Used by: PreSGT
SGT Coordinate files
There are two formats for the list of points to save SGTs for, one for Rob's codes and one for AWP-ODC. As with other coordinate transformations between the two systems, to convert X and Y offsets from RWG to AWP you have to flip the X and Y and add 1 to each, since RWG is 0-indexed and AWP is 1-indexed.
SgtCoords
Purpose: List of all the points to save SGTs for.
Filename convention: <site>.cordfile
Format: Z changes fastest, then Y, then X slowest.
# geoproj= <projection; we usually use 1 for great circle> # modellon= <model lon> modellat= <model lat> modelrot= <model rot, usually -55> # xlen= <x-length> ylen= <y-length> # <total number of points> <X index> <Y index> <Z index> <Single long to capture the index, in the form XXXXYYYYZZZZ> <lon> <lat> <depth in km> ...
Generated by: PreSGT
Used by: PreSGT, PreAWP, PostAWP
AWP cordfile
Purpose: List of SGT points to save in a format usable by AWP-ODC-SGT.
Filename convention: awp.<site>.cordfile
Format: Remember that X and Y are flipped and have 1 added from RWG. The points are sorted by Y, then X, then Z, so Y changes slowest and Z changes fastest. This is flipped from the RWG cordfile because X and Y components are swapped.
<number of points> <X coordinate> <Y coordinate> <Z coordinate> ...
Generated by: PreAWP
Used by: AWP-ODC-SGT CPU, AWP-ODC-SGT GPU
Impulse source descriptions
We generate the initial source description for CyberShake, with the required dt, nt, and filtering, using gen_source, in http://source.usc.edu/svn/cybershake/import/trunk/SimSgt_V3.0.3/src/ (run 'make get_source'). gen_source hard-codes its parameters, but you should only change 'nt', 'dt', and 'flo'. We have been setting flo to twice the CyberShake maximum frequency, to reduce filtering affects at the frequency of interest. gen_source wraps Rob Graves's source generator, which we use for consistency.
Once this RWG source is generated, we then use AWP-GPU-SGT/utils/data/format_source.py to reprocess the RWG source into an AWP-source friendly format. This involves reformatting the file and multiplying all values by 1e15 for unit conversion. Different files must be produced for X and Y coordinates, since in the AWP format different columns are used for different components.
Finally, AWP-GPU-SGT/utils/build_src.py takes the correct AWP-friendly source (nt and dt) for a run and adds the impulse location coordinates, producing a complete AWP format source description.
RWG source
Purpose: Source description for the SGT impulse.
Filename convention: source_cos0.10_<frequency>hz
Format:
source cos <nt> <dt> 0 0 0.0 0.0 0.0 0.0 <value at ts0> <value at ts1> <value at ts2> <value at ts3> <value at ts4> <value at ts5> <value at ts6> <value at ts7> <value at ts8> <value at ts9> <value at ts10> <value at ts11> ...
Generated by: gen_source (see above)
Used by: PreAWP
AWP source
Purpose: Source description which can be used by AWP-ODC.
Filename convention: <site>_f<x or y>_src
Format: Note that X and Y coordinates are swapped between RWG and AWP format, because of how the box is defined. Additionally, RWG is 0-indexed, and AWP is 1-indexed, and the RWG values must be multiplied by 1e15 for unit conversion.
<X index of source, same as site X index> <Y index of source, same as site Y index> <XX impulse at ts0> <YY at ts0> <ZZ at ts0> <XY at ts0> <XZ at ts0> <YZ at ts0> ...
Generated by: PreAWP
Used by: AWP-ODC-SGT CPU, AWP-ODC-SGT GPU
IN3D
Purpose: Input file for AWP-ODC.
Filename convention: IN3D.<site>.<x or y>
Format: Specified here (login required).
Generated by: PreAWP
Used by: AWP-ODC-SGT CPU, AWP-ODC-SGT GPU, PostAWP
AWP SGT
Purpose: SGT file, created by AWP-ODC-SGT.
Filename convention: awp-strain-<site>-f<x or y>
Format: binary, 4-byte floats. Points are in the same order as in the AWP SGT coordinate file, which is fast Z, X, Y. For each point, the SGT components are stored in XX, YY, ZZ, XY, XZ, YZ order, with time fastest.
<timeseries for nt steps, for (1st x-coordinate, 1st y-coordinate, 1st z-coordinate), XX component> <timeseries for nt steps, for (1st x-coordinate, 1st y-coordinate, 1st z-coordinate), YY component> ... <timeseries for nt steps, for (1st x-coordinate, 1st y-coordinate, 1st z-coordinate), YZ component> <timeseries for nt steps, for (1st x-coordinate, 1st y-coordinate, 2nd z-coordinate), XX component> ... <timeseries for nt steps, for (1st x-coordinate, 1st y-coordinate, last z-coordinate), YZ component> <timeseries for nt steps, for (2nd x-coordinate, 1st y-coordinate, 1st z-coordinate), XX component> ... <timeseries for nt steps, for (last x-coordinate, 1st y-coordinate, last z-coordinate), YZ component> <timeseries for nt steps, for (1st x-coordinate, 2nd y-coordinate, 1st z-coordinate), XX component> ... <timeseries for nt steps, for (last x-coordinate, last y-coordinate, last z-coordinate), YZ component>
Generated by: AWP-ODC-SGT CPU and GPU
Used by: PostAWP
RWG SGT
Purpose: SGT file, created by PostAWP for use in post-processing.
Filename convention: <site>_f<x or y>_<run id>.sgt
Format: binary, 4-byte floats. Points are in the same order as in the RWG coordinate file, which is fast Z, Y, X. For each point, the SGT components are stored in XX, YY, ZZ, XY, XZ, YZ order, with time fastest.
<timeseries for nt steps, for (1st x-coordinate, 1st y-coordinate, 1st z-coordinate), XX component> <timeseries for nt steps, for (1st x-coordinate, 1st y-coordinate, 1st z-coordinate), YY component> ... <timeseries for nt steps, for (1st x-coordinate, 1st y-coordinate, 1st z-coordinate), YZ component> <timeseries for nt steps, for (1st x-coordinate, 1st y-coordinate, 2nd z-coordinate), XX component> ... <timeseries for nt steps, for (1st x-coordinate, 1st y-coordinate, last z-coordinate), YZ component> <timeseries for nt steps, for (2nd x-coordinate, 1st y-coordinate, 1st z-coordinate), XX component> ... <timeseries for nt steps, for (last x-coordinate, 1st y-coordinate, last z-coordinate), YZ component> <timeseries for nt steps, for (1st x-coordinate, 2nd y-coordinate, 1st z-coordinate), XX component> ... <timeseries for nt steps, for (last x-coordinate, last y-coordinate, last z-coordinate), YZ component>
Generated by: PostAWP
Used by: NanCheck, CheckSgt
SGT header file
Purpose: SGT header information, used to parse and understand SGT files
Filename convention: <site>_f<x or y>_<run id>.sgthead
Format: binary. It consists of three sections:
- The sgtmaster structure, described below in C. Its information can be used to set up data structures to read the rest of the SGTs.
- The sgtindex structures, described below in C. There is one of these for each point in the SGTs, and they're used to determine the X/Y/Z indices of all the SGT points.
- The sgtheader structures, described below in C. There is one of these for each point in the SGTs. They're used when we perform reciprocity.
struct sgtmaster { int geoproj; /* =0: RWG local flat earth; =1: RWG great circle arcs; =2: UTM */ float modellon; /* longitude of geographic origin */ float modellat; /* latitude of geographic origin */ float modelrot; /* rotation of y-axis from south (clockwise positive) */ float xshift; /* xshift of cartesian origin from geographic origin */ float yshift; /* yshift of cartesian origin from geographic origin */ int globnp; /* total number of SGT locations (entire model) */ int localnp; /* local number of SGT locations (this file only) */ int nt; /* number of time points */ };
struct sgtindex /* indices for all 'globnp' SGT locations */ { long long indx; /* indx= xsgt*1000000000000 + ysgt*1000000 + zsgt */ int xsgt; /* x grid location */ int ysgt; /* y grid location */ int zsgt; /* z grid location */ float h; /* grid spacing */ };
struct sgtheader { long long indx; /* index of this SGT */ int geoproj; /* =0: RWG local flat earth; =1: RWG great circle arcs; =2: UTM */ float modellon; /* longitude of geographic origin */ float modellat; /* latitude of geographic origin */ float modelrot; /* rotation of y-axis from south (clockwise positive) */ float xshift; /* xshift of cartesian origin from geographic origin */ float yshift; /* yshift of cartesian origin from geographic origin */ int nt; /* number of time points */ float xazim; /* azimuth of X-axis in FD model (clockwise from north) */ float dt; /* time sampling */ float tst; /* start time of 1st point in GF */ float h; /* grid spacing */ float src_lat; /* site latitude */ float src_lon; /* site longitude */ float src_dep; /* site depth */ int xsrc; /* x grid location for source (station in recip. exp.) */ int ysrc; /* y grid location for source (station in recip. exp.) */ int zsrc; /* z grid location for source (station in recip. exp.) */ float sgt_lat; /* SGT location latitude */ float sgt_lon; /* SGT location longitude */ float sgt_dep; /* SGT location depth */ int xsgt; /* x grid location for output (source in recip. exp.) */ int ysgt; /* y grid location for output (source in recip. exp.) */ int zsgt; /* z grid location for output (source in recip. exp.) */ float cdist; /* straight-line distance btw site and SGT location */ float lam; /* lambda [in dyne/(cm*cm)] at output point */ float mu; /* rigidity [in dyne/(cm*cm)] at output point */ float rho; /* density [in gm/(cm*cm*cm)] at output point */ float xmom; /* moment strength of x-oriented force in this run */ float ymom; /* moment strength of y-oriented force in this run */ float zmom; /* moment strength of z-oriented force in this run */ };
Overall, then, the format for the file is:
<sgtmaster> <sgtindex for point 1> <sgtindex for point 2> ... <sgtindex for point globnp> <sgtheader for point 1> <sgtheader for point 2> ... <sgtheader for point globnp>
Generated by: PostAWP
Used by:
Dependencies
The following are external software dependencies used by CyberShake software modules.
Getpar
Purpose: A library written in C which enables parsing of key-value command-line parameters, and enforcement of required parameters. Rob Graves uses it in his codes.
How to obtain: Rob supplied a copy; it is in the CyberShake repository at http://source.usc.edu/svn/cybershake/import/trunk/Getpar/ .
Special installation instructions: Run 'make' in Getpar/getpar/src; this will make the library, libget.a, and install it in the lib directory, where CyberShake codes will expect it.
MySQLdb
Purpose: MySQL bindings for Python 2.
How to obtain: https://sourceforge.net/projects/mysql-python/ . Documentation is at http://mysql-python.sourceforge.net/MySQLdb.html .
Special installation instructions: Standard python install (python setup.py build; python setup.py install). Many clusters have MySQLdb installed in one of their python installations, but it might take some experimenting to find it. For example, the bwpy module on Blue Waters does NOT have it installed, but if you run 'unload module bwpy' and try again, you will find it.
UCVM
Purpose: Supplies the query tools needed to populate a mesh with velocity information.
How to obtain: The most recent version of UCVM can be found at Current UCVM Software Releases. As of October 2017, we have only integrated the C version of UCVM into CyberShake.
Special installation instructions: Following the standard installation instructions for a cluster should work (running ./ucvm_setup.py). You will want to install CVM-S4, CVM-S426, CVM-S4.M01, CVM-H, CenCal, CCA-06, and CCA 1D velocity models for CyberShake.
libcfu
Purpose: Provides a hash table library for a variety of CyberShake codes.
How to obtain: https://sourceforge.net/projects/libcfu/ . Documentation is at http://libcfu.sourceforge.net/libcfu.html .
Special installation instructions: Follow the instructions.