Difference between revisions of "Fling Study"
Line 9: | Line 9: | ||
#Scenario - A scenario refers to an earthquake, defined by dip, rake, and depth and a set of stations that record that earthquake. A scenario does not define rupture time history, and does not define hypocenter location. The Fling study has 42 scenarios, numbered 101 to 142. | #Scenario - A scenario refers to an earthquake, defined by dip, rake, and depth and a set of stations that record that earthquake. A scenario does not define rupture time history, and does not define hypocenter location. The Fling study has 42 scenarios, numbered 101 to 142. | ||
#Source Realization - Earthquake source definitions, given as a part of a scenario, are elaborated into scenario realizations. Earthquake scenario source realizations may differ by initial stress distribution, by hypocenter location, and possibly by rupture velocity. In the fling study, each Scenario has 30 scenario realizations. A broadband source realization is equivalent to a "rupture variation", a term used in SCEC CyberShake work. | #Source Realization - Earthquake source definitions, given as a part of a scenario, are elaborated into scenario realizations. Earthquake scenario source realizations may differ by initial stress distribution, by hypocenter location, and possibly by rupture velocity. In the fling study, each Scenario has 30 scenario realizations. A broadband source realization is equivalent to a "rupture variation", a term used in SCEC CyberShake work. | ||
− | #Simulations - In broadband, a broadband simulation is defined by an invocation of the main program, called run_bbp_2G.py. A simulation is defined by a source realization, a station list, and a set of Green's functions to use. In the | + | #Simulations - In broadband, a broadband simulation is defined by an invocation of the main program, called run_bbp_2G.py. A simulation is defined by a source realization, a station list, and a set of Green's functions to use. In the Fling study, we run a simulation for each source realization. |
#Seismograms - The smallest unit of work in broadband is the seismogram. In broadband, a seismogram refers to a 3 component seismogram associated with a site. At a minimum, the relation between the source location and the site is known. In many cases, the geographical location of the site is known. In the fling study, the number of seismograms is determined by the number of stations in the station list each simulation. For smaller scenarios, the station lists have 96 stations, for larger they have up to 279 stations. We will calculate the total number of seismograms and use that as a computation metric for the fling study. Below is a table showing the number of stations for each scenario. | #Seismograms - The smallest unit of work in broadband is the seismogram. In broadband, a seismogram refers to a 3 component seismogram associated with a site. At a minimum, the relation between the source location and the site is known. In many cases, the geographical location of the site is known. In the fling study, the number of seismograms is determined by the number of stations in the station list each simulation. For smaller scenarios, the station lists have 96 stations, for larger they have up to 279 stations. We will calculate the total number of seismograms and use that as a computation metric for the fling study. Below is a table showing the number of stations for each scenario. | ||
Revision as of 00:22, 16 May 2012
Contents
- 1 Overview
- 2 Terminology
- 3 Number of stations and seismograms per scenario
- 4 Fling Study Scenario List
- 5 Completed Scenarios
- 6 Running on USC HPCC
- 7 Comparison of Seismogram from Server and Cluster
- 8 Certification of USC HPCC Cluster for Broadband Calculations
- 9 Building Metrics Table
- 10 Related Entries
Overview
The Fling study is a set of broadband platform simulations defined by PG&E and PEER researchers and run by SCEC Broadband platform group.
Terminology
To help the engineers, seismologists, and computer scientists talk about this computational study, we have defined the following terms for use in broadband calculations:
- Study - A well-defined set of calculations to be performed. To qualify as a "study", the problem must defined to the level of detail that a knowledgeable user could calculate the total number and type of output files that will be calculated. We are currently conducting the Fling study.
- Scenario - A scenario refers to an earthquake, defined by dip, rake, and depth and a set of stations that record that earthquake. A scenario does not define rupture time history, and does not define hypocenter location. The Fling study has 42 scenarios, numbered 101 to 142.
- Source Realization - Earthquake source definitions, given as a part of a scenario, are elaborated into scenario realizations. Earthquake scenario source realizations may differ by initial stress distribution, by hypocenter location, and possibly by rupture velocity. In the fling study, each Scenario has 30 scenario realizations. A broadband source realization is equivalent to a "rupture variation", a term used in SCEC CyberShake work.
- Simulations - In broadband, a broadband simulation is defined by an invocation of the main program, called run_bbp_2G.py. A simulation is defined by a source realization, a station list, and a set of Green's functions to use. In the Fling study, we run a simulation for each source realization.
- Seismograms - The smallest unit of work in broadband is the seismogram. In broadband, a seismogram refers to a 3 component seismogram associated with a site. At a minimum, the relation between the source location and the site is known. In many cases, the geographical location of the site is known. In the fling study, the number of seismograms is determined by the number of stations in the station list each simulation. For smaller scenarios, the station lists have 96 stations, for larger they have up to 279 stations. We will calculate the total number of seismograms and use that as a computation metric for the fling study. Below is a table showing the number of stations for each scenario.
Number of stations and seismograms per scenario
Scenario | Number of Realizations | Number of Stations | Number of Seismograms |
---|---|---|---|
Scenario101 | 30 | 96 | 2880 |
Scenario102 | 30 | 96 | 2880 |
Scenario103 | 30 | 138 | 4140 |
Scenario104 | 30 | 158 | 4740 |
Scenario105 | 30 | 178 | 5340 |
Scenario106 | 30 | 198 | 5940 |
Scenario107 | 30 | 107 | 3210 |
Scenario108 | 30 | 107 | 3210 |
Scenario109 | 30 | 165 | 4950 |
Scenario110 | 30 | 203 | 6090 |
Scenario111 | 30 | 241 | 7230 |
Scenario112 | 30 | 279 | 8370 |
Scenario113 | 30 | 107 | 3210 |
Scenario114 | 30 | 107 | 3210 |
Scenario115 | 30 | 127 | 3810 |
Scenario116 | 30 | 165 | 4950 |
Scenario117 | 30 | 203 | 6090 |
Scenario118 | 30 | 241 | 7230 |
Scenario119 | 30 | 96 | 2880 |
Scenario120 | 30 | 96 | 2880 |
Scenario121 | 30 | 118 | 3540 |
Scenario122 | 30 | 138 | 4140 |
Scenario123 | 30 | 158 | 4740 |
Scenario124 | 30 | 178 | 5340 |
Scenario125 | 30 | 113 | 3390 |
Scenario126 | 30 | 113 | 3390 |
Scenario127 | 30 | 113 | 3390 |
Scenario128 | 30 | 113 | 3390 |
Scenario129 | 30 | 215 | 6450 |
Scenario130 | 30 | 215 | 6450 |
Scenario131 | 30 | 257 | 7710 |
Scenario132 | 30 | 110 | 3300 |
Scenario133 | 30 | 110 | 3300 |
Scenario134 | 30 | 110 | 3300 |
Scenario135 | 30 | 110 | 3300 |
Scenario136 | 30 | 210 | 6300 |
Scenario137 | 30 | 210 | 6300 |
Scenario138 | 30 | 250 | 7500 |
Scenario139 | 30 | 275 | 8250 |
Scenario140 | 30 | 149 | 4470 |
Scenario141 | 30 | 149 | 4470 |
Scenario142 | 30 | 275 | 8250 |
Totals | 1260 | 203910 |
Fling Study Scenario List
scenario_id magnitude dip rake Ztor fault_length fault_width Priority 101 6 90 180 0 14 8 1 102 6.5 90 180 0 24 13 103 7 90 180 0 68 15 104 7.5 90 180 0 210 15 105 7.8 90 180 0 420 15 106 8.2 90 180 0 470 15 107 6 70 180 0 14 8 2 108 6.5 70 180 0 24 13 109 7 70 180 0 68 15 110 7.5 70 180 0 210 15 111 7.8 70 180 0 420 15 112 8.2 70 180 0 470 15 113 6 70 180 0 14 8 3 114 6.5 70 180 0 24 13 115 7 70 180 0 40 25 116 7.5 70 180 0 100 32 117 7.8 70 180 0 160 40 118 8.2 70 180 0 400 40 119 6 90 180 0 14 8 4 120 6.5 90 180 0 24 13 121 7 90 180 0 40 25 122 7.5 90 180 0 100 32 123 7.8 90 180 0 160 40 124 8.2 90 180 0 400 40 125 6 45 90 0 10 10 5 126 6 45 90 5 10 10 127 6.5 45 90 0 18 18 128 6.5 45 90 5 18 18 129 7 45 90 0 44 23 130 7.5 45 90 0 126 25 131 7.8 45 90 0 180 25 132 6 60 90 0 10 10 133 6 60 90 5 10 10 134 6.5 60 90 0 18 18 135 6.5 60 90 5 18 18 136 7 60 90 0 50 20 137 7.5 60 90 0 150 20 138 7.8 60 90 0 200 20 139 7.0 45 90 0 18 18 140 6.5 45 90 0 18 18 141 6.5 45 90 5 18 18 142 7.0 45 90 0 44 23
Scenarios 140, 141 and 142 were identified as high priority simulations at a NGA-West2 meeting last Friday.
Completed Scenarios
- Scenario141:
- Codebase: URS/URS/URS/URS
- Completed: 15 May 2012
- Operator: PJM
- Results Staged:/home/broadband-01/maechlin/flingstudy/Scenario141
- Total Files: 28Mb
- outdata: 2.5Gb
- Starttime: Mon May 14 22:05:02 PDT 2012
- Endtime: May 15 11:20:47 PDT 2012
- Cores: 16
Running on USC HPCC
A sample set of simulations from the Fling study were run on USC HPCC. The original fling generation scripts, source descriptions, station lists, and batch scripts were copied over from broadband.usc.edu to /home/rcf-104. Then small modifications were made to update paths and block the actual execution of the platform (the platform will be run in a PBS job):
Sample scripts can be found at the following locations. However then are not necessarily used in the following order.
Script | Location | Description | Modified |
---|---|---|---|
build_xml.py | /auto/rcf-104/patrices/bbp/batch_tools | Builds XML workflows for a simulation | No |
batch_run_bbp.py | /auto/rcf-104/patrices/bbp/batch_tools | Executes BBP workflow | Modified to only write BBP command-lines for simulations to a log for later execution by run_parallel.py. BBP invocations are saved in batch_run_bbp_sims.log and bbp output directory moves are saved in batch_run_bbp_moves.log |
run_parallel.py | /auto/rcf-104/patrices/bbp/batch_tools | Helper script to run N programs on a set of M cores | New script |
gen_source_input.csh | /auto/rcf-104/patrices/bbp/fling | Generate full study inputs | No |
run_bbp-parallel.csh | /auto/rcf-104/patrices/bbp/fling | Originally intended to execute the study with the platform. After modifications, only generates XML and execution lists for run_parallel.py. | Some paths changed, also added ${ROOT_PATH} to some relative path locations to make them absolute paths |
General steps for running the Fling study:
- Generate inputs
$ ./gen_source_input.csh
- Generate XML workflows
$ ./run_bbp-parallel.csh
- Create PBS job submission script (example below)
- Submit PBS job to USC HPCC
Example PBS script running the sample Fling simulations on 16 cores:
#!/bin/bash #PBS -q nbns #PBS -l arch=x86_64,pmem=2000mb,pvmem=3000mb,walltime=6:00:00,nodes=4:ppn=4 #PBS -V #PBS -e /home/rcf-104/patrices/bbp/fling/Xml1/Set1/run_set1.err #PBS -o /home/rcf-104/patrices/bbp/fling/Xml1/Set1/run_set1.out PYTHONPATH=/home/rcf-104/patrices/bbp/11.2.2/bbp_2g/comps HOME=/home/rcf-104/patrices/bbp/fling echo "Jobs start" date cd $HOME python $HOME/Xml1/Set1/run_parallel.py /home/rcf-104/patrices/bbp/11.2.2/setup_bbp_env.sh $HOME /Xml1/Set1/batch_run_bbp_sims.log $PBS_NODEFILE 1 python $HOME/Xml1/Set1/run_parallel.py /home/rcf-104/patrices/bbp/11.2.2/setup_bbp_env.sh $HOME /Xml1/Set1/batch_run_bbp_moves.log $PBS_NODEFILE 1 echo "Jobs end" date
Comparison of Seismogram from Server and Cluster
Simulation | broadband.usc.edu | USC HPCC cluster |
---|---|---|
10010116 | ||
10010129 |
Certification of USC HPCC Cluster for Broadband Calculations
The verification and validation of the currently released Broadband platform is based on results generated on a SCEC server called broadband.usc.edu. When we move the Broadband platform software, re-build it, and re-run it in a different computing environment, the results the platform produces can be slightly different than results produced on the SCEC server. Differences can come from computing hardware, from operating system characteristics, from compiler version, and other sources.
Before accepting results generated in a new computing environment, we must first certify that the new computing environment produces results that are equivalent to the results from the original server where the platform was originally developed and tested.
To speed up execution of the Fling study, we plan to run it on the USC HPCC cluster, so we must certify that USC HPCC cluster results are valid and comparable to those generated on broadband.usc.edu.
Below are initial results from our initial certification tests. A researcher ran a small subset of the Fling study on the SCEC broadband server. Then, we ran the same subset on the USC HPCC cluster. Below we compare the output seismograms from both runs, showing that the two results are very similar.
In our discussions, we decided that the certification criteria for this study will include a number of small magnitude ruptures, and a number of large magnitude ruptures, which we will post when they are available.
Building Metrics Table
The following command will generate the metrics table above.
$ tot=0; for i in `ls | grep "Scenario"`; do echo -n "$i "; cnt=`cat $i/StatInfo/*.stl | grep -v "#" | wc -l` ; echo -n "$cnt "; \ tot=$(($tot+$cnt)); num_smgr=$(($cnt*30)); echo $num_smgr; done; echo "$tot $(($tot*30))"