CyberShake Study 18.8

CyberShake 18.3 is a computational study to perform CyberShake in a new region, the extended Bay Area. We plan to use a combination of 3D models (USGS Bay Area detailed and regional, CVM-S4.26.M01, CCA-06) with a minimum Vs of 500 m/s and a frequency of 1 Hz. We will use the GPU implementation of AWP-ODC-SGT, the Graves & Pitarka (2014) rupture variations with 200m spacing and uniform hypocenters, and the UCERF2 ERF. The SGT and post-processing calculations will both be run on both NCSA Blue Waters and OLCF Titan.

Status

This study is under development. We hope to begin in May 2018.

Science Goals

The science goals for this study are:

Expand CyberShake to the Bay Area.
Calculate CyberShake results with the USGS Bay Area velocity model as the primary model.
Calculate CyberShake results at selection locations with Vs min = 250 m/s.

Technical Goals

Perform the largest CyberShake study to date.

Sites

Map showing Study 18.3 sites (cities=yellow, CISN stations=orange, missions=blue, 10 km grid=purple, 5 km grid=green, PG&E locations=pink. The Bay Area box is in orange and the Study 17.3 box is in magenta.

The Study 18.3 box is 180 x 390 km, with the long edge rotated 27 degrees counter-clockwise from vertical. The corners are defined to be:

South: (-121.51,35.52)
West: (-123.48,38.66)
North: (-121.62,39.39)
East: (-119.71,36.22)

We are planning to run 869 sites, 837 of which are new, as part of this study.

These sites include:

77 cities (74 new)
10 new missions
139 CISN stations (136 new)
46 new sites of interest to PG&E
597 sites along a 10 km grid (571 new)

Of these sites, 32 overlap with the Study 17.3 region for verification.

A KML file with all these sites is available with names or without names.

Projection Analysis

As our simulation region gets larger, we needed to review the impact of the projection we are using for the simulations. An analysis of the impact of various projections by R. Graves is summarized in this posting:

Projection Comparision

Velocity Models

For Study 18.5, we have decided to construct the velocity mesh by querying models in the following order:

USGS Bay Area model
CCA-06 with the Ely GTL applied
CVM-S4.26.M01 (which includes a 1D background model).

A KML file showing the model regions is available here.

We will use a minimum Vs of 500 m/s. Smoothing will be applied 20 km on either side of any velocity model interface.

A thorough investigation was done to determine these parameters; a more detailed discussion is available at Study 18.5 Velocity Model Comparisons.

Vs=250m/s experiment

We would like to select a subset of sites and calculate hazard at 1 Hz and minimum Vs=250 m/s. Below are velocity plots with this lower Vs cutoff.

Surface plot with Vs min=250 m/s.

Plot at 100m depth with Vs min=250 m/s.

Verification

We have selected the following 4 sites for verification, one from each corner of the box:

s4518
s975
s2189
s2221

We will run s975 at both 50 km depth, ND=50 and 80 km depth, ND=80.

We will run s4518 and s2221 on both Blue Waters and Titan.

Hazard Curves

Site	2 sec RotD50	3 sec RotD50	5 sec RotD50	10 sec RotD50
s4518 (south corner)
s2221 (north corner)
s975 (east corner)	Blue is Bay Area test, black Study 17.3	Blue is Bay Area test, black Study 17.3	Blue is Bay Area test, black Study 17.3	Blue is Bay Area test, black Study 17.3

Comparisons with Study 17.3

Site	2 sec RotD50	3 sec RotD50	5 sec RotD50	10 sec RotD50
s975 (east corner)	black is Study 17.3; red is Bay Area test; green is Bay Area code with Study 17.3 params; blue is Vsmin=900m/s, USGS/CCA06 no GTL/CVM-S4.26.M01	black is Study 17.3; red is Bay Area test; green is Bay Area code with Study 17.3 params; blue is Vsmin=900m/s, USGS/CCA06 no GTL/CVM-S4.26.M01	black is Study 17.3; red is Bay Area test; green is Bay Area code with Study 17.3 params; blue is Vsmin=900m/s, USGS/CCA06 no GTL/CVM-S4.26.M01	black is Study 17.3; red is Bay Area test; green is Bay Area code with Study 17.3 params; blue is Vsmin=900m/s, USGS/CCA06 no GTL/CVM-S4.26.M01

Below are seismograms comparing a northern SAF event (source 39, rupture 5, rupture variation 311, M8.15 with hypocenter just offshore about 60 km south of Eureka) and a southern SAF event (source 59, rupture 0, rupture variation 0, M7.75 with hypocenter near Bombay Beach) between the Bay Area test with Study 17.3 parameters, Study 17.3, and Study 17.3 parameters but Bay Area tiling. The only difference is that the smoothing zone is larger in the Bay Area test + 17.3 parameters (I forgot to change it). Overall, the matches between Study 17.3 and the BA test + 17.3 parameters are excellent. Once the tiling is changed, we start to see differences, especially in the northern CA event.

N SAF event

S SAF event

To help illuminate the differences in velocity model, below are surface and vertical plots. The star indicates the location of the s975 site. Both models were created with Vs min = 900 m/s, and no GTL applied to the CCA-06 model.

Model	Horizontal (depth=0)	Vertical, parallel to box through site
s975, Study 17.3 tiling: 1)CCA-06, no GTL 2)USGS Bay Area 3)CVM-S4.26.M01		NW is at left, SE at right
s975, Bay Area: 1)USGS Bay Area 2)CCA-06, no GTL 3)CVM-S4.26.M01		NW is at left, SE at right

We also calculated curves for 2 overlapping rock sites, s816 and s903. s816 is 22 km from the USGS/CCA model boundary, and outside of the smoothing zone; s903 is 7 km from the boundary.

Site	2 sec RotD50	3 sec RotD50	5 sec RotD50	10 sec RotD50
s816	Blue is Bay Area test, black Study 17.3	Blue is Bay Area test, black Study 17.3	Blue is Bay Area test, black Study 17.3	Blue is Bay Area test, black Study 17.3
s903	Blue is Bay Area test, black Study 17.3	Blue is Bay Area test, black Study 17.3	Blue is Bay Area test, black Study 17.3	Blue is Bay Area test, black Study 17.3

Velocity model vertical slides for s816:

Model	Horizontal (depth=0)	Vertical, parallel to box through site
s816, Study 17.3 tiling: 1)CCA-06, no GTL 2)USGS Bay Area 3)CVM-S4.26.M01		NW is at left, SE at right
s816, Bay Area: 1)USGS Bay Area 2)CCA-06, no GTL 3)CVM-S4.26.M01		NW is at left, SE at right

These are seismograms for the same two events (NSAF, src 39 rup 5 rv 311 and SSAF, src 59 rup 0 rv 0). The southern events fit a good deal better than the northern.

Site	Northern SAF event	Southern SAF event
s816
s903

Configuration for this study

Based on these results, we've decided to use the Study 17.3 tiling order for the velocity models:

CCA-06 with Ely GTL
USGS Bay Area model
CVM-S4.26.M01

All models will use minimum Vs=500 m/s.

Here are comparison curves for Study 17.3 results, USGS Bay Area/CCA-06+GTL/CVM-S4.26.M01, and CCA-06+GTL/USGS Bay Area/CVM-S4.26.M01 .

Site	10 sec RotD50	5 sec RotD50	3 sec RotD50	2 sec RotD50
s975	Black=Study 17.3, Red=USGS/CCA/CVM-S4.26, Blue=CCA/USGS/CVM-S4.26	Black=Study 17.3, Red=USGS/CCA/CVM-S4.26, Blue=CCA/USGS/CVM-S4.26	Black=Study 17.3, Red=USGS/CCA/CVM-S4.26, Blue=CCA/USGS/CVM-S4.26	Black=Study 17.3, Red=USGS/CCA/CVM-S4.26, Blue=CCA/USGS/CVM-S4.26

Seismograms

We've extracted 2 seismograms for s975, a northern SAF event and a southern, and plotted comparisons of the Bay Area test with the Study 17.3 3D results. Note the following differences between the two runs:

Minimum vs: 900 m/s for Study 17.3, 500 m/s for the test
GTL: no GTL in CCA-06 for Study 17.3, a GTL for CCA-06 for the test
DT: 0.0875 sec for Study 17.3, 0.05 sec for the test
Velocity model priority: CCA-06, then USGS Bay Area, then CVM-S4.26.M01 for Study 17.3; USGS Bay Area, then CCA-06+GTL, then CVM-S4.26.M01 for the test

Northern Event

This seismogram is from source 39, rupture 5, rupture variation 311, a M8.15 northern San Andreas event with hypocenter just offshore, about 60 km south of Eureka.

Southern Event

This seismogram is from source 59, rupture 0, rupture variation 0, a M7.75 southern San Andreas event with hypocenter near Bombay Beach.

Background Seismicity

Statewide maps showing the impact of background seismicity are available here: CyberShake Background Seismicity

Performance Enhancements (over Study 17.3)

Responses to Study 17.3 Lessons Learned

Include plots of velocity models as part of readiness review when moving to new regions.

We have constructed many plots, and will include some in the science readiness review.

Formalize process of creating impulse. Consider creating it as part of the workflow based on nt and dt.

Many jobs were not picked up by the reservation, and as a result reservation nodes were idle. Work harder to make sure reservation is kept busy.

Forgot to turn on monitord during workflow, so had to deal with statistics after the workflow was done. Since we're running far fewer jobs, it's fine to run monitord population during the workflow.

We set pegasus.monitord.events = true in all properties files.

In Study 17.3b, 2 of the runs (5765 and 5743) had a problem with their output, which left 'holes' of lower hazard on the 1D map. Looking closely, we discovered that the SGT X component of run 5765 was about 30 GB smaller than it should have been, likely causing issues when the seismograms were synthesized. We no longer had the SGTs from 5743, so we couldn't verify that the same problem happened here. Moving forward, include checks on SGT file size as part of the nan check.

We added a file size check as part of the NaN check.

Output Data Products

Below is a list of what data products we intend to compute and what we plan to put in the database.

Computational and Data Estimates

Computational Estimates

In producing the computational estimates, we selected the four N/S/E/W extreme sites in the box which 1)within the 200 km cutoff for southern SAF events (381 sites) and 2)were outside the cutoff (488 sites). We produced inside and outside averages and scaled these by the number of inside and outside sites.

We also modified the box to be at an angle of 30 degrees counterclockwise of vertical, which makes the boxes about 15% smaller than with the previously used angle of 55 degrees.

We scaled our results based on the Study 17.3 performance of site s975, a site also in Study 18.3, and the Study 15.4 performance of DBCN, which used a very large volume and 100m spacing.

SGT calculation
	# Grid points	#VMesh gen nodes	Mesh gen runtime	# GPUs	SGT job runtime	Titan SUs	BW node-hrs
Inside cutoff, per site	23.1 billion	192	0.85 hrs	800	1.35 hrs	69.7k	2240
Outside cutoff, per site	10.2 billion	192	0.37 hrs	800	0.60 hrs	30.8k	990
Total						41.6M	1.34M

For the post-processing, we quantified the amount of work by determining the number of individual rupture points to process (summing, over all ruptures, the number of rupture variations for that rupture times the number of rupture surface points) and multiplying that by the number of timesteps. We then scaled based on performance of s975 from Study 17.3, and DBCN in Study 15.4.

Below we list the estimates for Blue Waters or Titan.

PP calculation
	#Points to process	#Nodes (BW)	BW runtime	BW node-hrs	#Nodes (Titan)	Titan runtime	Titan SUs
Inside cutoff, per site	5.96 billion	120	9.32 hrs	1120	240	10.3 hrs	74.2k
Outside cutoff, per site	2.29 billion	120	3.57 hrs	430	240	3.95 hrs	28.5k
Total				635K			42.2M

Our computational plan is to split the SGT calculations 50% BW/50% Titan, and split the PP 75% BW, 25% Titan. With a 20% margin, this would require 37.6M SUs on Titan, and 1.37M node-hrs on Blue Waters.

Currently we have 91.7M SUs available on Titan (expires 12/31/18), and 8.14M node-hrs on Blue Waters (expires 8/31/18). Based on the 2016 PRAC (spread out over 2 years), we budgeted approximately 6.2M node-hours for CyberShake on Blue Waters this year, of which we have used 0.01M.

Data Estimates

SGT size estimates are scaled based on the number of points to process.

Data estimates
	#Grid points	Velocity mesh	SGTs size	Temp data	Output data
Inside cutoff, per site	23.1 billion	271 GB	410 GB	1090 GB	19.1 GB
Outside cutoff, per site	10.2 billion	120 GB	133 GB	385 GB	9.3 GB
Total		158 TB	216 TB	589 TB	11.6 TB

If we plan on all the SGTs on Titan and split the PP 25% Titan, 75% Blue Waters, we will need:

Titan: 589 TB temp files + 3 TB output files = 592 TB

Blue Waters: 162 TB SGTs + 9 TB output files = 171 TB

SCEC storage: 1 TB workflow logs + 11.6 TB output data files = 12.6 TB(45 TB free)

Database usage: (4 rows PSA [@ 2, 3, 5, 10 sec] + 12 rows RotD [RotD100 and RotD50 @ 2, 3, 4, 5, 7.5, 10 sec])/rupture variation x 225K rupture variations/site x 869 sites = 3.1 billion rows x 125 bytes/row = 364 GB (2.0 TB free on moment.usc.edu disk)

Production Checklist

Install UCVM 18.5.0 on Titan
Compute test hazard curves for 4 sites - on both systems for 2 of them.
Compute test curves for 2 overlapping sites.
Integrate Vs30 calculation and database population into workflow
~~Create Vs min=250 m/s plots.~~
Spec Vsmin=250 runs
Update workflows to use Globus Online for file transfers.
Modify resource estimates based on test curves.
Create XML file describing study for web monitoring tool.
Make decision regarding sites with 1D model.
Activate Blue Waters and Titan quota cronjobs.
Activate Blue Waters and Titan usage cronjobs.
Improve parallelism/scalability of smoothing code.
Tag code in repository
~~Calculate background seismicity impact for most-likely-to-be-impacted sites.~~ (Will defer until after study completion)
Verify that Titan and Blue Waters codebases are in sync with each other and the repo.
Prepare pending file
Get usage stats from Titan and Blue Waters before beginning study.
Hold science and technical readiness reviews.
Calls with Blue Waters and Titan staff
Switch to more recent version of Rob's projection code