Rupture Variation Generator v5.4.2
This page details the work to migrate the Graves & Pitarka (2019) rupture generator, v5.4.2, to CyberShake.
The specific code changes required to create an API are detailed here: Rupture Variation Generator v5.4.2 code changes
Contents
Status
- Replicate reference SRFs using stand-alone code: Complete
- Create RupGen-api-5.4.2: Complete
- Replicate SRFs from stand-alone code using RupGen library: Complete
- Compile DirectSynth against RupGen library: Complete
- Replicate SRFs from stand-alone code using DirectSynth: In progress
- In the database, create new Rupture Variation Scenario ID and populate the Rupture Variations table: Not yet started
- Perform CyberShake run for USC using RupGen-api-5.4.2: Not yet started
Verification
The verification sequence is:
- Reference results from Rob
- (1) reproduced using Rob's supplied stand-alone code, compiled and run on a Summit head node.
- (2) is used to produce reference SRFs from ERF 36 geometry files for
- Source 76, rupture 0 (M6.35)
- Source 128, rupture 858 (M7.35)
- Source 68, rupture 7 (M8.45)
- Results from (3) are reproduced using test code which is compiled against the RupGen-api-5.4.2 library.
- Results from (3) are reproduced using DirectSynth and writing out the SRFs.
RupGen-api-5.4.2 against stand-alone code
Source 76, rupture 0
We generated all 77 rupture variations using the stand-alone code, then generated them using test code compiled against the library. These were all done on a login node.
Only a few non-slip fields differed more than the permitted tolerance, less than 1 per variation.
The average difference (which is mostly the difference between slips) was 0.0012%, and the largest difference was ~1e-5 (on values which range up to ~100).
Since in the past we have had issues with an order dependence in the rupture generator, we also spot-checked by generating every 10th variation using the test code. These yielded the same md5sums as when they were generated in order.
Source 128, rupture 858
We generated all 256 rupture variations using the stand-alone code, then generated them using test code compiled against the library. These were all done on a login node.
Each rupture variation has approximately 6 differences outside of the tolerance values (out of approximately 2.5 million values).
The average difference (which is mostly the difference between slips) was 0.0072%, and the largest difference was ~7e-5 (on values which range up to ~1000).
Since in the past we have had issues with an order dependence in the rupture generator, we also spot-checked by generating every 20th variation using the test code. These yielded the same md5sums as when they were generated in order, as did rupture variations 250 and 251 generated consecutively.
Source 68, rupture 7
All these SRFs were generated on the compute nodes. If you generate them on a login node, you will get something slightly different.
This source/rupture combo has 1190 rupture variations, but since each one takes about 90 seconds to generate, it would take 30 hours to create them all. Instead, we used the stand-alone code to generate the first 423 rupture variations, then generated the first 41 using test code compiled against the library.
Using the same tolerances, many rupture variations had more than 10% of points which at least 1 difference, which causes an abort. We doubled the tolerance (from 0.00011 to 0.00021) and ran the comparisons again. Each rupture variation has approximately 498 differences outside of the tolerance values (out of approximately 35 million values).
The average difference (which is mostly the difference between slips) was 0.052%, and the largest difference was ~4e-4 (on values which range up to ~10000).
The spot-check test was done on rupture variations 0, 10, 20, 30, and 40. We also spot-checked 100, 150, 200, 250, 300, 350, and 400 against the reference, with similar differences.
Optimization
v5.4.2 is approximately 3x slower than v3.3.1, so we investigated optimization.
We are running source 68, rupture 7, rupture variation 0 as our benchmark. We run it 5 times and take the average.
Reference runtime: 69.400300 sec
For source 128, rupture 858 (source 68, rupture 7 produced 42GB trace results), Score-P suggests that gaus_rand() and sfrand() are both called an extraordinary number of times and are responsible for about 75% of the runtime (total runtime was 18.9 sec):
cube::Region NumberOfCalls ExclusiveTime InclusiveTime gaus_rand 5531760 8.104156 15.400368 sfrand 66381120 7.296212 7.296212 fft2d_fftw 14 1.584906 1.584949 write_srf2 1 1.272117 1.272618 kfilt_beta2 2 1.182854 15.778140 ...
- Inlined sfrand(). Runtime: