Difference between revisions of "CSEP - ETAS Simulation Plan"

From SCECpedia
Jump to navigationJump to search
(Created page with " == Stampede 3 Installation == == Performance Results ==")
 
 
Line 1: Line 1:
 +
This page summarizes the performance study ran on Stampede 3 in order to calculate the simulation requirements for computing 1 day ETAS forecasts from 2007 to 2018.
  
 +
== Stampede 3 Installation and Configuration ==
  
== Stampede 3 Installation ==
+
* Java - jdk-21.0.1+12
 +
* FastMPJ
 +
 +
The Slurm file header used to run the simulations is shown below:
  
 +
<pre>
 +
#SBATCH -t 6:00:00
 +
#SBATCH -N 7
 +
#SBATCH -n 336
 +
#SBATCH -p spr
 +
#SBATCH -A DS-Cybershake
 +
</pre>
  
 +
where the "-n" parameter is calculated as 48 * number of nodes (N). Additionally, the following parameters were used in the Slurm script
 +
 +
<pre>
 +
MEM_GIGS=110
 +
THREADS=20
 +
FMPJ_HOME=/work2/02404/fsilva/stampede3/FastMPJ
 +
#CLEAN_OPTION="--clean"
 +
export FMPJ_HOME=/work2/02404/fsilva/stampede3/FastMPJ
 +
</pre>
  
 
== Performance Results ==
 
== Performance Results ==
 +
 +
To measure the Stampede 3 performance running the one day ETAS forecasts, we ran the same simulation scenario using different numbers of nodes. To compute these results, we used the same random seed and date. We also deleted the cached results stored in the scratch filesystem after each simulation in order to force the recalculation of the entire run. The command-line used to generate the run was:
 +
 +
<pre>
 +
u3etas_comcat_config_builder.sh --end-time 1717484400000 --num-simulations 100000 --duration-years 0.002737851 --include-spontaneous --historical-catalog --start-after-historical --etas-k-cov 1.5 --random-seed 123456789 --hpc-site TACC_FRONTERA --nodes 35 --hours 24 --queue normal --output-dir $ETAS_SIM_DIR/2024_06_04-ComcatPlusHistorical-Start20240604_1day_100000Simulations_Statewide_PointSources_kCOV1p5_Spontaneous_HistCatalog --binary-output
 +
</pre>
 +
 +
The results are as follows:
 +
 +
{| border="1" cellpadding="5"
 +
|-
 +
! Number of Nodes
 +
! Runtime (min)
 +
! Service Units (SUs) Used
 +
|-
 +
! 7
 +
! 169
 +
! 19.7
 +
|-
 +
! 14
 +
! 84
 +
! 19.6
 +
|-
 +
! 28
 +
! 43
 +
! 20.1
 +
|-
 +
! 56
 +
! 24
 +
! 22.4
 +
|}
 +
 +
In the table above, service units was computed by dividing the runtime by 60 and multiplying by the number of nodes used. We observe that as the numbers of nodes used increased, runtime decreased linearly while  the allocation usage remained mostly flat.
 +
 +
== Requirements for the Complete ETAS Simulation Runs ==
 +
 +
Using the performance results obtained above, we can calculate the requirements for the full simulation run by multiplying the required SUs for a single run by the total number of runs:
 +
 +
* Start Date = 1 August 2007
 +
* End Date = 30 August 2018
 +
* Total Number of Days = 4045
 +
 +
For the total required service units, we multiple the total number of 1 day forecasts (4045) by the number of service units used for each run (20):
 +
 +
* SUs needed = 20 * 4045 = 80900 SUs
 +
 +
The storage requirements per 1 day forecast is as follows:
 +
 +
* Binary results (results_*.bin files) ~ 100M
 +
* Complete output folder (with logs) ~ 272M
 +
 +
If we multiply the numbers above by the total number of 1-day forecasts, we have
 +
 +
* Total storage for data = 100M * 4045 = 405G
 +
* Total storage (including logs) = 272M * 4045 = 1.1T

Latest revision as of 18:25, 22 July 2024

This page summarizes the performance study ran on Stampede 3 in order to calculate the simulation requirements for computing 1 day ETAS forecasts from 2007 to 2018.

Stampede 3 Installation and Configuration

  • Java - jdk-21.0.1+12
  • FastMPJ

The Slurm file header used to run the simulations is shown below:

#SBATCH -t 6:00:00
#SBATCH -N 7
#SBATCH -n 336
#SBATCH -p spr
#SBATCH -A DS-Cybershake

where the "-n" parameter is calculated as 48 * number of nodes (N). Additionally, the following parameters were used in the Slurm script

MEM_GIGS=110
THREADS=20
FMPJ_HOME=/work2/02404/fsilva/stampede3/FastMPJ
#CLEAN_OPTION="--clean"
export FMPJ_HOME=/work2/02404/fsilva/stampede3/FastMPJ

Performance Results

To measure the Stampede 3 performance running the one day ETAS forecasts, we ran the same simulation scenario using different numbers of nodes. To compute these results, we used the same random seed and date. We also deleted the cached results stored in the scratch filesystem after each simulation in order to force the recalculation of the entire run. The command-line used to generate the run was:

u3etas_comcat_config_builder.sh --end-time 1717484400000 --num-simulations 100000 --duration-years 0.002737851 --include-spontaneous --historical-catalog --start-after-historical --etas-k-cov 1.5 --random-seed 123456789 --hpc-site TACC_FRONTERA --nodes 35 --hours 24 --queue normal --output-dir $ETAS_SIM_DIR/2024_06_04-ComcatPlusHistorical-Start20240604_1day_100000Simulations_Statewide_PointSources_kCOV1p5_Spontaneous_HistCatalog --binary-output

The results are as follows:

Number of Nodes Runtime (min) Service Units (SUs) Used
7 169 19.7
14 84 19.6
28 43 20.1
56 24 22.4

In the table above, service units was computed by dividing the runtime by 60 and multiplying by the number of nodes used. We observe that as the numbers of nodes used increased, runtime decreased linearly while the allocation usage remained mostly flat.

Requirements for the Complete ETAS Simulation Runs

Using the performance results obtained above, we can calculate the requirements for the full simulation run by multiplying the required SUs for a single run by the total number of runs:

  • Start Date = 1 August 2007
  • End Date = 30 August 2018
  • Total Number of Days = 4045

For the total required service units, we multiple the total number of 1 day forecasts (4045) by the number of service units used for each run (20):

  • SUs needed = 20 * 4045 = 80900 SUs

The storage requirements per 1 day forecast is as follows:

  • Binary results (results_*.bin files) ~ 100M
  • Complete output folder (with logs) ~ 272M

If we multiply the numbers above by the total number of 1-day forecasts, we have

  • Total storage for data = 100M * 4045 = 405G
  • Total storage (including logs) = 272M * 4045 = 1.1T