Difference between revisions of "CyberShake SGT reference test"

From SCECpedia
Jump to navigationJump to search
 
(40 intermediate revisions by 2 users not shown)
Line 1: Line 1:
This page details a small-scale (single GPU) test of the AWP-ODC-SGT code, with inputs and reference solutions.
+
This page details a small-scale (single GPU) test of the AWP-ODC-SGT GPU code used in CyberShake, with inputs and reference solutions.
  
== Code ==
+
== Clone CyberShake GitHub Repository Code ==
 +
The code used for this test is available at https://github.com/SCECcode/cybershake-core
  
The code used for this test is available at https://github.com/SCECcode/cybershake-core/tree/main/AWP-GPU-SGT .
+
You can clone the repository this way.  You'll want to clone it into a directory which the compute nodes have access to.
 +
 
 +
  git clone https://github.com/SCECcode/cybershake-core
 +
 
 +
This clones the full CyberShake codebase, which includes several applications, one of which is the GPU code that we want to work with at this time. After the checkout, the parallel, wave propagation, CUDA-based, GPU code of interest is located in
 +
 
 +
  <your source directory>/cybershake-core/AWP-GPU-SGT/src
 +
 
 +
== Setup Modules on Summit ==
 +
To compile the code, you'll want to use the gnu and NVIDIA CUDA compilers. 
 +
On Summit, for standard environments, you can load the required modules with these command:
 +
 
 +
  module swap xl gcc
 +
  module load cuda
 +
 
 +
These will add the required compilers to your environment. Since your module environment may be different, here is a module list that shows a module environment on Summit that builds the AWP-GPU-SGT code correctly. All of these may not be required, but this shows a working module environment for the code.
 +
 
 +
<pre>
 +
[login2.summit src]$ module list
 +
Currently Loaded Modules:
 +
  1) lsf-tools/2.0                4) xalt/1.2.1  7) spectrum-mpi/10.4.0.3-20210112  10) cuda/11.0.3
 +
  2) hsi/5.0.2.p5                5) DefApps      8) nsight-compute/2021.2.1
 +
  3) darshan-runtime/3.3.0-lite  6) gcc/9.1.0    9) nsight-systems/2021.3.1.54
 +
</pre>
 +
 
 +
== Configure AWP-ODC-SGT  ==
 +
Move into the AWP-GPU-SGT source directory, which is located at:
 +
 
 +
cd  <yourdirectory>/cybershake-core/AWP-GPU-SGT/src
 +
 
 +
You will need to edit a file in this directory to configure for the small-sized test problem we have posted.
  
 
In AWP, the value BLOCK_SIZE_Z, set in a #define at the top of src/pmcl3d_cons.h, must be set to a factor of the number of grid points in the Z dimension.  Since this test is 200 grid points deep, set BLOCK_SIZE_Z to 200:
 
In AWP, the value BLOCK_SIZE_Z, set in a #define at the top of src/pmcl3d_cons.h, must be set to a factor of the number of grid points in the Z dimension.  Since this test is 200 grid points deep, set BLOCK_SIZE_Z to 200:
Line 9: Line 40:
 
<pre>#define BLOCK_SIZE_Z 200</pre>
 
<pre>#define BLOCK_SIZE_Z 200</pre>
  
Then compile the code by running 'make' in the src directory.  This will generate the executable, pmcl3d, and copy it to the bin directory.  You may need to edit the Makefile to use the correct alias for the MPI C compiler on your system.
+
== Compile the executable - pmcl3d ==
 +
 
 +
Then compile the code by running 'make -f Makefile-summit' in the src directory.  This will generate the executable, pmcl3d, and copy it to the bin directory.   
 +
 
 +
The make may generate some warnings, but should build the executable called pmcl3d.
 +
 
 +
If not running on Summit, you may need to edit the Makefile to use the correct alias MPI C compiler and point to the NVIDIA CUDA compiler on your system.
 +
 
 +
This make file also defines a "clean" target, so you can run "make clean" and re-build the executable if needed.
 +
 
 +
== Create Run Directory ==
 +
 
 +
The input and output files for AWP-ODC-GPU can be quite large, so we recommend, on Summit, that users create a runtime directory on a directory that is visible from the compute nodes and that has a large disk quota. For this testing, we recommend creating a directory on proj-shared file system, and copying the required inputs files and outputs there.
 +
 
 +
  mkdir /gpfs/alpine/geo112/proj-shared/<your username>/<your rundirectory>
 +
 
 +
== Copy Test Files to runtime ==
 +
We has posted the required input files and example results and other scripts into a directory on Summit. Copy them to your runtime directory. These input files are available on Summit at
 +
<pre>/gpfs/alpine/geo112/world-shared/callag/SGT_SMALL_reference_test</pre>
 +
 
 +
  cd /gpfs/alpine/geo112/proj-shared/<your username>/<your rundirectory>
 +
 
 +
  cp /gpfs/alpine/geo112/world-shared/callag/SGT_SMALL_reference_test/* .
 +
 
 +
This will copy the following files into your run directory:
 +
 
 +
  752K SMALL_fx_src
 +
  1.4G awp-strain-SMALL-fx-reference
 +
  363K awp.SMALL.cordfile
 +
  770M awp.SMALL.smoothed.media
 +
  669 awp_x.lsf
 +
  90 make_dirs.sh
  
 
== Input Files ==
 
== Input Files ==
 +
AWP requires three different input files. The links below to a SCEC wiki give detailed descriptions of each of these files:
  
AWP requires three different input files:
 
 
*[[CyberShake_Code_Base#AWP_cordfile | AWP cordfile]], which contains a list of the grid points for which SGTs are saved.
 
*[[CyberShake_Code_Base#AWP_cordfile | AWP cordfile]], which contains a list of the grid points for which SGTs are saved.
 
*[[CyberShake_Code_Base#AWP_format | AWP velocity mesh]], which contains the material properties for the region.  For this test, the material properties are homogeneous (vp=1500 m/s, vs=750 m/s, rho=2200 kg/m3).
 
*[[CyberShake_Code_Base#AWP_format | AWP velocity mesh]], which contains the material properties for the region.  For this test, the material properties are homogeneous (vp=1500 m/s, vs=750 m/s, rho=2200 kg/m3).
 
*[[CyberShake_Code_Base#AWP_source | Impulse source]], which contains the impulse placed at the site of interest.  This is a point source.
 
*[[CyberShake_Code_Base#AWP_source | Impulse source]], which contains the impulse placed at the site of interest.  This is a point source.
  
These input files are available on Summit at /gpfs/alpine/scratch/callag/geo112/SGT_SMALL_reference_test .  The cordfile is 'awp.SMALL.cordfile', the source is 'SMALL_fx_src', and the velocity mesh is 'awp.SMALL.smoothed.media'.
+
* cordfile is 'awp.SMALL.cordfile'
 +
* source is 'SMALL_fx_src'  
 +
* velocity mesh is 'awp.SMALL.smoothed.media
  
AWP expects a certain directory structure, and the input files to be staged in a certain way.  To create the directory structure, run the 'make_dirs.sh' script in /gpfs/alpine/scratch/callag/geo112/SGT_SMALL_reference_test . Then, either copy or create symlinks to the 3 input files in comp_x/input .
+
== Creating the Runtime Directory Structure ==
 +
AWP expects a certain directory structure, and the input files to be staged in a certain way.  To create the directory structure, run the 'make_dirs.sh' script that you copied into your runtime directory. 
 +
 
 +
  ./make_dirs.sh
 +
 
 +
Then, either copy or create symlinks to the 3 input files in comp_x/input. The command below will create symlinks in the correct directory for the test problem.
 +
 
 +
  ln -s ../../awp.SMALL.cordfile comp_x/input/awp.SMALL.cordfile
 +
  ln -s ../../SMALL_fx_src comp_x/input/SMALL_fx_src
 +
  ln -s ../../awp.SMALL.smoothed.media comp_x/input/awp.SMALL.smoothed.media
  
 
== Execution ==
 
== Execution ==
  
A sample LSF batch script is available at /gpfs/alpine/scratch/callag/geo112/SGT_SMALL_reference_test/run_awp.lsf Edit the 'EXEC_PATH' line to point to the location of your pmcl3d install. When I tested this, it took about 15 minutes to run on a single Summit GPU.
+
A sample LSF batch script is available in the input file collection, called awp.lsf
 +
 
 +
Edit the 'EXEC_PATH' line to point to the location of your pmcl3d install.
 +
  EXEC_PATH=<your source directory>/cybershake-core/AWP-GPU-SGT/bin
 +
 
 +
If necessary, edit the project allocation to your allocation.
 +
  #BSUB -P GEO112
 +
 
 +
Submit the job to the debug queue:
 +
  bsub awp_x.lsf
 +
 
 +
Check the job status:
 +
  bjobs
 +
 
 +
When I tested this, it took about 15 minutes to run on a single Summit GPU.
 +
 
 +
When run successfully, you should see no error messages in stderr, and the job should create the file comp_x/output_sgt/awp-strain-SMALL-fx .  It should be 1490016000 bytes (~1.4 GB) in size.
  
 
== Reference Results ==
 
== Reference Results ==
  
Reference results are available on Summit at /gpfs/alpine/scratch/callag/geo112/SGT_SMALL_reference_test/awp-strain-SMALL-fx-reference Details about the file format are available at [[CyberShake_Code_Base#AWP_SGT]].
+
Reference results are available in the testcfg files you copied over. The results are called:
 +
  awp-strain-SMALL-fx-reference
 +
 
 +
Details about the file format are available at  
 +
*[[CyberShake_Code_Base#AWP_SGT]]
 +
 
 +
== Build the Comparison Executable ==
  
== Comparisons ==
+
To compare your results to the reference results, I recommend using AWP-GPU-SGT/utils/compare_sgts .
  
To compare your results to the reference results, I recommend using AWP-GPU-SGT/utils/compare_sgts .  cd into the utils directory and run 'make compare_sgts'.  The usage for this code is:
+
<pre>
 +
cd <your source directory>/cybershake-core/AWP-SGT-GPU/utils
 +
make compare_sgts</pre>
  
 +
== Run the Comparison code ==
 +
The usage for this code is:
 
<pre>./compare_sgts <reference SGT file> <test SGT file> <number of SGT points> <number of timesteps></pre>
 
<pre>./compare_sgts <reference SGT file> <test SGT file> <number of SGT points> <number of timesteps></pre>
  
 
So for this test, you'll run:
 
So for this test, you'll run:
  
<pre>$>./compare_sgts comp_x/output_sgt/awp-strain-SMALL-fx 31042 2000 </pre>
+
<pre>./compare_sgts awp-strain-SMALL-fx-reference comp_x/output_sgt/SGT0020000 31042 2000 </pre>
 +
or
 +
  ./compare_sgts /gpfs/alpine/geo112/proj-shared/scallag/testawp/awp-strain-SMALL-fx-reference /gpfs/alpine/geo112/proj-shared/scallag/testawp/comp_x/output_sgt/SGT0020000 31042 2000
 +
 
 +
Here is some sample output:
 +
 
 +
<pre>
 +
<snip>
 +
Average diff = -2.264432e-12, average percent diff = 0.001190%, average absolute percent diff = 0.021401%
 +
Largest diff of 0.017437 at float index 39253460047.
 +
Largest percent diff of 18299.484375% at float index 39253451935.
 +
Absolute percentage difference histogram:
 +
  0.0001  0.0010  0.0100  0.1000  0.3000  1.0000  3.0000  10.0000  100.0000  1000.0000 
 +
1308391611  11624296490  76362366946  71582352557  3328374731  647791272  92245001  17436808  4545298  283955  7118 
 +
0.79%  7.05%  46.29%  43.39%  2.02%  0.39%  0.06%  0.01%  0.00%  0.00%  0.00% 
 +
0.79%  7.84%  54.13%  97.52%  99.54%  99.93%  99.99%  100.00%  100.00%  100.00%  100.00% 
 +
</pre>
  
It's hard to say what constitutes 'good enough', but moving between systems I usually see average absolute percent differences of a few hundredths of a percent.
+
The 'average absolute percent diff' is probably the most useful metric for this test.  It's hard to say what constitutes 'good enough', but moving between systems I usually see average absolute percent differences of a few hundredths of a percent.

Latest revision as of 22:40, 20 May 2022

This page details a small-scale (single GPU) test of the AWP-ODC-SGT GPU code used in CyberShake, with inputs and reference solutions.

Clone CyberShake GitHub Repository Code

The code used for this test is available at https://github.com/SCECcode/cybershake-core

You can clone the repository this way. You'll want to clone it into a directory which the compute nodes have access to.

 git clone https://github.com/SCECcode/cybershake-core

This clones the full CyberShake codebase, which includes several applications, one of which is the GPU code that we want to work with at this time. After the checkout, the parallel, wave propagation, CUDA-based, GPU code of interest is located in

 <your source directory>/cybershake-core/AWP-GPU-SGT/src

Setup Modules on Summit

To compile the code, you'll want to use the gnu and NVIDIA CUDA compilers. On Summit, for standard environments, you can load the required modules with these command:

  module swap xl gcc
  module load cuda 

These will add the required compilers to your environment. Since your module environment may be different, here is a module list that shows a module environment on Summit that builds the AWP-GPU-SGT code correctly. All of these may not be required, but this shows a working module environment for the code.

[login2.summit src]$ module list
Currently Loaded Modules:
  1) lsf-tools/2.0                4) xalt/1.2.1   7) spectrum-mpi/10.4.0.3-20210112  10) cuda/11.0.3
  2) hsi/5.0.2.p5                 5) DefApps      8) nsight-compute/2021.2.1
  3) darshan-runtime/3.3.0-lite   6) gcc/9.1.0    9) nsight-systems/2021.3.1.54

Configure AWP-ODC-SGT

Move into the AWP-GPU-SGT source directory, which is located at:

cd <yourdirectory>/cybershake-core/AWP-GPU-SGT/src

You will need to edit a file in this directory to configure for the small-sized test problem we have posted.

In AWP, the value BLOCK_SIZE_Z, set in a #define at the top of src/pmcl3d_cons.h, must be set to a factor of the number of grid points in the Z dimension. Since this test is 200 grid points deep, set BLOCK_SIZE_Z to 200:

#define BLOCK_SIZE_Z 200

Compile the executable - pmcl3d

Then compile the code by running 'make -f Makefile-summit' in the src directory. This will generate the executable, pmcl3d, and copy it to the bin directory.

The make may generate some warnings, but should build the executable called pmcl3d.

If not running on Summit, you may need to edit the Makefile to use the correct alias MPI C compiler and point to the NVIDIA CUDA compiler on your system.

This make file also defines a "clean" target, so you can run "make clean" and re-build the executable if needed.

Create Run Directory

The input and output files for AWP-ODC-GPU can be quite large, so we recommend, on Summit, that users create a runtime directory on a directory that is visible from the compute nodes and that has a large disk quota. For this testing, we recommend creating a directory on proj-shared file system, and copying the required inputs files and outputs there.

 mkdir /gpfs/alpine/geo112/proj-shared/<your username>/<your rundirectory>

Copy Test Files to runtime

We has posted the required input files and example results and other scripts into a directory on Summit. Copy them to your runtime directory. These input files are available on Summit at

/gpfs/alpine/geo112/world-shared/callag/SGT_SMALL_reference_test
 cd /gpfs/alpine/geo112/proj-shared/<your username>/<your rundirectory>
 cp /gpfs/alpine/geo112/world-shared/callag/SGT_SMALL_reference_test/* .

This will copy the following files into your run directory:

 752K SMALL_fx_src
 1.4G awp-strain-SMALL-fx-reference
 363K awp.SMALL.cordfile
 770M awp.SMALL.smoothed.media
 669 awp_x.lsf
 90 make_dirs.sh

Input Files

AWP requires three different input files. The links below to a SCEC wiki give detailed descriptions of each of these files:

  • AWP cordfile, which contains a list of the grid points for which SGTs are saved.
  • AWP velocity mesh, which contains the material properties for the region. For this test, the material properties are homogeneous (vp=1500 m/s, vs=750 m/s, rho=2200 kg/m3).
  • Impulse source, which contains the impulse placed at the site of interest. This is a point source.
  • cordfile is 'awp.SMALL.cordfile'
  • source is 'SMALL_fx_src'
  • velocity mesh is 'awp.SMALL.smoothed.media

Creating the Runtime Directory Structure

AWP expects a certain directory structure, and the input files to be staged in a certain way. To create the directory structure, run the 'make_dirs.sh' script that you copied into your runtime directory.

 ./make_dirs.sh

Then, either copy or create symlinks to the 3 input files in comp_x/input. The command below will create symlinks in the correct directory for the test problem.

 ln -s ../../awp.SMALL.cordfile comp_x/input/awp.SMALL.cordfile
 ln -s ../../SMALL_fx_src comp_x/input/SMALL_fx_src
 ln -s ../../awp.SMALL.smoothed.media comp_x/input/awp.SMALL.smoothed.media

Execution

A sample LSF batch script is available in the input file collection, called awp.lsf

Edit the 'EXEC_PATH' line to point to the location of your pmcl3d install.

 EXEC_PATH=<your source directory>/cybershake-core/AWP-GPU-SGT/bin

If necessary, edit the project allocation to your allocation.

 #BSUB -P GEO112

Submit the job to the debug queue:

 bsub awp_x.lsf

Check the job status:

 bjobs

When I tested this, it took about 15 minutes to run on a single Summit GPU.

When run successfully, you should see no error messages in stderr, and the job should create the file comp_x/output_sgt/awp-strain-SMALL-fx . It should be 1490016000 bytes (~1.4 GB) in size.

Reference Results

Reference results are available in the testcfg files you copied over. The results are called:

 awp-strain-SMALL-fx-reference

Details about the file format are available at

Build the Comparison Executable

To compare your results to the reference results, I recommend using AWP-GPU-SGT/utils/compare_sgts .

cd <your source directory>/cybershake-core/AWP-SGT-GPU/utils
make compare_sgts

Run the Comparison code

The usage for this code is:

./compare_sgts <reference SGT file> <test SGT file> <number of SGT points> <number of timesteps>

So for this test, you'll run:

./compare_sgts awp-strain-SMALL-fx-reference comp_x/output_sgt/SGT0020000 31042 2000 

or

 ./compare_sgts /gpfs/alpine/geo112/proj-shared/scallag/testawp/awp-strain-SMALL-fx-reference /gpfs/alpine/geo112/proj-shared/scallag/testawp/comp_x/output_sgt/SGT0020000 31042 2000

Here is some sample output:

<snip>
Average diff = -2.264432e-12, average percent diff = 0.001190%, average absolute percent diff = 0.021401%
Largest diff of 0.017437 at float index 39253460047.
Largest percent diff of 18299.484375% at float index 39253451935.
Absolute percentage difference histogram:
   0.0001   0.0010   0.0100   0.1000   0.3000   1.0000   3.0000   10.0000   100.0000   1000.0000   
1308391611   11624296490   76362366946   71582352557   3328374731   647791272   92245001   17436808   4545298   283955   7118   
 0.79%   7.05%   46.29%   43.39%   2.02%   0.39%   0.06%   0.01%   0.00%   0.00%   0.00%   
0.79%   7.84%   54.13%   97.52%   99.54%   99.93%   99.99%   100.00%   100.00%   100.00%   100.00%   

The 'average absolute percent diff' is probably the most useful metric for this test. It's hard to say what constitutes 'good enough', but moving between systems I usually see average absolute percent differences of a few hundredths of a percent.