Difference between revisions of "AWP-ODC-FDQ"

From SCECpedia
Jump to navigationJump to search
Line 113: Line 113:
  
 
== Location of Code ==
 
== Location of Code ==
 +
 +
Kyle has a GPU version of the code on Titan at:
 +
/lustre/atlas1/geo112/proj-shared/withers/chino_hills_gpu
 +
 +
The added parameters to run this code are mainly the choice of exponent in the power law Q(f) law, which is constant below 1 Hz.
  
 
== Notes about FDQ Version ==
 
== Notes about FDQ Version ==
 
 
Note that there are some structural changes from the cpu code. The GPU code doesn't use the input parameter file anymore, all parameters are specified in the run script (if different from default values defined in the code). Also see the instructions below.
 
Note that there are some structural changes from the cpu code. The GPU code doesn't use the input parameter file anymore, all parameters are specified in the run script (if different from default values defined in the code). Also see the instructions below.
  

Revision as of 00:32, 2 July 2015

AWP-ODC-FDQ is a version of the wave propagation code AWP-ODC that contains frequency dependent-Q physics modules. Currently we have a GPU version of this code.

PBS Script

maechlin@h2ologin3:~/fdq_awpodc> more fdq_bw.pbs

 #!/bin/bash
 ###
 ### PBS script for submitting FDQ on Blue Waters
 ###
 ### Set the number of nodes
 ### Set the number of PEs per node
 #PBS -l nodes=4:ppn=1:xk
 ###
 ### Set the wallclock time
 ###
 #PBS -l walltime=01:30:00
 ###
 ### Set the job name
 ###
 #PBS -N chino_hills_gpu
 ###
 ### Set the job stdout and stderr
 ###
 #PBS -e $PBS_JOBID.err
 #PBS -o $PBS_JOBID.out
 ###
 ### Set the Queue
 ###
 #PBS -q normal 
 ###
 ### Set the Allocation
 ###
 #PBS -A jmz
 ###
 ### Set the Email (Beginning, End, Abort)
 ###
 #PBS -m bea
 #PBS -M maechlin@usc.edu
 
 ### 
 ### Load specific modules
 ###
 module swap PrgEnv-cray PrgEnv-gnu
 module load cudatoolkit
 module unload darshan
 
 cd $PBS_O_WORKDIR
 
 now=`date`
 fname="O.$now.tmp"
 echo "STARTING $now" >> "$fname"
 aprun -n 4 -S 1 ./pmcl3d --NX 224 --NY 224 -Z 1024 -x 2 -y 2 \
 --TMAX 20.0 --DH 200.0 --DT 0.01 \
 --NSRC 1 --NST 91 \
 --MEDIASTART 0 \
 --READ_STEP 91 \
 --NTISKP 10 --WRITE_STEP 10 \
 --FL 0.005 --FH 5.0 --FP 2.5 \
 --NEDZ 1 --INSRC FAULTPOW --INVEL mesh256 \
 --NSKPX 2 --NSKPY 2 \
 >> "$fname"
 echo "ENDING `date`" >> "$fname"

Changing Default Blue Waters Environment

Default software modules are Cray. Change these to GNU

 module unload PrgEnv-cray   
 module load PrgEnv-gnu 
 module load cudatoolkit 
 module unload darshan 

Makefile

 CC 	= cc
 CFLAGS	= -O3 -Wall
 GFLAGS	= nvcc -O4 -Xptxas -dlcm=ca -maxrregcount=255 -use_fast_math --ptxas-options=-v -arch=sm_35
 INCDIR  = -I/opt/nvidia/cudatoolkit/5.5.20-1.0402.7700.8.1/include
 OBJECTS	= command.o pmcl3d.o grid.o source.o mesh.o cerjan.o swap.o kernel.o io.o
 LIB	= -lm -ldl -L/opt/nvidia/cudatoolkit/5.5.20-1.0402.7700.8.1/lib64 -lcudart -lmpich
 
 pmcl3d:	$(OBJECTS)
    $(CC) $(CFLAGS) $(INCDIR) -o	pmcl3d	$(OBJECTS)	$(LIB)
 
 pmcl3d.o:	pmcl3d.c
    $(CC) $(CFLAGS) $(INCDIR) -c -o pmcl3d.o	pmcl3d.c		
 
 command.o:	command.c
   $(CC) $(CFLAGS) $(INCDIR) -c -o	command.o	command.c	
 
 io.o:	  io.c
    $(CC) $(CFLAGS) $(INCDIR) -c -o	io.o	  io.c	
 
 grid.o:		grid.c
    $(CC) $(CFLAGS) $(INCDIR) -c -o grid.o		grid.c		
 
 source.o:	source.c
   	$(CC) $(CFLAGS) $(INCDIR) -c -o source.o	source.c	
 
 mesh.o:		mesh.c
   $(CC) $(CFLAGS) $(INCDIR) -c -o mesh.o		mesh.c		
 
 cerjan.o:	cerjan.c
   $(CC) $(CFLAGS) $(INCDIR) -c -o cerjan.o	cerjan.c
 
 swap.o:		swap.c
   	$(CC) $(CFLAGS) $(INCDIR) -c -o swap.o		swap.c
 
 kernel.o:	kernel.cu
   	$(GFLAGS) $(INCDIR) -c -o	kernel.o	kernel.cu	
 
 clean:	
   	rm *.o

Location of Code

Kyle has a GPU version of the code on Titan at: /lustre/atlas1/geo112/proj-shared/withers/chino_hills_gpu

The added parameters to run this code are mainly the choice of exponent in the power law Q(f) law, which is constant below 1 Hz.

Notes about FDQ Version

Note that there are some structural changes from the cpu code. The GPU code doesn't use the input parameter file anymore, all parameters are specified in the run script (if different from default values defined in the code). Also see the instructions below.

The following are must:

 - NX, NY, NZ > 2 and even integers
 - PX, PY >= 1, PZ=1, and divide NX, NY, NZ respectively
 - BLOCK_SIZE_Y=2 in pmcl3d_cons.h
 - BLOCK_SIZE_Z divides NZ
 - BLOCK_SIZE_Y * BLOCK_SIZE_Z <= 1024 and a power of 2

The following are suggestions:

 - NX and NY are around at the same order, or 1/2*NY<=NX<=2*NY
 - NZ >= 256 and a power of 2
 - BLOCK_SIZE_Z divides NZ
 - BLOCK_SIZE_Y * BLOCK_SIZE_Z = 512

See Also