Difference between revisions of "UCVMC La Habra mesh generation"
(6 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | == | + | == ucvm2mesh-mpi == |
A general solution for creating a large mesh is to partition a | A general solution for creating a large mesh is to partition a | ||
Line 16: | Line 16: | ||
This solution is simple but does not scale well when the target mesh is extremely large. | This solution is simple but does not scale well when the target mesh is extremely large. | ||
+ | |||
+ | == ucvm2mesh_mpi_layer == | ||
+ | |||
+ | For UCVMC18.5, a new mesh generator based on original ucvm2mesh-mpi is developed to allow a mpi process to take on multiple rank tasks (one after another) and allows user to define a region of interest. | ||
+ | |||
+ | A layer is defined as a group of tasks that falls within a z partition layer, ie. for pz partitions there are pz number of | ||
+ | layers and each layer has px * py number of rank tasks. | ||
+ | |||
+ | <pre> | ||
+ | rank_per_layer = px * py | ||
+ | starting_rank_layer = 1 | ||
+ | number_of_layers =1 | ||
+ | |||
+ | newly added command parameters : | ||
+ | |||
+ | -l starting_rank_layer | ||
+ | -c number_of_layers | ||
+ | |||
+ | start_rank = (starting_rank_layer - 1) * rank_per_layer | ||
+ | end_rank = start_rank + (number_of_layers * rank_per_layer) - 1 | ||
+ | </pre> | ||
== La Habra at blue waters (NCSA) == | == La Habra at blue waters (NCSA) == | ||
Line 99: | Line 120: | ||
export TEST_SRC_PATH=${TEST_TOP_PATH}/ncsa/ucvm2mesh | export TEST_SRC_PATH=${TEST_TOP_PATH}/ncsa/ucvm2mesh | ||
− | + | ################################################################# | |
Line 115: | Line 136: | ||
echo "Jobs done" | echo "Jobs done" | ||
exit 0 | exit 0 | ||
+ | |||
+ | </pre> | ||
+ | |||
+ | Timing result: | ||
+ | |||
+ | <pre> | ||
+ | Application 67323987 resources: utime ~14942482s, stime ~54928s, Rss ~2990920, inblocks ~2955759907, outblocks ~1099299238 | ||
+ | |||
+ | Application 67504456 resources: utime ~11647705s, stime ~54663s, Rss ~2991316, inblocks ~2471215624, outblocks ~1093604006 | ||
+ | |||
+ | Application 67524237 resources: utime ~11985846s, stime ~54351s, Rss ~2992688, inblocks ~2471207932, outblocks ~1093604006 | ||
+ | |||
+ | Application 67543579 resources: utime ~13435327s, stime ~57987s, Rss ~2992956, inblocks ~2471215635, outblocks ~1093604006 | ||
</pre> | </pre> |
Latest revision as of 21:04, 6 June 2018
ucvm2mesh-mpi
A general solution for creating a large mesh is to partition a target 3D mesh into a regular grid with smaller segments. The segments are then processed in a divide-and-conquer manner and the results are gathered into a single file.
UCVMC17.1's ucvm2mesh-mpi maps the target 3D mesh into a regular grid based on the parameters supplied in the mesh configuration file. The parameters: px, py and pz are the number of partitions in x,y,z orientation. A unit of process or rank is defined as an unit within the grid. Therefore, there are exactly px * py * pz ranks within the 3D mesh.
ucvm2mesh-mpi starts up multiple mpi processes equal in number to ranks required for the target mesh. All processes are ran concurrently and write their results to the same output file based on their 'ranking' position.
This solution is simple but does not scale well when the target mesh is extremely large.
ucvm2mesh_mpi_layer
For UCVMC18.5, a new mesh generator based on original ucvm2mesh-mpi is developed to allow a mpi process to take on multiple rank tasks (one after another) and allows user to define a region of interest.
A layer is defined as a group of tasks that falls within a z partition layer, ie. for pz partitions there are pz number of layers and each layer has px * py number of rank tasks.
rank_per_layer = px * py starting_rank_layer = 1 number_of_layers =1 newly added command parameters : -l starting_rank_layer -c number_of_layers start_rank = (starting_rank_layer - 1) * rank_per_layer end_rank = start_rank + (number_of_layers * rank_per_layer) - 1
La Habra at blue waters (NCSA)
The resulting mesh is about 2.2T in size. The resulting directory should be striped to improve the performance.
cd /path/to/working/directory lfs setstripe -c 4 ./
bw_la_habra_mesh.conf
#List of CVMs to query ucvmlist=cvmsi # UCVM conf file ucvmconf=/u/sciteam/meisu/scratch/UCVMC_TEST/ucvm.conf # Gridding cell centered or vertex gridtype=CENTER # Spacing of cells spacing=20.0 # Projection proj=+proj=utm +datum=WGS84 +zone=11 rot=-39.9 x0=-119.288842 y0=34.120549 z0=0.0 # Number of cells along each dim nx=9000 ny=6750 nz=3072 # Partitioning of grid among processors px=18 py=18 pz=64 # Vs/Vp minimum vp_min=0 vs_min=0 # Mesh and grid files, format meshfile=/u/sciteam/meisu/scratch/UCVM_REVIEW/RESULT/bw_la_habra_mesh.media gridfile=/u/sciteam/meisu/scratch/UCVM_REVIEW/RESULT/bw_la_habra_mesh.grid meshtype=IJK-12 # Location of scratch dir scratch=/u/sciteam/meisu/scratch
bw_la_habra_mesh.pbs
#!/bin/bash #PBS -l walltime=24:00:00,nodes=81:ppn=8:xe #PBS -A baln #PBS -e /u/sciteam/meisu/scratch/UCVM_REVIEW/RESULT/bw_la_habra_mesh.err #PBS -o /u/sciteam/meisu/scratch/UCVM_REVIEW/RESULT/bw_la_habra_mesh.out ## prepend velocity model's lib export TEST_UCVMC_TARGET=/projects/sciteam/baln/meisu/TARGET_UCVMC export UCVM_SRC_PATH=$TEST_UCVMC_TARGET/UCVMC export UCVM_INSTALL_PATH=$TEST_UCVMC_TARGET/install if [ $LD_LIBRARY_PATH ] ; then export LD_LIBRARY_PATH=$UCVM_INSTALL_PATH/lib/euclid3/lib:$UCVM_INSTALL_PATH/lib/proj-4/lib:$UCVM_INSTALL_PATH/model/cvms426/lib:$UCVM_INSTALL_PATH/model/cencal/lib:$LD_LIBRARY_PATH else export LD_LIBRARY_PATH=$UCVM_INSTALL_PATH/lib/euclid3/lib:$UCVM_INSTALL_PATH/lib/proj-4/lib:$UCVM_INSTALL_PATH/model/cvms426/lib:$UCVM_INSTALL_PATH/model/cencal/lib fi #echo $LD_LIBRARY_PATH export TEST_TOP_PATH=/u/sciteam/meisu/scratch/UCVMC_TEST export TEST_SRC_PATH=${TEST_TOP_PATH}/ncsa/ucvm2mesh ################################################################# cd $PBS_O_WORKDIR cp ${UCVM_INSTALL_PATH}/bin/ucvm2mesh_mpi_layer . cp ${TEST_SRC_PATH}/bw_la_habra_mesh.conf . #aprun -n 648 ./ucvm2mesh_mpi_layer -f bw_la_habra_mesh.conf -l 1 -c 16 #aprun -n 648 ./ucvm2mesh_mpi_layer -f bw_la_habra_mesh.conf -l 17 -c 16 aprun -n 648 ./ucvm2mesh_mpi_layer -f bw_la_habra_mesh.conf -l 33 -c 16 #aprun -n 648 ./ucvm2mesh_mpi_layer -f bw_la_habra_mesh.conf -l 49 -c 16 echo "Jobs done" exit 0
Timing result:
Application 67323987 resources: utime ~14942482s, stime ~54928s, Rss ~2990920, inblocks ~2955759907, outblocks ~1099299238 Application 67504456 resources: utime ~11647705s, stime ~54663s, Rss ~2991316, inblocks ~2471215624, outblocks ~1093604006 Application 67524237 resources: utime ~11985846s, stime ~54351s, Rss ~2992688, inblocks ~2471207932, outblocks ~1093604006 Application 67543579 resources: utime ~13435327s, stime ~57987s, Rss ~2992956, inblocks ~2471215635, outblocks ~1093604006