UCVM v25.7 with external model data directory CVM LARGEDATA DIR

From SCECpedia
Jump to navigationJump to search

Goal

As the number of CVMs in UCVM with large data size increases, the time spent building and installing UCVM core code becomes long and unstable. In UCVM v25.7, an optional setup is introduced to separate most model data into a centralize location that can be accessed during build time and runtime via symbolic links.

As a benefit, for UCVM instance that is built as part of docker container, ie. within CVM explorer, this data directory can be mounted from the outside into a running container instead of packing into the image itself. This makes it possible to create multi-CVMs UCVM docker image.

Before building UCVM

Create am accessible directory, CVM_DATA_DIRECTORY with an expected directory structure (431G)

/somepath/CVM_DATA_DIRECTORY

./model:
canvas   cencal  cvmh      cvmhlabn  cvmhsbbn   cvmhsgbn  cvmhstbn  cvms5_s5    cvmsi_i26  uwlinca
cca_i06  cs248   cvmhibbn  cvmhrbn   cvmhsbcbn  cvmhsmbn  cvmhvbn   cvmsi_cvms  sfcvm      uwsfbcvm

./model/canvas:
vp.dat  vs.dat

./model/cca_i06:
density.dat  vp.dat  vs.dat

./model/cencal:
USGSBayAreaVM-08.3.0.etree  USGSBayAreaVMExt-08.3.0.etree

./model/cs248:
density.dat  vp.dat  vs.dat

./model/cvmh:
base@@        CVM_CM.vo     CVM_HR.vo    CVMSM_flags@@  cvm_vs30_wills.hdr  model_top@@  tsurf
BASE.gts      CVM_CM_VP@@   CVM_HR_VP@@  CVMSM_tag66@@  cvm_vs30_wills.mdl  moho@@
BATO.gts      CVM_CM_VS@@   CVM_HR_VS@@  CVMSM_vp66@@   fromOutside         MOHO.gts
CVM_CM_TAG@@  CVM_HR_TAG@@  CVM_LR.vo    CVMSM_vs66@@   interfaces.vo       topo_dem@@

./model/cvmh/tsurf:
CMxVM_Model3D_CalMex_BATO.ts      CMxVM_Model3D_CM_BASE_Folded.ts  CVMH_CalMex_BATO.ts  CVMH_Moho.ts
CMxVM_Model3D_CM_BASE_Folded.dxf  CVMH_Basement64.ts               CVMH_Moho64.ts

./model/cvmhibbn:
base@@        CVMHB-Inner-Borderland-Basin.dat            CVMSM_flags@@  model_top@@
CVM_CM_TAG@@  CVMHB-Inner-Borderland-Basin_tag61_basin@@  CVMSM_tag66@@  moho@@
CVM_CM.vo     CVMHB-Inner-Borderland-Basin.vo             CVMSM_vp66@@   topo_dem@@
CVM_CM_VP@@   CVMHB-Inner-Borderland-Basin_vp63_basin@@   CVMSM_vs66@@
CVM_CM_VS@@   CVMHB-Inner-Borderland-Basin_vs63_basin@@   interfaces.vo

./model/cvmhlabn:
base@@        CVMHB-Los-Angeles-Basin.dat            CVMSM_flags@@  model_top@@
CVM_CM_TAG@@  CVMHB-Los-Angeles-Basin_tag61_basin@@  CVMSM_tag66@@  moho@@
CVM_CM.vo     CVMHB-Los-Angeles-Basin.vo             CVMSM_vp66@@   topo_dem@@
CVM_CM_VP@@   CVMHB-Los-Angeles-Basin_vp63_basin@@   CVMSM_vs66@@
CVM_CM_VS@@   CVMHB-Los-Angeles-Basin_vs63_basin@@   interfaces.vo

./model/cvmhrbn:
base@@        CVM_CM_VS@@                      CVMHB-Ridge-Basin_vp63_basin@@  CVMSM_vp66@@   moho@@
CVM_CM_TAG@@  CVMHB-Ridge-Basin.dat            CVMHB-Ridge-Basin_vs63_basin@@  CVMSM_vs66@@   topo_dem@@
CVM_CM.vo     CVMHB-Ridge-Basin_tag61_basin@@  CVMSM_flags@@                   interfaces.vo
CVM_CM_VP@@   CVMHB-Ridge-Basin.vo             CVMSM_tag66@@                   model_top@@

./model/cvmhsbbn:
base@@        CVMHB-San-Bernardino-Basin.dat            CVMSM_flags@@  model_top@@
CVM_CM_TAG@@  CVMHB-San-Bernardino-Basin_tag61_basin@@  CVMSM_tag66@@  moho@@
CVM_CM.vo     CVMHB-San-Bernardino-Basin.vo             CVMSM_vp66@@   topo_dem@@
CVM_CM_VP@@   CVMHB-San-Bernardino-Basin_vp63_basin@@   CVMSM_vs66@@
CVM_CM_VS@@   CVMHB-San-Bernardino-Basin_vs63_basin@@   interfaces.vo

./model/cvmhsbcbn:
base@@        CVMHB-Santa-Barbara-Channel-Basin.dat            CVMSM_flags@@  model_top@@
CVM_CM_TAG@@  CVMHB-Santa-Barbara-Channel-Basin_tag61_basin@@  CVMSM_tag66@@  moho@@
CVM_CM.vo     CVMHB-Santa-Barbara-Channel-Basin.vo             CVMSM_vp66@@   topo_dem@@
CVM_CM_VP@@   CVMHB-Santa-Barbara-Channel-Basin_vp63_basin@@   CVMSM_vs66@@
CVM_CM_VS@@   CVMHB-Santa-Barbara-Channel-Basin_vs63_basin@@   interfaces.vo

./model/cvmhsgbn:
base@@        CVMHB-San-Gabriel-Basin.dat            CVMSM_flags@@  model_top@@
CVM_CM_TAG@@  CVMHB-San-Gabriel-Basin_tag61_basin@@  CVMSM_tag66@@  moho@@
CVM_CM.vo     CVMHB-San-Gabriel-Basin.vo             CVMSM_vp66@@   topo_dem@@
CVM_CM_VP@@   CVMHB-San-Gabriel-Basin_vp63_basin@@   CVMSM_vs66@@
CVM_CM_VS@@   CVMHB-San-Gabriel-Basin_vs63_basin@@   interfaces.vo

./model/cvmhsmbn:
base@@        CVMHB-Santa-Maria-Basin.dat            CVMSM_flags@@  model_top@@
CVM_CM_TAG@@  CVMHB-Santa-Maria-Basin_tag61_basin@@  CVMSM_tag66@@  moho@@
CVM_CM.vo     CVMHB-Santa-Maria-Basin.vo             CVMSM_vp66@@   topo_dem@@
CVM_CM_VP@@   CVMHB-Santa-Maria-Basin_vp63_basin@@   CVMSM_vs66@@
CVM_CM_VS@@   CVMHB-Santa-Maria-Basin_vs63_basin@@   interfaces.vo

./model/cvmhstbn:
base@@        CVMHB-Salton-Trough-Basin.dat            CVMSM_flags@@  model_top@@
CVM_CM_TAG@@  CVMHB-Salton-Trough-Basin_tag61_basin@@  CVMSM_tag66@@  moho@@
CVM_CM.vo     CVMHB-Salton-Trough-Basin.vo             CVMSM_vp66@@   topo_dem@@
CVM_CM_VP@@   CVMHB-Salton-Trough-Basin_vp63_basin@@   CVMSM_vs66@@
CVM_CM_VS@@   CVMHB-Salton-Trough-Basin_vs63_basin@@   interfaces.vo

./model/cvmhvbn:
base@@        CVM_CM_VS@@                        CVMHB-Ventura-Basin_vp63_basin@@  CVMSM_vp66@@   moho@@
CVM_CM_TAG@@  CVMHB-Ventura-Basin.dat            CVMHB-Ventura-Basin_vs63_basin@@  CVMSM_vs66@@   topo_dem@@
CVM_CM.vo     CVMHB-Ventura-Basin_tag61_basin@@  CVMSM_flags@@                     interfaces.vo
CVM_CM_VP@@   CVMHB-Ventura-Basin.vo             CVMSM_tag66@@                     model_top@@

./model/cvms5_s5:
vp.dat  vs.dat

./model/cvmsi_cvms:
3D.out      ivmod.edge        Makefile     q12y_edge        sgeod.h       soil.pgm    te6A_edge  tsq4_sur2
b1___edge   ivsurfaced.h      Makefile.am  q12y_sur2        sgeo.h        sp9b_edge   te6A_sur2  tsq5_edge
b1___sur2   ivsurface.h       Makefile.in  q12z_edge        sgmo_edge     sp9b_sur2   te6B_edge  tsq5_sur2
b2___edge   ku1__edge         mantled.h    q12z_sur2        sgmo_sur2     spu1_edge   te6B_sur2  tsq7_edge
b2___sur2   ku1__sur2         mantle.h     qps1_edge        sgre_edge     spu1_sur2   te7__edge  tsq7_sur2
b3___edge   ku2__edge         moho1.h      qps1_sur2        sgre_sur2     spu9_edge   te7__sur2  tsq9_edge
b3___sur2   ku2__sur2         moho_sur     qps2_edge        sku2_edge     spu9_sur2   te8__edge  tsq9_sur2
b4___edge   ku3__edge         names.h      qps2_sur2        sku2_sur2     st4b_edge   te8__sur2  tv1__edge
b4___sur2   ku3__sur2         nsbb_edge    qps5_edge        smb1_edge     st4b_sur2   tj1__edge  tv1__sur2
b5___edge   ku4__edge         nsbb_sur2    qps5_sur2        smb2_edge     st4s_edge   tj1__sur2  tv2__edge
b5___sur2   ku4__sur2         params.h     qps6_edge        smb2_sur2     st4s_sur2   tj2__edge  tv2__sur2
bmod_edge   ku5__edge         pu1__edge    qps6_sur2        smb3_edge     ste2_edge   tj2__sur2  tv3__edge
borehole.h  ku5__sur2         pu1__sur2    regionald.h      smb3_sur2     ste2_sur2   tj3__edge  tv3__sur2
boreholes   ku8__edge         pu2A_edge    regional.h       smb9_edge     surfaced.h  tj3__sur2  tv5__edge
cvms.h      ku8__sur2         pu2A_sur2    salton_base.sur  smb9_sur2     surface.h   tj4__edge  tv5__sur2
cvms_sub.f  laba_edge         pu2B_edge    sbb2_edge        smm1_edge     te1__edge   tj4__sur2  tv9__edge
cvms_sub.o  laba_sur2         pu2B_sur2    sbb2_sur2        smm1_sur2     te1__sur2   tj5__edge  tv9__sur2
dim2.h      lab_geo2_geology  pu3__edge    sbb__edge        smm2_edge     te2__edge   tj5__sur2  version.h
dim8.h      labup.h           pu3__sur2    sbb__sur2        smm2_sur2     te2__sur2   tsq1_edge  wtbh1d.h
eh.modPS    lamo_edge         pu9__edge    sbmi_edge        smr1_edge     te3__edge   tsq1_sur2  wtbh1.h
genprod.h   lamo_sur2         pu9__sur2    sbmi_sur2        smr1_sur2     te3__sur2   tsq2_edge  wtbh2.h
genpro.h    lare_edge         q12b_edge    sbmo_edge        smr2_edge     te4__edge   tsq2_sur2  wtbh3.h
impva.edge  lare_sur2         q12b_sur2    sbmo_sur2        smr2_sur2     te4__sur2   tsq3_edge
in.h        laup_edge         q12x_edge    sgba_edge        soil1.h       te5__edge   tsq3_sur2
innum.h     laup_sur2         q12x_sur2    sgba_sur2        soil_generic  te5__sur2   tsq4_edge

./model/cvmsi_i26:
box.dat  cvmsi.bin  cvmsi.ver  region_spec.in  XYZGRD

./model/sfcvm:
USGS_SFCVM_v21-0_regional.h5  USGS_SFCVM_v21-1_detailed.h5

./model/uwlinca:
vp.dat

./model/uwsfbcvm:
easting.dat  northing.dat  vp.dat  vs.dat

Building UCVM

Set the data location environment in addition to UCVM_SRC_PATH and UCVM_INSTALL_PATH

export CVM_LARGEDATA_DIR=/somepath/CVM_DATASET_DIRECTORY

The handling for this environment variable is within the configure.ac/Makefile.am of each CVMs's top/data directory.

The basic data source logic in mycvm/configure.ac is :

  If CVM_LARGEDATA_DIR is defined :
       If CVM_IN_DOCKER is defined :
            set WITH_mycvm_CVM_LARGEDATA_DIR
       If CVM_IN_DOCKER is not defined :
            do a specific model data file check before enable WITH_mycvm_CVM_LARGEDATA_DIR

The basic data source logic in mycvm/data/Makefile.am is :

  If WITH_mycvm_CVM_LARGEDATA_DIR is defined :
          link with the right dataset locally
  If WITH_mycvm_CVM_LARGEDATA_DIR is not defined:
          download the dataset from remote source (CARC/globus endpoint)

Not every CVMs needs this data source logic because their data might be sparse or small or it is rule based.

The reason for the CVM_IN_DOCKER environment variable is because the data directory is mounted when the container is run instead of when the UCVM/image is being 'built'


Example: CVM explorer

Internally, there is a local path to this large data directory and so needs to set it during 'docker compose --build' for CVMs in UCVM to pick it up.

so, in cvm_web/setup_cvm_web.sh, add

  export CVM_LARGEDATA_DIR=/usr/local/share/cvm-largedata-dir
  export CVM_IN_DOCKER='#'

and in Dockerfile-web, add

  ENV CVM_LARGEDATA_DIR=/usr/local/share/cvm-largedata-dir
  ENV CVM_IN_DOCKER='#

and in development.yml, we add the mounting link so a running container would find it.

 web:
  volumes:
   - ./:/app
   - ./custom-php.ini:/etc/php.d/custom-php.ini
   - ./cvm-result:/usr/local/share/ucvm/cvm-result
   - /var/www/html/CVM_DATASET_DIRECTORY:/usr/local/share/cvm-largedata-dir:ro
  restart: unless-stopped


Example: Docker image