UCVM on Frontier

From SCECpedia
Jump to navigationJump to search

Testing the UCVM installation on Frontier

We are implementing tests of the development v3ersion of UCVM used for CyberShake NorCal.

Building Process

Building on the head node is very slow. We req Looks like we need to build on compute node. But compute node not network accessible. So do git clone, and largefile downloads on head node, then when ready to make, request a compute node.

Define INSTALL PATH

UCVM_INSTALL_PATH /lustre/orion/proj-shared/geo156/pmaech/scratch/TARGET_UCVM_SFCVM/ucvm_install

Setup Frontier Modules

[login03.frontier ~]$ module list

Currently Loaded Modules:
  1) craype-x86-trento                       7) cray-dsmml/0.2.2       13) darshan-runtime/3.4.0
  2) craype-network-ofi                      8) cray-libsci/22.12.1.1  14) hsi/default
  3) perftools-base/22.12.0                  9) PrgEnv-cray/8.3.3      15) lfs-wrapper/0.0.1
  4) xpmem/2.6.2-2.5_2.22__gd067c3f.shasta  10) cray-python/3.9.13.1   16) DefApps/default
  5) cray-pmi/6.1.8                         11) libfabric/1.15.2.0     17) libtool/2.4.6
  6) craype/2.7.19                          12) gcc/10.3.0             18) cray-mpich/8.1.23

This is typically built by keeping the default modules plus these:

  • module load cray-python
  • module load libtool/2.4.6
  • module load libfabric
  • module load gcc/10.3.0

Testing with two account showed this built ucvm binaries and tests(except CCA) passed.

Example Installation

Code is built a UCVM installtion on Frontier at : /ccs/home/mei/scratch/TARGET_UCVM_SFCVM/ucvm_install

   source conf/ucvm_env.sh
   which ucvm_query
   ucvm_query -H

Next is to run test/run_testing to run some basic unit testing

Install Script on Frontier

#!/bin/bash
#
#
hn=`hostname -d`
ppwd=`pwd`

export MY_TOP=$ppwd/scratch

export TOP_UCVM_TARGET=$MY_TOP/TARGET_UCVM_SFCVM
export UCVM_SRC_PATH=$TOP_UCVM_TARGET/UCVM
export UCVM_INSTALL_PATH=$TOP_UCVM_TARGET/ucvm_install
export UCVM_SALLOC_ENV="-A geo156 -q debug"
export LD_LIBRARY_PATH=/opt/cray/libfabric/1.15.2.0/lib64:$LD_LIBRARY_PATH
export LIBRARY_PATH=/opt/cray/libfabric/1.15.2.0/lib64:$LIBRARY_PATH

rm -rf $TOP_UCVM_TARGET 
mkdir $TOP_UCVM_TARGET

cd $TOP_UCVM_TARGET
git clone https://github.com/SCECcode/ucvm.git -b withSFCVM UCVM

cd $UCVM_SRC_PATH/largefiles
./get_largefiles.py -m sfcvm,cca,cvmsi,cvms

cd $UCVM_SRC_PATH/largefiles; ./stage_largefiles.py

cd $UCVM_SRC_PATH
./ucvm_setup.py -d -a -p $UCVM_INSTALL_PATH &> ucvm_setup_install.log

cd $UCVM_SRC_PATH; make check &> make_check.log

echo "..EXITING.."
exit

Interactive session to run Acceptance Tests

salloc -A geo156 -N 1 -t 1:30:00 -J UCVM_Tests -q debug

Frontier library not loading

Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

configure:3707: $? = 0
configure:3696: mpicc -v >&5
mpicc for MPICH version 8.1.23
Using built-in specs.
COLLECT_GCC=/opt/cray/pe/gcc/10.3.0/bin/../snos/bin/gcc
COLLECT_LTO_WRAPPER=/opt/cray/pe/gcc/10.3.0/snos/libexec/gcc/x86_64-suse-linux/10.3.0/lto-wrapper
Target: x86_64-suse-linux
Configured with: ../cpe-gcc-10.3.0-202104220029.0777bcc28ac1d/configure --prefix=/opt/cray/pe/gcc/10.3.0/snos --disable-nls --libdir=/opt/cray/pe/gcc/10.3.0/snos/lib --enable-languages=c,c++,fortran --with-gxx-include-dir=/opt/cray/pe/gcc/10.3.0/snos/include/g++ --with-slibdir=/opt/cray/pe/gcc/10.3.0/snos/lib --with-system-zlib --enable-shared --enable-__cxa_atexit --build=x86_64-suse-linux --with-ppl --with-cloog --disable-multilib
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 10.3.0 20210408 (Cray Inc.) (GCC) 
configure:3707: $? = 0
configure:3696: mpicc -V >&5
gcc: error: unrecognized command-line option '-V'
configure:3707: $? = 1
configure:3696: mpicc -qversion >&5
gcc: error: unrecognized command-line option '-qversion'; did you mean '--version'?
configure:3707: $? = 1
configure:3727: checking whether the C compiler works
configure:3749: mpicc    conftest.c  >&5
/usr/bin/ld: warning: libfabric.so.1, needed by /opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib/libmpi_gnu_91.so, not found (try using -rpath or -rpath-link)
/usr/bin/ld: /opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib/libmpi_gnu_91.so: undefined reference to `fi_version@FABRIC_1.0'
/usr/bin/ld: /opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib/libmpi_gnu_91.so: undefined reference to `fi_dupinfo@FABRIC_1.3'
/usr/bin/ld: /opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib/libmpi_gnu_91.so: undefined reference to `fi_strerror@FABRIC_1.0'
/usr/bin/ld: /opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib/libmpi_gnu_91.so: undefined reference to `fi_freeinfo@FABRIC_1.3'
/usr/bin/ld: /opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib/libmpi_gnu_91.so: undefined reference to `fi_fabric@FABRIC_1.1'
/usr/bin/ld: /opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib/libmpi_gnu_91.so: undefined reference to `fi_getinfo@FABRIC_1.3'
collect2: error: ld returned 1 exit status
configure:3753: $? = 1
configure:3791: result: no
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "UCVM"
| #define PACKAGE_TARNAME "ucvm"
| #define PACKAGE_VERSION "22.7.0"
| #define PACKAGE_STRING "UCVM 22.7.0"
| #define PACKAGE_BUGREPORT "software@scec.org"
| #define PACKAGE_URL ""
| #define PACKAGE "ucvm"
| #define VERSION "22.7.0"
| /* end confdefs.h.  */
| 
| int
| main ()
| {
| 
|   ;
|   return 0;
| }
configure:3796: error: in `/lustre/orion/geo156/scratch/dean316/build_UCVM/UCVM':
configure:3798: error: C compiler cannot create executables
See `config.log' for more details
Problem that we are seeing on Frontier.   This is the work around,

cat -10  config.log > r

edit r to just have the configure command call

./r   to run the command by hand

make
make install

and

./ucvm_setup.py -a -r -d -p YOUR_UCVM_INSTALL_PATH 

check if 

/conf  directory has ucvm_env.sh