Difference between revisions of "UCVM on Frontier"
From SCECpedia
Jump to navigationJump to searchLine 86: | Line 86: | ||
== Frontier library not loading == | == Frontier library not loading == | ||
+ | <pre> | ||
+ | Copyright (C) 2020 Free Software Foundation, Inc. | ||
+ | This is free software; see the source for copying conditions. There is NO | ||
+ | warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. | ||
+ | |||
+ | configure:3707: $? = 0 | ||
+ | configure:3696: mpicc -v >&5 | ||
+ | mpicc for MPICH version 8.1.23 | ||
+ | Using built-in specs. | ||
+ | COLLECT_GCC=/opt/cray/pe/gcc/10.3.0/bin/../snos/bin/gcc | ||
+ | COLLECT_LTO_WRAPPER=/opt/cray/pe/gcc/10.3.0/snos/libexec/gcc/x86_64-suse-linux/10.3.0/lto-wrapper | ||
+ | Target: x86_64-suse-linux | ||
+ | Configured with: ../cpe-gcc-10.3.0-202104220029.0777bcc28ac1d/configure --prefix=/opt/cray/pe/gcc/10.3.0/snos --disable-nls --libdir=/opt/cray/pe/gcc/10.3.0/snos/lib --enable-languages=c,c++,fortran --with-gxx-include-dir=/opt/cray/pe/gcc/10.3.0/snos/include/g++ --with-slibdir=/opt/cray/pe/gcc/10.3.0/snos/lib --with-system-zlib --enable-shared --enable-__cxa_atexit --build=x86_64-suse-linux --with-ppl --with-cloog --disable-multilib | ||
+ | Thread model: posix | ||
+ | Supported LTO compression algorithms: zlib | ||
+ | gcc version 10.3.0 20210408 (Cray Inc.) (GCC) | ||
+ | configure:3707: $? = 0 | ||
+ | configure:3696: mpicc -V >&5 | ||
+ | gcc: error: unrecognized command-line option '-V' | ||
+ | configure:3707: $? = 1 | ||
+ | configure:3696: mpicc -qversion >&5 | ||
+ | gcc: error: unrecognized command-line option '-qversion'; did you mean '--version'? | ||
+ | configure:3707: $? = 1 | ||
+ | configure:3727: checking whether the C compiler works | ||
+ | configure:3749: mpicc conftest.c >&5 | ||
+ | /usr/bin/ld: warning: libfabric.so.1, needed by /opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib/libmpi_gnu_91.so, not found (try using -rpath or -rpath-link) | ||
+ | /usr/bin/ld: /opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib/libmpi_gnu_91.so: undefined reference to `fi_version@FABRIC_1.0' | ||
+ | /usr/bin/ld: /opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib/libmpi_gnu_91.so: undefined reference to `fi_dupinfo@FABRIC_1.3' | ||
+ | /usr/bin/ld: /opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib/libmpi_gnu_91.so: undefined reference to `fi_strerror@FABRIC_1.0' | ||
+ | /usr/bin/ld: /opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib/libmpi_gnu_91.so: undefined reference to `fi_freeinfo@FABRIC_1.3' | ||
+ | /usr/bin/ld: /opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib/libmpi_gnu_91.so: undefined reference to `fi_fabric@FABRIC_1.1' | ||
+ | /usr/bin/ld: /opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib/libmpi_gnu_91.so: undefined reference to `fi_getinfo@FABRIC_1.3' | ||
+ | collect2: error: ld returned 1 exit status | ||
+ | configure:3753: $? = 1 | ||
+ | configure:3791: result: no | ||
+ | configure: failed program was: | ||
+ | | /* confdefs.h */ | ||
+ | | #define PACKAGE_NAME "UCVM" | ||
+ | | #define PACKAGE_TARNAME "ucvm" | ||
+ | | #define PACKAGE_VERSION "22.7.0" | ||
+ | | #define PACKAGE_STRING "UCVM 22.7.0" | ||
+ | | #define PACKAGE_BUGREPORT "software@scec.org" | ||
+ | | #define PACKAGE_URL "" | ||
+ | | #define PACKAGE "ucvm" | ||
+ | | #define VERSION "22.7.0" | ||
+ | | /* end confdefs.h. */ | ||
+ | | | ||
+ | | int | ||
+ | | main () | ||
+ | | { | ||
+ | | | ||
+ | | ; | ||
+ | | return 0; | ||
+ | | } | ||
+ | configure:3796: error: in `/lustre/orion/geo156/scratch/dean316/build_UCVM/UCVM': | ||
+ | configure:3798: error: C compiler cannot create executables | ||
+ | See `config.log' for more details | ||
+ | </pre> | ||
+ | |||
<pre> | <pre> | ||
Problem that we are seeing on Frontier. This is the work around, | Problem that we are seeing on Frontier. This is the work around, |
Revision as of 07:22, 23 April 2024
Contents
Testing the UCVM installation on Frontier
We are implementing tests of the development v3ersion of UCVM used for CyberShake NorCal.
Building Process
Building on the head node is very slow. We req Looks like we need to build on compute node. But compute node not network accessible. So do git clone, and largefile downloads on head node, then when ready to make, request a compute node.
Define INSTALL PATH
UCVM_INSTALL_PATH /lustre/orion/proj-shared/geo156/pmaech/scratch/TARGET_UCVM_SFCVM/ucvm_install
Setup Frontier Modules
[login03.frontier ~]$ module list Currently Loaded Modules: 1) craype-x86-trento 7) cray-dsmml/0.2.2 13) darshan-runtime/3.4.0 2) craype-network-ofi 8) cray-libsci/22.12.1.1 14) hsi/default 3) perftools-base/22.12.0 9) PrgEnv-cray/8.3.3 15) lfs-wrapper/0.0.1 4) xpmem/2.6.2-2.5_2.22__gd067c3f.shasta 10) cray-python/3.9.13.1 16) DefApps/default 5) cray-pmi/6.1.8 11) libfabric/1.15.2.0 17) libtool/2.4.6 6) craype/2.7.19 12) gcc/10.3.0 18) cray-mpich/8.1.23
This is typically built by keeping the default modules plus these:
- module load cray-python
- module load libtool/2.4.6
- module load libfabric
- module load gcc/10.3.0
Testing with two account showed this built ucvm binaries and tests(except CCA) passed.
Example Installation
Code is built a UCVM installtion on Frontier at : /ccs/home/mei/scratch/TARGET_UCVM_SFCVM/ucvm_install
source conf/ucvm_env.sh which ucvm_query ucvm_query -H
Next is to run test/run_testing to run some basic unit testing
Install Script on Frontier
#!/bin/bash # # hn=`hostname -d` ppwd=`pwd` export MY_TOP=$ppwd/scratch export TOP_UCVM_TARGET=$MY_TOP/TARGET_UCVM_SFCVM export UCVM_SRC_PATH=$TOP_UCVM_TARGET/UCVM export UCVM_INSTALL_PATH=$TOP_UCVM_TARGET/ucvm_install export UCVM_SALLOC_ENV="-A geo156 -q debug" export LD_LIBRARY_PATH=/opt/cray/libfabric/1.15.2.0/lib64:$LD_LIBRARY_PATH export LIBRARY_PATH=/opt/cray/libfabric/1.15.2.0/lib64:$LIBRARY_PATH rm -rf $TOP_UCVM_TARGET mkdir $TOP_UCVM_TARGET cd $TOP_UCVM_TARGET git clone https://github.com/SCECcode/ucvm.git -b withSFCVM UCVM cd $UCVM_SRC_PATH/largefiles ./get_largefiles.py -m sfcvm,cca,cvmsi,cvms cd $UCVM_SRC_PATH/largefiles; ./stage_largefiles.py cd $UCVM_SRC_PATH ./ucvm_setup.py -d -a -p $UCVM_INSTALL_PATH &> ucvm_setup_install.log cd $UCVM_SRC_PATH; make check &> make_check.log echo "..EXITING.." exit
Interactive session to run Acceptance Tests
salloc -A geo156 -N 1 -t 1:30:00 -J UCVM_Tests -q debug
Frontier library not loading
Copyright (C) 2020 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. configure:3707: $? = 0 configure:3696: mpicc -v >&5 mpicc for MPICH version 8.1.23 Using built-in specs. COLLECT_GCC=/opt/cray/pe/gcc/10.3.0/bin/../snos/bin/gcc COLLECT_LTO_WRAPPER=/opt/cray/pe/gcc/10.3.0/snos/libexec/gcc/x86_64-suse-linux/10.3.0/lto-wrapper Target: x86_64-suse-linux Configured with: ../cpe-gcc-10.3.0-202104220029.0777bcc28ac1d/configure --prefix=/opt/cray/pe/gcc/10.3.0/snos --disable-nls --libdir=/opt/cray/pe/gcc/10.3.0/snos/lib --enable-languages=c,c++,fortran --with-gxx-include-dir=/opt/cray/pe/gcc/10.3.0/snos/include/g++ --with-slibdir=/opt/cray/pe/gcc/10.3.0/snos/lib --with-system-zlib --enable-shared --enable-__cxa_atexit --build=x86_64-suse-linux --with-ppl --with-cloog --disable-multilib Thread model: posix Supported LTO compression algorithms: zlib gcc version 10.3.0 20210408 (Cray Inc.) (GCC) configure:3707: $? = 0 configure:3696: mpicc -V >&5 gcc: error: unrecognized command-line option '-V' configure:3707: $? = 1 configure:3696: mpicc -qversion >&5 gcc: error: unrecognized command-line option '-qversion'; did you mean '--version'? configure:3707: $? = 1 configure:3727: checking whether the C compiler works configure:3749: mpicc conftest.c >&5 /usr/bin/ld: warning: libfabric.so.1, needed by /opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib/libmpi_gnu_91.so, not found (try using -rpath or -rpath-link) /usr/bin/ld: /opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib/libmpi_gnu_91.so: undefined reference to `fi_version@FABRIC_1.0' /usr/bin/ld: /opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib/libmpi_gnu_91.so: undefined reference to `fi_dupinfo@FABRIC_1.3' /usr/bin/ld: /opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib/libmpi_gnu_91.so: undefined reference to `fi_strerror@FABRIC_1.0' /usr/bin/ld: /opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib/libmpi_gnu_91.so: undefined reference to `fi_freeinfo@FABRIC_1.3' /usr/bin/ld: /opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib/libmpi_gnu_91.so: undefined reference to `fi_fabric@FABRIC_1.1' /usr/bin/ld: /opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib/libmpi_gnu_91.so: undefined reference to `fi_getinfo@FABRIC_1.3' collect2: error: ld returned 1 exit status configure:3753: $? = 1 configure:3791: result: no configure: failed program was: | /* confdefs.h */ | #define PACKAGE_NAME "UCVM" | #define PACKAGE_TARNAME "ucvm" | #define PACKAGE_VERSION "22.7.0" | #define PACKAGE_STRING "UCVM 22.7.0" | #define PACKAGE_BUGREPORT "software@scec.org" | #define PACKAGE_URL "" | #define PACKAGE "ucvm" | #define VERSION "22.7.0" | /* end confdefs.h. */ | | int | main () | { | | ; | return 0; | } configure:3796: error: in `/lustre/orion/geo156/scratch/dean316/build_UCVM/UCVM': configure:3798: error: C compiler cannot create executables See `config.log' for more details
Problem that we are seeing on Frontier. This is the work around, cat -10 config.log > r edit r to just have the configure command call ./r to run the command by hand make make install and ./ucvm_setup.py -a -r -d -p YOUR_UCVM_INSTALL_PATH check if /conf directory has ucvm_env.sh