Skip to content
This repository was archived by the owner on Mar 25, 2025. It is now read-only.
This repository was archived by the owner on Mar 25, 2025. It is now read-only.

Eigen compatibility with OpenACC #311

Description

@iomaganaris

This issue was raised when running nrnivmodl-core with the following ModelDB model https://senselab.med.yale.edu/ModelDB/ShowModel?model=19176&file=%2fHCN2k%2fhcn2.mod#tabs-2 using PGI 19.4 and PGI 19.10 with the OpenACC backend generated from NMODL.
The generated c++ file for hcn2.mod file contains a call to the following Eigen solver generated by the translation of the DERIVATIVE block:

X = Eigen::PartialPivLU<Eigen::Ref<Eigen::Matrix<double, 4, 4>>>(Jm).solve(F);

Compiling this file with pgc++ there is the following issue (with -Minfo=acc added to the compilation flags):

PGCC-W-0155-External and Static variables are not supported in acc routine - _ZTVSt9exception (x86_64/core/mod2c/hcn2.cpp: 63)
std::exception::exception():
     22, include "multicore.hpp"
          32, include "membfunc.hpp"
               31, include "vector"
                    61, include "allocator.h"
                         46, include "c++allocator.h"
                              33, include "new_allocator.h"
                                   33, include "new"
                                        40, include "exception"
                                             63, Generating implicit acc routine seq
PGCC-W-0155-External and Static variables are not supported in acc routine - _ZTVSt9exception (x86_64/core/mod2c/hcn2.cpp: 63)
std::exception::exception() [subobject]:
      0, Generating implicit acc routine seq
PGCC-W-0155-External and Static variables are not supported in acc routine - _ZTVSt9exception (x86_64/core/mod2c/hcn2.cpp: 57)
std::bad_alloc::bad_alloc():
     22, include "multicore.hpp"
          32, include "membfunc.hpp"
               31, include "vector"
                    61, include "allocator.h"
                         46, include "c++allocator.h"
                              33, include "new_allocator.h"
                                   33, include "new"
                                        57, Generating implicit acc routine seq

This was due to throwing an exception in https://gitlab.com/libeigen/eigen/-/blob/master/Eigen/src/Core/util/Memory.h#L70.
After fixing this issue by commenting out the problematic line, there was another issue regarding atomic coming from https://gitlab.com/libeigen/eigen/-/blob/master/Eigen/src/Core/products/Parallelizer.h#L14 which was fixed by adding the -DEIGEN_HAS_CXX11_ATOMIC=0 compiler flag to pgc++.

PGCC-W-0155-Compiler failed to translate accelerator region (see -Minfo messages): Unexpected load/store type (x86_64/core/mod2c/hcn2.cpp: 1)
std::__atomic_base<int>::store(int, std::memory_order):
     32, include "LU"
          11, include "Core"
              297, include "Parallelizer.h"
                    14, include "atomic"
                         41, include "atomic_base.h"
                             468, Generating implicit acc routine seq
                                  Generating acc routine seq
                                  Generating Tesla code
PGCC-F-0704-Compilation aborted due to previous errors. (x86_64/core/mod2c/hcn2.cpp

Following those, there was an issue coming from the llvm based pgc++ compiler, so we tried with the nollvm backend.

Eigen::EigenBase<Eigen::CwiseBinaryOp<Eigen::internal::scalar_product_op<double, double>, const Eigen::Transpose<const Eigen::Block<const Eigen::Block<Eigen::Block<Eigen::Ref<Eigen::Matrix<double, (int)4, (int)4, (int)0, (int)4, (int)4>, (int)0, Eigen::OuterStride<(int)-1>>, (int)4, (int)1, (bool)1>, (int)-1, (int)1, (bool)0>, (int)1, (int)1, (bool)0>>, const Eigen::Block<const Eigen::Block<Eigen::Block<Eigen::Ref<Eigen::Matrix<double, (int)4, (int)4, (int)0, (int)4, (int)4>, (int)0, Eigen::OuterStride<(int)-1>>, (int)1, (int)4, (bool)0>, (int)1, (int)-1, (bool)0>, (int)1, (int)1, (bool)0>>>::cols() const:
     32, include "LU"
          11, include "Core"
              240, include "EigenBase.h"
                    62, Generating implicit acc routine seq
                        Generating acc routine seq
                        Generating Tesla code
              259, include "NoAlias.h"
              261, include "Matrix.h"
              265, include "CwiseUnaryOp.h"
              271, include "Stride.h"
              273, include "Map.h"
              275, include "Block.h"
              279, include "Transpose.h"
              283, include "Redux.h"
              289, include "Solve.h"
              291, include "SolverBase.h"
              293, include "Transpositions.h"
pgc++-Fatal-/gpfs/bbp.cscs.ch/ssd/apps/hpc/jenkins/deploy/compilers/2020-02-01/linux-rhel7-x86_64/gcc-4.8.5/pgi-19.4-hdirysdrvd/linux86-64-llvm/19.4/bin/pggpp2-llvm TERMINATED by signal 11
Arguments to /gpfs/bbp.cscs.ch/ssd/apps/hpc/jenkins/deploy/compilers/2020-02-01/linux-rhel7-x86_64/gcc-4.8.5/pgi-19.4-hdirysdrvd/linux86-64-llvm/19.4/bin/pggpp2-llvm
/gpfs/bbp.cscs.ch/ssd/apps/hpc/jenkins/deploy/compilers/2020-02-01/linux-rhel7-x86_64/gcc-4.8.5/pgi-19.4-hdirysdrvd/linux86-64-llvm/19.4/bin/pggpp2-llvm x86_64/core/mod2c/hcn2.cpp -opt 2 -terse 1 -inform warn -x 119 0xa10000 -x 122 0x40 -x 123 0x1000 -x 127 4 -x 127 17 -x 19 0x400000 -x 28 0x40000 -x 120 0x10000000 -x 70 0x8000 -x 122 1 -x 125 0x20000 -quad -vect 56 -y 34 16 -x 34 0x8 -x 32 25952256 -y 19 8 -y 35 0 -x 42 0x30 -x 39 0x40 -x 39 0x80 -x 59 4 -x 129 2 -tp skylake -x 120 0x1000 -astype 0 -x 121 1 -fn x86_64/core/mod2c/hcn2.cpp -il /gpfs/bbp.cscs.ch/ssd/slurmTmpFS/magkanar/773510/pgc++uoOV8vHgaqXY.il/hcn2.il -inlib /gpfs/bbp.cscs.ch/ssd/slurmTmpFS/magkanar/773510/pgc++0oOVCczrYGzB.ext -insize 200 -x 221 25000 -x 222 5 -x 115 10 -x 14 32 -x 117 0x200 -x 123 0x80000000 -x 123 4 -x 119 0x20 -def __pgnu_vsn=80300 -x 70 0x40000000 -x 183 4 -x 121 0x800 -x 6 0x20000 -autoinl 10 -x 168 400 -x 174 128000 -x 14 0x200000 -x 14 0x400000 -x 249 70 -x 120 0x200000 -x 70 0x40000000 -x 164 0x800000 -x 39 4 -x 68 0x1 -accel tesla -accel host -x 186 0x80000 -x 180 0x4000400 -x 163 0x1 -cudaver 10010 -x 176 0x100 -cudacap 70 -x 121 0xc00 -x 194 0x40000 -x 186 0x80 -x 189 0x8000 -y 163 0xc0000000 -x 189 0x10 -y 189 0x4000000 -cudaroot /gpfs/bbp.cscs.ch/ssd/apps/hpc/jenkins/deploy/compilers/2020-02-01/linux-rhel7-x86_64/gcc-4.8.5/pgi-19.4-hdirysdrvd/linux86-64-llvm/2019/cuda/10.1 -x 9 1 -x 72 0x1 -x 136 0x11 -x 37 0x481000 -mp -x 69 0x200 -x 69 0x400 -x 69 2 -x 0 0x1000000 -x 2 0x100000 -x 0 0x2000000 -x 161 16384 -x 162 16384 -x 62 8 -gnuvsn 80300 -x 69 0x200 -x 123 0x400 -cmdline '+pgc++ x86_64/core/mod2c/hcn2.cpp -O2 -Mvect=simd -acc --diag_suppress 177 -mp --c++14 -Minline=size:200,levels:10 -Minfo=acc -DEIGEN_HAS_CXX11_ATOMIC=0 -DPG_ACC_BUGS -DCUDA_PROFILING -DCORENEURON_BUILD -DHAVE_MALLOC_H -DSWAP_ENDIAN_DISABLE_ASM -DEIGEN_DONT_VECTORIZE=1 -DNRNMPI=1 -DLAYOUT=0 -DDISABLE_HOC_EXP -DENABLE_SPLAYTREE_QUEUING -I/gpfs/bbp.cscs.ch/project/proj16/magkanar/GPU_EIGEN/CoreNeuron/build_gpu_sympy/install/include -I/gpfs/bbp.cscs.ch/project/proj16/magkanar/GPU_EIGEN/CoreNeuron/build_gpu_sympy/install/include/coreneuron/utils/randoms -I/gpfs/bbp.cscs.ch/ssd/apps/hpc/jenkins/deploy/tools/2020-02-01/linux-rhel7-x86_64/gcc-8.3.0/hpe-mpi-2.21-7pbszh6v5u/include -fPIC -c -o x86_64/core/build/hcn2.o' -asm /gpfs/bbp.cscs.ch/ssd/slurmTmpFS/magkanar/773510/pgc++uoOV8WmKrzY7.ll
make: *** [x86_64/core/build/hcn2.o] Error 127

The final issue we came across was the following:

PGCC-W-0155-External and Static variables are not supported in acc routine - _ZTVSt9exception (x86_64/core/mod2c/hcn2.cpp: 336)
void Eigen::internal::gemv_dense_selector<(int)2, (int)1, (bool)1>::run<Eigen::Transpose<const Eigen::Ref<Eigen::Matrix<double, (int)-1, (int)-1, (int)0, (int)-1, (int)-1>, (int)0, Eigen::OuterStride<(int)-1>>>, Eigen::Transpose<const Eigen::Block<const Eigen::Ref<Eigen::Matrix<double, (int)-1, (int)-1, (int)0, (int)-1, (int)-1>, (int)0, Eigen::OuterStride<(int)-1>>, (int)1, (int)-1, (bool)0>>, Eigen::Transpose<Eigen::Block<Eigen::Ref<Eigen::Matrix<double, (int)-1, (int)-1, (int)0, (int)-1, (int)-1>, (int)0, Eigen::OuterStride<(int)-1>>, (int)1, (int)-1, (bool)0>>>(const T1 &, const T2 &, T3 &, const T3::Scalar &):
     32, include "LU"
          11, include "Core"
              288, include "GeneralProduct.h"
                   309, Generating implicit acc routine seq
              292, include "PermutationMatrix.h"
              294, include "TriangularMatrix.h"
              317, include "VectorwiseOp.h"
Unimplemented opcode: 0
PGCC-F-0000-Internal compiler error. Unimplemented opcode.       4  (x86_64/core/mod2c/hcn2.cpp: 46)
PGCC/x86 Linux 19.10-0: compilation aborted
make: *** [x86_64/core/build/hcn2.o] Error 2

For this there is no solution found.
To reproduce all the issues in a gpu node:

module load git
git clone https://github.com/BlueBrain/CoreNeuron.git
cd CoreNeuron
mkdir build_gpu_sympy
cd build_gpu_sympy
module load nvhpc cuda hpe-mpi boost python-dev flex bison cmake
cmake .. -DCMAKE_INSTALL_PREFIX=./install -DCORENRN_ENABLE_NMODL=ON -DCORENRN_ENABLE_GPU=ON -DCORENRN_NMODL_FLAGS="sympy --analytic" -DNRN_ENABLE_CORENEURON=ON -DCMAKE_BUILD_TYPE=Release
make -j36
cd HCN2k
mkdir mod_eigen
cp hcn2.mod mod_eigen
<install-dir>/bin/nrnivmodl -coreneuron mod_eigen

CoreNeuron and NMODL master branches were used

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingcodegenCode generation backendgpusolverSolver and numerical methods

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions