Skip to content
This repository was archived by the owner on Mar 25, 2025. It is now read-only.

Eigen openacc compatibility#690

Closed
kotsaloscv wants to merge 17 commits into
masterfrom
eigen_openacc_compatibility
Closed

Eigen openacc compatibility#690
kotsaloscv wants to merge 17 commits into
masterfrom
eigen_openacc_compatibility

Conversation

@kotsaloscv

Copy link
Copy Markdown
Contributor

This PR solves the compatibility issues of Eigen with GPUs (I fixed both the OpenACC & CUDA backends).

The Eigen::PartialPivLU (LU decomposition to solve linear systems) is not compatible with GPUs (no device tokens).

For matrices up to 4x4, the Eigen inverse() has template specializations decorated with host & device tokens. Therefore, we use the inverse method instead of the PartialPivLU (requires an invertible matrix) which supports both CPUs & GPUs.

For matrices 5x5 and above, Eigen does not provide GPU-enabled methods to solve small linear systems. For this reason, we use the Crout LU decomposition (Legacy code : coreneuron/sim/scopmath/crout_thread.cpp). For the CPU exectutions, we use the standard PartialPivLU from Eigen.

For the Crout LU-decomposition, I have included a unit test to compare its result with Eigen::PartialPivLU.

@kotsaloscv kotsaloscv requested review from iomaganaris and pramodk June 7, 2021 06:19
@bbpbuildbot

Copy link
Copy Markdown
Collaborator

Can one of the admins verify this patch?

@kotsaloscv kotsaloscv linked an issue Jun 7, 2021 that may be closed by this pull request
@bbpbuildbot

Copy link
Copy Markdown
Collaborator

Logfiles from GitLab pipeline #8037 (:no_entry:) have been uploaded here!

Status and direct links:

@kotsaloscv kotsaloscv requested review from cattabiani and ohm314 June 7, 2021 08:20

@bbp-hpcteam bbp-hpcteam left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a quick note for failing clang-format: note that we use clang-format v11 for formatting check under CI. I have added an item for this Thursday meeting to discuss project specific clang-format version.

@bbpbuildbot

Copy link
Copy Markdown
Collaborator

Logfiles from GitLab pipeline #8141 (:white_check_mark:) have been uploaded here!

Status and direct links:

@bbpbuildbot

Copy link
Copy Markdown
Collaborator

Logfiles from GitLab pipeline #8142 (:white_check_mark:) have been uploaded here!

Status and direct links:

Comment thread src/codegen/codegen_c_visitor.hpp Outdated
Comment on lines +1659 to +1665
static int count_length(const std::vector<SymbolType>& variables) {
int length = 0;
for (const auto& variable: variables) {
length += variable->get_length();
}
return length;
};

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function is not really needed since it can be replaced with a one-liner:

std::accumulate(v.begin(), v.end(), 0, [](int l, const SymbolType& variable) {return a+= variable->get_length(); })

Comment thread src/codegen/codegen_cuda_visitor.cpp Outdated
void CodegenCudaVisitor::print_backend_includes() {
printer->add_line("#include <cuda.h>");

if (info.eigen_linear_solver_exist && count_length(info.state_vars) > 4) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not blocking but a one-liner with accumulate here is better than a new static function since the code remains cleaner. As you prefer

Comment thread test/unit/crout/crout.cpp Outdated


template <typename T>
bool test_Crout_correctness(T rtol = 1e-6, T atol = 1e-6) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these tolerances are quite big. why?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cattabiani you mean that we can relax them even more (like 1e-3)?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think he meant the inverse - why not something much closer to eps ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I meant what Ohm said

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ohm314 & @cattabiani : I have modified Crout solver and I have resolved an accuracy issue that I had. I computed the relative error (A*x-b)/b for both eigen and crout, and crout's relative error was consistently two orders of magnitude higher than eigen one (before the modification). Therefore, for small tolerances my tests where failing. Now (after the modification), the relative error for both solvers has the same order of magnitude.

Comment thread test/unit/crout/crout.cpp Outdated


template <typename T>
bool test_Crout_correctness(T rtol = 1e-6, T atol = 1e-6) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think he meant the inverse - why not something much closer to eps ?

Comment thread src/solver/crout/crout.hpp Outdated
#pragma acc routine seq
#endif
template <typename T>
EIGEN_DEVICE_FUNC inline void Crout(int d, T* S, T* D) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this not be done with anything better than C-style arrays?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefered not to change the initial solver (C-style arrays) to avoid bugs. The C-style arrays are readily combined with the .data() of Eigen and therefore, I think that after an extensive testing, we could consider its modernization (if everything passes).

@bbpbuildbot

Copy link
Copy Markdown
Collaborator

Logfiles from GitLab pipeline #8300 (:white_check_mark:) have been uploaded here!

Status and direct links:

@bbpbuildbot

Copy link
Copy Markdown
Collaborator

Logfiles from GitLab pipeline #8373 (:white_check_mark:) have been uploaded here!

Status and direct links:

…wMajor storage order (improvement with in-place transposition)
@bbpbuildbot

Copy link
Copy Markdown
Collaborator

Logfiles from GitLab pipeline #8413 (:white_check_mark:) have been uploaded here!

Status and direct links:

…r and its unit test (full compatibility with OpenACC/CUDA backends)
@kotsaloscv kotsaloscv linked an issue Jun 9, 2021 that may be closed by this pull request
@bbpbuildbot

Copy link
Copy Markdown
Collaborator

Logfiles from GitLab pipeline #8532 (:no_entry:) have been uploaded here!

Status and direct links:

@bbpbuildbot

Copy link
Copy Markdown
Collaborator

Logfiles from GitLab pipeline #8531 (:no_entry:) have been uploaded here!

Status and direct links:

@bbpbuildbot

Copy link
Copy Markdown
Collaborator

Logfiles from GitLab pipeline #8565 (:white_check_mark:) have been uploaded here!

Status and direct links:

@bbpbuildbot

Copy link
Copy Markdown
Collaborator

Logfiles from GitLab pipeline #11210 (:no_entry:) have been uploaded here!

Status and direct links:

@kotsaloscv kotsaloscv closed this Aug 27, 2021
@kotsaloscv kotsaloscv reopened this Aug 27, 2021
@bbpbuildbot

Copy link
Copy Markdown
Collaborator

Logfiles from GitLab pipeline #15092 (:white_check_mark:) have been uploaded here!

Status and direct links:

@olupton olupton left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, let's test it again after a merge/rebase with the latest master.

#include <algorithm>
#include <cmath>
#include <ctime>
#include <numeric>

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed?

const std::string& X,
const std::string& Jm,
const std::string& F) {
// The Eigen::PartialPivLU is not compatible with GPUs (no __device__ tokens).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still true with the latest Eigen changes we plan to use? (https://github.com/BlueBrain/eigen/)

Comment thread cmake/CompilerHelper.cmake Outdated
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} --diag_suppress 1,82,111,115,177,186,611,997,1097,1625")

# Needed for Eigen
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wc,--pending_instantiations=0")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The surrounding code has changed a bit in master, it would be good to merge/rebase before final review/testing.

Comment thread test/unit/crout/crout.cpp
using VecType = Matrix<T, Dynamic, 1>;

std::random_device rd; // seeding
std::mt19937 mt(rd());

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to use a pseudorandom seed then we should at least print what it is, in case we need to reproduce/debug an issue.

@bbpbuildbot

Copy link
Copy Markdown
Collaborator

Logfiles from GitLab pipeline #15548 (:no_entry:) have been uploaded here!

Status and direct links:

@kotsaloscv

Copy link
Copy Markdown
Contributor Author

For the build PGI pipeline we will need to bring back in CMake the following:
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wc,--pending_instantiations=0")
That's why it is failing for now

@pramodk

pramodk commented Sep 6, 2021

Copy link
Copy Markdown
Contributor

Superseded by #728

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Eigen compatibility with OpenACC Issue with newton solver when used with OpenACC backend

7 participants