Fixes issue with code generation of net_send_buffering() by iomaganaris · Pull Request #1005 · BlueBrain/nmodl

iomaganaris · 2023-02-20T16:17:25Z

If OpenMP GPU backend is enabled then #pragma omp atomic capture is not executed properly on CPU with NVHPC 23.1 (more details and tests in Fixes for NVHPC 23.1 compiler neuronsimulator/nrn#2239)
Wrap the update of nsb->_cnt inside an if-statement based on nt->compute_gpu

pramodk · 2023-02-20T16:20:57Z

#pragma omp atomic capture is not executed properly on CPU with NVHPC 23.1

does this produce bogus results? 🤔

iomaganaris · 2023-02-20T16:40:43Z

#pragma omp atomic capture is not executed properly on CPU with NVHPC 23.1

does this produce bogus results? thinking

Yes, without these changes on CPU and the following debug print we get:

diff --git a/src/codegen/codegen_acc_visitor.cpp b/src/codegen/codegen_acc_visitor.cpp
index 7ed55bb9c..fff491567 100644
--- a/src/codegen/codegen_acc_visitor.cpp
+++ b/src/codegen/codegen_acc_visitor.cpp
@@ -128,6 +128,7 @@ void CodegenAccVisitor::print_net_send_buffering_cnt_update() const {
     printer->start_block("} else");
     printer->add_line("i = nsb->_cnt++;");
     printer->end_block(1);
+    printer->add_line("printf(\"==== %d ====\\n\", i);");
 }
 
 void CodegenAccVisitor::print_net_send_buffering_grow() {
diff --git a/src/codegen/codegen_cpp_visitor.cpp b/src/codegen/codegen_cpp_visitor.cpp
index 3b6c854ba..62f2c4ad6 100644
--- a/src/codegen/codegen_cpp_visitor.cpp
+++ b/src/codegen/codegen_cpp_visitor.cpp
@@ -4020,6 +4020,7 @@ void CodegenCVisitor::print_net_receive_buffering(bool need_mech_inst) {
 
 void CodegenCVisitor::print_net_send_buffering_cnt_update() const {
     printer->add_line("i = nsb->_cnt++;");
+    printer->add_line("printf(\"==== %d ====\\n\", i);");
 }

➜  coreneuron git:(magkanar/fix_nvhpc_231) ✗ ./x86_64/special -mpi -python test_watchrange.py         
numprocs=1
NEURON -- VERSION 9.0.dev-1300-g8d5f19d39+ magkanar/fix_nvhpc_231 (8d5f19d39+) 2023-02-20
Duke, Yale, and the BlueBrain Project -- Copyright 1984-2022
See http://neuron.yale.edu/neuron/credits

Additional mechanisms from files
 "mod files/axial.mod" "mod files/axial_pp.mod" "mod files/bacur.mod" "mod files/banocur.mod" "mod files/fornetcon.mod" "mod files/invlfire.mod" "mod files/netmove.mod" "mod files/sample.mod" "mod files/watchrange.mod"
CoreNEURON run
==== 0 ====
==== 1 ====
==== 2 ====
net_sem_from_gpu sendtype 0 i_vdata 2 weight_index_ -1 ith 0 ipnt 0 td 0.000000 flag 3.000000
net_sem_from_gpu sendtype 0 i_vdata 17 weight_index_ -1 ith 0 ipnt 15 td 0.000000 flag 3.000000
net_sem_from_gpu sendtype 0 i_vdata 26 weight_index_ -1 ith 0 ipnt 24 td 0.000000 flag 1.000000
==== 34 ====
==== 35 ====
==== 36 ====
net_sem_from_gpu sendtype 0 i_vdata 2 weight_index_ -1 ith 0 ipnt 0 td 0.000000 flag 3.000000
net_send td-t = -0.125 SelfEvent target=Bounce 0 flag=3
[magkanar-MS-7C73:174349] *** Process received signal ***
[magkanar-MS-7C73:174349] Signal: Aborted (6)
[magkanar-MS-7C73:174349] Signal code:  (-6)
[magkanar-MS-7C73:174349] [ 0] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f62ba442520]
[magkanar-MS-7C73:174349] [ 1] /usr/lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c)[0x7f62ba496a7c]
[magkanar-MS-7C73:174349] [ 2] /usr/lib/x86_64-linux-gnu/libc.so.6(raise+0x16)[0x7f62ba442476]
[magkanar-MS-7C73:174349] [ 3] /usr/lib/x86_64-linux-gnu/libc.so.6(abort+0xd3)[0x7f62ba4287f3]
[magkanar-MS-7C73:174349] [ 4] /home/magkanar/bbp_repos/nrn/test/coreneuron/x86_64/libcorenrnmech.so(+0x12774e)[0x7f62be12774e]
[magkanar-MS-7C73:174349] [ 5] /home/magkanar/bbp_repos/nrn/test/coreneuron/x86_64/libcorenrnmech.so(_ZN10coreneuron22net_buf_receive_BounceEPNS_9NrnThreadE+0x49b)[0x7f62be0998db]

With these changes on CPU (and same on GPU):

➜  coreneuron git:(magkanar/fix_nvhpc_231) ✗ ./x86_64/special -mpi -python test_watchrange.py         
numprocs=1
NEURON -- VERSION 9.0.dev-1300-g8d5f19d39+ magkanar/fix_nvhpc_231 (8d5f19d39+) 2023-02-20
Duke, Yale, and the BlueBrain Project -- Copyright 1984-2022
See http://neuron.yale.edu/neuron/credits

Additional mechanisms from files
 "mod files/axial.mod" "mod files/axial_pp.mod" "mod files/bacur.mod" "mod files/banocur.mod" "mod files/fornetcon.mod" "mod files/invlfire.mod" "mod files/netmove.mod" "mod files/sample.mod" "mod files/watchrange.mod"
CoreNEURON run
==== 0 ====
==== 1 ====
==== 2 ====
net_sem_from_gpu sendtype 0 i_vdata 26 weight_index_ -1 ith 0 ipnt 24 td 0.000000 flag 1.000000
net_sem_from_gpu sendtype 0 i_vdata 17 weight_index_ -1 ith 0 ipnt 15 td 0.000000 flag 3.000000
net_sem_from_gpu sendtype 0 i_vdata 2 weight_index_ -1 ith 0 ipnt 0 td 0.000000 flag 3.000000
==== 0 ====
==== 1 ====
==== 2 ====

codecov · 2023-02-20T17:08:05Z

Codecov Report

Merging #1005 (3a8501e) into master (0481f30) will increase coverage by 0.48%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #1005      +/-   ##
==========================================
+ Coverage   68.74%   69.23%   +0.48%     
==========================================
  Files         189      189              
  Lines       25789    25816      +27     
==========================================
+ Hits        17729    17874     +145     
+ Misses       8060     7942     -118

Impacted Files	Coverage Δ
src/codegen/codegen_acc_visitor.hpp	`50.00% <ø> (+50.00%)`	⬆️
src/codegen/codegen_cpp_visitor.hpp	`100.00% <ø> (ø)`
src/codegen/codegen_acc_visitor.cpp	`64.35% <100.00%> (+64.35%)`	⬆️
src/codegen/codegen_cpp_visitor.cpp	`85.29% <100.00%> (-0.07%)`	⬇️
test/unit/codegen/codegen_cpp_visitor.cpp	`100.00% <100.00%> (ø)`

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

bbpbuildbot · 2023-02-20T17:08:36Z

Logfiles from GitLab pipeline #103450 (:white_check_mark:) have been uploaded here!

Status and direct links:

bbpbuildbot · 2023-02-21T14:48:21Z

Logfiles from GitLab pipeline #103761 (:white_check_mark:) have been uploaded here!

Status and direct links:

iomaganaris · 2023-02-21T14:54:12Z

I'm a bit skeptical about the fact that coverage of src/codegen/codegen_acc_visitor.cpp increased so much while testing only a very small part of it but I guess it's better than reducing it. We can follow up on that in #957

olupton

I'm curious...are we sure all these changes to the code generation needed?
Did you check if the change to print_net_send_buf_update_to_host is sufficient without the changes to the atomic update part?

A couple of code-printing nitpicks.

- Fix formatting and small improvement with restarting blocks Co-authored-by: Olli Lupton <oliver.lupton@epfl.ch>

iomaganaris · 2023-02-21T15:11:33Z

I'm curious...are we sure all these changes to the code generation needed? Did you check if the change to print_net_send_buf_update_to_host is sufficient without the changes to the atomic update part?

A couple of code-printing nitpicks.

Yes, the changes to print_net_send_buf_update_to_host are not enough. I just included them since I've already done them. I can try to create a small reproducer of the issue as well to report it to NVIDIA

bbpbuildbot · 2023-02-21T16:08:35Z

Logfiles from GitLab pipeline #103817 (:white_check_mark:) have been uploaded here!

Status and direct links:

pramodk · 2023-02-22T10:21:37Z

Use C++ if-statement instead of in OpenMP pragma in some cases

Bit late here - @iomaganaris: similar to the two below bug reports, did you see if it's worth reporting to nvofurm? (if not already reported)

https://forums.developer.nvidia.com/t/incorrect-gpu-results-with-pragma-acc-atomic-capture/187962
https://forums.developer.nvidia.com/t/incorrect-cpu-results-with-pragma-acc-atomic-capture/184453/

iomaganaris · 2023-02-22T10:26:22Z

Use C++ if-statement instead of in OpenMP pragma in some cases

Bit late here - @iomaganaris: similar to the two below bug reports, did you see if it's worth reporting to nvofurm? (if not already reported)

https://forums.developer.nvidia.com/t/incorrect-gpu-results-with-pragma-acc-atomic-capture/187962 https://forums.developer.nvidia.com/t/incorrect-cpu-results-with-pragma-acc-atomic-capture/184453/

I am trying to create a small reproducer but I haven't managed until now. I was just able to reproduce the first issue of these with my own example. I will try with Olli's as well.

Use C++ if-statement instead of in OpenMP pragma in some cases

Note that this was not needed for the fix but I find it a bit cleaner and maybe less error prone.

…modl#1005) * If OpenMP GPU backend is enabled then `#pragma omp atomic capture` is not executed properly on CPU with NVHPC 23.1 (more details and tests in neuronsimulator/nrnBlueBrain/nmodl#2239) * Wrap the update of `nsb->_cnt` inside an `if-statement` based on `nt->compute_gpu` in OpenACC visitor * Use C++ `if-statement` instead of in OpenMP pragma in some cases * Added unit test for code generation of `net_send_buffering()` on code generated for GPU --------- Co-authored-by: Olli Lupton <oliver.lupton@epfl.ch> NMODL Repo SHA: BlueBrain/nmodl@c3b0736

iomaganaris added 3 commits February 17, 2023 17:42

Small fix for gpu code generation and cpu execution

c364d2e

Fix issue with #pragma omp atomic capture on net_send_buffering()

e9548bb

Fix code generation for CPU and GPU backends for net_send_buffering()

0b3b4d4

iomaganaris added bug Something isn't working codegen Code generation backend labels Feb 20, 2023

iomaganaris requested review from olupton and pramodk February 20, 2023 16:17

iomaganaris self-assigned this Feb 20, 2023

Make format happy

ee23b25

Fix coverage

4256f26

olupton reviewed Feb 21, 2023

View reviewed changes

Comment thread src/codegen/codegen_acc_visitor.cpp Outdated

Comment thread src/codegen/codegen_cpp_visitor.cpp Outdated

Apply suggestions from code review

c84bf36

- Fix formatting and small improvement with restarting blocks Co-authored-by: Olli Lupton <oliver.lupton@epfl.ch>

Make format happy

3a8501e

olupton approved these changes Feb 21, 2023

View reviewed changes

iomaganaris merged commit c3b0736 into master Feb 22, 2023

iomaganaris deleted the magkanar/fix_net_buf branch February 22, 2023 09:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fixes issue with code generation of net_send_buffering()#1005

Fixes issue with code generation of net_send_buffering()#1005
iomaganaris merged 7 commits into
masterfrom
magkanar/fix_net_buf

iomaganaris commented Feb 20, 2023 •

edited

Loading

Uh oh!

pramodk commented Feb 20, 2023

Uh oh!

iomaganaris commented Feb 20, 2023 •

edited

Loading

Uh oh!

codecov Bot commented Feb 20, 2023 •

edited

Loading

Uh oh!

bbpbuildbot commented Feb 20, 2023

Uh oh!

bbpbuildbot commented Feb 21, 2023

Uh oh!

iomaganaris commented Feb 21, 2023

Uh oh!

olupton left a comment

Uh oh!

Uh oh!

Uh oh!

iomaganaris commented Feb 21, 2023

Uh oh!

bbpbuildbot commented Feb 21, 2023

Uh oh!

pramodk commented Feb 22, 2023

Uh oh!

iomaganaris commented Feb 22, 2023 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

iomaganaris commented Feb 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pramodk commented Feb 20, 2023

Uh oh!

iomaganaris commented Feb 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented Feb 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

bbpbuildbot commented Feb 20, 2023

Uh oh!

bbpbuildbot commented Feb 21, 2023

Uh oh!

iomaganaris commented Feb 21, 2023

Uh oh!

olupton left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

iomaganaris commented Feb 21, 2023

Uh oh!

bbpbuildbot commented Feb 21, 2023

Uh oh!

pramodk commented Feb 22, 2023

Uh oh!

iomaganaris commented Feb 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

iomaganaris commented Feb 20, 2023 •

edited

Loading

iomaganaris commented Feb 20, 2023 •

edited

Loading

codecov Bot commented Feb 20, 2023 •

edited

Loading

iomaganaris commented Feb 22, 2023 •

edited

Loading