Skip to content

Test to expose bug in array variables.#2779

Merged
1uc merged 24 commits into
masterfrom
1uc/test-array-variables
Apr 19, 2024
Merged

Test to expose bug in array variables.#2779
1uc merged 24 commits into
masterfrom
1uc/test-array-variables

Conversation

@1uc

@1uc 1uc commented Mar 15, 2024

Copy link
Copy Markdown
Collaborator

This test exposes a bug in how data is copied between NEURON and CoreNEURON.

It must include (at least) one MOD file with array range variables. It must check that the copy of these array range variables received by CoreNEURON is correct.

The suspected bug is that an array range variable with M elements is sometimes 1 and sometimes M "floating point variables"; and somehow this is done inconsistently.

@1uc

1uc commented Mar 15, 2024

Copy link
Copy Markdown
Collaborator Author

I tried to debug the issue, when compiling a MOD file called ca_ch. My current understanding is the following.

We can see nrn_init_ca_ch being called from:

void allocate_data_in_mechanism_nrn_init() {
    _nrn_skip_initmodel = true;
    for (int i = 0; i < nrn_nthread; ++i) {  // could be parallel
        NrnThread& nt = nrn_threads[i];
        for (NrnThreadMembList* tml = nt.tml; tml; tml = tml->next) {
            Memb_list* ml = tml->ml;
            mod_f_t s = corenrn.get_memb_func(tml->index).initialize;
            if (s) {
                (*s)(&nt, ml, tml->index);
            }
        }
    }
    _nrn_skip_initmodel = false;
}

https://github.com/neuronsimulator/nrn/blob/17be061e031701736cb03a5088b7ddf253a069a4/src/coreneuron/sim/finitialize.cpp#L21-L38?plain=1
From which we've learnt that it's sending down tml pointers. Which might be allocated/populated like this:

void Phase2::populate(NrnThread& nt, const UserParams& userParams) {
    // ...
    auto& nrn_prop_param_size_ = corenrn.get_prop_param_size();
    auto& nrn_prop_dparam_size_ = corenrn.get_prop_dparam_size();
    // ...

    for (auto tml = nt.tml; tml; tml = tml->next) {
        Memb_list* ml = tml->ml;
        int type = tml->index;
        int layout = corenrn.get_mech_data_layout()[type];
        int n = ml->nodecount;
        int sz = nrn_prop_param_size_[type];
        offset = nrn_soa_byte_align(offset);

        ml->data = nt._data + offset;
        offset += nrn_soa_padded_size(n, layout) * sz;
    }
    // ...
}

https://github.com/neuronsimulator/nrn/blob/17be061e031701736cb03a5088b7ddf253a069a4/src/coreneuron/io/phase2.cpp#L937-L1344?plain=1
Summary:

  • For each mech we set the pointer to be offset into some global allocation.
  • offset is the product of nrn_soa_padded_size(n, layout) which is essentially the number of rows in the SoA (including some padding); and sz.
  • sz is whatever is stored in nrn_prop_param_size_.

So we now need to hunt down how nrn_prop_param_size_ is set. Most likely from here:

void hoc_register_prop_size(int type, int psize, int dpsize) {
    // ...
    corenrn.get_prop_param_size()[type] = psize;
    corenrn.get_prop_dparam_size()[type] = dpsize;
    // ...
}

https://github.com/neuronsimulator/nrn/blob/17be061e031701736cb03a5088b7ddf253a069a4/src/coreneuron/mechanism/register_mech.cpp#L192-L206?plain=1
Which is a function called from compiled mod files, e.g.,

        hoc_register_prop_size(mech_type, float_variables_size(), int_variables_size());

with

    static inline int float_variables_size() {
        return 23;
    }

@1uc

1uc commented Mar 15, 2024

Copy link
Copy Markdown
Collaborator Author

We can see it crash here:
https://github.com/neuronsimulator/nrn/actions/runs/8298830006/job/22713118409?pr=2779#step:16:3877

in a manner that would be consistent with the bug, e.g. a SEGFAULT.

@1uc

1uc commented Mar 15, 2024

Copy link
Copy Markdown
Collaborator Author

@pramodk I think this is the type of test we were thinking of. Please let me know if you see anything wrong with the test.

@azure-pipelines

Copy link
Copy Markdown

✔️ d3920b5 -> Azure artifacts URL

@bbpbuildbot

This comment has been minimized.

@1uc

1uc commented Mar 18, 2024

Copy link
Copy Markdown
Collaborator Author

This looks suspicious, because data pads the number of rows:

nbyte += corenrn.get_prop_param_size()[tml->index] * tml->ml->nodecount * sizeof(double);

@1uc

1uc commented Mar 19, 2024

Copy link
Copy Markdown
Collaborator Author

This is currently my understanding of what's affected. Updated: 2024-03-22 10:06
coreneuron-storage

@pramodk

pramodk commented Mar 20, 2024

Copy link
Copy Markdown
Member

@pramodk I think this is the type of test we were thinking of. Please let me know if you see anything wrong with the test.

the test LGTM.

(Sorry, somehow this mention was not filtered in my mailbox, and hence went into the pile of GitHub emails)

@1uc 1uc force-pushed the 1uc/test-array-variables branch from d3920b5 to e5cf000 Compare March 22, 2024 08:42
@bbpbuildbot

This comment has been minimized.

@1uc 1uc force-pushed the 1uc/test-array-variables branch from e0b5fe1 to 9c4a888 Compare March 22, 2024 15:51
@1uc

1uc commented Mar 22, 2024

Copy link
Copy Markdown
Collaborator Author

I've pushed the changes I've made so far. The test_red_green.py now passes (locally):
upsilon_traces

Clearly, a lot of refactoring and more testing left to be done.

@bbpbuildbot

This comment has been minimized.

@1uc

1uc commented Mar 25, 2024

Copy link
Copy Markdown
Collaborator Author

This sketches out how it works for recording variables.
playrecord-storage

@1uc

1uc commented Mar 25, 2024

Copy link
Copy Markdown
Collaborator Author

This is what what I understand about transferring data/pointers from NRN to CoreNEURON. If there's pointers to array RANGE variables, they'll likely be incorrect.

We can also see that we need a way of getting the substructure, i.e. array_dims and array_prefix_sums.

pointer-conversion

@1uc 1uc force-pushed the 1uc/test-array-variables branch from b80b332 to 8d538f0 Compare March 25, 2024 16:18
@sonarqubecloud

Copy link
Copy Markdown

Please retry analysis of this Pull-Request directly on SonarCloud

@bbpbuildbot

This comment has been minimized.

@bbpbuildbot

This comment has been minimized.

@1uc 1uc force-pushed the 1uc/test-array-variables branch from be7ae9b to 231b8ba Compare March 27, 2024 12:21
@bbpbuildbot

This comment has been minimized.

@1uc 1uc force-pushed the 1uc/test-array-variables branch from 231b8ba to 7cc612d Compare March 27, 2024 17:09
@bbpbuildbot

This comment has been minimized.

@1uc 1uc force-pushed the 1uc/test-array-variables branch from 7cc612d to a48a578 Compare March 28, 2024 07:26
@azure-pipelines

Copy link
Copy Markdown

✔️ a48a578 -> Azure artifacts URL

@bbpbuildbot

This comment has been minimized.

@1uc 1uc force-pushed the 1uc/test-array-variables branch 4 times, most recently from dc98a6f to e9989ef Compare March 28, 2024 11:11

@pramodk pramodk left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(good to merge after submodule update)

Comment thread test/coreneuron/test_array_variables_transfer.py Outdated
Comment thread src/coreneuron/io/core2nrn_data_return.cpp

@nrnhines nrnhines left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that matching old comments to the current api between NEURON and CoreNEURON can be left to a subsequent PR.

Comment thread src/coreneuron/io/core2nrn_data_return.cpp Outdated
@azure-pipelines

Copy link
Copy Markdown

✔️ b59524a -> Azure artifacts URL

@bbpbuildbot

This comment has been minimized.

1uc added a commit to BlueBrain/nmodl that referenced this pull request Apr 18, 2024
These are the code generation changes required for:

    neuronsimulator/nrn#2779

The solution is to fill `nrn_prop_param_size` with the number of
doubles, not number of variables.
1uc added a commit to BlueBrain/nmodl that referenced this pull request Apr 18, 2024
These are the code generation changes required for:

    neuronsimulator/nrn#2779

The solution is to fill `nrn_prop_param_size` with the number of
doubles, not number of variables.
@bbpbuildbot

This comment has been minimized.

1uc added a commit to BlueBrain/nmodl that referenced this pull request Apr 18, 2024
* Fix array variables.

These are the code generation changes required for:

    neuronsimulator/nrn#2779

The solution is to fill `nrn_prop_param_size` with the number of
doubles, not number of variables.
@bbpbuildbot

This comment has been minimized.

@azure-pipelines

Copy link
Copy Markdown

✔️ 4b74073 -> Azure artifacts URL

This should affect copying data from CoreNEURON to NRN. If a GPU is
used, the data can be permuted. When copying back it should be permuted
back.
@1uc 1uc force-pushed the 1uc/test-array-variables branch from c2fa237 to 3f3221c Compare April 19, 2024 07:42
@1uc

1uc commented Apr 19, 2024

Copy link
Copy Markdown
Collaborator Author

While reviewing my own code before suggesting to merge, I noticed that there was a "TODO fix". The comment was there because at the time I was hoping that the code was unused. I think that's wrong, because permute happens when using the GPU. Therefore, the copy back was likely wrong under those circumstances.

I've pushed a commit to address the issue: 3f3221c

@azure-pipelines

Copy link
Copy Markdown

✔️ 3f3221c -> Azure artifacts URL

@bbpbuildbot

This comment has been minimized.

@1uc 1uc force-pushed the 1uc/test-array-variables branch from fa18d6d to 0faf5f5 Compare April 19, 2024 09:28
@sonarqubecloud

Copy link
Copy Markdown

Please retry analysis of this Pull-Request directly on SonarCloud

@azure-pipelines

Copy link
Copy Markdown

✔️ 0faf5f5 -> Azure artifacts URL

@1uc

1uc commented Apr 19, 2024

Copy link
Copy Markdown
Collaborator Author

@nrnhines @pramodk please read: #2779 (comment). It explains a recent change and there's one more commit which is clean up of that change.

I think the PR is now ready to merge.

@pramodk

pramodk commented Apr 19, 2024

Copy link
Copy Markdown
Member

please read: #2779 (comment). It explains a recent change and there's one more commit which is clean up of that change.

that certainly make sense!

Good to merge! 🚀

@1uc 1uc merged commit 7803cce into master Apr 19, 2024
@1uc 1uc deleted the 1uc/test-array-variables branch April 19, 2024 13:05
JCGoran pushed a commit that referenced this pull request Mar 12, 2025
* Fix array variables.

These are the code generation changes required for:

    #2779

The solution is to fill `nrn_prop_param_size` with the number of
doubles, not number of variables.

NMODL Repo SHA: BlueBrain/nmodl@f6821ce
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants