kokkos icon indicating copy to clipboard operation
kokkos copied to clipboard

SICMSpace

Open calccrypto opened this issue 6 years ago • 9 comments

Added an alternative to HostSpace that uses lanl/SICM underneath to provide arena allocations on devices that are visible as NUMA nodes (Serial, OpenMP, and Threads). When a variable is moved, the entire arena it is located at, which may have other variables in it, is moved, allowing for more efficient data movement of related variables.

Devices can be provided to SICMSpace like so:

sicm_device_list use_these_devices; // a subset of the full device list
Kokkos::Experimental::SICMSpace sicmspace(&use_these_devices);
auto memspace = Kokkos::view_alloc("SICM", sicmspace);
Kokkos::View<Double *, Kokkos::Experimental::SICMSpace> view( memspace, N );

SICMSpace using the default arena can have allocations appear in any NUMA node.

calccrypto avatar Jan 24 '20 20:01 calccrypto

Can one of the admins verify this patch?

dalg24-jenkins avatar Jan 24 '20 20:01 dalg24-jenkins

Can you call into our weekly meeting on Wednesday so we can discuss the above mentioned questions? We may need to get Galen and Geoff involved too.

crtrott avatar Jan 27 '20 18:01 crtrott

You are targeting the wrong branch here. It should be develop and not master.

masterleinad avatar Jan 27 '20 19:01 masterleinad

@masterleinad Fixed

calccrypto avatar Jan 27 '20 21:01 calccrypto

Can you call into our weekly meeting on Wednesday so we can discuss the above mentioned questions? We may need to get Galen and Geoff involved too.

I am afraid I will be busy tomorrow and cannot make it.

calccrypto avatar Jan 28 '20 21:01 calccrypto

I believe this should be external, and integrated in a similar fashion to the Umpire Space. See #2690 for details. We can have a conversation whether we want an "Optional" memory spaces repository, or whether we want repositories specific to each variety.

jeffmiles63 avatar Feb 03 '20 22:02 jeffmiles63

@jeffmiles63 The issue is that HPC people have this allergy to extra dependencies, stemming from the lack of good package management in HPC. If we're going to ask people to put their contributions in another repository, we really need to have a good story for the interaction between Kokkos and package management, since in many cases asking someone to put a contribution in an external repository is asking their users to literally double the number of dependencies their project has (i.e., Kokkos + 1), which isn't very fair. I'm not saying I disagree, but I think that with the number of these sorts of contributions we're seeing these days, we had better bump up the priority of getting our package management story straight (probably starting with Spack).

dhollman avatar Feb 04 '20 01:02 dhollman

Haven't looked too much at the SharedAllocationRecor specializations but all my comments have been addressed.

We sill need to discuss https://github.com/kokkos/kokkos/pull/2678#pullrequestreview-348336728.

dalg24 avatar Feb 10 '20 23:02 dalg24

@calccrypto please look at PR #2845 and the referenced repository https://github.com/jeffmiles63/kokkos-extensions I took a stab at making this work with the proposed plugin interface.
A couple of issues that I ran into.

  1. From the plugin, the found SICM package has issues finding the jemalloc library. It added the link line but not the path. The only I could get this to work is to have the SICM package installed in the same location as the jemalloc package.
  2. After I finally got it to build, it still fails to run giving a symbol lookup error: symbol lookup error: <path to lib>/libsicm.so: undefined symbol: je_mallocx

jeffmiles63 avatar Mar 09 '20 22:03 jeffmiles63

Closing for lack of activity and no clear way forward.

dalg24 avatar Jul 13 '23 17:07 dalg24