create_mirror_view_and_copy fails on Windows
Tried after a while Windows compiling and the result is: Kokkos::create_mirror_view_and_copy from CUDA space into Kokkos::HostSpace() fails, and all the unit tests and algorithms (e.g. scan) that depend on it.
Compilers tried:
- pure clang++, versions 12 and 13, C++17, C++20
- NVCC 11.7 + MSVC, C++17
Compiled backends:
- Serial
- CUDA
Some failed unit tests under NVCC+MSVC:
- [ FAILED ] cuda.parallel_scan_with_reducers
- [ FAILED ] cuda.view_api_b
- [ FAILED ] cuda.view_allocation_large_rank
- [ FAILED ] cuda.view_shmem_size_on_device
- [ FAILED ] cuda.workgraph_fib
- [ FAILED ] cuda.complex_issue_3865
- [ FAILED ] cuda.crs_count_fill
- [ FAILED ] cuda.crs_copy_constructor
- [ FAILED ] cuda.deep_copy_scratch
- [ FAILED ] cuda.max_within_parfor
- [ FAILED ] cuda.min_within_parfor
- [ FAILED ] cuda.minmax_within_parfor
- [ FAILED ] cuda.clamp_within_parfor
- [ FAILED ] cuda.range_scan
- [ FAILED ] cuda.view_mapping
Any ideas what might be the issue?
p.s. There are many annoyances that need to be fixed for (NVCC+MSVC), but that is another subject
What does the output for the FAILED tests look like? Is there some kind of exception/memory access violation or is the function just returning the wrong result?
@j8asic can you try my PR branch? With CUDA 11.4 and Visual Studio 19 that is compiling and passing tests for me.
Fixed more issues, which popped up in VS 2022.
Just for reference with VS 2022 I started with a standard x64-MSVC config and added these:
-DKokkos_ENABLE_CUDA=ON -DKokkos_ENABLE_CUDA_LAMBDA=ON -DKokkos_ARCH_VOLTA70=ON -DKokkos_ENABLE_TESTS=ON -DKokkos_ENABLE_COMPILE_AS_CMAKE_LANGUAGE=ON
as CMAKE config args. No other editing of the config file.
This build compiled and passed all tests. Visual Studio 2022, version 17.2.5 with CUDA 11.7. Need to check VS 2019 at home, where I got a VS 2019 with CUDA 11.4.
For reference: the error before the fix:
C:\Users\ceear\Source\Repos\kokkos\core\src\Kokkos_HostSpace.hpp(268): error C2955: 'Kokkos::Impl::SharedAllocationRecord': use of class template requires template argument list
../../../core/src\impl/Kokkos_SharedAlloc.hpp(309): note: see declaration of 'Kokkos::Impl::SharedAllocationRecord'
../../../core/src\impl/Kokkos_SharedAlloc.hpp(320): note: see reference to function template instantiation 'Kokkos::Impl::SharedAllocationRecord<Kokkos::HostSpace,void>::SharedAllocationRecord<ExecutionSpace>(const ExecutionSpace &,const Kokkos::HostSpace &,const std::string &,const size_t,void (__cdecl *const )(Kokkos::Impl::SharedAllocationRecord<void,void> *))' being compiled
with
[
ExecutionSpace=execution_space
]
../../../core/src\impl/Kokkos_SharedAlloc.hpp(353): note: see reference to function template instantiation 'Kokkos::Impl::SharedAllocationRecord<Kokkos::HostSpace,Kokkos::Impl::ViewValueFunctor<Kokkos::Device<Kokkos::Serial,Kokkos::HostSpace>,double,true>>::SharedAllocationRecord<ExecutionSpace>(const ExecutionSpace &,const MemorySpace &,const std::string &,const size_t)' being compiled
with
[
ExecutionSpace=execution_space,
MemorySpace=Kokkos::HostSpace
]
../../../core/src\impl/Kokkos_SharedAlloc.hpp(353): note: see reference to function template instantiation 'Kokkos::Impl::SharedAllocationRecord<Kokkos::HostSpace,Kokkos::Impl::ViewValueFunctor<Kokkos::Device<Kokkos::Serial,Kokkos::HostSpace>,double,true>>::SharedAllocationRecord<ExecutionSpace>(const ExecutionSpace &,const MemorySpace &,const std::string &,const size_t)' being compiled
with
[
ExecutionSpace=execution_space,
MemorySpace=Kokkos::HostSpace
]
../../../core/src\impl/Kokkos_ViewMapping.hpp(3455): note: see reference to function template instantiation 'Kokkos::Impl::SharedAllocationRecord<Kokkos::HostSpace,Kokkos::Impl::ViewValueFunctor<Kokkos::Device<Kokkos::Serial,Kokkos::HostSpace>,double,true>> *Kokkos::Impl::SharedAllocationRecord<Kokkos::HostSpace,Kokkos::Impl::ViewValueFunctor<Kokkos::Device<Kokkos::Serial,Kokkos::HostSpace>,double,true>>::allocate<execution_space>(const ExecutionSpace &,const MemorySpace &,const std::string &,const size_t)' being compiled
with
[
ExecutionSpace=execution_space,
MemorySpace=Kokkos::HostSpace
]
../../../core/src\impl/Kokkos_ViewMapping.hpp(3455): note: see reference to function template instantiation 'Kokkos::Impl::SharedAllocationRecord<Kokkos::HostSpace,Kokkos::Impl::ViewValueFunctor<Kokkos::Device<Kokkos::Serial,Kokkos::HostSpace>,double,true>> *Kokkos::Impl::SharedAllocationRecord<Kokkos::HostSpace,Kokkos::Impl::ViewValueFunctor<Kokkos::Device<Kokkos::Serial,Kokkos::HostSpace>,double,true>>::allocate<execution_space>(const ExecutionSpace &,const MemorySpace &,const std::string &,const size_t)' being compiled
with
[
ExecutionSpace=execution_space,
MemorySpace=Kokkos::HostSpace
]
Again about the VOID issue: If you're not willing to rename the tag struct to e.g. VOID_TAG the following should be added to
impl/Kokkos_FunctorAnalysis.hpp: #undef VOID . Proof: it was also added to impl/Kokkos_Atomic_Windows.hpp https://github.com/kokkos/kokkos/blob/d19aab9981a2c447e832a7b4eb7b16992328fb14/core/src/impl/Kokkos_Atomic_Windows.hpp#L55.
It actually gets renamed in #5268
Cherry-picked to release 3.7.00 in #5282