Skip to content

Crash with MPI_Reduce( MPI_IN_PLACE, ...) when destination rank > 0 and two processes on same host #6540

@latexxi

Description

@latexxi

Crash with MPI_Reduce( MPI_IN_PLACE, ... )
Happens when MPI_Reduce destination rank != 0 and running two processes on the same host.

mpich3.2.1 (and openmpi) works as expected
mpich4.0.2 and 4.1.1 crashes

Caught signal 11 (Segmentation fault: address not mapped to object at address 0x3e6f)
0 0x0000000000153d69 __memcpy_ssse3_back() :0
1 0x0000000000288d86 MPIR_Typerep_unpack() :0
2 0x00000000002abb8a do_localcopy() utils.c:0
3 0x00000000002abcab MPIR_Localcopy() :0
4 0x00000000001e06bc MPIR_Reduce_intra_reduce_scatter_gather() :0
5 0x000000000025ae8a MPIR_Reduce_allcomm_auto() :0
6 0x000000000025afe4 MPIR_Reduce_impl() :0
7 0x000000000025b936 MPIR_Reduce() :0
8 0x00000000000fadd7 PMPI_Reduce() ???:0
9 0x000000000040150d main() /home/lauri/cpp/mpi_test/cleaned.cpp:29
10 0x00000000000223d5 __libc_start_main() ???:0
11 0x0000000000401239 _start() ???:0

TO REPRODUCE:
mpic++ -O0 -g -o mpi_reduce_test mpi_reduce_test.cpp
mpirun -n 2 ./mpi_reduce_test

// mpi_reduce_test.cpp
#include <vector>
#include <iostream>
#include <sstream>
#include <cstring>
#include <unistd.h>
#include "mpi.h"

int main(int argc, char *argv[]) {
    int err = MPI_Init( &argc, &argv );
    int NoOfProcess, rank;
    MPI_Comm_size(MPI_COMM_WORLD, &NoOfProcess);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    if ( rank == 0 ) {
        char buf[1024 * 8];
        int len;
        err = MPI_Get_library_version(buf, &len);
        std::cout << buf << std::endl;
    }
    size_t count = 2000;
    std::vector<double> buffer(count, (1.0 + (double)rank));
    char strBuf[1024];
    // MPI_Reduce to process #dst
    for ( int dst = 0; dst < 2; dst++ ) {
        if ( rank == dst ) {
            std::cerr << "In-place MPI_Reduce" << std::endl;
            err = MPI_Reduce( MPI_IN_PLACE,
                              buffer.data(),
                              count,
                              MPI_DOUBLE,
                              MPI_SUM,
                              dst,
                              MPI_COMM_WORLD );
        } else {
            err = MPI_Reduce( buffer.data(),
                              nullptr,
                              count,
                              MPI_DOUBLE,
                              MPI_SUM,
                              dst,
                              MPI_COMM_WORLD );
        }

        if ( rank == 0 ) {
            std::cerr << "After MPI_REDUCE: rank=" << rank << " buffer[0:2]=" << buffer[0] << " " << buffer[1] << std::endl;
            MPI_Recv( strBuf, 1024, MPI_CHAR, 1, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE );
            std::cerr << strBuf;
            std::cerr << "***********" << std::endl;
        } else {
            std::stringstream ss;
            ss << "After MPI_REDUCE: rank=" << rank << " buffer[0:2]=" << buffer[0] << " " << buffer[1] << std::endl;
            std::string str = ss.str();
            MPI_Send( str.c_str(), str.length(), MPI_CHAR, 0, 0, MPI_COMM_WORLD );
        }
    }
    err = MPI_Finalize();
    return 0;
}

MPI version details:

Details

MPICH Version: 4.0.2 MPICH Release date: Thu Apr 7 12:34:45 CDT 2022 MPICH ABI: 14:2:2 MPICH Device: ch4:ucx MPICH configure: --prefix=/vols/mmsimP4_t1b_006/ws/lauri/lauri2310_mpi/mmsimMPI/4.0.2/install/64 --with-pm=hydra --with-device=ch4:ucx --disable-checkpointing --disable-libudev --enable-strict --enable-fast=O3 --disable-fortran MPICH CC: gcc -Wall -Wextra -Wstrict-prototypes -Wmissing-prototypes -DGCC_WALL -Wno-unused-parameter -Wshadow -Wmissing-declarations -Wundef -Wpointer-arith -Wbad-function-cast -Wwrite-strings -Wno-sign-compare -Wold-style-definition -Wnested-externs -Winvalid-pch -Wvariadic-macros -Wtype-limits -Werror-implicit-function-declaration -Wstack-usage=262144 -fno-var-tracking -Wno-unused-label -O2 -std=c99 -D_STDC_C99= -D_POSIX_C_SOURCE=200112L -O3 MPICH CXX: g++ -O3 MPICH F77: MPICH FC:

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions