我正在尝试使用MVAPICH2-GDR来执行简单的hello world程序。虽然它可以成功编译代码,但在运行时存在分割错误。我的平台是Redhat 6.5和CUDA 7.5。所以我下载了rpm文件mvapich2-gdr-cuda7.5-intel-2.2-0.3.rc1.el6.x86_64.rpm.
MPI代码是简单的hello world程序:
1 #include <mpi.h>
2 #include <stdio.h>
3
4 int main(int argc, char** argv) {
5 // Initialize the MPI environment
6 MPI_Init(NULL, NULL);
7 // Get the number of processes
8 int world_size;
9 MPI_Comm_size(MPI_COMM_WORLD, &world_size);
10
11 // Get the rank of the process
12 int world_rank;
13 MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
14
15 // Print off a hello world message
16 printf("Hello world from %d out of %d\n", world_rank, world_size);
17
18 // Finalize the MPI environment.
19 MPI_Finalize();
20 }为了编译这个程序,我使用了以下命令:
mpicc hello.c -o hello要运行该程序:
mpirun -np 2 ./hello错误消息如下:
[localhost.localdomain:mpi_rank_1][error_sighandler] Caught error: Segmentation fault (signal 11)
[localhost.localdomain:mpi_rank_0][error_sighandler] Caught error: Segmentation fault (signal 11)
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 188057 RUNNING AT localhost.localdomain
= EXIT CODE: 139
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)因为MVAPICH2-GDR没有开放源代码,所以我真的不知道错误是从哪里来的。有人成功使用过MVAPICH2-GDR吗?
发布于 2016-07-27 00:42:41
为了提高GPU-GPU通信的性能,MVAPICH2-GDR使用了一个新的GDRCOPY模块。您需要显式地将MVAPICH2-GDR指向该库,或者通过设置MV2_USE_GPUDIRECT_GDRCOPY=0来显式禁用此功能。
正如您所看到的,通过禁用此功能,我能够运行您的代码。有关更多信息,请参阅用户指南:http://mvapich.cse.ohio-state.edu/userguide/gdr/2.2rc1/
/ MV2_USE_GPUDIRECT_GDRCOPY=0 /bin/mpirun -np 2 ./a.out来自0的Hello world来自2的Hello world来自1的Hello world
https://stackoverflow.com/questions/38576791
复制相似问题