阻塞执行，直到通过MPI_Comm_spawn调用的子代完成

本文介绍了阻塞执行，直到通过MPI_Comm_spawn调用的子代完成的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在修改一个现有的应用程序，我想在其中生成一个动态创建的bash脚本.我创建了一个简单的包装程序，该程序将bash脚本的名称作为参数.在包装器中，脚本由MPI_Comm_spawn生成.紧接着，包装程序调用MPI_Finalize，它在脚本完成之前执行:

I'm in the process of modifying an existing application, where I would like to spawn a dynamically created bash script. I created a simple wrapper routine which takes the name of the bash script as an argument. In the wrapper, the script is spawned by MPI_Comm_spawn. Directly after, the wrapper calls MPI_Finalize, which is executed before the scripts have finished:

#include "mpi.h"
#include <stdlib.h>
#include <iostream>

using namespace std;

int main(int argc, char *argv[])
{
    char *script = argv[1];
    int maxProcs = 2, myRank;
    MPI_Comm childComm;
    int spawnError[maxProcs];

    // Initialize
    argv[1] = NULL;
    MPI_Init(&argc, &argv);

    // Rank of parent process
    MPI_Comm_rank(MPI_COMM_WORLD, &myRank);    

    // Spawn application    
    MPI_Comm_spawn(script, MPI_ARGV_NULL, maxProcs, MPI_INFO_NULL, myRank, MPI_COMM_SELF, &childComm, spawnError);

    // Finalize
    MPI_Finalize();

    return EXIT_SUCCESS;
}

如果我插入

    sleep(10);

在

    MPI_Finalize ();

一切正常.现在我的问题是在bash脚本完成之前是否有可能在包装器中阻止执行?另外，最好获得脚本的返回值.不幸的是，不能为脚本创建另一个包装器，该脚本与父包装器通信并通过系统调用执行bash脚本，因为我需要从脚本内部访问MPI环境变量.我希望我已经把事情弄清楚了.任何帮助将不胜感激！

everything works fine. Now my question is if it is possible to block execution in the wrapper until the bash script is finished? Also, it would be nice to obtain the return value of the script. Unfortunately, it is not an option to create another wrapper for the script, which communicates with the parent wrapper and executes the bash scripts via a system call because I need to access MPI environment variables from within the script. I hope, I have made things clear enough. Any help would be greatly appreciated!

推荐答案

如果您可以控制bash脚本的内容，即如果可以在生成之前将某些内容放进去，那么一个非常粗糙的选择是编写一个特殊的MPI程序，其中包含一条MPI_Barrier行:

If you have control over the content of the bash script, i.e. if you can put something into it before the spawn, then a very crude option would be to write a special MPI program that contains a single MPI_Barrier line:

#include <mpi.h>

int main (int argc, char **argv)
{
   MPI_Comm parent;

   MPI_Init(&argc, &argv);

   // Obtain an intercommunicator to the parent MPI job
   MPI_Comm_get_parent(&parent);

   // Check if this process is a spawned one and if so enter the barrier
   if (parent != MPI_COMM_NULL)
      MPI_Barrier(parent);

   MPI_Finalize();

   return 0;
}

以与其他MPI程序相同的方式编译该程序，并使用与主MPI程序相同的相同的MPI发行版，并将其命名为waiter.然后在bash脚本的开始处设置一个EXIT陷阱:

Compile the program as any other MPI program with the same MPI distribution as the one used by the main MPI program and call it something like waiter. Then set an EXIT trap at the very beginning of your bash script:

#!/bin/bash
trap "/path/to/waiter $*" EXIT
...
# End of the script file

还将主程序修改为:

// Spawn application    
MPI_Comm_spawn(script, MPI_ARGV_NULL, maxProcs, MPI_INFO_NULL, myRank, MPI_COMM_SELF, &childComm, spawnError);

// Wait for the waiters to enter the barrier
MPI_Barrier(childComm);

// Finalize
MPI_Finalize();

重要的是，陷阱中必须像waiter $*那样调用waiter，以便它可以接收bash脚本将接收的所有命令行参数，因为某些旧的MPI实现将附加参数附加到生成的可执行文件中以提供它.与父级连接信息.符合MPI-2的实现通常通过环境提供此信息，以支持MPI_Init(NULL, NULL).

It is important that waiter is called like waiter $* inside the trap so it can receive all command line arguments that the bash script would receive since some old MPI implementations append additional arguments to the spawned executable in order to provide it with parent connectivity information. MPI-2 compliant implementations usually provide this information via the environment in order to support MPI_Init(NULL, NULL).

这种工作方式非常简单:trap命令指示外壳程序在脚本退出时执行waiter. waiter本身只是与父MPI作业建立了一个内部通信器，然后等待障碍.一旦所有生成的脚本都完成，它们都将作为退出陷阱的一部分开始等待程序，障碍将被解除.

The way this works is pretty simple: the trap command instructs the shell to execute waiter whenever the script exits. waiter itself simply establishes an intercommunicator with the parent MPI job and waits on the barrier. Once all spawned scripts have completed, all of them start the waiter process as part of the exit trap and the barrier will be lifted.

如果您无法修改脚本，则只需创建一个包装器脚本，该包装器将调用实际脚本并将服务员放入包装器中.

If you cannot modify the script, then just create a wrapper script that calls the actual script and put the waiter in the wrapper.

经过测试，并可以与Open MPI和Intel MPI一起使用.

Tested and works with Open MPI and Intel MPI.

这篇关于阻塞执行，直到通过MPI_Comm_spawn调用的子代完成的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！