Message Passing Interface

How do we realize practically this parallelism?

Let us focus on what we have discussed until now:

  • We have ``machines’’ with multiple processors and whose main memory is partitioned into fragmented components,

  • We have algorithms that can divide a problem of size \(N\) among these processors so that they can run (almost) independently,

  • With a certain degree of approximation, we know how to compute what is the best improvement we can expect from a parallel program with \(M\) processors on a problem of size \(N\).

What we need to discuss now is then: “How can we actually implement these algorithms on real machines?

  • We need a way to define a parallel environment in which every processor is accounted for,

  • We need to have data formats that are aware of the fact that we have a distributed memory,

  • We need to exchange data between the various memory fragments.

“MPI (Message Passing Interface) is a specification for a standard library for message passing that was defined by the MPI Forum, a broadly based group of parallel computer vendors, library writers, and applications specialists.” – W. Gropp, E. Lusk, N. Doss, A. Skjellum,’’ – A high-performance, portable implementation of the MPI message passing interface standard, Parallel Computing, 22 (6), 1996.

  • MPI implementations consist of a specific set of routines directly callable from C, C++, Fortran, Python;

  • MPI uses Language Independent Specifications for calls and language bindings;

  • The MPI interface provides an essential virtual topology, synchronization, and communication functionality inside a set of processes.

  • There exist many implementations of the MPI specification, e.g., MPICH, Open MPI, etc.

Our First MPI Program

In all the course we are going to use the MPI inside Python programs.

Let us start from the classical helloworld program:

%%file ccode/helloworld.c
#include"mpi.h"
#include<stdio.h>

int main(int argc,char **argv){
 MPI_Init( &argc, &argv);
 printf("Hello, world!\n");
 MPI_Finalize();
 return 0;
}
Overwriting ccode/helloworld.c

We can compile it by doing

mpicc helloworld.c -o helloworld
  • mpicc is a wrapper for a C compiler provided by the Open MPI implementation of MPI.

  • the option -o sets the name of the compiled (executable) file.

Let us see what is happening behind the curtains

  • you can first try to discover what compiler are you using by executing

mpicc --version

that will give you something like

gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
Copyright (C) 2017 Free Software Foundation, Inc.
  • or discover what are the library inclusion and linking options by asking for mpicc --showme:compile and mpicc --showme:link, respectively.

  • In general, looking at the output of the man mpicc command is always a good idea.

``If you find yourself saying, “But I don’t want to use wrapper compilers!”, please humor us and try them. See if they work for you. Be sure to let us know if they do not work for you. ‘’ - https://www.open-mpi.org/faq/?category=mpi-apps

Note

A piece of advice: if your program is anything more realistic than a classroom exercise use make1, and save yourself from writing painfully long compiling commands, and dealing with complex dependencies more than once.

“Make gets its knowledge of how to build your program from a file called the makefile, which lists each of the non-source files and how to compute it from other files.”

A very simple Makefile for our first test would be

MPICC = mpicc #The wrapper for the compiler
CFLAGS += -g  #Useful for debug symbols
all: helloworld
helloworld: helloworld.c
  $(MPICC) $(CFLAGS) $(LDFLAGS) $? $(LDLIBS) -o $@
clean:
  rm -f helloworld

Let us run our first parallel program by doing:

mpirun [ -np X ] [ --hostfile <filename> ]  python helloworld.py

or by using its synonym

mpiexec [ -np X ] [ --hostfile <filename> ] python helloworld.py
  • mpiexec will run X copies of helloworld in your current run-time environment, scheduling (by default) in a round-robin fashion by CPU slot.

  • if running under a supported resource manager, Open MPI’s mpirun will usually automatically use the corresponding resource manager process starter, as opposed to, for example, rsh or ssh, which require the use of a hostfile, or will default to running all X copies on the localhost

  • as always, look at the manual, by doing man mpirun.

!(cd ccode && make helloworld)
!mpiexec -np 4 ./ccode/helloworld
make[1]: ingresso nella directory "/home/cirdan/Documenti/RTDa-PISA/CorsoCalcoloParallelo2021/introtoparallelcomputing/intrompi/ccode"
mpicc			 -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /home/cirdan/anaconda3/envs/parallel/include -g			 -Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,--disable-new-dtags -Wl,--gc-sections -Wl,-rpath,/home/cirdan/anaconda3/envs/parallel/lib -Wl,-rpath-link,/home/cirdan/anaconda3/envs/parallel/lib -L/home/cirdan/anaconda3/envs/parallel/lib helloworld.c -lm -ldl -o helloworld
make[1]: uscita dalla directory "/home/cirdan/Documenti/RTDa-PISA/CorsoCalcoloParallelo2021/introtoparallelcomputing/intrompi/ccode"
Hello, world!
Hello, world!
Hello, world!
Hello, world!

Every process executes the line that it is a local routine!

A procedure is local if completion of the procedure depends only on the local executing process.

A procedure is non-local if completion of the operation may require the execution of some MPI procedure on another process. Such an operation may require communication occurring with another user process.

The MPI parallel environment

The MPI parallel environment Let us modify our helloworld to investigate the MPI parallel environment. Specifically, we want to answer, from within the program, to the questions:

  1. How many processes are there?

  2. Who am I?

%%file ccode/hamlet.c
#include "mpi.h"
#include <stdio.h>
int main( int argc, char **argv ){
 int rank, size;
 MPI_Init( &argc, &argv );
 MPI_Comm_rank( MPI_COMM_WORLD, &rank );
 MPI_Comm_size( MPI_COMM_WORLD, &size );
 printf( "Hello world! I'm process %d of %d\n",rank, size );
 MPI_Finalize();
 return 0;
}
Overwriting ccode/hamlet.c
  • How many is answered by a call to MPI_Comm_size as an int value,

  • Who am I? Is answered by a call to MPI_Comm_rank as an int value that is conventionally called rank and is a number between 0 and size-1.

The MPI parallel environment The last keyword we need to describe is the MPI_COMM_WORLD, this is the standard Communicator object.

Communicator: A Communicator object connects a group of processes in one MPI session. There can be more than one communicator in an MPI session, each of them gives each contained process an independent identifier and arranges its contained processes in an ordered topology.

This provides

  • a safe communication space, that guarantees that the code can communicate as they need to, without conflicting with communication extraneous to the present code, e.g., if other parallel libraries are in use,

  • a unified object for conveniently denoting communication context, the group of communicating processes and to house abstract process naming.

The MPI parallel environment If we have saved our inquiring MPI program in the file hamlet.c, we can then modify our Makefile by modifying/adding the lines

all: helloworld hamlet
hamlet: hamlet.c
 $(MPICC) $(CFLAGS) $(LDFLAGS) $? $(LDLIBS) -o $@
clean:
 rm -f helloworld hamlet

Then, we compile everything by doing make hamlet (or, simply, make).

!(cd ccode && make hamlet)
!mpiexec -np 6 ./ccode/hamlet
make[1]: ingresso nella directory "/home/cirdan/Documenti/RTDa-PISA/CorsoCalcoloParallelo2021/introtoparallelcomputing/intrompi/ccode"
mpicc			 -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /home/cirdan/anaconda3/envs/parallel/include -g			 -Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,--disable-new-dtags -Wl,--gc-sections -Wl,-rpath,/home/cirdan/anaconda3/envs/parallel/lib -Wl,-rpath-link,/home/cirdan/anaconda3/envs/parallel/lib -L/home/cirdan/anaconda3/envs/parallel/lib hamlet.c -lm -ldl -o hamlet
make[1]: uscita dalla directory "/home/cirdan/Documenti/RTDa-PISA/CorsoCalcoloParallelo2021/introtoparallelcomputing/intrompi/ccode"
Hello world! I'm process 1 of 6
Hello world! I'm process 4 of 6
Hello world! I'm process 0 of 6
Hello world! I'm process 2 of 6
Hello world! I'm process 3 of 6
Hello world! I'm process 5 of 6

We can rewrite the same code in Python as

%%file hamlet.py
"""
Hello (parallel) world!
"""
from mpi4py import MPI

comm = MPI.COMM_WORLD 
rank = comm.Get_rank() 
size = comm.Get_size() 

print("Hello world! I'm process ",rank," of ",size)
Overwriting hamlet.py

What have we done here:

  • The instruction

from mpi4py import MPI

provides basic MPI definitions and types, if this was a C code, this would have been a preprocessor directive of the form #include "mpi.h"

  • start MPI by creating a communicator

comm = MPI.COMM_WORLD

For the Python code

  • How many is answered by a call to comm.Get_size() as an int value,

  • Who am I? Is answered by a call to comm.Get_rank() as an int value that is conventionally called rank and is a number between 0 and size-1.

!mpiexec -n 4 python hamlet.py
Hello world! I'm process  0  of  4
Hello world! I'm process  1  of  4
Hello world! I'm process  2  of  4
Hello world! I'm process  3  of  4
  • Every processor answers the call,

  • But it answers it as soon as he has done doing the computation! There is no synchronization.

Point-to-point communication

Sending and Receiving Messages We have seen that each process within a communicator is identified by its rank, how can we [exchange data]{.alert} between two processes?

We need to posses several information to have a meaningful message

  • Who is sending the data?

  • To whom the data is sent?

  • What type of data are we sending?

  • How does the receiver can identify it?

The blocking send and receive

int MPI_Send(void *message, int count, 
    MPI_Datatype datatype, int dest, int tag, 
    MPI_Comm comm)
  • void *message: points to the message content itself, it can be a simple scalar or a group of data,

  • int count: specifies the number of data elements of which the message is composed,

  • MPI_Datatype datatype: indicates the [data type]{.alert} of the elements that make up the message,

  • int dest: the rank of the destination process,

  • int tag: the user-defined tag field,

  • MPI_Comm comm: the communicator in which the source and destination processes reside and for which their respective ranks are defined.

int MPI_Recv (void *message, int count, 
    MPI_Datatype datatype, int source, int tag,
    MPI_Comm comm, MPI_Status *status)
  • void *message: points to the message content itself, it can be a simple scalar or a group of data,

  • int count: specifies the number of data elements of which the message is composed,

  • MPI_Datatype datatype: indicates the [data type]{.alert} of the elements that make up the message,

  • int dest: the rank of the source process,

  • int tag: the user-defined tag field,

  • MPI_Comm comm: the communicator in which the source and destination processes reside,

  • MPI_Status *status: is a structure that contains three fields named MPI_SOURCE , MPI_TAG, and MPI_ERROR.

Basic MPI Data Types Of the inputs in the previous slides the only one that is specific to MPI is the MPI_Datatype, these corresponds to a C data type

MPI Data Types

C Type

MPI_CHAR

signed char

MPI_SHORT

signed short int

MPI_INT

signed int

MPI_LONG

signed long int

MPI_FLOAT

float

MPI_DOUBLE

double

MPI_LONG_DOUBLE

long double

MPI_UNSIGNED_CHAR

unsigned char

MPI_UNSIGNED_SHORT

unsigned short int

MPI_UNSIGNED

unsigned int

MPI_UNSIGNED_LONG

unsigned long int

Note: we will see in the following how to send/receive user–defined data structures.

Why “blocking” send and receive? For the MPI_Send to be blocking means that it does not return until the message data and envelope have been safely stored away so that the sender is free to modify the send buffer: it is a non local operation.

Note: The message might be copied directly into the matching receive buffer (as in the first figure), or it might be copied into a temporary system buffer.

A simple send/receive example

If we want to test these two instructions we can write the following simple C program.

%%file ccode/easysendrecv.c
#include "mpi.h"
#include <string.h>
#include <stdio.h>
int main( int argc, char **argv){
 char message[20];
 int myrank;
 MPI_Status status;
 MPI_Init( &argc, &argv );
 MPI_Comm_rank( MPI_COMM_WORLD, &myrank );
 if (myrank == 0){  /* code for process zero */
  strcpy(message,"Hello, there");
  MPI_Send(message, strlen(message)+1, MPI_CHAR, 1, 99, MPI_COMM_WORLD);
 }
 else if (myrank == 1){ /* code for process one */
  MPI_Recv(message, 20, MPI_CHAR, 0, 99, MPI_COMM_WORLD, &status);
  printf("received :%s:\n", message);
 }
 MPI_Finalize();
 return 0;
}
Overwriting ccode/easysendrecv.c

That could be recasted in Python by doing

%%file easysendrecv.py
"""
A simple send/receive example
"""
from mpi4py import MPI

comm = MPI.COMM_WORLD 
rank = comm.Get_rank() 
size = comm.Get_size() 

if rank == 0:
    data = "Hello, there"
    comm.send(data, dest=1, tag=99)
elif rank == 1:
    data = comm.recv(source=0, tag=99)
    print('received :',data)
Overwriting easysendrecv.py

That we can run as the simpler program by doing:

!mpiexec -np 2 python easysendrecv.py
received : Hello, there

for the Python version, or the following for the C version

!(cd ccode && make easysendrecv)
!mpiexec -np 2 ./ccode/easysendrecv
make[1]: ingresso nella directory "/home/cirdan/Documenti/RTDa-PISA/CorsoCalcoloParallelo2021/introtoparallelcomputing/intrompi/ccode"
mpicc			 -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /home/cirdan/anaconda3/envs/parallel/include -g			 -Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,--disable-new-dtags -Wl,--gc-sections -Wl,-rpath,/home/cirdan/anaconda3/envs/parallel/lib -Wl,-rpath-link,/home/cirdan/anaconda3/envs/parallel/lib -L/home/cirdan/anaconda3/envs/parallel/lib easysendrecv.c -lm -ldl -o easysendrecv
make[1]: uscita dalla directory "/home/cirdan/Documenti/RTDa-PISA/CorsoCalcoloParallelo2021/introtoparallelcomputing/intrompi/ccode"
received :Hello, there:

So, what have we done? Process \(0\) sends the content of the char array message[20], whose size is strlen(message)+1 size of char (MPI_CHAR) to processor 1 with tag 99 on the communicator MPI_COMM_WORLD. on the other side process \(1\), receives into the buffer message[20] an array with size 20 size of MPI_CHAR, from process 0 with tag 99 on the same communicator MPI_COMM_WORLD.

Observe that in the Python case we did not declare the size or the type of the object we were passing. The all-lowercase methods (of the Comm class), like send(), recv(), work by passing an object to be sent as a paramenter to the communication call, and the received object is simply the return value. These variants can communicate general Python objects.

In MPI for Python, the MPI.Comm.Send(), MPI.Comm.Recv() and methods of communicator objects provide support for blocking point-to-point communications and can be used to communicate memory buffers, as we do in the C variant. Consider the following example sending a numpy array between two processes.

%%file easysendrecv2.py
"""
A (slightly less) simple send/receive example
In which we :
- pass MPI datatypes explicitly
- use the MPI datatype discovery
"""
from mpi4py import MPI
import numpy

comm = MPI.COMM_WORLD
rank = comm.Get_rank()

if rank == 0:
    data = numpy.arange(1000, dtype='i')
    comm.Send([data, MPI.INT], dest=1, tag=77)
elif rank == 1:
    data = numpy.empty(1000, dtype='i')
    comm.Recv([data, MPI.INT], source=0, tag=77)

if rank == 0:
    data = numpy.arange(100, dtype=numpy.float64)
    comm.Send(data, dest=1, tag=13)
elif rank == 1:
    data = numpy.empty(100, dtype=numpy.float64)
    comm.Recv(data, source=0, tag=13)
Overwriting easysendrecv2.py

In general, buffer arguments to these calls must be explicitly specified by using a 2/3-list/tuple like [data, MPI.DOUBLE], or [data, count, MPI.DOUBLE] (the former one uses the byte-size of data and the extent of the MPI datatype to define count).

!mpiexec -np 2 python easysendrecv2.py

The perform the datatype discovery, Python uses the pickle module. This module implements binary protocols for serializing and de-serializing a Python object structure. “Pickling” is the process converting a Python object hierarchy into a byte stream, and “unpickling” is the inverse operation, converting a byte stream (from a binary file or bytes-like object) into an object hierarchy.

A simple send/receive example : programmer smash!

It is a good exercise to try and mess things up, so let us see some damaging suggestions (test them with the previous C code):

  • What happens if we have a mismatch in the tags?

  • A: The process stays there hanging waiting for a message with a tag that will never come…

  • What happens if we have a mismatch in the ranks of the sending and receiving processes?

  • A: The process stays there hanging trying to match messages that will never come…

  • What happens if we use the wrong message size?

  • A: If the size of the arriving message is longer than the expected we get an error of MPI_ERR_TRUNCATE: message truncated, note that there are combinations of wrong sizes for which things still works

  • What happens if we have a mismatch in the type?

  • A: There are combinations of instances in which things seems to work, but the code is erroneous, and the behavior is not deterministic.

Deadlock

We have now two processes that needs to exchange some data.

  • Solution 1:

MPI_Comm_rank(comm, &myrank);
if (myrank == 0){
 MPI_Send(sendbuf, count, MPI_DOUBLE, 1, tag, comm);
 MPI_Recv(recvbuf, count, MPI_DOUBLE, 1, tag, comm, status);
}else if(myrank == 1){
 MPI_Send(sendbuf, count, MPI_DOUBLE, 0, tag, comm); 
 MPI_Recv(recvbuf, count, MPI_DOUBLE, 0, tag, comm, status);
}
  • Solution 2:

MPI_Comm_rank(comm, &myrank);
if (myrank == 0){
 MPI_Recv(recvbuf, count, MPI_DOUBLE, 1, tag, comm, status);
 MPI_Send(sendbuf, count, MPI_DOUBLE, 1, tag, comm);
}else if(myrank == 1){ 
 MPI_Recv(recvbuf, count, MPI_DOUBLE, 0, tag, comm, status);
 MPI_Send(sendbuf, count, MPI_DOUBLE, 0, tag, comm);
}
  • Solution 3:

MPI_Comm_rank(comm, &myrank);
if (myrank == 0){
 MPI_Send(sendbuf, count, MPI_DOUBLE, 1, tag, comm);
 MPI_Recv(recvbuf, count, MPI_DOUBLE, 1, tag, comm, status);
}else if(myrank == 1){
 MPI_Recv(recvbuf, count, MPI_DOUBLE, 0, tag, comm, status);
 MPI_Send(sendbuf, count, MPI_DOUBLE, 0, tag, comm); 
}

In the case of Solution 1:

MPI_Comm_rank(comm, &myrank);
if (myrank == 0){
 MPI_Send(...);
 MPI_Recv(...);
}else if(myrank == 1){
 MPI_Send(...); 
 MPI_Recv(...);
}
  • The call MPI_Send is blocking, therefore the message sent by each process has to be copied out before the send operation returns and the receive operation starts.

  • For the call to complete successfully, it is then necessary that at least one of the two messages sent be buffered, otherwise …

  • a deadlock situation occurs: both processes are blocked since there is no buffer space available!

In the case of Solution 2:

MPI_Comm_rank(comm, &myrank);
if (myrank == 0){
 MPI_Recv(...);
 MPI_Send(...);
}else if(myrank == 1){
 MPI_Recv(...);
 MPI_Send(...); 
}
  • The receive operation of process \(0\) must complete before its send. It can complete only if the matching send of processor \(1\) is executed.

  • The receive operation of process \(1\) must complete before its send. It can complete only if the matching send of processor \(0\) is executed.

  • This program will always deadlock.

In the case of Solution 3:

MPI_Comm_rank(comm, &myrank);
if (myrank == 0){
 MPI_Send(...);
 MPI_Recv(...);
}else if(myrank == 1){
 MPI_Recv(...);
 MPI_Send(...); 
}
  • This program will succeed even if no buffer space for data is available.

Deadlock Issues

We can try to salvage what the situation in the case of Solution 1 by allocating the buffer space for the send calls

if (myrank == 0){
 MPI_Send(...);
 MPI_Recv(...);
}else if(myrank == 1){
 MPI_Send(...); 
 MPI_Recv(...);
}

We can substitute the MPI_Send operation with a Send in buffered mode

int MPI_Bsend(const void* buf, int count, 
  MPI_Datatype datatype, int dest,
  int tag, MPI_Comm comm)
  • A buffered mode send operation can be started whether or not a matching receive has been posted;

  • It may complete before a matching receive is posted;

  • This operation is local!

Allocating buffer space To actually use the MPI_Bsend we need also to allocate the space for the buffer, therefore we need to use the two functions

#define BUFFSIZE 10000
int size; char *buff;
// Buffer of 10000 bytes for MPI_Bsend
MPI_Buffer_attach( malloc(BUFFSIZE), BUFFSIZE);
// Buffer size reduced to zero 
MPI_Buffer_detach( &buff, &size);
// Buffer of 10000 bytes available again 
MPI_Buffer_attach( buff, size); 

Warning

a pointer to the buffer is passed to MPI_Buffer_attach while the address of the pointer is passed to MPI_Buffer_detach and these are both void *.

Nonblocking communications

Nonblocking communications As we have seen the use of blocking communications ensures that

  • the send and receive buffers used in the MPI_Send and MPI_Recv arguments are safe to use or reuse after the function call,

  • but it also means that unless there is a simultaneously matching send for each receive, the code will deadlock.

There exists a version of the point-to-point communication that returns immediately from the function call before confirming that the send or the receive has completed, these are the nonblocking send and receive functions.

  • To verify that the data has been copied out of the send buffer a separate call is needed,

  • To verify that the data has been received into the receive buffer a separate call is needed,

  • The sender should not modify any part of the send buffer after a nonblocking send operation is called, until the send completes.

  • The receiver should not access any part of the receive buffer after a nonblocking receive operation is called, until the receive completes.

Nonblocking communications: MPI_Isend and MPI_Irecv The two nonblocking point-to-point communication call are then

int MPI_Isend(void *message, int count, 
   MPI_Datatype datatype, int dest, int tag,
   MPI_Comm comm, MPI_Request *send_request);

int MPI_Irecv(void *message, int count, 
   MPI_Datatype datatype, int source, int tag,
   MPI_Comm comm, MPI_Request *recv_request);
  • The MPI_Request variables substitute the MPI_Status and store information about the status of the pending communication operation.

  • The way of saying when this communications must be completed is by using the when is called, the nonblocking request originating from MPI_Isend or MPI_Irecv is provided as an argument.

Nonblocking communications: an example

%%file ccode/nonblockingsendrecv.c
#include <stdio.h>
#include <mpi.h>

int main(int argc, char **argv) {
 int a, b, size, rank, tag = 0; 
 MPI_Status status;
 MPI_Request send_request, recv_request;
 MPI_Init(&argc, &argv);
 MPI_Comm_size(MPI_COMM_WORLD, &size);
 MPI_Comm_rank(MPI_COMM_WORLD, &rank);
if (rank == 0) {
 a = 314159; 
 MPI_Isend(&a, 1, MPI_INT, 1, tag, MPI_COMM_WORLD, &send_request);
 MPI_Irecv (&b, 1, MPI_INT, 1, tag, MPI_COMM_WORLD, &recv_request);
 MPI_Wait(&send_request, &status);
 MPI_Wait(&recv_request, &status);
 printf ("Process %d received value %d\n", rank, b);
} else {
 a = 667;
 MPI_Isend (&a, 1, MPI_INT, 0, tag, MPI_COMM_WORLD, &send_request);
 MPI_Irecv (&b, 1, MPI_INT, 0, tag, MPI_COMM_WORLD, &recv_request);
 MPI_Wait(&send_request, &status);
 MPI_Wait(&recv_request, &status);
 printf ("Process %d received value %d\n", rank, b);
}
 MPI_Finalize();
 return 0;
}
Overwriting ccode/nonblockingsendrecv.c

A simple send/receive example We can compile our code by simply adding to our Makefile

nonblockingsendrecv: nonblockingsendrecv.c
  $(MPICC) $(CFLAGS) $(LDFLAGS) $? $(LDLIBS) -o $@

then, we type make nonblockingsendrecv, and we run our program with getting as answer

!(cd ccode && make nonblockingsendrecv)
!mpiexec -np 2 ./ccode/nonblockingsendrecv
make[1]: ingresso nella directory "/home/cirdan/Documenti/RTDa-PISA/CorsoCalcoloParallelo2021/introtoparallelcomputing/intrompi/ccode"
mpicc			 -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /home/cirdan/anaconda3/envs/parallel/include -g			 -Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,--disable-new-dtags -Wl,--gc-sections -Wl,-rpath,/home/cirdan/anaconda3/envs/parallel/lib -Wl,-rpath-link,/home/cirdan/anaconda3/envs/parallel/lib -L/home/cirdan/anaconda3/envs/parallel/lib nonblockingsendrecv.c -lm -ldl -o nonblockingsendrecv
make[1]: uscita dalla directory "/home/cirdan/Documenti/RTDa-PISA/CorsoCalcoloParallelo2021/introtoparallelcomputing/intrompi/ccode"
Process 0 received value 667
Process 1 received value 314159

Another useful instruction for the case of nonblocking communication is represented by

int MPI_Test(MPI_Request *request, int *flag, MPI_Status *status);

A call to MPI_TEST returns flag = true if the operation identified by request is complete. In such a case, the status object is set to contain information on the completed operation.

Sendreceive

Send-Receive The send-receive operations combine in one call the sending of a message to one destination and the receiving of another message, from another process.

  • Source and destination are possibly the same,

  • Send-receive operation is very useful for executing a shift operation across a chain of processes,

  • A message sent by a send-receive operation can be received by a regular receive operation

int MPI_Sendrecv(const void *sendbuf, int sendcount, 
  MPI_Datatype sendtype, int dest, int sendtag, 
  void *recvbuf, int recvcount, MPI_Datatype recvtype, 
  int source, int recvtag, MPI_Comm comm,
  MPI_Status *status);

Send-Receive-Replace A slight variant of the MPI_Sendrecv operation is represented by the MPI_Sendrecv_replace operation

int MPI_Sendrecv_replace(void* buf, int count, 
    MPI_Datatype datatype, int dest, int sendtag, 
    int source, int recvtag, 
    MPI_Comm comm, MPI_Status *status)

as the name suggests, the same buffer is used both for the send and for the receive, so that the message sent is replaced by the message received. Clearly, if you confront its arguments with the one of the MPI_Sendrecv, the arguments void *recvbuf, int recvcount are absent.


1

https://www.gnu.org/software/make/