Tuesday, December 2, 2014

MPI4Py: From Latin America to the World!

The Python programming language has become one of the big three languages in HPC (along with Fortran and C/C++). Thanks to an extended ecosystem of scientific libraries, development tools, and applications, Python has been widely adopted by many scientific communities as a high-level, easy-to-use, scripting language. Plugging Python code with pre-existing programs in other languages is straightforward. Therefore, Python makes prototyping new computational methods very easy and powerful. It comes at no surprise that your favorite HPC library, whatever that might be, has a Python interface.

The same is true for the Message-Passing Interface (MPI), the standard mechanism to implement large-scale parallel applications in HPC. The last decade brought several competing alternatives to use MPI with Python: PyMPI, PyPar, and MPI4Py, among others. All those libraries shared the same goal of providing an implementation of the MPI standard to Python programmers. However, after years of dispute, MPI4Py emerged as the clear winner. Interestingly, MPI4Py was developed in Latin America by Lisandro Dalcin. Lisandro is a researcher at the Research Center in Computational Methods in Argentina, who graduated from Universidad Nacional del Litoral, in Argentina, with a PhD degree in Engineering. He is the author of MPI4Py, the most popular implementation of MPI in Python, PETSc4Py, and SLEPc4Py. He is a contributor of Cython and PETSc libraries. He was part of the PETSc team who won the R&D 100 Award in 2009.

MPI4Py offers a convenient, yet powerful way to use the MPI-2 standard in Python. The design of MPI4Py makes it possible to seamlessly incorporate message-passing parallelism into a Python program. One important feature of MPI4Py is its ability to communicate any built-in or user-defined Python object, taking advantage of the pickle module to marshall and unmarshall objects. Thus, writing a parallel program out of an object-oriented Python code does not involve a huge effort on writing the serialization methods. In addition, MPI4Py efficiently implements the two-sided communication operations of the MPI-2 standard (including non-blocking and persistent calls). To top it all off,  MPI4Py provides dynamic process management, one-sided communication, and parallel I/O operations.

The MPI4Py project demonstrates the potential of hard work on collaborative efforts in the region. For more information about MPI4Py, visit the webpage http://mpi4py.scipy.org.

1 comment:

  1. I've recently used it and was shocked at how simple it is to write a MPI program in Python with mpi4py. The developer is really to be praised about this.
    However, it seems the current version still does not support asynchronous send/recv calls. Looking forward to see it improve!

    ReplyDelete