Introduction
MUMPS is a MUltifrontal Massively Parallel sparse direct Solver developed CERFACS, ENSEEIHT-IRIT, and INRIA:
http://graal.ens-lyon.fr/MUMPS/
MUMPS is a free solver but it is necessary to register to download it.
It is relatively easy to compile MUMPS under Unix – it is just necessary to slightly modify Makefile.inc, several version of which come with the MUMPS distribution. For Windows users there is WinMumps
http://sourceforge.net/projects/winmumps/
a set of Visual Studio project files to compile MUMPS.
The goal of this chapter is to consider how to compile MUMPS under Microsoft Visual Studio and Intel Fortran but under GNU Make. This could be more convenient for those who are get used to GNU Make. The number of changes required is relatively small but it is necessary to modify not only Makefile.inc but also makefiles in src, libseq and examples (total four files), as options for cl and ifort on Windows are not fully compatible with Unix counterparts. See Compiling and Linking: Simple Example with Visual C++ for a short overview of typical options for cl.
The modified files can be browsed at /files/config/MUMPS/4.9.2/ and below I will go through the changes and explain them. Below there is description for a serial version of MUMPS 4.9.2. There is also a version of this document for MUMPS 4.8.4. With these files make in the root of the MUMPS distribution should be enough to build the MUMPS library and examples.
Makefile.inc
This is a modified Makefile.INTEL.SEQ. This file should be put in the root of the MUMPS distribution. Below there are lines that have been changed with comments
#LPORDDIR = $(topdir)/PORD/lib/
#IPORD = -I$(topdir)/PORD/include/
#LPORD = -L$(LPORDDIR) -lpord
I do not use PORD, METIS does the job better.
LMETISDIR = ../lib
This is the library where libmetis.lib is located. I assume that you have copied it to the lib in the MUMPS distribution. It this is not the case, please modify accordingly.
LMETIS = libmetis.lib
This is the name of the METIS library.
ORDERINGSF = -Dmetis
I use only METIS.
CC = cl
The C compiler is cl
AR = lib
The librarian is lib.
LIBSEQ = $(topdir)/libseq/libmpiseq.lib
The library has the extension lib
LIBBLAS = mkl_intel_c.lib mkl_intel_thread.lib mkl_core.lib libiomp5md.lib
This are libraries from Intel MKL. cl and ifort usually find them by using the variable LIB.
LIBOTHERS = -link -LIBPATH:$(LMETISDIR)
We do not need pthread on Windows. Here one can specify where the linker should find the libraries. This is working because this is the last macro on the line to compile examples. If you need to specify path to other libraries, for example BLAS, please do it here.
OPTF = -O3 -MD -Dintel_ -DALLOW_NON_INIT -fpp
OPTL =
OPTC = -O2 -MD
These are optimization flags.
LIBM = $(LIBSEQ)
It is necessary to rename the variable LIB. This variable is used by cl and ifort and without it they will not find system libraries as well as MKL libraries.
src/Makefile and libseq/Makefile
Copy these two files in src/ and libseq/ accordingly.
In these two files it is necessary:
1) To rename *.a to *.lib. The libraries on Windows have the extension lib.
2) To change the option -o to write an object file to -Fo. By default cl and ifort give the extension *.obj to object files (not *.o). With -Fo option it is possible to force cl and ifort to compile to *.o.
3) To change flag for the librarian to -out: in lines with $(AR).
examples/Makefile
Copy this file to examples/
In this file in addition to first two changes above it was also necessary to modify $(LIB) to $(LIBM).
Discussion
This is the official MUMPS Discussion List.
Dear Dr. Evgenii Rudnyi,
Thank you very much for your contribution. I have two questions:
1) Do you have any plan to give makes to build the parallel version of MUMPS, or do you have already done so? Is there any speedup when set OMP_NUM_THREADS > 1?
2) Do you have any suggestion to call the parallel MUMPS without the explicit command “mpiexec -n -4 mumps_example.exe”? Since the MUMPS is called with sparse matrix data inputted to the solver, it is not possible to send sparse matrix data to the executable solver.
Thanks,
Zhanghong Tang
Dear Zhanghong,
I use MUMPS with shared memory when the speed up goes through multithreaded BLAS. There is speed up here indeed but it depends on a matrix. I could run such a version with your matrix.
As for MPI, I guess that it could be possible to run MPI from the code without mpiexec but then it is necessary to learn MPI. You may want to look at Trilinos, but I should say that this is a huge library.
It would be simpler to specify a matrix through file, say
mpiexec -n -4 mumps_example matrix_name
You will my code that does this at
http://portal.uni-freiburg.de/imteksimulation/downloads/benchmark/Bone%20model
see file7 assemble.tar.gz.
Evgenii
Dear Dr. Evgenii Rudnyi,
Thank you very much for your kindly reply. Which BLAS do you used? I tried the multithreaded BLAS (Intel MKL, GoToBLAS) and the speedup is not apparent, and often the time of solution increase when number of threads > 1.
My matrix type is double complex and unsymmetric. If you can tell me an email box, I can send it to you (gmail is good since the file is very large).
Thanks,
Zhanghong Tang
Dear Zhanghong,
Right now I am using Intel MKL. Let me describe what I see on my notebook HP EliteBook with two processors under XP-64.
First it is good to check MKL alone. To this end, you can try dgesv.cpp to solve a system of linear equations from
http://matrixprogramming.com/files/code/LAPACK/
makefile compiles it with 64-bit MKL under 64-bit VC (cl).
I get results for a random dense matrix of 10000 as follows (I am working under Cygwin, tcsh, so you may need to modify the commands accordingly)
$ setenv OMP_NUM_THREADS 1
./dgesv 10000
dgesv is over for 62.484
$ setenv OMP_NUM_THREADS 2
./dgesv 10000
dgesv is over for 37.829
As one sees, the speed up is pretty good. Now I use serial MUMPS with the code run_mumps.cpp from
http://matrixprogramming.com/files/code/benchmark/
Here the makefile compiles with 32-bit MKL under 32-bit VC (cl). The results with my matrix water_tank
http://www.cise.ufl.edu/research/sparse/matrices/Rudnyi/water_tank.html
are as follows
$ setenv OMP_NUM_THREADS 1
$ ./run_mumps.exe water_tank.mtx
TIME: Factorization is done for 6.734 s
$ setenv OMP_NUM_THREADS 2
$ ./run_mumps.exe water_tank.mtx
TIME: Factorization is done for 4.782 s
Speed up is not that spectacular indeed but it is there. Would it be possible to upload your matrix to
http://groups.google.com/group/matrixprogramming/files