The goal of this section is basically to repeat the rules from the section Using Fortran Subroutines from C++ with the example of the Fortran subroutines decomp and solve from the book Computer Methods for Mathematical Computations.
I will use a simple matrix class described in matrix.h (see also Class Matrix). The class is based on
vector<double> and it stores a matrix column-wise as it is accepted in Fortran. The header also contains a simple timing class (define USECLOCK with Microsoft Visual C++). Let us look at the code in main.cc where the code determines the matrix dimension from the command line argument, then it initializes the matrix with random number, preserves the copy of matrices, as they will be destroyed by decomp and solve, then calls decomp and solve and print information about timing and the residual.
- Fortran usually uses extra underscore, so the names of Fortran subroutines are
solve_. Well, Intel Fortran on Windows does not follow this rule, so use
nm(Using nm to troubleshoot linking problems) to make sure what names to use in your C++ code.
- The Fortran function must be declared and we need to use
extern Cto prevent name mangling.
- Fortran uses pointers in function calls, so the declaration must reflect this.
- When one uses the
vector<double>class from the standard library, function
begin()not necessarily returns a pointer to double. The trick is to employ then
&*to get the pointer to double.
To compile the code with gcc
$ g++ -c -s -O3 main.cc
$ gfortran -c -s -O3 decomp.f
$ gfortran -c -s -O3 solve.f
$ g++ main.o decomp.o solve.o -o main.exe
or use the supplied make file
$ make -f fmm_c++.make
$ ./main 500
time to factor A(500,500) is 0.096 s
cond is 22349.3
time for back substitution is 1.16299e-07 s
residual is 5.7989e-15
Please note that timing is much slower as compared with functions
lu_factor/lu_solve in SciPy. The main reason is that SciPy calls LAPACK functions which in turn use the optimized BLAS and this is the main point. If you need to achieve a good performance, you cannot afford not using the optimized BLAS. The table below compares results of main.cc and 02lu.py from Linear Solve in Python (NumPy and SciPy) on my notebook (time is in second).