One route towards the numerical solution of very large linear systems in scientific computing is the development of hybrid techniques that combine the direct and iterative approaches. Such techniques inherit from the advantages of each of the two approaches: limited memory space and easiness of parallel implementation of iterative solvers, and numerical robustness of direct solvers. Two main classes of approaches can be distinguished. The first consists in computing an incomplete factorization/approximate inverse of the complete matrix in order to build a preconditioner. The second track is to perform an exact partial factorization and to solve the remaining condensed equations using a preconditioned iterative scheme where the preconditioner is built from the partial factorization. This latter approach can be related to some non-overlapping domain decomposition techniques. We will study both approaches in the framework of this project and the problem of efficient permutation techniques to enhance the robustness of the preconditioned solvers will be considered too.

Incomplete factorization
The algorithmic of a direct factorization method relies on an ordering and a partitioning of the unknowns giving rise to dense blocks in the factors. This block structure in the factors enables the use of the basic linear algebra kernels BLAS3 that are mandatory to get high performance on modern superscalar architectures. It is worthwhile to note that for systems of PDEs discretized by high-order finite element methods, the matrices of the associated algebraic systems possess a block structure where a blocks is a dense matrix whose dimension is given by the product of the number of primitive variables with the number of local interpolation nodes. Thus, for high-order interpolation, the size of an elementary block can become an important issue for the factorization methods. Moreover, when no continuity is imposed in the approximation as with discontinuous Galerkin formulations [CDS07], the size of the resulting matrix is increased as compared to the one characterizing an equivalent continuous formulation, and the individual blocks are denser due to the locality of the approximation.\\ A first approach would consist in adapting classic ordering and partitioning techniques to exhibit the dense structure in an incomplete factorization process [HRR07]. A preliminary analysis has shown that strong modifications should be implemented; furthermore new criteria will have to be designed to ensure the stability of the factors as well as a fast convergence of the iterative solver and studies carried out in the context of the ILUPACK solver will be a good starting point [BS06,ABMQ07]. The foreseen new developments will combine techniques coming from discrete mathematics (as used in sparse direct solvers) and techniques based on numerical criteria; this will be new in that framework.

Partial exact factorization and domain decomposition
These techniques bear some resemblance to solution strategies developed in the framework of domain decomposition for the numerical solution of PDE models. They mainly rely on the condensation of the linear operator on the boundary of independent subsets of equations (subdomains). The hybrid approach in this context consists in using sparse direct solvers on the submatrices associated with the subdomains to form a reduced/condensed linear system on the interface associated with the boundary [DLP07]. This linear system is often referred to as Schur complement system; it can be solved iteratively. In this domain decomposition framework, we will consider algebraic local preconditioners for the Schur complement [CGM01,GHW07]. Another approach will consist in using an ordering to build a preconditioner for the global Schur complement; this latter approach will still exploit the parallelism related to the initial partitioning of the data associated with the independent subsets/subdomains [HS06].

Parallel pre-permutation techniques for robust iterative solvers
One of the most significant advances made in iterative solvers in the last few years has been the development of permutation techniques to enhance robustness of preconditioned Krylov methods. These methods pre-process the matrix by permuting it so as to obtain large diagonal entries (nonsymmetric case) or good 1x1 or 2x2 pivots (symmetric case) during elimination. Codes, such as ILUPACK [BS06,ABMQ07], that utilizes these permutations coupled with inverse-based dropping, are significantly more reliable than standard ones. What is currently missing are truly parallel version of these techniques. Researchers are actively working to develop such parallel implementations but the state of the art is not satisfactory. We will explore this issue both for symmetric and nonsymmetric matrices.

[ABMQ07] J.I. Aliaga, M. Bollhöefer, A.F. Martin and E.S. Quintana
Parallelization of Multilevel Preconditioners Constructed from Inverse-Based ILUs on Shared Memory Multiprocessors.
Proceedings of the ParCo 2007 conference.

[BS06] M. Bollhöefer and Y. Saad
Multilevel preconditioners constructed from inverse--based ILUs.
SIAM J. on Scientific Computing 27 (5), 1627-1650, 2006.

[CDS07] A. Catella, V. Dolean and S. Lanteri
An implicit DGTD method for solving the two-dimensional Maxwell equations on unstructured triangular meshes.
INRIA Research Report RR-6110.
Accepted for publication (shorter version) in IEEE. Trans. Magn., 2007.

[DLP07] V. Dolean, S. Lanteri and R. Perrussel
A domain decomposition method for solving the three-dimensional time-harmonic Maxwell equations discretized by discontinuous Galerkin methods.
J. Comput. Phys., Vol. 227, No. 3 pp. 2044-2072, 2007.

[CGM01] L. M. Carvalho, L. Giraud, and G. Meurant
Local preconditioners for two-level non-overlapping domain decomposition methods. Numerical Linear Algebra with Applications, 8(4):207-227, 2001.

[GHW07] L. Giraud, A. Haidar and L. T. Watson
Parallel scalability study of three dimensional additive Schwarz preconditioners in non-overlapping domain decomposition.
Parallel Computing, special issue for PMAA'06, to appear, 2008.
Technical Report TR/PA/07/05, CERFACS, Toulouse, France, 2007.
Also appeared as ENSEEIHT-IRIT Technical report RT/APO/07/01.

[HS06] P. Hénon and Y. Saad
A parallel multistage ILU factorization based on a hierachical graph decomposition.
SIAM J. Sci. Comput. , 28:2266-2293 , 2006.

[HRR07] P. Hénon, P. Ramet and J. Roman
On finding approximate supernodes for an efficient block ILU(k) factorization.
Parallel Computing, special issue for PMAA'06, to appear, 2008.

Design by N.Design Studio, adapted by solidGone.org (version 1.0.0)
Powered by pmwiki-2.1.27