4.1 Gaussian Elimination

Approaches to LU decomposition which systematically capture the intermediate results of Gaussian elimination often differ in the order in which A is forced into upper triangular form. The most common alternatives are to eliminate the subdiagonal parts of A either one row at a time or one column at a time. The calculations required to zero a complete row or a complete column are referred to as one stage of the elimination process.

The effects of the kth stage of Gaussian elimination on the A matrix are summarized by the following equation.

aij(k+1)=aij(k)-(aik(k)akk(k))aij(k),wherei,j>ka_{ij}^{(k+1)}=a_{ij}^{\left(k\right)}-\left(\frac{a_{ik}^{\left(k\right)}}{a_% {kk}^{\left(k\right)}}\right)a_{ij}^{\left(k\right)},\operatorname{where}i,j>k (27)

The notation a(k)ij{}_{ij}^{(k)} means the value of aij produced during the kth stage of the elimination procedure. In Equation 27, the term aik(k)akk(k)\frac{a_{ik}^{\left(k\right)}}{a_{kk}^{\left(k\right)}} (sometimes referred to as a multiplier) captures the crux of the elimination process. It describes the effect of eliminating element aik on the other entries in row i during the kth stage of the elimination. In fact, these multipliers are the elements of the lower triangular matrix L, i.e.

lik=aik(k)akk(k)l_{ik}=\frac{a_{ik}^{\left(k\right)}}{a_{kk}^{\left(k\right)}} (28)

Algorithm 1 implements Equation 27 and Equation 28 and computes the LU decomposition of an m × n matrix A. It is based on Algorithm 4.2–1 of Golub and van Van Loan [8].

Algorithm 1: LU Decomposition
for k=1,,min(m-1,n)k=1,\cdots,\operatorname{min}(m-1,n)
   for j=k+1,,nj=k+1,\cdots,n
      wj=akjw_{j}=a_{kj}
   for i=k+1,,mi=k+1,\cdots,m
      α=aikakk\alpha=\displaystyle\frac{a_{ik}}{a_{kk}}
      aik=αa_{ik}=\alpha
      for j=k+1,,nj=k+1,\cdots,n
         aij=aij-αwja_{ij}=a_{ij}-\alpha w_{j}

The algorithm overwrites aij with lij when i < j. Otherwise, aij is overwritten by uij. The algorithm creates a matrix U that is upper triangular and a matrix L that is unit lower triangular. Note that a working vector w of length n is required by the algorithm.