9.1 Sparse Matrix Representation

Data structures used to maintain sparse matrices must provide access to the nonzero elements of a matrix in a manner which facilitates efficient implementation of the algorithms that are examined in Section 8. The current sparse matrix implementation also seeks to support a high degree of generality both in problem size and the definition of a matrix element. Among other things, this implies that the algorithms must be able to solve problems that are too large to fit into core. A Blink tree supported by a data base cache (described elsewhere, see also   Lehman and Yao[18]) is one possible vehiclefor this task. Simply put, the fundamental sparse matrix data structure is:

  • Each matrix is a relation in a data base, and

  • Each nonzero element of a matrix is a tuple in a matrix relation.

Matrix tuples have the structure indicated in Figure 4.

Figure 4: Matrix Tuple Structure
Typical tuple layout of a sparse matrix stored as a relation in a database.

The row and column domains of each tuple constitute a compound key to the matrix relation. Their meaning corresponds to the standard dense matrix terminology.

The description of a matrix element is left intentionally vague. Its definition varies with the application. Matrix elements must include a real number, double precision real number, complex number, or any other entity for which the arithmetical operations of addition, subtraction, multiplication, and division are reasonably defined.

In this context, matrix elements are accessed through high level data base operations:

  • Get retrieves a random tuple.

  • Next retrieves tuples sequentially. You will recall that the scan operator (defined in Section 8.2.1 and Section 8.2.2) is used extensively by sparse matrix algorithms in Section 8. Scan is implemented by embellishing the next primitive.

  • Put updates the non-key portions of an existing tuple.

  • Insert adds a new tuple to a relation.

  • Delete removes an existing tuple from a relation.

This data structure places few constraints on the representation of a matrix. However, several conventions are adopted to facilitate consistent algorithms and efficient cache access:

  • Matrices have one-based indexing, i.e. the row and column indices of an n × n matrix range from 1 to n.

  • Column zero exists for each row of an asymmetric matrix. Column zero serves as a row header and facilitates row operations. It does not enter into the calculations.

  • A symmetric matrix matrix is stored as an upper triangular matrix. In this representation, the diagonal element anchors row operations as well as entering into the computations. Column zero is not used for symmetric matrices.

Figure 5 depicts the data structure associated with sparse matrices.

Figure 5: Sparse Matrix Representation
A linked list representation of an asymmetric sparse matrix and a linked list representation of a symmetric sparse matrix.