8 Sparse Matrices 8.1 Sparse Matrix Methodology 8.3 Pivoting To Preserve Sparsity

8.2 Abstract Data Types for Sparse Matrices

The traditional implementation of sparse matrix techniques in engineering and scientific analysis is heavily reminiscent of its roots in the static data structures of FORTRAN. Descriptions of these data structures are provided by Duff et al.[3] , Tinney and Hart[13], and others. The current implementation takes a different tack. Sparse matrix algorithms are described using an abstract data type paradigm. That is, data sets and operators are specified, but the actual data structures used to implement them are left undefined. Any data structure that efficiently satisfies the constraints imposed in this section is suited for the job.

All signals emitted by the operators defined in this section are used to navigate through data, not to indicate errors. Error processing is intentionally omitted from the algorithms appearing in this document. The intent is to avoid clutter that obscures the nature of the algorithms.

8.2.1 Sparse Matrix

A sparse matrix A is stored in a dynamic data structure that locates an element a_ij based on its row index i and column index j. The following operations are supported on A:

Insert adds an arbitrary element a_ij to A. If a_ij does not already exist, insert signals a successful insertion. In subsequent algorithms, insertion of element a_i,j into a sparse matrix is represented as

insert( A, i, j, a )
Get retrieves an arbitrary element a_ij from A. When a_ij is an element of A, get signals a successful lookup. In subsequent algorithms, retrieval of element a_ij from a sparse matrix is represented as

get( A, i, j, a )
Scan permits sequential access to the nonzero entries of row i of A. Row scans are bounded. More specifically, a row scan finds all nonzero entries a_ij in row i of A such that $j_{min} \leq j \leq j_{max}$. When scan finds a_ij its column index j is returned. When the scan has exhausted all entries in its range, a finished signal is emitted. In subsequent algorithms, iterating row i of a sparse matrix is represented as

scan( A, i, j_min , j_max , a )

A scan has two support operations, push_scan and pop_scan, which permit the nesting of scans.

Push_scan suspends the scan at its current position.

Pop_scan resumes a suspended scan.
Put updates the value an arbitrary element a_ij of A. In subsequent algorithms, updating element a_ij of a sparse matrix is represented as

put( A, i, j, a )

In the context of the preceding function prototypes, parameters A and a can be thought of as C/C++ pointers or references. Other arguments can be viewed as integers.

The sparse matrix algorithms assume that operations that read the data structure (get and scan) make the designated element a_ij of A available in a buffer. Operations that update a_ij (insert and put) do so based on the current contents of the communication buffer.The buffer can be conceptualized as a C/C++ struct. For sparse matrices, the structure always contains at least one field: an element of the matrix. To avoid clutter in the algorithms, this ”element” field is implicit. The assignment

$$\delta=a$$

acutally represents

$$\delta=a.element$$

Should other fields of the structure become relevant, they are represented explicitly.

Section 9.1 examines one possible realization of the sparse matrix data type.

8.2.2 Adjacency List

An adjacency list, A, is a data type for representing adjacency relationships of the sparse graph G = (V, E). An adjacency list is typically stored in a dynamic data structure that identifies the edge from vertex i to vertex j as an ordered pair of vertex labels (i,j). Descriptive information is usually associated with each edge. Basic concepts of adjacency lists are reviewed in Section 2.3 (Representing Sparse Graphs as Adjaceny Lists) of the companion monograph Graph Algorithms.

Since both adjacency lists and sparse matrices represent sparse networks, it should come as no surprise that they require a similar set of operations. More specifically, the following operations are supported on an adjacency list A:

Insert adds an arbitrary edge e_i,j to A. If edge e_i,j is not already in the list, insert signals a successful insertion. In subsequent algorithms, insertion of edge e_i,j into an adjacency list is represented as

insert( A, i, j, e )
Get retrieves an arbitrary edge e_i,j from A. When edge e_i,j is in A, get signals a successful lookup. In subsequent algorithms, retrieval of edge e_i,j from an adjacency list is represented as

get( A, i, j, e )
Iterate permits sequential access to all edges incident upon vertex i. Vertex scans are bounded. More specifically, a vertex scan finds all edges e_i,j such that $j_{min} \leq j \leq j_{max}$. When iterate finds edge e_i,j, it returns j. When a vertex scan has exhausted all entries in its range, a finished signal is emitted. In subsequent algorithms, iterating the edges associated with vertex i is represented as

iterate( A, i, j_min , j_max , e )

A iterate has two support operations, push_iteratation and pop_iteration, which permit nesting of the operation.

Push_iteratation saves the current state of an iteration process.

Pop_iteratation restores the most recently suspended iteration state.
Put updates the information associated with an arbitrary edge e_i,j in A. In subsequent algorithms, updating edge e_i,j of an adjacency list is represented as

put( A, i, j, e )

In the context of the preceding function prototypes, parameters A and e can be thought of as C/C++ pointers or references. Other arguments can be viewed as integers.

The algorithms assume that read operations (get and scan) make edge information available in a buffer (this buffer is usually denoted by the symbol e). Update operations (insert and put) modify the information associated with an edge based on the current contents of the communication buffer. The buffer can be conceptualized as a C/C++ struct. Any of the adjacency list buffer fields required by an algorithm are explicitly referenced, ie. e.fillup represents the ”fillup” edge buffer field.

Algorithms for implementing adjacency lists are examined in Section 4 (Creating Adjacency Lists) of the companion monograph Graph Algorithms.

8.2.3 Reduced Graph

A reduced graph, G $^{\prime}$ = ( V $^{\prime}$, E $^{\prime}$ ), is a data structure that supports the systematic elimination of all vertices from the graph G =( V, E ). The vertices of the reduced graph are denoted as V $^{\prime}$( G $^{\prime}$ ) and its edges as E $^{\prime}$( G $^{\prime}$ ). A crucial attribute of the reduced graph is efficient identification of the vertex in V $^{\prime}$( G $^{\prime}$ ) with the minimum degree.

A reduced graph supports the following operations:

Increase_degree increases the degree of vertex v in V $^{\prime}$( G $^{\prime}$ ) by one.
Decrease_degree decreases the degree of vertex v in V $^{\prime}$( G $^{\prime}$ ) by one.
In_graph tests to see whether vertex v is in V $^{\prime}$( G $^{\prime}$ ),
Minimum_degree finds the vertex v in V $^{\prime}$( G$^{\prime}$ ) with the smallest degree.
Remove excises vertex v from V $^{\prime}$( G $^{\prime}$ ).

Implementation of reduced graph modeling is examined in detail by Section 8.2 (Eliminating Many Vertices), Section 8.3 (Initializing Minimum Degree Vertex Tracking), and Section 8.4 (Maintaining the Reduced Graph) of Graph Algorithms.

8.2.4 List

A simple list L is an ordered set of elements. If the set { l₁, …, l_i, l_i+1, …, l_n } represents L, then the list contains n elements. Element l₁ is the first item on the list and l_n is the last item on the list. Element l_i precedes l_i+1 and element l_i+1 follows l_i. Element l_i is at position i in L. Descriptive information may accompany each item on a list. Lists associated with matrix algorithms support the following operations:

Link adds an element x to a list at position i. Inserting element x into the list at position i results an an updated list: { l₁, …, l_i, l_i+1, …, l_n+1 } An insertion at position eol appends x to the end of the list.
Unlink removes the element at position i from the list. Deleting element i results in the list { l₁, …, l_i, l_i+1, …, l_n-1 }.
Find looks for an element on the list and returns its position. If the element is not a member of the list, eol is returned.
First returns the position of the first item on the list. When the list is empty, eol is returned.
Next returns position i+1 on the list if position i is provided. If l_i is the last item on the list, eol is returned.
Prev returns position i-1 on the list if position i is provided. If i is one, eol is returned.

A linked list refers to a list implementation that does not require its members to reside in contiguous storage locations. In this environment, an efficient implementation of the prev operator dictates the use of a doubly linked list.

Communicating with a simple list is analogous to adjacency list communication. Read operations (find, first, next, and prev) make list information available in a buffer. Update operations (link, unlink) modify the list based on the current contents of the buffer.

8.2.5 Mapping

A mapping $\mu$ relates elements of its domain d to elements of its range\ r as follows.

$$\mu(d)=r$$

A mapping resides in a data structure that supports two operations:

Map links an element r in the range of $\mu$ to an arbitrary element d in the domain of $\mu$, i.e. sets $\mu$(d) to r.
Evaluate evaluates the mapping $\mu$ for an arbitrary element d in its domain, i.e. returns $\mu$(d).

8.2.6 Vector

For simplicity of exposition, a full vector is represented as a linear array. However, any data structure that lets you retrieve and update an arbitrary element b_i of a vector b based upon its index i will suffice.