8.2 Abstract Data Types for Sparse Matrices

The traditional implementation of sparse matrix techniques in engineering and scientific analysis is heavily reminiscent of its roots in the static data structures of FORTRAN. Descriptions of these data structures are provided by  Duff et al.[3] , Tinney and Hart[13], and others. The current implementation takes a different tack. Sparse matrix algorithms are described using an abstract data type paradigm. That is, data sets and operators are specified, but the actual data structures used to implement them are left undefined. Any data structure that efficiently satisfies the constraints imposed in this section is suited for the job.

All signals emitted by the operators defined in this section are used to navigate through data, not to indicate errors. Error processing is intentionally omitted from the algorithms appearing in this document. The intent is to avoid clutter that obscures the nature of the algorithms.

8.2.1 Sparse Matrix

A sparse matrix A is stored in a dynamic data structure that locates an element aij based on its row index i and column index j. The following operations are supported on A:

  • Insert adds an arbitrary element aij to A. If aij does not already exist, insert signals a successful insertion. In subsequent algorithms, insertion of element ai,j into a sparse matrix is represented as

    insert( A, i, j, a )

  • Get retrieves an arbitrary element aij from A. When aij is an element of A, get signals a successful lookup. In subsequent algorithms, retrieval of element aij from a sparse matrix is represented as

    get( A, i, j, a )

  • Scan permits sequential access to the nonzero entries of row i of A. Row scans are bounded. More specifically, a row scan finds all nonzero entries aij in row i of A such that $j_{min} \leq j \leq j_{max}$. When scan finds aij its column index j is returned. When the scan has exhausted all entries in its range, a finished signal is emitted. In subsequent algorithms, iterating row i of a sparse matrix is represented as

    scan( A, i, jmin , jmax , a )

    A scan has two support operations, push_scan and pop_scan, which permit the nesting of scans.

    Push_scan suspends the scan at its current position.

    Pop_scan resumes a suspended scan.

  • Put updates the value an arbitrary element aij of A. In subsequent algorithms, updating element aij of a sparse matrix is represented as

    put( A, i, j, a )

In the context of the preceding function prototypes, parameters A and a can be thought of as C/C++ pointers or references. Other arguments can be viewed as integers.

The sparse matrix algorithms assume that operations that read the data structure (get and scan) make the designated element aij of A available in a buffer. Operations that update aij (insert and put) do so based on the current contents of the communication buffer.The buffer can be conceptualized as a C/C++ struct. For sparse matrices, the structure always contains at least one field: an element of the matrix. To avoid clutter in the algorithms, this ”element” field is implicit. The assignment

$$\delta=a$$

acutally represents

$$\delta=a.element$$

Should other fields of the structure become relevant, they are represented explicitly.

Section 9.1 examines one possible realization of the sparse matrix data type.

8.2.2 Adjacency List

An adjacency list, A, is a data type for representing adjacency relationships of the sparse graph G = (V, E). An adjacency list is typically stored in a dynamic data structure that identifies the edge from vertex i to vertex j as an ordered pair of vertex labels (i,j). Descriptive information is usually associated with each edge. Basic concepts of adjacency lists are reviewed in Sec­tion 2.3 (Representing Sparse Graphs as Adjaceny Lists) of the companion monograph Graph Algorithms.

Since both adjacency lists and sparse matrices represent sparse networks, it should come as no surprise that they require a similar set of operations. More specifically, the following operations are supported on an adjacency list A:

  • Insert adds an arbitrary edge ei,j to A. If edge ei,j is not already in the list, insert signals a successful insertion. In subsequent algorithms, insertion of edge ei,j into an adjacency list is represented as

    insert( A, i, j, e )

  • Get retrieves an arbitrary edge ei,j from A. When edge ei,j is in A, get signals a successful lookup. In subsequent algorithms, retrieval of edge ei,j from an adjacency list is represented as

    get( A, i, j, e )

  • Iterate permits sequential access to all edges incident upon vertex i. Vertex scans are bounded. More specifically, a vertex scan finds all edges ei,j such that $j_{min} \leq j \leq j_{max}$. When iterate finds edge ei,j, it returns j. When a vertex scan has exhausted all entries in its range, a finished signal is emitted. In subsequent algorithms, iterating the edges associated with vertex i is represented as

    iterate( A, i, jmin , jmax , e )

    A iterate has two support operations, push_iteratation and pop_iteration, which permit nesting of the operation.

    Push_iteratation saves the current state of an iteration process.

    Pop_iteratation restores the most recently suspended iteration state.

  • Put updates the information associated with an arbitrary edge ei,j in A. In subsequent algorithms, updating edge ei,j of an adjacency list is represented as

    put( A, i, j, e )

In the context of the preceding function prototypes, parameters A and e can be thought of as C/C++ pointers or references. Other arguments can be viewed as integers.

The algorithms assume that read operations (get and scan) make edge information available in a buffer (this buffer is usually denoted by the symbol e). Update operations (insert and put) modify the information associated with an edge based on the current contents of the communication buffer. The buffer can be conceptualized as a C/C++ struct. Any of the adjacency list buffer fields required by an algorithm are explicitly referenced, ie. e.fillup represents the ”fillup” edge buffer field.

Algorithms for implementing adjacency lists are examined in Sec­tion 4 (Cre­at­ing Ad­ja­cency Lists) of the companion monograph Graph Algorithms.

8.2.3 Reduced Graph

A reduced graph, G $^{\prime}$ = ( V $^{\prime}$, E $^{\prime}$ ), is a data structure that supports the systematic elimination of all vertices from the graph G =( V, E ). The vertices of the reduced graph are denoted as V $^{\prime}$( G $^{\prime}$ ) and its edges as E $^{\prime}$( G $^{\prime}$ ). A crucial attribute of the reduced graph is efficient identification of the vertex in V $^{\prime}$( G $^{\prime}$ ) with the minimum degree.

A reduced graph supports the following operations:

  • Increase_degree increases the degree of vertex v in V $^{\prime}$( G $^{\prime}$ ) by one.

  • Decrease_degree decreases the degree of vertex v in V $^{\prime}$( G $^{\prime}$ ) by one.

  • In_graph tests to see whether vertex v is in V $^{\prime}$( G $^{\prime}$ ),

  • Minimum_degree finds the vertex v in V $^{\prime}$( G$^{\prime}$ ) with the smallest degree.

  • Remove excises vertex v from V $^{\prime}$( G $^{\prime}$ ).

Implementation of reduced graph modeling is examined in detail by Section 8.2 (Eliminating Many Vertices), Section 8.3 (Initializing Minimum Degree Vertex Tracking), and Section 8.4 (Maintaining the Reduced Graph) of Graph Algorithms.

8.2.4 List

A simple list L is an ordered set of elements. If the set { l1, …, li, li+1, …, ln } represents L, then the list contains n elements. Element l1 is the first item on the list and ln is the last item on the list. Element li precedes li+1 and element li+1 follows li. Element li is at position i in L. Descriptive information may accompany each item on a list. Lists associated with matrix algorithms support the following operations:

  • Link adds an element x to a list at position i. Inserting element x into the list at position i results an an updated list: { l1, …, li, li+1, …, ln+1 } An insertion at position eol appends x to the end of the list.

  • Unlink removes the element at position i from the list. Deleting element i results in the list { l1, …, li, li+1, …, ln-1 }.

  • Find looks for an element on the list and returns its position. If the element is not a member of the list, eol is returned.

  • First returns the position of the first item on the list. When the list is empty, eol is returned.

  • Next returns position i+1 on the list if position i is provided. If li is the last item on the list, eol is returned.

  • Prev returns position i-1 on the list if position i is provided. If i is one, eol is returned.

A linked list refers to a list implementation that does not require its members to reside in contiguous storage locations. In this environment, an efficient implementation of the prev operator dictates the use of a doubly linked list.

Communicating with a simple list is analogous to adjacency list communication. Read operations (find, first, next, and prev) make list information available in a buffer. Update operations (link, unlink) modify the list based on the current contents of the buffer.

8.2.5 Mapping

A mapping $\mu$ relates elements of its domain d to elements of its range\ r as follows.

$$\mu(d)=r$$

A mapping resides in a data structure that supports two operations:

  • Map links an element r in the range of $\mu$ to an arbitrary element d in the domain of $\mu$, i.e. sets $\mu$(d) to r.

  • Evaluate evaluates the mapping $\mu$ for an arbitrary element d in its domain, i.e. returns $\mu$(d).

8.2.6 Vector

For simplicity of exposition, a full vector is represented as a linear array. However, any data structure that lets you retrieve and update an arbitrary element bi of a vector b based upon its index i will suffice.