# 9.3 Floating Point Performance

Note.This section contains data that is so antiquated that we’re not sure it has much relevance to modern hardware configurations. It is included for the sake of completeness and as a historical curiosity.

Table 2: Floating Point Benchmarks
 With 80387 No 80387 Repetitions Average Repetitions Average 2,000k Time 200k Time Operation (seconds) ($\mu$sec) (seconds) ($\mu$sec) Add 25.9 13.0 34.3 172 Subtract 25.9 13.0 35.3 177 Multiply 28.4 14.2 44.3 222 Divide 33.6 16.8 50.9 255 Inner product 1 40.3 20.2 61.9 310 Scalar multiply 2 30.2 15.1 45.0 225 Loop overhead 1.3 1.3
• 1 sum + = a[cursor[i]] * y[cursor[j]]

2 a[ cursor[i]] * = scalar

• A variety of floating point operations were monitored under MS DOS version 3.30 on a 16 Mhz IBM PS/2 Model 70 with a 16 Mhz 80387 coprocessor and a 27 msec, 60 Mbyte fixed disk drive. The 32 bit operations available on the 80386 were not used. Table 2 catalogs the time requirements of the simple arithmetical operations, inner product accumulation, and multiplying a vector by a scalar. All benchmarks were performed using double precision real numbers. The test contains a loop that was restarted after every 500 operations, e.g. 200k repetitions also includes the overhead of starting and stopping a loop 400 times. With this testing scheme, all loop counters and array indices were maintained in the registers.

The measurements in Table 3 provide a similar analysis of math functions in the Microsoft C Version 5.1 math library. These benchmarks were conducted with a single loop whose counter was a long integer.

Table 3: Math Library Benchmarks
 With 80387 No 80387 Repetitions Average Repetitions Average 300k Time 10k Time Function (seconds) ($\mu$sec) (seconds) ($\mu$sec) acos 36.2 121 30.5 3,050 asin 35.1 117 29.9 2,990 atan 26.0 87 23.0 2,300 cos 37.7 126 25.3 2,530 sin 37.0 123 24.7 2,470 tan 31.7 106 19.2 1,920 log 25.4 85 18.5 1,850 sqrt 16.5 55 5.7 570 pow 51.4 171 38.6 3,860 j0 1 235.1 784 60.7 6,070 j6 662.02 2,207 176.3 17,603 y0 3 510.02 1,700 146.4 14,640 Loop overhead 3 3
• 1 Bessel function of the first kind, order 0.

2 Extrapolated from 30,000 repetitions.

3 Bessel function of the second kind, order 0.

• Differences in loop overheads found in Table 2 and Table 3 are accounted for by the differences in the loop counter implementation described above. The 3 $\mu$sec overhead reflects the time required to increment a long integer and monitor the termination condition (which also involved a long integer comparison). The 1.3 $\mu$sec overhead reflects the time required to increment a register and monitor the termination condition (which involved a register comparison).