9.3 Floating Point Performance

Note. This section contains data that is so antiquated that we’re not sure it has much relevance to modern hardware configurations. It is included for the sake of completeness and as a historical curiosity .

Table 2: Floating Point Benchmarks
With 80387 No 80387
Repetitions Average Repetitions Average
2,000k Time 200k Time
Operation (seconds) ( μ \mu sec) (seconds) ( μ \mu sec)
Add 25.9 13.0 34.3 172
Subtract 25.9 13.0 35.3 177
Multiply 28.4 14.2 44.3 222
Divide 33.6 16.8 50.9 255
Inner product 1 40.3 20.2 61.9 310
Scalar multiply 2 30.2 15.1 45.0 225
Loop overhead 1.3 1.3
  • 1 sum += a[cursor[i]]*y[cursor[j]]
    2 a[cursor[i]] *= scalar

  • A variety of floating point operations were monitored under MS DOS version 3.30 on a 16 Mhz IBM PS/2 Model 70 with a 16 Mhz 80387 coprocessor and a 27 msec, 60 Mbyte fixed disk drive. The 32 bit operations available on the 80386 were not used. Table 2 catalogs the time requirements of the simple arithmetical operations, inner product accumulation, and multiplying a vector by a scalar. All benchmarks were performed using double precision real numbers. The test contains a loop that was restarted after every 500 operations, e.g. 200k repetitions also includes the overhead of starting and stopping a loop 400 times. With this testing scheme, all loop counters and array indices were maintained in the registers.

    The measurements in Table 3 provide a similar analysis of math functions in the Microsoft C Version 5.1 math library. These benchmarks were conducted with a single loop whose counter was a long integer.

    Table 3: Math Library Benchmarks
    With 80387 No 80387
    Repetitions Average Repetitions Average
    300k Time 10k Time
    Operation (seconds) ( μ \mu sec) (seconds) ( μ \mu sec)
    acos 36.2 121 30.5 3,050
    asin 35.1 117 29.9 2,990
    atan 26.0 87 23.0 2,300
    cos 37.7 126 25.3 2,530
    sin 37.0 123 24.7 2,470
    tan 31.7 106 19.2 1,920
    log 25.4 85 18.5 1,850
    sqrt 16.5 55 5.7 570
    pow 51.4 171 38.6 3,860
    j0 1 235.1 784 60.7 6,070
    j6 662.02 2,207 176.3 17,603
    y0 3 510.02 1,700 146.4 14,640
    Loop overhead 3 3
  • 1 Bessel function of the first kind, order 0.
    2 Extrapolated from 30,000 repetitions.
    3 Bessel function of the second kind, order 0.

  • Differences in loop overheads found in Table 2 and Table 3 are accounted for by the differences in the loop counter implementation described above. The 3 $\mu$sec overhead reflects the time required to increment a long integer and monitor the termination condition (which also involved a long integer comparison). The 1.3 $\mu$sec overhead reflects the time required to increment a register and monitor the termination condition (which involved a register comparison).