How To Calculate HPC Efficiency

Summary

HPC efficiency is a measure (percentage) of the actual performance of a HPC system against its theoretical peak performance.

Theoretical Peak Performance

The theoretical peak performance (GFLOPS) is calculated by the following equation…

GFLOPS = node * ( sockets / node ) * ( cores / socket ) * GHz * FLOPS

FLOPS (FLoating Point Operations Per Second) is specific to the kind of CPU. The following table shows the FLOP values of some Intel and AMD CPUs.

CPU FLOPS
Intel Xeon E5-2600 (Sandy Bridge) series 8
Intel Xeon E3-1200 (Ivy Bridge) series 8
AMD Opteron 6200 (Bulldozer) series 4
AMD Opteron 6300 (Piledriver) series 4

For example, the theoretical peak performance of an Altus 1804i, dual Opteron 6234 (2.4 GHz, 12 core) is…

1 node * ( 2 sockets / node ) * ( 12 cores / socket ) * 2.4 GHz * 4 FLOPS
= 230.4 GFLOPS

Actual Performance

The Actual performance can be found by running XHPL. For more information, see…

How To Install / Configure / Execute XHPL (ACML) for AMD FMA4

How To Install / Configure / Execute XHPL (MKL) for Intel AVX

For example the acutal performance of an Altus 1804i with dual Opteron 6234 (2.4 GHz, 12 core), 128GB RAM (90%) is…

================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR01C2L2      123378   160     4     6            7469.13              1.776e+02
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0027476 ...... PASSED
================================================================================

HPC Efficiency

The HPC efficiency is simply…

Efficiency = Actual Performance GFLOPS / Theoretical Peak Performance GFLOPS

Using the previous Altus 1804i example the HPC efficiency calculates to…

177.6 / 230.4 = 77.1 %

To increase the HPC efficiency, increase the actual performance. This can be done by tweaking compilers, math libraries, shared/distributed memory, numactl, kernel parameters, etc.

Recent Posts