NASAのOpenMPベンチマークで遊ぶ

NASAOpenMPベンチマークNPB
ダウンロードは「http://www.hpcs.cs.tsukuba.ac.jp/omni-openmp/download/download-benchmarks.html」よりできる。
権威のあるベンチマークらしいので、手法の評価のために使えるかも
FFTとか色々入ってる

CLASSは問題のサイズを示す。CLASS=S,A,B...とある。
実行結果 FFT 1スレッドの場合


NAS Parallel Benchmarks 2.3 OpenMP C version - FT Benchmark

Size : 256x256x128
Iterations : 6
T = 1 Checksum = 5.046735008193e+02 5.114047905510e+02
T = 2 Checksum = 5.059412319734e+02 5.098809666433e+02
T = 3 Checksum = 5.069376896287e+02 5.098144042213e+02
T = 4 Checksum = 5.077892868474e+02 5.101336130759e+02
T = 5 Checksum = 5.085233095391e+02 5.104914655194e+02
T = 6 Checksum = 5.091487099959e+02 5.107917842803e+02
Result verification successful
class = A


FT Benchmark Completed
Class = A
Size = 256x256x128
Iterations = 6
Threads = 1
Time in seconds = 25.25
Mop/s total = 282.63
Operation type = floating point
Verification = SUCCESSFUL
Version = 2.3
Compile date = 25 Oct 2010

Compile options:
CC = gcc-4
CLINK = gcc-4
C_LIB = (none)
C_INC = -I../common
CFLAGS = -fopenmp
CLINKFLAGS = -fopenmp
RAND = randdp

実行結果 FFT 4スレッドの場合


NAS Parallel Benchmarks 2.3 OpenMP C version - FT Benchmark

Size : 256x256x128
Iterations : 6
T = 1 Checksum = 5.046735008193e+02 5.114047905510e+02
T = 2 Checksum = 5.059412319734e+02 5.098809666433e+02
T = 3 Checksum = 5.069376896287e+02 5.098144042213e+02
T = 4 Checksum = 5.077892868474e+02 5.101336130759e+02
T = 5 Checksum = 5.085233095391e+02 5.104914655194e+02
T = 6 Checksum = 5.091487099959e+02 5.107917842803e+02
Result verification successful
class = A


FT Benchmark Completed
Class = A
Size = 256x256x128
Iterations = 6
Threads = 4
Time in seconds = 10.57
Mop/s total = 675.16
Operation type = floating point
Verification = SUCCESSFUL
Version = 2.3
Compile date = 25 Oct 2010

Compile options:
CC = gcc-4
CLINK = gcc-4
C_LIB = (none)
C_INC = -I../common
CFLAGS = -fopenmp
CLINKFLAGS = -fopenmp
RAND = randdp

おおむね、2.4倍程度の高速化に成功している