Benckmarking Jacket

[Old posts from the commercial version of ArrayFire] Discussion of ArrayFire using CUDA or OpenCL.

Moderator: pavanky

Benckmarking Jacket

Postby jaideep » Tue Nov 01, 2011 11:53 am

We recently received the following request via email:

Do you have a MATLAB benchmark script that you can send that would allow me to compare the 4 GPUs? I'd prefer to run on my own machines because the machines themselves are different, so I don't want to just compare the GPUs, but rather, the systems as a whole. .


We package a lot of Jacket examples that showcase some of its capabilities. The jacket/examples folder contains blas_example which is best suited to benchmark Jacket for performance. The blas_example sample output looks like :
Code: Select all
Jacket v1.8.2 (build e1f8141) by AccelerEyes (64-bit Linux)
License Type: Designated Computer (/home/singh/jacket/engine/jlicense.dat)
Licenses: JMC, SDK, DLA, SLA, 4 GPUs
CUDA toolkit 4.0, driver 285.05.05
GPU0 Tesla C2070, 5376 MB, Compute 2.0 (single,double) (in use)
GPU1 Tesla C2075, 5376 MB, Compute 2.0 (single,double)
GPU2 Tesla C2075, 5376 MB, Compute 2.0 (single,double)
Display Device: GPU2 Tesla C2075
Memory Usage: 4856 MB free (5376 MB total)

>> blas_example
Comparison will be done in single-precision.
Benchmark N-by-N matrix multiply

nn =

         128         256         384         512         640         768         896        1024        1152        1280        1408        1536        1664        1792        1920        2048

Computing the CPU benchmarks...

cpu_gflops =

   22.9401   30.7839   33.5942   34.0179   33.9873   34.4580   34.6695   34.7602   34.9841   35.1146   34.9959   35.2862   35.2766   35.3150   35.4245   35.3601

Computing the GPU benchmarks...

gpu_gflops =

   41.4147  327.8021  336.1073  365.5046  393.6954  559.0121  496.9088  449.2644  593.3699  521.5540  525.5692  618.3885  569.0071  526.6894  635.8531  587.8517


speedup =

    1.8053   10.6485   10.0049   10.7445   11.5836   16.2230   14.3327   12.9247   16.9611   14.8529   15.0180   17.5249   16.1299   14.9140   17.9495   16.6247

Computing the CPU for-loop benchmarks...

cpu_for_gflops =

   1.0e+05 *

    0.0002    0.0020    0.0067    0.0158    0.0309    0.0533    0.0847    0.1264    0.1800    0.2469    0.3287    0.4267    0.5425    0.6776    0.8334    1.0115

Computing the GPU for-loop benchmarks...

gpu_for_gflops =

   1.0e+05 *

    0.0004    0.0033    0.0112    0.0266    0.0519    0.0897    0.1424    0.2125    0.3026    0.4151    0.5524    0.7172    0.9119    1.1389    1.4008    1.7001

Computing the GPU gfor-loop benchmarks...

gpu_gfor_gflops =

   1.0e+06 *

    0.0002    0.0020    0.0067    0.0160    0.0312    0.0540    0.0857    0.1279    0.1821    0.2498    0.3324    0.4316    0.5488    0.6854    0.8430    1.0231


As you can see, Jacket and GPUs crush CPUs when it compares to performance and giving Jacket users a huge advantage in terms of the GFlops achieved.
Jaideep Singh
Software Engineer
AccelerEyes
User avatar
jaideep
 
Posts: 207
Joined: Tue Oct 11, 2011 2:40 pm

Return to [archive-commercial] Programming & Development with ArrayFire

cron