Another profiling question

[Old posts from the commercial version of ArrayFire] Discussion of ArrayFire using CUDA or OpenCL.

Moderator: pavanky

Another profiling question

Postby mgstauffer » Tue Oct 13, 2009 7:51 pm

Mac OSX 10.5.8
Jacket 2.2
GTX 285

I'm uncertain about the reliability of profiling with Jacket. Below is an example of what's confusing. I'm comparing CPU and GPU operations (matrix-vector mult and outer-product). First I run just the CPU operations in a loop, then a loop with just GPU, then a "mixed" loop with the same operations as before, but on both CPU and GPU. The profiling results suggest that the GPU operations are ~3x slower when run in the mixed loop. The CPU operations are the same speed in both loops. Does this look accurate? Is running GPU operations mixed in like this with CPU operations slower? Or does this seem like a profiling anomaly? Thanks.

Code: Select all
  time   calls  line
                  1 %10/09 Test Jacket - basic speed profiling
                  2
             1    3 M = 1000;
             1    4 typeStr = 'double';
             1    5 typeFunc = @double;
             1    6 gtypeFunc = @gdouble;
                  7 %%%%%
                  8
  0.01       1    9 a = rand(M,M,typeStr);
             1   10 u = rand(M,1,typeStr);
             1   11 v = rand(M,1,typeStr)';
                 12
                 13 %prealloc results
< 0.01       1   14 b = ones(M,M,typeStr);
             1   15 x = ones(M,1,typeStr);
             1   16 y = ones(M,1,typeStr);
                 17
                 18 %make complex?
                 19 if 0
                 20     a = a + 1i*0.5*a;
                 21     b = b + 1i*0.2*b;
                 22     u = u + 1i*flipdim(u,1);
                 23     v = v + 1i*flipdim(v,2);
                 24     x = x + 1i*x;
                 25     y = y + 1i*y;
                 26 end
                 27
             1   28 ga = gtypeFunc(a); 
             1   29 gb = gtypeFunc(b); 
             1   30 gu = gtypeFunc(u); 
             1   31 gv = gtypeFunc(v); 
             1   32 gx = gtypeFunc(x); 
             1   33 gy = gtypeFunc(y);
                 34
             1   35 rep = 1000;
                 36
                 37 %loop CPU
             1   38 for ind = 1:rep
  1.98    1000   39     x = a*u; %mat-vec mult
  7.55    1000   40     b = u*v; %outer prod
  0.02    1000   41 end
                 42
                 43 %loop GPU
             1   44 for ind = 1:rep %gpu-vars not necessary for iterator since not used in calcs
  0.29    1000   45     gx = ga*gu; gforce(gx); %mat-vec mult
  0.31    1000   46     gb = gu*gv; gforce(gb); %outer prod
          1000   47 end
                 48
                 49 %loop mixed
             1   50 for ind = 1:rep
  1.97    1000   51     x = a*u; %mat-vec mult
  0.92    1000   52     gx = ga*gu; gforce(gx); %mat-vec mult
  7.70    1000   53     b = u*v; %outer prod
  0.83    1000   54     gb = gu*gv; gforce(gb); %outer prod
  0.02    1000   55 end   
mgstauffer
 
Posts: 90
Joined: Wed Jul 29, 2009 10:11 pm

Return to [archive-commercial] Programming & Development with ArrayFire

cron