src/gena/reduce.cpp:320: error: GFOR not supported

[Old posts from the commercial version of ArrayFire] Discussion of ArrayFire using CUDA or OpenCL.

Moderator: pavanky

src/gena/reduce.cpp:320: error: GFOR not supported

Postby rm9 » Thu Feb 06, 2014 6:51 pm

Hi,

I got the error in the topic on this piece code. What is not supported? Thanks.
Code: Select all
   gfor(array j, W.dims(2)) {
      int input_ind = 1;
      (*dW)(span, span, j) =
         ConvolveValid3d2d(a, flipped_delta(span, span, j));      
      (*dW)(span, span, j) = flip(flip((*dW)(span, span, j), 0), 1) / delta.dims(2);
      (*db)(j) = sum<float>(flipped_delta(span, span, j)) / delta.dims(2);
   }
rm9
 
Posts: 54
Joined: Thu Jan 30, 2014 5:44 am

Re: src/gena/reduce.cpp:320: error: GFOR not supported

Postby shehzan » Thu Feb 06, 2014 9:31 pm

Hi

sum<float>() returns a single value to the host. It does not return and array and hence is not supported in gfor.
You should use the array sum() function. Look at this page for more: http://www.accelereyes.com/arrayfire/c/ ... p__sum.htm
----
Shehzan
Developer
AccelerEyes
User avatar
shehzan
 
Posts: 121
Joined: Tue Feb 12, 2013 7:20 pm

Re: src/gena/reduce.cpp:320: error: GFOR not supported

Postby rm9 » Fri Feb 07, 2014 5:11 am

Thanks, that worked.
Though I have another problem with this code now.
I get very different numeric results when I'm using gfor rather than a standard for.
I'm training a neural network with this code and the performance of the networks changes vastly when I use gfor.
Is this something known or should gfor produce the same results as a for?
Thanks.
rm9
 
Posts: 54
Joined: Thu Jan 30, 2014 5:44 am

Re: src/gena/reduce.cpp:320: error: GFOR not supported

Postby shehzan » Fri Feb 07, 2014 10:57 am

That's great.
Are you trying to say that replacing this gfor statement:
Code: Select all
gfor(array j, W.dims(2)) {

with
Code: Select all
for(int i = 0; i < W.dims(2); i++) {


gives you different results?
Can you post you code again after having fixed the issue with sum.

Edit:
Also read this gfor page if you haven't already: https://www.accelereyes.com/arrayfire/c/page_gfor.htm
----
Shehzan
Developer
AccelerEyes
User avatar
shehzan
 
Posts: 121
Joined: Tue Feb 12, 2013 7:20 pm

Re: src/gena/reduce.cpp:320: error: GFOR not supported

Postby shehzan » Fri Feb 07, 2014 11:19 am

As an addon, we have a neural network example on our Github examples page. You can check it out here: https://github.com/arrayfire/ArrayFire- ... etwork.cpp
----
Shehzan
Developer
AccelerEyes
User avatar
shehzan
 
Posts: 121
Joined: Tue Feb 12, 2013 7:20 pm

Re: src/gena/reduce.cpp:320: error: GFOR not supported

Postby rm9 » Fri Feb 07, 2014 1:45 pm

shehzan wrote:That's great.
Are you trying to say that replacing this gfor statement:
Code: Select all
gfor(array j, W.dims(2)) {

with
Code: Select all
for(int i = 0; i < W.dims(2); i++) {


gives you different results?
Can you post you code again after having fixed the issue with sum.

Edit:
Also read this gfor page if you haven't already: https://www.accelereyes.com/arrayfire/c/page_gfor.htm


Yes, I'm replacing the gfor with a for on j from 0 to W.dims(2).
I tried to replace a different loop with gfor and I got the same difference.
Here is the code with the sum, I simply separated it into another loop.
I also wrapped the local variable with the local function to make sure it is separate for each gfor iteration.
Code: Select all
   for (int j = 0; j < W.dims(2); ++j) {
      (*db)(j) = sum<float>(flipped_delta(span, span, j)) / delta.dims(2);
   }
   gfor(array j, W.dims(2)) {
      int input_ind = 1;
      array c_result = local(
         ConvolveValid3d2d(a, flipped_delta(span, span, j)));
      (*dW)(span, span, j) = flip(flip(c_result, 0), 1) / delta.dims(2);
   }
rm9
 
Posts: 54
Joined: Thu Jan 30, 2014 5:44 am

Re: src/gena/reduce.cpp:320: error: GFOR not supported

Postby shehzan » Mon Feb 10, 2014 5:05 pm

From the code you posted, the use of sum() varies depending on how it is used (host vs device usage).
You seem to be storing the result of a certain sum as shown in db(j). What is happening here is that all values of db(j) change to the value the sum returns.
When used as a gpu function or in gfor, sum returns an array of the same values.
I think what you can do as a debugging technique is to only run the sum function through gfor and see how the result varies compared to the result of the for-loop.
My best guess with the code provided is that you need to to use db(j, span, span, span). Just db(j) is essentially db(j, 0, 0, 0).
----
Shehzan
Developer
AccelerEyes
User avatar
shehzan
 
Posts: 121
Joined: Tue Feb 12, 2013 7:20 pm

Re: src/gena/reduce.cpp:320: error: GFOR not supported

Postby rm9 » Mon Feb 10, 2014 5:47 pm

I'm not sure I understand; the sum part doesn't seem to be the problem since it's not in the gfor.
If I replace the gfor in this code with a regular loop (actually put it inside the db(j) loop) then everything works fine.
Once I run the code as I write, then I get the problems.. so my I understanding is that the problem is with the convolution, not the sum.
Thanks.
rm9
 
Posts: 54
Joined: Thu Jan 30, 2014 5:44 am

Re: src/gena/reduce.cpp:320: error: GFOR not supported

Postby pavanky » Mon Feb 10, 2014 5:53 pm

Hi,

I do not think you need local in this particular case. Please try the code without using local to see if the results match. If they do not, can you attach some code where we can reproduce the problem. Just seeing the code in its current condition will not help us in debugging the issue.
Pavan Yalamanchili,
ArrayFire
--
~ If it is not broken, you have not tried hard enough ~
User avatar
pavanky
Site Admin
 
Posts: 1123
Joined: Mon Mar 15, 2010 7:39 pm
Location: Atlanta, GA

Re: src/gena/reduce.cpp:320: error: GFOR not supported

Postby rm9 » Tue Feb 11, 2014 12:34 pm

Hi Pavan,

Here is a test code that shows the problem.
It runs the same code with for and gfor and prints out the error.
I tried with and without local and got the same error.
You can see that when both loops are for or gfor then the error is zero, but when one is for and one is gfor then I get a very large error.
Thanks.

Code: Select all
array ConvolveValid3d2d(const array& a, const array& b) {
   array c = convolve(a, b, true);
   int output_size0 = a.dims(0) - b.dims(0) + 1;
   int output_size1 = a.dims(1) - b.dims(1) + 1;
   int full_size0 = a.dims(0) + b.dims(0) - 1;
   int full_size1 = a.dims(1) + b.dims(1) - 1;
   int start0 = floor(float(full_size0 - output_size0) / 2);
   int start1 = floor(float(full_size1 - output_size1) / 2);
   c = c(seq(start0, start0 + output_size0 - 1), seq(start1, start1 + output_size1 - 1), span);
   return c;
}

int main() {
        array output_for(constant(0, 121, 10, 128));
   array output_gfor(constant(0, 121, 10, 128));
   array W(randn(8, 64, 128));
   array data(randn(128, 64, 10));
   array b(randn(1, 128));
   for (int j = 0; j < W.dims(2); ++j) {
      array c = ConvolveValid3d2d(data, W(span, span, j));
      output_for(span, span, j) = moddims(c, c.dims(0) * c.dims(1), c.dims(2));
      output_for(span, span, j) += b(j);
   }

   gfor(array j, W.dims(2)) {
      array c = local(ConvolveValid3d2d(data, W(span, span, j)));
      output_gfor(span, span, j) = moddims(c, c.dims(0) * c.dims(1), c.dims(2));
      output_gfor(span, span, j) += b(j);
   }

   array err = af::abs(output_for - output_gfor);
   float e = af::sum<float>(err);
   cout << e << endl;
        return 0;
}
rm9
 
Posts: 54
Joined: Thu Jan 30, 2014 5:44 am

Re: src/gena/reduce.cpp:320: error: GFOR not supported

Postby pavanky » Tue Feb 11, 2014 11:53 pm

Hi,

There seem to be a couple of problems that we need to fix internally. For now I am going to give you a work around that does convolution in frequency domain that is vectorized without using GFOR.

Code: Select all
array ConvolveValid4d(const array& a, const array& b) {
    int full0 = a.dims(0) + b.dims(0) - 1;
    int full1 = a.dims(1) + b.dims(1) - 1;
    int tileA = a.dims(2);
    int tileB = b.dims(2);
    array A = constant(0, full0, full1, tileA);
    array B = constant(0, full0, full1, tileB);

    A(seq(a.dims(0)), seq(a.dims(1)), span) = a;
    B(seq(b.dims(0)), seq(b.dims(1)), span) = b;

    A = moddims(A, full0 * full1, tileA);   
    B = moddims(B, full0 * full1, 1, tileB);

    array AA = tile(A, 1, 1, tileB);
    array BB = tile(B, 1, tileA, 1);

    AA = moddims(AA, full0, full1, tileA * tileB);
    BB = moddims(BB, full0, full1, tileA * tileB);
    array C = real(ifft2(fft2(AA) * fft2(BB)));

    int start0 = b.dims(0) - 1;
    int start1 = b.dims(1) - 1;
    seq output_dim0 = start0 + seq(a.dims(0) - b.dims(0) + 1);
    seq output_dim1 = start1 + seq(a.dims(1) - b.dims(1) + 1);
    array c = C(output_dim0, output_dim1, span);
    return c;
}


You can call the function in the following manner.

Code: Select all
   array b = randn(1, 1, 128);
   array output_batch = ConvolveValid4d(data, W);
   output_batch = moddims(output_batch, 121, 10, 128) + tile(b, 121, 10);
Pavan Yalamanchili,
ArrayFire
--
~ If it is not broken, you have not tried hard enough ~
User avatar
pavanky
Site Admin
 
Posts: 1123
Joined: Mon Mar 15, 2010 7:39 pm
Location: Atlanta, GA

Re: src/gena/reduce.cpp:320: error: GFOR not supported

Postby rm9 » Wed Feb 12, 2014 9:05 am

Thanks, it worked but it seems very slow compared to gfor.
rm9
 
Posts: 54
Joined: Thu Jan 30, 2014 5:44 am

Re: src/gena/reduce.cpp:320: error: GFOR not supported

Postby pavanky » Wed Feb 12, 2014 11:27 am

Oops, I forgot that ffts were slow for non powers of 2. Here is a fixed version.

Code: Select all
// from http://graphics.stanford.edu/~seander/bithacks.html#RoundUpPowerOf2                                                                                                                   
static inline unsigned RoundUpPowerOf2(unsigned x)
{
    x = x - 1;
    x = x | (x >> 1);
    x = x | (x >> 2);
    x = x | (x >> 4);
    x = x | (x >> 8);
    x = x | (x >>16);
    return x + 1;
}


array ConvolveValid4d(const array& a, const array& b) {
    unsigned full0 = a.dims(0) + b.dims(0) - 1;
    unsigned full1 = a.dims(1) + b.dims(1) - 1;
    full0 = RoundUpPowerOf2(full0);
    full1 = RoundUpPowerOf2(full1);

    int tileA = a.dims(2);
    int tileB = b.dims(2);
    array A = constant(0, full0, full1, tileA);
    array B = constant(0, full0, full1, tileB);

    A(seq(a.dims(0)), seq(a.dims(1)), span) = a;
    B(seq(b.dims(0)), seq(b.dims(1)), span) = b;

    A = moddims(A, full0 * full1, tileA);
    B = moddims(B, full0 * full1, 1, tileB);

    array AA = tile(A, 1, 1, tileB);
    array BB = tile(B, 1, tileA, 1);

    AA = moddims(AA, full0, full1, tileA * tileB);
    BB = moddims(BB, full0, full1, tileA * tileB);
    array C = real(ifft2(fft2(AA) * fft2(BB)));

    int start0 = b.dims(0) - 1;
    int start1 = b.dims(1) - 1;
    seq output_dim0 = start0 + seq(a.dims(0) - b.dims(0) + 1);
    seq output_dim1 = start1 + seq(a.dims(1) - b.dims(1) + 1);
    array c = C(output_dim0, output_dim1, span);
    return c;
}


Please do not compare the performance with the gfor version. Especially when it is giving the wrong answers. It may not be doing the complete operations as requested.
Pavan Yalamanchili,
ArrayFire
--
~ If it is not broken, you have not tried hard enough ~
User avatar
pavanky
Site Admin
 
Posts: 1123
Joined: Mon Mar 15, 2010 7:39 pm
Location: Atlanta, GA

Re: src/gena/reduce.cpp:320: error: GFOR not supported

Postby rm9 » Wed Feb 12, 2014 11:48 am

I see, thanks. I will eagerly wait for the next version. :)
rm9
 
Posts: 54
Joined: Thu Jan 30, 2014 5:44 am

Re: src/gena/reduce.cpp:320: error: GFOR not supported

Postby pavanky » Fri Feb 14, 2014 2:27 pm

Hi Ran,

We narrowed down the problem to the convolution working with two different tile sizes.

i.e. you have matrix A of size [a1, a2, a3] and matrix B of size [b1, b2, b3] and you want to generate matrix C of size [c1, c2, a3, b3].

When you pass in the input matrix "data" as 3d and "W" inside gfor, this is exactly what is happening. This operation is not supported and we should not have failed silently.

Right now instead of failing it is only working on the first tile of "data" and not operating on the rest causing the value errors. We can not support this kind of operation easily. We should have errored out gracefully instead of failing.

As an alternative, what you can do is do a for loop over data (because the size is only 10) and do a gfor over "W".

So it would be something along these lines.

Code: Select all
  gfor(array j, W.dims(2)) {
      for (int i = 0; i < data.dims(2); i++) {
          cx(span, span, i, j) = ConvolveValid3d2d(data(span, span, i), W(span, span, j));
      }
      array c = cx(span, span, span, j);
      output_gfor(span, span, j) = moddims(c, c.dims(0) * c.dims(1) * c.dims(2), c.dims(3));
      output_gfor(span, span, j) += b(j);
   }


P.S. I have not tested this code. This merely a suggestion. We will also look into why the fft method is failing for you. That method has more parallelism and should provide a better speedup.
Pavan Yalamanchili,
ArrayFire
--
~ If it is not broken, you have not tried hard enough ~
User avatar
pavanky
Site Admin
 
Posts: 1123
Joined: Mon Mar 15, 2010 7:39 pm
Location: Atlanta, GA

Re: src/gena/reduce.cpp:320: error: GFOR not supported

Postby pavanky » Fri Feb 14, 2014 2:30 pm

Well I just realized that we can do something internally to make it easier for you as well :) We will keep you informed.

--
Pavan
Pavan Yalamanchili,
ArrayFire
--
~ If it is not broken, you have not tried hard enough ~
User avatar
pavanky
Site Admin
 
Posts: 1123
Joined: Mon Mar 15, 2010 7:39 pm
Location: Atlanta, GA

Re: src/gena/reduce.cpp:320: error: GFOR not supported

Postby rm9 » Sun Feb 16, 2014 5:36 am

Thanks Pavan. Unfortunately I couldn't get the code the work, I still get a large numeric difference between the gfor and for version.
Here is my code, I hope I understood your idea correctly.
Code: Select all
   array output2(constant(0, output_size0_ * output_size1_, data.dims(2), output_maps));
   gfor(array j, output_maps) {
   //for (int j = 0; j < output_maps; ++j) {
      array c(constant(0, output_size0_, output_size1_, data.dims(2)));
      for (int b = 0; b < data.dims(2); ++b) {
         array cx = ConvolveValid2d(data(span, span, b), W(span, span, j));
         c(span, span, b) = cx;
      }
      c = moddims(c, output_size0_ * output_size1_, data.dims(2));
      output2(span, span, j) = c;
      output2(span, span, j) += b(j);
   }
   output2 = moddims(output2, output_size0_, output_size1_, data.dims(2), output_maps);
   cout << af::sum<float>(af::abs(output - output2)) << endl;


I'd be happy to hear more suggestions :)
Thanks!
rm9
 
Posts: 54
Joined: Thu Jan 30, 2014 5:44 am

Re: src/gena/reduce.cpp:320: error: GFOR not supported

Postby pavanky » Thu Feb 20, 2014 6:28 pm

Hi Ran,

Just wanted to provide an update. The for + gfor code is working in a local branch, but does not work in the released version. However the local branch is failing for some other cases. We are working towards resolving the issue ASAP.

Sorry about the inconvenience.
Pavan Yalamanchili,
ArrayFire
--
~ If it is not broken, you have not tried hard enough ~
User avatar
pavanky
Site Admin
 
Posts: 1123
Joined: Mon Mar 15, 2010 7:39 pm
Location: Atlanta, GA

Re: src/gena/reduce.cpp:320: error: GFOR not supported

Postby rm9 » Fri Feb 21, 2014 4:59 am

Thanks!
rm9
 
Posts: 54
Joined: Thu Jan 30, 2014 5:44 am

Re: src/gena/reduce.cpp:320: error: GFOR not supported

Postby rm9 » Sat Feb 22, 2014 6:29 am

I just wanted to mention that I think that an option for a 'valid' convolution could help performance, am I right?
Since it won't have to calculate all the values for a full one.
Thanks.
rm9
 
Posts: 54
Joined: Thu Jan 30, 2014 5:44 am


Return to [archive-commercial] Programming & Development with ArrayFire

cron