af_accum_I causes error: unspecified launch failure (4)

[Old posts from the commercial version of ArrayFire] Discussion of ArrayFire using CUDA or OpenCL.

Moderator: pavanky

af_accum_I causes error: unspecified launch failure (4)

Postby simonrus » Thu Jun 05, 2014 3:25 pm

Hello,

I want to check scan algorithm with integer values (s32).
The next example works well on my computer when NSAMPLES is not big (<4000),
but when I increase NSAMPLES to 8000, it crashes with the message:
"CUDA runtime error: unspecified launch failure (4)"

I use GTX580 on Ubuntu 12.04 LTS 64bit, CUDA 5.5 driver 319.37

Could you please help to understand what is wrong with the provided code? I've spent 3 days already to eliminate such crash and haven't found out the reason what is wrong with the code

Unfortunately, arrayfire documentation doesn't provide deep description of batch parameter in af_accum_I function. Could you please explain what this parameter means and how it can be used to calculate accum sum over very large arrays (>1 billion elements)

Code: Select all
#include <cstdlib>
#include <stdio.h>
#include <arrayfire.h>
#include <af/utils.h>

using namespace af;


#define NSAMPLES        8000
#define BATCHSIZE       400

int main(int argc, char** argv) {
  try{
        int device = 0;
        af::deviceset(device);
        //af::info();

        dim4 dims(NSAMPLES, 1);

        array signal = constant(1, dims); //contains 1.0,1.0,1.0,1.0,1.0.....
        array signal_i = signal.as(s32); //contains array 1,1,1,1,1.....

        array result_i(dims, s32);      //allocate memory for result

        unsigned dimensions[] = {NSAMPLES, 1};
        int32_t *d_In = signal_i.device<int32_t>();
        int32_t *d_Out = result_i.device<int32_t>();

        //run C interface
        af_accum_I(d_Out, 2, dimensions , d_In, BATCHSIZE, 0, sum_t);

        signal_i.unlock();
        result_i.unlock();

        print(result_i); //should be 1,2,3,4,5,6...
    }


    catch(af::exception & e) {
        printf("%s\n", e.what());

    }
    return 0;
}


Thank you in advance!
simonrus
 
Posts: 3
Joined: Thu Jun 05, 2014 2:40 pm

Re: af_accum_I causes error: unspecified launch failure (4)

Postby shehzan » Fri Jun 06, 2014 1:06 pm

Hi

Batch is used as a gfor parameter in the C interface. If you use a gfor loop in the C++ interface, you would use a Batch parameter in the C interface.

Here is the fixed code of you:
Code: Select all
        // Use:
        array signal_i = constant(1, dims, af::s32);
        //array signal = constant(1, dims); //contains 1.0,1.0,1.0,1.0,1.0.....
        //array signal_i = signal.as(s32); //contains array 1,1,1,1,1.....

        array result_i(dims, s32);      //allocate memory for result

        // Can also use: (not required in C++ interface)
        //const unsigned* dimensions = dims.dims();
        unsigned dimensions[] = {NSAMPLES, 1};

        // C++ interface: (requires float arrays)
        //result_i = accum(signal_i, 1, sum_t);
        int32_t *d_In = signal_i.device<int32_t>();
        int32_t *d_Out = result_i.device<int32_t>();

        // Changed dimension 2->1 and BATCHSIZE->0
        //run C interface
        af_accum_I(d_Out, 1, dimensions , d_In, 0, 0, sum_t);

        signal_i.unlock();
        result_i.unlock();

        print(result_i); //should be 1,2,3,4,5,6...



As a part of this test, we found that the C++ interface of accum is only working with floats. We will fix this and put it out in the next update.
Was this also the reason that you were not using the C++ interface? Or was there another reason?
----
Shehzan
Developer
AccelerEyes
User avatar
shehzan
 
Posts: 121
Joined: Tue Feb 12, 2013 7:20 pm

Re: af_accum_I causes error: unspecified launch failure (4)

Postby simonrus » Wed Jun 11, 2014 4:54 am

Hello,

Thank you for you reply! The provided fix works well.

I've implemented CIC filter using arrayfire library. Since CIC integrator requires two's compliments arithmetic to deal with overflow, I use s32 to represent 32.16 fixed point.

Also fixed point is very useful to have a bit-exact result with other SW/HW implementations. It would be great if you add C++ interface for s32 type

With Best Regards,
Sergey
simonrus
 
Posts: 3
Joined: Thu Jun 05, 2014 2:40 pm


Return to [archive-commercial] Programming & Development with ArrayFire

cron