Convolve and CUDA Runtime error (11)

[Old posts from the commercial version of ArrayFire] Discussion of ArrayFire using CUDA or OpenCL.

Moderator: pavanky

Convolve and CUDA Runtime error (11)

Postby neuralPanther » Thu Sep 25, 2014 11:59 am

Hi All,

My software reads out of a host array, and after copying to the GPU, convolves the data.

Regardless of the length of the data to be convolved (from 8 million points to 32k points) on the 222th call to the convolve function, I get : "src/gena/gi_mem.cpp:453: CUDA runtime error: invalid argument (11)"
The filter is 64 elements long.

Here's some of the pertinent code:
Code: Select all
//setup the arrays
af::array data =af::constant(0.0,dataSize,1,af::f64);

// "Process Loop"
//copy the host memory
d_Data = (double*)data.device<double>();
checkCudaErrors(cudaMemset(d_Data,0,dataSize*sizeof(double)));
checkCudaErrors(cudaMemcpy(d_Data,hostData,dataSize*sizeof(double),cudaMemcpyHostToDevice));
data.unlock();

//************************* Do other stuff *********************************************************************

// call the convolve
B  = af::constant(0,dataSize + kernel.elements(), af::f64);
try{
      B = af::convolve(data,kernel,false);
}catch{af::exception& e)
{
      fprintf(stderr, "%s\n", e.what());
}
// when counter is 222 I hit the exception
counter++;


The code is effectively "looped" from the host memory copy through the convolve (I only setup and allocate the arrays a single time) until the host memory is completely processed.
Is there any reason why calling the convolve would result in an exception after a certain number of calls?

Thank you!
neuralPanther
 
Posts: 25
Joined: Fri Feb 14, 2014 8:03 pm

Re: Convolve and CUDA Runtime error (11) [SOLVED?]

Postby neuralPanther » Thu Sep 25, 2014 5:53 pm

I think I found a solution to this:

When I initalized my memory in the GPU I loaded a device array with my filter coefficients like so:
Code: Select all
float* tempPtr = af::array::alloc<float>(64);
h_n= af::array(64,tempDblPtr,af::afDevice);

checkCudaErrors(cudaMemcpy(h_n.device<float>(),host_h_n,64*sizeof(float),cudaMemcpyHostToDevice));
h_n.unlock();
af::array::free(tempPtr );


Then further down in my program I used this h_n as I discussed before:

Code: Select all
B = af::convolve(data,h_n,false);


The issue resolved itself when I instead used:
Code: Select all
 B = af::fir(N, host_h_n,data)


My only conclusion is that somehow my GPU array (h_n above) became corrupted through repeated calls (on the order of 2000) to the convolve function.

If there's a different/better explanation I would love to know so I can avoid the same problem in the future.
neuralPanther
 
Posts: 25
Joined: Fri Feb 14, 2014 8:03 pm


Return to [archive-commercial] Programming & Development with ArrayFire

cron