My software reads out of a host array, and after copying to the GPU, convolves the data.
Regardless of the length of the data to be convolved (from 8 million points to 32k points) on the 222th call to the convolve function, I get : "src/gena/gi_mem.cpp:453: CUDA runtime error: invalid argument (11)"
The filter is 64 elements long.
Here's some of the pertinent code:
- Code: Select all
//setup the arrays
af::array data =af::constant(0.0,dataSize,1,af::f64);
// "Process Loop"
//copy the host memory
d_Data = (double*)data.device<double>();
checkCudaErrors(cudaMemset(d_Data,0,dataSize*sizeof(double)));
checkCudaErrors(cudaMemcpy(d_Data,hostData,dataSize*sizeof(double),cudaMemcpyHostToDevice));
data.unlock();
//************************* Do other stuff *********************************************************************
// call the convolve
B = af::constant(0,dataSize + kernel.elements(), af::f64);
try{
B = af::convolve(data,kernel,false);
}catch{af::exception& e)
{
fprintf(stderr, "%s\n", e.what());
}
// when counter is 222 I hit the exception
counter++;
The code is effectively "looped" from the host memory copy through the convolve (I only setup and allocate the arrays a single time) until the host memory is completely processed.
Is there any reason why calling the convolve would result in an exception after a certain number of calls?
Thank you!