Jacket OpenCL gsingle() issue

[Old posts from the commercial version of ArrayFire] Issues and comments for download and installation. Getting up and running.

Moderators: melonakos, pavanky

Jacket OpenCL gsingle() issue

Postby woolly » Mon Feb 13, 2012 12:48 pm

I made a post before but it is not checked yet to be allowed to the forum. But I already figured out what was the problem: I had to first convert my variables to single precision otherwise wrap around occurs I guess. I thought that gsingle would take care of that. I'm sorry not to have noticed that before. It's also nowhere to be found on your website nor in the wiki, maybe you should add it to the wiki of the gsingle function, or maybe actually put a check for correct data type in that function... it cost me a few hours to figure this out, having never had to do with any data type conversion issues. So: please don't allow the post I made before but maybe update the documentation for beginners like me. Thanks.

I have an AMD E-350 with HD6310 GPU. The ginfo command gives:

Scanning system for OpenCL devices....

Arrayfire (OpenCL alpha)
Device0: Loveland (in use)
Device1: AMD E-350 Processor

When I run gsingle(ones(3)) I get the following:

ans =

0 1.8750 0
1.8750 0 1.8750
0 1.8750 0

What is that???

But I seem to be able to run the mandelbrot example just fine, although it is much slower than the CPU. The GPU does 6 frames per second and the CPU 27.
woolly
 
Posts: 3
Joined: Mon Feb 13, 2012 10:47 am

Re: Jacket OpenCL gsingle() issue

Postby jaideep » Mon Feb 13, 2012 3:21 pm

We only support single precision real and complex inputs in Jacket OpenCL. Following code should work
>> gsingle(ones(3,'single')) % ones(3) is in double-precision
Jaideep Singh
Software Engineer
AccelerEyes
User avatar
jaideep
 
Posts: 207
Joined: Tue Oct 11, 2011 2:40 pm

Re: Jacket OpenCL gsingle() issue

Postby woolly » Tue Feb 14, 2012 2:57 pm

Single precision (also on CUDA) makes things much slower in case your data originates from doubles, because you have to do
Y = gsingle(single(X));
every time you want to use GPU for computation for new data X.

And if you want to put it back after modification on the GPU you have to do
X = double(single(Y));

I checked with the profiler and the gsingle, single and double functions take almost all the time (latter two are displayed as built-in), while compared to CPU execution the actual computation is indeed faster, but with this overhead there is no gain in using the GPU. I guess that's also why the Mandelbrot example runs so slow, bad example? Does CUDA have the same problems with this? Or does that example run faster on CUDA?

Adapting a program for GPU computation commonly requires things like the above. Especially if you want to use existing Matlab functions within your code, which often require double precision inputs. The alternative is to rewrite the Matlab function. That way you can also somehow circumvent any functions which are not overloaded by Jacket. Probably there are also Matlab functions that don't do any checks and only have child-functions that are in the Jacket package. For example the dot function doesn't seem to have anything not in Jacket, it's just sum(A.*B), isn't it? Do you have a list maybe of these known Jacket compatible Matlab functions? Would be handy.

That said, only the main CUDA version has lots of functions. Any idea when you will be implementing more OpenCL ones?? And what about double precision? :)

Please comment. Did I understand this ok, for a beginner? I hope so :)
woolly
 
Posts: 3
Joined: Mon Feb 13, 2012 10:47 am

Re: Jacket OpenCL gsingle() issue

Postby jaideep » Thu Feb 16, 2012 12:55 pm

Jacket OpenCL is still in alpha stage, we are working on adding more functionality (including double-precision support) and improve performance of supported functions.

Since Matlab supports single-precision, you could do:
Code: Select all
A_cpu = single(A_gpu) % to pull data back and in single-precision.


Jacket CUDA is highly optimized and you should see a significant performance speedup running Mandelbrot with it on CUDA-capable GPUs.
Jaideep Singh
Software Engineer
AccelerEyes
User avatar
jaideep
 
Posts: 207
Joined: Tue Oct 11, 2011 2:40 pm


Return to [archive-commercial] Download & Installation

cron