GPUmat vs Jacket - what a difference?

A place for wish lists to be posted for future Jacket-based products.

Moderator: jacket_guy

GPUmat vs Jacket - what a difference?

Postby vitaly » Fri Jun 05, 2009 5:30 am

Im sorry, probably, for such question. But all the same (if want) than Jacket it is better GPUmat. And if it is possible on the contrary. With GPUmat I have understood, whether I wish to understand there is a sense to try Jacket and whether I will get a rise in speed, in possibilities...
vitaly
 
Posts: 41
Joined: Fri Jun 05, 2009 5:16 am

Re: GPUmat vs Jacket - what a difference?

Postby znoop333 » Thu Jun 11, 2009 10:32 am

I saw GPUmat reviewed on http://gpgpu.org, and it does look very similar to Jacket. Reading through their documentation, I find that GPUmat has less support for mixed GPU/CPU statements. I like how cublasGetVector lets you see their low level data format, and it's interesting to see they let you call routines like cublasSgemm directly from Matlab.

GPUmat is free but it's probably full of bugs. The same ones that I'm sure Accelereyes has run into. They support far fewer Matlab functions; I'd say they're a year (?) behind Jacket.

I wonder if writing Matlab wrappers for the CUBLAS functions is patentable. Seems unlikely. Maybe just a matter of time before Nvidia makes their own library of wrappers? I'd rather see Nvidia invest more resources into implementing more BLAS routines and other high level functions.
Department of Biomed. Engr.
Case Western Reserve University

Jacket/GPUmat/GPUlib
GeForce 9500 GT, 1024 MB VRAM, Capability 1.1
MATLAB R2008A. Win XP SP2 - 32-bit, Core2Duo, 2.1 GHz, 3 GB RAM
znoop333
 
Posts: 16
Joined: Mon Aug 25, 2008 1:31 pm

Re: GPUmat vs Jacket - what a difference?

Postby vishy » Thu Jun 11, 2009 12:18 pm

This appears to be a new offering in the space of GPU MATLAB tools... if you happen to try both out, we'd love to hear your feedback. Any information we can learn about how to improve Jacket is very helpful.
--------------------------------------------------
Vish Venugopalakrishnan
Software Engineer (Q/A)
AccelerEyes LLC
vishy.v@accelereyes.com

--------------------------------------------------
Resources:
Getting Started, FAQ, Tips, Syntax
User avatar
vishy
 
Posts: 371
Joined: Thu Apr 16, 2009 11:46 am

Re: GPUmat vs Jacket - what a difference?

Postby vitaly » Mon Jun 15, 2009 11:50 am

Yes this new decision, I think, having on start same CUDA it should lead to the same result, as Jacket.
Whether without having the sufficient information on methods in CPUmat and Jacket I would like to understand there is a way to create more advanced codes.

I.e. it is possible solve a problem quickly but the code will be slow, and it is possible to write another a code on a low level and fast code.

The type data double does not support both CPUmat and Jacket?

To compare it is necessary to buy Jacket and I can make it only if I see clear advantages Jacket over CPUmat. Whether Now I wish to understand there is a reason to go on Jacket.
It is necessary to tell that level of technical support at CPUmat is better. I could not find detailed the description of installation and the description of commands of the Jacket.

I wish to stop the choice on a productive code. As I understand, comands Jacket it is more than at CPUmat, but this business of time if forward development at CPUmat will be faster than at Jacket (because CPUmat is new), therefore it in comparison does not matter. Speed of work more important for me.

I wait GT300, therefore by the end of the year I should make a choice.
vitaly
 
Posts: 41
Joined: Fri Jun 05, 2009 5:16 am

Re: GPUmat vs Jacket - what a difference?

Postby malcolm » Tue Jun 16, 2009 10:15 pm

Hi Vitaly,
Jacket v1.1 now supports native types single, double, uint32, int32, and logical out on the GPU.
-jm

Code: Select all
>> ginfo
Jacket v1.1 (build XXXX)  data: 0 CPU-used, 0 GPU-used, 3987 GPU-free (in MB)
GPU0 (enabled) Quadro FX 5800, 1265 MHz, 4095 MB VRAM, Capability 1.3
>> A = gsingle(1:4);
>> B = gdouble(1:4);
>> C = A > 2;
>> D = guint32(B);
>> E = B + 1i;
>> whos
Name      Size            Bytes  Class       Attributes

  A         1x4               508  gsingle
  B         1x4               508  gdouble
  C         1x4              1048  glogical
  D         1x4               984  guint32
  E         1x4              1052  gdouble complex
James Malcolm (malcolm@accelereyes.com)
User avatar
malcolm
 
Posts: 505
Joined: Sat Jun 14, 2008 11:00 pm

Re: GPUmat vs Jacket - what a difference?

Postby vitaly » Wed Jun 17, 2009 12:59 am

Hi JM!

Thanks for the answer and attention.

However I have found out that only Jacket v1.0.3 version is accessible to purchase.

What is price of version Jacket v1.1?

What productivity GPU in double (multiplication of matrixes interests), is not worse than CPU in double? (If it is important, at me GTX250 for tests).

And nevertheless inform efficiency (productivity) of your code it is close to a limit? There can be improvements in the future? What potential of increase in productivity of your code you see?

What do you think about GT300?

In advance thanks.
Vitaly
vitaly
 
Posts: 41
Joined: Fri Jun 05, 2009 5:16 am

Re: GPUmat vs Jacket - what a difference?

Postby malcolm » Wed Jun 17, 2009 4:54 am

Hi Vitaly,
You're correct, v1.1 is not quite out yet, but we're packaging it up and testing as I write this.

We'll reset all the trial licenses so you can give this one a test again. The pricing information is updated some because we put in a floating licensing mechanism to make it easier among groups.

There are included double-precision examples for performancing testing. We haven't yet double benchmarking of IEEE compliance on end-to-end examples; however, here's a quick test we did: generate a double precision 3x3 matrix and push/pull it through both Matlab and Jacket single and double precision conversions (below). Based on user feedback, we anticipate adjusting things here and there in the coming months. Let us know what you think.

As for productivity, we are constantly hearing from users who have tried and given up on coding directly in CUDA themselves. Of course, that's the only way to get the best performance, but as most people find out, it's often not worth the time in development.

Please give v1.1 a try on your GTX. You should get an email when v1.1 is released in the next day or so. We'd really like to hear what you think. This is the biggest upgrade we've ever rolled out.
-jm

Code: Select all
>> N    %%%  standard double-precision CPU Matlab variable
N =
   1.407449003374228  -1.303193638472176  -0.582186939674376
  -0.381293350245835   0.656453029287915  -1.658488523670904
   1.292504363290397   0.079473728930311   1.179880057296001
>> N_cpu_single = single(N)
N_cpu_single =
   1.4074490  -1.3031937  -0.5821869
  -0.3812934   0.6564530  -1.6584885
   1.2925043   0.0794737   1.1798800
>> N_gpu_single = gsingle(N)
N_gpu_single =
   1.4074490  -1.3031937  -0.5821869
  -0.3812934   0.6564530  -1.6584885
   1.2925043   0.0794737   1.1798800
>> N_gpu_double = gdouble(N)
N_gpu_double =
   1.407449003374228  -1.303193638472176  -0.582186939674376
  -0.381293350245835   0.656453029287915  -1.658488523670904
   1.292504363290397   0.079473728930311   1.179880057296001

%% Compute the norm differences between some of these:
>> norm(N(:) - N_gpu_double(:))
ans =
     0              %% double precision GPU result
>> norm(N(:) - N_gpu_single(:))
ans =
     8.302859345298218e-08    %% double precision GPU result
>> norm(N(:) - N_cpu_single(:))
ans =
   8.3028596e-08     %% single precision CPU result
James Malcolm (malcolm@accelereyes.com)
User avatar
malcolm
 
Posts: 505
Joined: Sat Jun 14, 2008 11:00 pm

Re: GPUmat vs Jacket - what a difference?

Postby vitaly » Wed Jun 17, 2009 7:26 am

Hi!
Thanks for the full and useful answer.

I am engaged in field of neuratechnologies. The operating time of my algorithm on processor Core2Duo 4Gz makes 1 month. Therefore possibility to reduce this time in 10 times (with GT300) for me desirable result. I already tried to pass on single type of the data and have thus received catastrophic falling of accuracy. Therefore unique possibility to use capacity GPU it to work with double the data.

I offer, if it is possible, to try this code.

n=3000;
A=rand(n,n);
G=gdouble(A);
tic
R=A*A;
toc
tic
S=G*G;
toc
sum(sum(abs(R-double(S))))

It is clear that speed in gdouble should be essentially more low rather than in gsingle, whether but it will be interesting it above in comparison with usual CPU.

Inform, what ways are available for acceleration of work CPU+GPU. How I understand, speed CPU not strongly influences for the speed? And speed of system memory should be increased (for acceleration of data transmission from CPU to GPU)? Three-channel memory Ci7 will be the best variant?

I will wait for possibility to check up work v1.1 on the GTX250

Thanks.
vitaly
 
Posts: 41
Joined: Fri Jun 05, 2009 5:16 am

Re: GPUmat vs Jacket - what a difference?

Postby znoop333 » Thu Jul 02, 2009 4:13 pm

I spent some time using GPUmat today. I converted my code from Jacket syntax to GPUmat with only the following changes:

1. search and replace "gsingle" with "GPUsingle"
2. remove gforce() calls, then put GPUsync() on a following line
3. "hybrid" statements aren't supported in GPUmat, so anywhere I had a gsingle+something else, I had to wrap the second term with GPUsingle().

That was it. It wasn't hard at all. I was not able to compare execution time between GPUmat and Jacket because my trial license expired again, but I would expect the execution times to be similar.

GPUmat does not have double precision support, but my GPU doesn't have Compute capability 1.3 so it doesn't matter.
Department of Biomed. Engr.
Case Western Reserve University

Jacket/GPUmat/GPUlib
GeForce 9500 GT, 1024 MB VRAM, Capability 1.1
MATLAB R2008A. Win XP SP2 - 32-bit, Core2Duo, 2.1 GHz, 3 GB RAM
znoop333
 
Posts: 16
Joined: Mon Aug 25, 2008 1:31 pm

Re: GPUmat vs Jacket - what a difference?

Postby znoop333 » Mon Jul 06, 2009 10:15 am

I found yet another CUBLAS wrapper for Matlab. GPUlib is sold by Tech-X Corporation. It's licensed open source because it looks like it was NIH funded, free for academics but the commercial license is $495. I think it was originally developed for IDL, not Matlab.

I'm still trying to figure out their exact feature set. I'm not as familiar with IDL. Definitely no GFOR.
Department of Biomed. Engr.
Case Western Reserve University

Jacket/GPUmat/GPUlib
GeForce 9500 GT, 1024 MB VRAM, Capability 1.1
MATLAB R2008A. Win XP SP2 - 32-bit, Core2Duo, 2.1 GHz, 3 GB RAM
znoop333
 
Posts: 16
Joined: Mon Aug 25, 2008 1:31 pm

Re: GPUmat vs Jacket - what a difference?

Postby natael » Mon Feb 15, 2010 9:57 pm

Hi,

I've been comparing GPUmat and Jacket for a little while now. On single machines, the standard Jacket, without the GFX, without MGL, without HPC, as far as I've tested, is equivalent to GPUmat. But it's a lot of "without". And I've then tested the MGL and HPC products.
See here : viewtopic.php?f=7&t=1206
However, I disagree when I read GPUmat is probably full of bug. On the other hand, there is not much "end-usage" functions. For single machines, benchmark, etc... it's nice to have it to advertise the capabilities of GPU computing. But one has been through these functions over a long night. And I was still hungry.
An advantage for GPUmat was that I didn't have warm-up problems. On my code example, I had absolutely no difference between running the code the 1st time, or the 2nd time,... and they might have an interesting point there. Maybe it's due to the fact that there are much less functions, probably all "pre-compiled" as it costs less to load ? I've no idea...

To me, Jacket has the advantage to be more "professional". In our institute we already consider getting it for its capability to handle parallelization of GPU cluster and multiple GPUs on single motherboard (HPC, MGL...), which GPUmat cannot (yet?). GPUmat, for now, can only let you decide which one (and only one) of the different GPUs in your system you can use for the session. It can be useful for multiple users , each one running on one unique GPU among several.
In Jacket, the GFX sounds promising too, and would give another advantage. I wish it were more advanced, it would make the whole thing much more convincing. But I assume it's just a matter of time.

Raphael
AccelerEyes Jacket v1.3.0 (build 3890)
CUDA driver: 256.35, CUDA toolkit 2.3
GPU0 GeForce GTS 250, 1792 MHz, 1023 MB VRAM, Compute 1.1 (single)
OS: Linux Ubuntu 10.04
RAM: 6 GB
CPU : AMD athlon 64 X2 6000+
natael
 
Posts: 24
Joined: Fri Jan 29, 2010 8:19 pm
Location: Max Planck Institute for Solar System Research, Germany

Re: GPUmat vs Jacket - what a difference?

Postby few » Sat Feb 20, 2010 2:16 pm

I ran the code listed in vitaly' post (quoted below) to compare performance on my system with the jacket 1.2.2 and CUDA 2.3. Here is some info about my system:

(4 core Intel Core i7 CPU 975 @ 3.33GHz, 12 GB RAM, with Tesla C1060 4GB GPU)

Jacket v1.2.2 (build 3170) data: 0 CPU-used, 0 GPU-used, 4044 GPU-free (in MB)
GPU0 (enabled) Tesla C1060, 1265 MHz, 4095 MB VRAM, Compute 1.3 (single,double) (in use)
GPU1 (enabled) Quadro FX 380, 1074 MHz, 255 MB VRAM, Compute 1.1 (single)


Here is the code I first ran with results, the GPU is much faster than the CPU:
n=3000;
A=rand(n,n);
G=gdouble(A);
tic
R=A*A;
toc
tic
S=G*G;
toc
sum(sum(abs(R-double(S))))
Elapsed time is 2.440815 seconds.
Elapsed time is 0.028268 seconds.
ans =
8.3630e-006


Note that the times vary quite a bit from run to run (before and after warming).

Here is a comparison with single-precision speed and results:
n=3000;
A=single(rand(n,n));
G=gsingle(A);
tic
R=A*A;
toc
tic
S=G*G;
toc
sum(sum(abs(R-single(S))))
Elapsed time is 1.211856 seconds.
Elapsed time is 0.019458 seconds.
ans =
4.4995e+003


In case you didn't notice: the residual magnitude is 10^3 in the previous example (10^-6 in the first example). Ouch. That's why I can't use single precision in my work either.

Here is how the processing time scaled when I moved to much larger matrices:
n=10000;
A=rand(n,n);
G=gdouble(A);
tic
R=A*A;
toc
tic
S=G*G;
toc
sum(sum(abs(R-double(S))))
Elapsed time is 55.508999 seconds.
Elapsed time is 0.382455 seconds.
ans =
5.7276e-004


A 10^-4 residual isn't too bad for computations involving 100 million value matrices. I think it's around the accuracy that my work requires.

Finally I wrapped the code so that I could calculate the CPU time, GPU time, and residual for the calculations for n = 1000:1000:10000. Note: I only ran the code once, since it takes a while to run, but I've already run lots of gpu code and warmed jacket since starting matlab. Here are the results:
[CPUtime', GPUtime', Resid']
ans =
0.0959, 0.0063, 0.0000
0.7258, 0.0155, 0.0000
1.6517, 0.0255, 0.0000
4.5068, 0.0652, 0.0000
6.8876, 0.0932, 0.0001
15.9968, 0.1334, 0.0001
19.5711, 0.1863, 0.0002
28.1169, 0.3501, 0.0002
49.7120, 0.3253, 0.0004
60.9479, 0.3726, 0.0006

(I inserted commas so that the times are distinguishable- whitespace gets eaten)

vitaly wrote:Hi!
Thanks for the full and useful answer.

I am engaged in field of neuratechnologies. The operating time of my algorithm on processor Core2Duo 4Gz makes 1 month. Therefore possibility to reduce this time in 10 times (with GT300) for me desirable result. I already tried to pass on single type of the data and have thus received catastrophic falling of accuracy. Therefore unique possibility to use capacity GPU it to work with double the data.

I offer, if it is possible, to try this code.

n=3000;
A=rand(n,n);
G=gdouble(A);
tic
R=A*A;
toc
tic
S=G*G;
toc
sum(sum(abs(R-double(S))))

It is clear that speed in gdouble should be essentially more low rather than in gsingle, whether but it will be interesting it above in comparison with usual CPU.

Inform, what ways are available for acceleration of work CPU+GPU. How I understand, speed CPU not strongly influences for the speed? And speed of system memory should be increased (for acceleration of data transmission from CPU to GPU)? Three-channel memory Ci7 will be the best variant?

I will wait for possibility to check up work v1.1 on the GTX250

Thanks.
Last edited by few on Sun Feb 21, 2010 11:31 am, edited 1 time in total.
few
 
Posts: 6
Joined: Fri Nov 13, 2009 1:28 am

Re: GPUmat vs Jacket - what a difference?

Postby Lars1 » Sun Feb 21, 2010 6:11 am

Hi.

The posted code:

Code: Select all
n=3000;
A=rand(n,n);
G=gdouble(A);
tic
R=A*A;
toc
tic
S=G*G;
toc
sum(sum(abs(R-double(S))))


leads to problems because of some missing gforce. I suggest to use the following:

Code: Select all
n=3000;
A=rand(n,n);
G=gdouble(A);
gforce(G); gforce;  % Ensures that G is formed on the GPU and any CPU/GPU sync is done
tic
R=A*A;
toc
tic
S=G*G;
gforce(S); % Need to force computation of the matrix S here
toc
sum(sum(abs(R-double(S))))


Some benchmarks for gsingle is available here: http://www.accelereyes.com/wiki/index.php?title=MTIMES_(Matrix). Matrix multiply is unfortunately not something Jacket does very fast. The speed-ups you have seem very large only because lazy execution means that S computation is first done in the line "sum(sum(abs(R-double(S))))". I will add double precision to the benchmarks as soon as possible.

BR Torben
--
Editor of "Torben's Corner" - http://wiki.accelereyes.com/wiki/index.php/Torben's_Corner
Cluster: 2 x X5670 + 20 x X5570 | 18 x C2070 & 15 x GTX580
Colfax CXT2000i: GTX465 & GTX580 | FX3800 & 4000 | C1060 & C2050
Lars1
 
Posts: 140
Joined: Thu Jul 23, 2009 7:28 am


Return to Feature Wish List for all Products