2009年10月20日火曜日

CULA に関する質問

CULA の SVD に関するドキュメントが少ないので、メールを打ってみた(ここは本当は私がやるべきではないのだが。。。)。 重要なのは、culaSVD がやはりブロッキングという点だ。

-----
Hello, to answer your questions
1) These are job codes which control what the routine calculates singular vectors and where they are returned. The code 'A' says to calculate the full U/Vt matrices. Users will typically use the code that calculates the minimum required results in order to lessen computation time.

CULA routines follow closely the LAPACK standard for routine naming and arguments. Full documentation of the routines will be in CULA 1.1, but in the meantime we can use this: http://www.intel.com/software/products/mkl/docs/webhelp/lse/functn_gesvd.html Note that our routines have cut the work and info parameters from the list.

2) culaSgesvd blocks. You will find that runtimes are sufficiently long that the blocking will not impair performance.

3) Yes, CULA is thoroughly threadsafed. The only function that can be adversely affected is culaGetErrorInfo() which returns the most recent Info code triggered by any thread (which may be a different thread from the one currently executing.) Info codes are normally errors in your data or arguments, so these can be debugged in a single-threaded environment most of the time.

For each thread you have three options -
1) Pre-bind each thread with a call to cudaSetDevice, using the CUDA toolkit. This is the most flexible, allowing you to choose which GPU each thread binds to.
2) Bind by calling culaInitialize() in each thread. CULA will bind to the GPU it determines is best, but this will result in each thread binding to the same GPU.
3) Do neither #1 nor #2 and accept CUDA's default binding, which is always device zero. Note that in order to use CULA, your first thread will still need to call culaInitialize().

Regards,
John

0 件のコメント:

コメントを投稿