[PD-dev] CUDA discussion
czhenry at gmail.com
Tue Nov 3 14:12:04 CET 2009
> top-down design issues:
> 1. The essential CUDA<->Pd functions should be made separate from
> CUDA based Pd externals, with a separate header file, and compilable
> to shared and static libraries.
> 2. The set of CUDA<->Pd extensions needs to be able to manage
> multiple devices, including device query, initialization and setting
> global parameter sets per GPU. Most likely, this means a custom data
> structure and object based method system.
> 3. Compilation--how to create the build system and handle
> dependencies for a library of CUDA based externals. Management of
> CUDA libraries, CUDA-rt and CUDA-BLAS especially.
> 4. Testing and initialization. At setup time, a CUDA based external
> should be able to find out if it is legal and ready to run.
> 5. Abstraction of major device and memory operations. What makes up
> a sufficient and complete set of operations? This is a list that is
> most likely to be grown through experimentation, but a good
> preliminary list of operations will help get things started on the
> right footing.
> 6. Performance. How to profile or benchmark and make comparisons
> between implementations? The single greatest performance issue that I
> have identified is caching on GPU. host<->device memory transfers can
> be eliminated in some cases, allowing CUDA based externals to follow
> one another in the DSP tree with faster scheduling and runtime
7. Namespace. Should be able to duplicate existing objects with
unified variations on names.
More information about the Pd-dev