Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> CUDA Runtime: The runtime library (libcudart) that applications link against.

That library is actually a rather poor idea. If you're writing a CUDA application, I strongly recommend avoiding the "runtime API". It provides partial access to the actual CUDA driver and its API, which is 'simpler' in the sense that you don't explicitly create "contexts", but:

* It hides or limits a lot of the functionality.

* Its actual behavior vis-a-vis contexts is not at all simple and is likely to make your life more difficult down the road.

* It's not some clean interface that's much more convenient to use.

So, either go with the driver, or consider my CUDA API wrappers library [1], which _does_ offer a clean, unified, modern (well, C++11'ish) RAII/CADRe interface. And it covers much more than the runtime API, to boot: JIT compilation of CUDA (nvrtc) and PTX (nvptx_compiler), profiling (nvtx), etc.

> Driver API ... provides direct access to GPU functionality.

Well, I wouldn't go that far, it's not that direct. Let's call it: "Less indirect"...

[1] : https://github.com/eyalroz/cuda-api-wrappers/



If you do this, you forego both backwards and forwards compatibility. You must follow the driver release cadence exactly, and rebuild all of your code for every driver you want to support when a new release happens, or you risk subtle breakage. NVIDIA guarantees nothing in terms of breakage for you.

Probably the worst part of this: for the most part, in practice, it will work just fine. Until it doesn’t. You will have lots of fun debugging subtle bugs in a closed-source black box, which reproduces only against certain driver API header versions, which potentially does not match the version of the actual driver API DSO you’ve dlopened, and which only produces problems when mixed with certain Linux kernel versions.

(I have the exact opposite opinion; people reach too eagerly for the driver API when they don’t need it. Almost everything that can be done with the driver api can be done with the runtime API. If you absolutely must use the driver API, which I doubt, you should at least resolve the function pointers through cudaGetDriverEntrypointByVersion.)


Disagree, because:

* The Runtime API is also a black box - it's just differently shaped.

* CUDA runtime APIs are also incompatible with CUDA drivers which are significantly older. Although TBH I have not checked that compatibility range recently.

* C++ is a compiled language. So, yes, in some cases, you need to recompile. But - less than your might think. Specifically, the driver API headers use macros to direct your API function names to versioned names. For example:

    #define cuStreamGetCaptureInfo              __CUDA_API_PTSZ(cuStreamGetCaptureInfo_v3)

  and this versioned function will typically be available also when the signature changes to v4 (in this example, it seems two versions backwards are available in CUDA 13.0).
* ... meaning also that you don't have to "follow the driver release cadence exactly". But even if you want to follow it - there's a change every couple of years: a major CUDA version is released, and instead of functionality getting added, the API changes. And as for actual under-the-hoold behavior while observing the same API - that can change whether you're using the driver or the runtime API.

* Finally, if you want something more stable, more portable, that doesn't change frequently - OpenCL can also be considered rather than CUDA.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: