> *CUDA Runtime: The runtime library (libcudart) that applications link against....

nickysielicki · 2025-11-20T18:14:54 1763662494

If you do this, you forego both backwards and forwards compatibility. You must follow the driver release cadence exactly, and rebuild all of your code for every driver you want to support when a new release happens, or you risk subtle breakage. NVIDIA guarantees nothing in terms of breakage for you.

Probably the worst part of this: for the most part, in practice, it will work just fine. Until it doesn’t. You will have lots of fun debugging subtle bugs in a closed-source black box, which reproduces only against certain driver API header versions, which potentially does not match the version of the actual driver API DSO you’ve dlopened, and which only produces problems when mixed with certain Linux kernel versions.

(I have the exact opposite opinion; people reach too eagerly for the driver API when they don’t need it. Almost everything that can be done with the driver api can be done with the runtime API. If you absolutely must use the driver API, which I doubt, you should at least resolve the function pointers through cudaGetDriverEntrypointByVersion.)

einpoklum · 2025-11-21T10:43:55 1763721835

Disagree, because:

* The Runtime API is also a black box - it's just differently shaped.

* CUDA runtime APIs are also incompatible with CUDA drivers which are significantly older. Although TBH I have not checked that compatibility range recently.

* C++ is a compiled language. So, yes, in some cases, you need to recompile. But - less than your might think. Specifically, the driver API headers use macros to direct your API function names to versioned names. For example:

    #define cuStreamGetCaptureInfo              __CUDA_API_PTSZ(cuStreamGetCaptureInfo_v3)

  and this versioned function will typically be available also when the signature changes to v4 (in this example, it seems two versions backwards are available in CUDA 13.0).

* ... meaning also that you don't have to "follow the driver release cadence exactly". But even if you want to follow it - there's a change every couple of years: a major CUDA version is released, and instead of functionality getting added, the API changes. And as for actual under-the-hoold behavior while observing the same API - that can change whether you're using the driver or the runtime API.

* Finally, if you want something more stable, more portable, that doesn't change frequently - OpenCL can also be considered rather than CUDA.