Setting up Visual Studio Intellisense for CUDA kernel calls

I LOVED Randy's solution. I'll match and raise using C preprocessor variadic macros:

#ifdef __INTELLISENSE__
#define CUDA_KERNEL(...)
#else
#define CUDA_KERNEL(...) <<< __VA_ARGS__ >>>
#endif

Usage examples:

my_kernel1 CUDA_KERNEL(NUM_BLOCKS, BLOCK_WIDTH)();
my_kernel2 CUDA_KERNEL(NUM_BLOCKS, BLOCK_WIDTH, SHMEM, STREAM)(param1, param2);

From VS 2015 and CUDA 7 onwards you can add these two includes before any others, provided your files have the .cu extension:

#include "cuda_runtime.h"
#include "device_launch_parameters.h"

No need for MACROS or anything. Afterwards everything will work perfectly.


Wow, lots of dust on this thread. I came up with a macro fix (well, more like workaround...) for this that I thought I would share:

// nvcc does not seem to like variadic macros, so we have to define
// one for each kernel parameter list:
#ifdef __CUDACC__
#define KERNEL_ARGS2(grid, block) <<< grid, block >>>
#define KERNEL_ARGS3(grid, block, sh_mem) <<< grid, block, sh_mem >>>
#define KERNEL_ARGS4(grid, block, sh_mem, stream) <<< grid, block, sh_mem, stream >>>
#else
#define KERNEL_ARGS2(grid, block)
#define KERNEL_ARGS3(grid, block, sh_mem)
#define KERNEL_ARGS4(grid, block, sh_mem, stream)
#endif

// Now launch your kernel using the appropriate macro:
kernel KERNEL_ARGS2(dim3(nBlockCount), dim3(nThreadCount)) (param1); 

I prefer this method because for some reason I always lose the '<<<' in my code, but the macro gets some help via syntax coloring :).


Visual Studio provides IntelliSense for C++, the trick from the rocket scientist's blog is basically relying on the similarity CUDA-C has to C++, nothing more.

In the C++ language, the proper parsing of angle brackets is troublesome. You've got < as less than and for templates, and << as shift, remember not long ago when we had to put a space in between nested template declarations.

So it turns out that the guy at NVIDIA who came up with this syntax was not a language expert, and happened to choose the worst possible delimiter, then tripled it, well, you're going to have trouble. It's amazing that Intellisense works at all when it sees this.

The only way I know to get full IntelliSense in CUDA is to switch from the Runtime API to the Driver API. The C++ is just C++, and the CUDA is still (sort of) C++, there is no <<<>>> badness for the language parsing to have to work around.