Can I compile a cuda program without having a cuda device

The answer to your question is YES.

The nvcc compiler driver is not related to the physical presence of a device, so you can compile CUDA codes even without a CUDA capable GPU. Be warned however that, as remarked by Robert Crovella, the CUDA driver library libcuda.so (cuda.lib for Windows) comes with the NVIDIA driver and not with the CUDA toolkit installer. This means that codes requiring driver APIs (whose entry points are prefixed with cu, see Appendix H of the CUDA C Programming Guide) will need a forced installation of a "recent" driver without the presence of an NVIDIA GPU, running the driver installer separately with the --help command line switch.

Following the same rationale, you can compile CUDA codes for an architecture when your node hosts a GPU of a different architecture. For example, you can compile a code for a GeForce GT 540M (compute capability 2.1) on a machine hosting a GT 210 (compute capability 1.2).

Of course, in both the cases (no GPU or GPU with different architecture), you will not be able to successfully run the code.

For the early versions of CUDA, it was possible to compile the code under an emulation modality and run the compiled code on a CPU, but device emulation is since some time deprecated. If you don't have a CUDA capable device, but want to run CUDA codes you can try using gpuocelot (but I don't have any experience with that).