Multi Nvidia GPU overclocking for computations (CUDA)

I never was able to get it to work by hand editing xorg.conf. What did work was to execute on the command line which sets it all up for you:

sudo nvidia-xconfig -a --cool-bits=28 --allow-empty-initial-configuration

Then edit xorg.conf. For me that was sudo vi /etc/X11/xorg.conf and prepend "#" to each line containing allow-empty-initial-configuration to comment it out.

Reboot.

Then to overclock run:

/usr/bin/nvidia-settings

To restore your settings after a reboot create an executable file that you call from startup applications containing the text below which will set the gpu clock offset and set the gpu to prefer maximum performance. My example sets the offset to 50. Don't set the offset too high in the file for your actual display gpu until you know for sure what you want or you may end up with a system where the display won't work:

nvidia-settings -a [gpu:0]/GpuPowerMizerMode=1
nvidia-settings -a [gpu:0]/GPUGraphicsClockOffset[3]=50

nvidia-settings -a [gpu:1]/GpuPowerMizerMode=1
nvidia-settings -a [gpu:1]/GPUGraphicsClockOffset[3]=50

nvidia-settings -a [gpu:2]/GpuPowerMizerMode=1
nvidia-settings -a [gpu:2]/GPUGraphicsClockOffset[3]=50

nvidia-settings -a [gpu:3]/GpuPowerMizerMode=1
nvidia-settings -a [gpu:3]/GPUGraphicsClockOffset[3]=50

If you want to overclock memory too it's

nvidia-settings -a [gpu:0]/GPUMemoryTransferRateOffset[3]=800 

And of related interest, you can also modify power to the cards. To see the valid values enter a value of 1000

sudo -n nvidia-smi -i 0 --persistence-mode=1
sudo -n nvidia-smi -i 0 --power-limit=145

And just to display power

nvidia-smi

Changing the xorg.conf file to add virtual X servers for each of the cards (even those not connected to a monitor) solved the issue.

Basically, you want to have a server layout section with all of your real and virtual screens:

Section "ServerLayout"  
    Identifier    "Layout0"     
#   Our real monitor
    Screen 0      "Screen0" 0 0     
#   Our virtual monitors
    Screen 1      "Screen1"     
    Screen 2      "Screen2"
#    ....
    Screen 3      "Screen3"     
    InputDevice   "Keyboard0" "CoreKeyboard"
    InputDevice   "Mouse0"    "CorePointer" 
EndSection

Then, for each your cards, you can put in (almost) identical "Monitor", "Screen" and "Display" sections, differing only by their identifiers, which in the following are N, but should be repaced by the card number, 0,1, etc. Note that at least the parameters for the real monitor should correspond to what you currently have in your xorg.conf file, i.e. in the following I have CRT since it's an old VGA monitor.

Section "Screen"
    Identifier     "ScreenN"
    Device         "DeviceN"
    Monitor        "MonitorN"
    DefaultDepth 24
    Option         "ConnectedMonitor" "CRT"
    Option         "Coolbits" "5"
    Option         "TwinView" "0"
    Option         "Stereo" "0"
    Option         "metamodes" "nvidia-auto-select +0+0"
    SubSection     "Display"
       Depth 24
    EndSubSection
EndSection



Section "Monitor"
    Identifier     "MonitorN"
    VendorName     "Unknown"
    ModelName      "CRT-N"
    HorizSync       28.0 - 33.0
    VertRefresh     43.0 - 72.0
    Option         "DPMS"
EndSection

Section "Device"
    Identifier     "DeviceN"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BoardName      "Your Card name here"
    BusID          "PCI:X:Y:Z"
EndSection

Tags:

Gpu

Nvidia