Throttling processes by GPU temperature
While rendering some GPU-intensive OpenGL stuff I got scared when my graphics card hit 90C so I paused the process until it had returned to something cooler. I got fed up pausing and restarting it by hand so I wrote this small script:
#!/bin/bash kill -s SIGSTOP "${@}" running=0 stop_threshold=85 cont_threshold=75 while true do temperature="$(nvidia-smi -q -d TEMPERATURE | grep 'GPU Current Temp' | sed 's/^.*: \(.*\) C$/\1/')" if (( running )) then if (( temperature > stop_threshold )) then echo "STOP ${temperature} > ${stop_threshold}" kill -s SIGSTOP "${@}" running=0 fi else if (( temperature < cont_threshold )) then echo "CONT ${temperature} < ${cont_threshold}" kill -s SIGCONT "${@}" running=1 fi fi sleep 1 done | ts
If you want to run it yourself, I advise checking the output from nvidia-smi on your system because its manual page says the format isn't stable. Moreover I suggest monitoring the temperature, at least until you're sure it's working ok for you. Usage is simple, just pass on the command line the PIDs of the processes you want to throttle by GPU temperature, typically these would be OpenGL applications (or Vulkan / OpenCL / CUDA / whatever else they come up with next).