tensorflow-gpu is not working with Blas GEMM launch failed

I had a very similar problem. For me it coincided with an nvidia driver update. So I though it was a problem with the driver. But changing the driver had no effect. What eventually worked for me was cleaning out the nvidia cache:

sudo rm -rf ~/.nv/

Found this suggestion in the NVIDIA developer forum: https://devtalk.nvidia.com/default/topic/1007071/cuda-setup-and-installation/cuda-error-when-running-matrixmulcublas-sample-ubuntu-16-04/post/5169223/

I suspect that during the driver update there where still some compiled files of the old version that were not compatible, or even that were corrupted during the process. Assumptions aside, this solved the problem for me.