Once the decision was made to move to ROS 2, the next step is to upgrade my Ubuntu installation to Ubuntu Bionic Beaver 18.04 LTS. I could upgrade in place, but given that I’m basically rebuilding my system for new infrastructure I decided to take this opportunity to upgrade to a larger SSD and restart from scratch.
As soon as my Ubuntu installation was up and running, I immediately went to revisit the most problematic portion of my previous installation: the version-matching adventure to install GPU-accelerated Tensorflow. If anything goes horribly wrong, this is the best time to flatten the disk and try until I get it right. As it turned out, that was a good call.
Taking to heart the feedback given by people like myself, Google has streamlined TensorFlow installation and even includes an option to run within a Docker container. This packages all of the various software libraries (from Nvidia’s CUDA Toolkit to Google’s TensorFlow itself) into a single integrated package. This is in fact their recommended procedure today for GPU support, with the words:
This setup only requires the NVIDIA GPU drivers.
When it comes to Linux and specialty device drivers, “only” rarely actually turns out to be so. I went online for further resources and found this page offering three options for installing Nvidia drivers on Ubuntu 18.04. Since I like living on the bleeding edge and have little to lose on a freshly installed disk, I tried the manual installation of Nvidia driver files first.
It was not user friendly, the script raised errors that pointed me to log file… but the log file did not contain any information I found relevant for diagnosis. On a lark (again, very little to lose) I selected “continue anyway” options for the process to complete. This probably meant the installation has gone off the rails, but I wanted to see what I end with. After reboot I can tell my video driver has been changed, because it only ran on a single monitor and had flickering visual artifacts.
Well, that didn’t work.
I then tried to install drivers from ppa:graphics-drivers/ppa
but that process encountered problems it didn’t know how to solve. Not being familiar with Ubuntu mechanics, I only had approximate understanding of the error messages. What it really told me was “you should probably reformat and restart now” which I did.
Once Ubuntu 18.04 was reinstalled, I tried the ppa:graphics-drivers/ppa
option again and this time it successfully installed the latest driver with zero errors and zero drama. I even maintained the use of both monitors without any flickering visual artifacts.
With that success, I installed Docker Community Edition for Ubuntu followed by Nvidia container runtime for Docker, both of which installed smoothly.
Once the infrastructure was in place, I was able to run a GPU-enabled TensorFlow docker containers on my machine, executing a simple program.
This process is still not great, but at least it is getting smoother. Maybe I’ll revisit this procedure again in another year to find an easier process. In the meantime, I’m back up and running with latest TensorFlow on my Ubuntu 18.04.