Building a High Performance GPU Computing Workstation – Part IV
Update ASUS Motherboard BIOS to most current UEFI version.Install Windows 10 Pro via USB to NVMe Drive.Partition (shrink) NVMe Drive in Windows to free up space for UbuntuFormat & Partition 8Tb data driveCreate a bootable USB drive with Ubuntu 16.04LTSTurn off Secure Boot in UEFI BIOSInstall Ubuntu 16.04LTS via USB drive to NVMe free partitionInstall Nvidia 1080Ti Drivers in UbuntuTest both windows and Ubuntu for proper function- Install CUDA and cuDNN in Ubuntu & update ~/.profile
- Install Tensorflow, Keras, pytorch, and jupyter in Ubuntu
- Run MNIST, & Deep Dream, and small VGG network
- Modify GRUB boot loader (optional) & configure system
Finally, the fun stuff!
First, CUDA. I’m choosing to use CUDA 8.0 with cuDNN 5.1 as my setup reference. Please be aware if you install CUDA via debian with apt-get, you may get an older incompatible version of the NVIDIA driver, which will make you tear your hair out. (Z.B as of November 2017, Nvidia fixed this & included driver 384 in the debian package after I complained loudly. Thank you.) The real way to do this is to download the CUDA 8.0 linux runtime package from NVIDIA here, and move it to the desktop. You might need to tweak the permissions on the file to allow it to run on your computer, like I did. Open terminal and type:
cd Desktop
sudo chmod +x ~/Desktop/cuda_8.0.61_375.26_linux.run ( exact filename may be different)
and sudo sh ~/Desktop/cuda_8.0.61_375.26_linux.run
And the shell script will take you through install. Note that it will ask you if you want to install a driver. Answer no. The rest I answered yes to. CUDA8.0 installs into the directory CUDA-8.0.
You want to set your paths up now for CUDA and cuDNN. That involves setting the PATH dependencies so that programs can find CUDA & cuDNN.
export CUDA_HOME=/usr/local/cuda
export PATH=/usr/local/cuda-8.0/gin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64\
${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
cuDNN generally requires you to sign up with NVIDIA as a developer (amusing for us MD’s to consider ourselves ‘devs’) … or you can just borrow it from the nice folks at fast.ai by using the following script.
This will load cuDNN 5.1 from fast.ai’s web servers. Installing cuDNN is just copying two files into the CUDA directory. Some people on the web have suggested its OK to install different cuDNN versions, particularly with higher CUDA versions as they are each in their own directories – I have no idea, haven’t tried it yet.
wget “http://files.fast.ai/files/cudnn.tgz” -o “cudnn.tgz”
tar -zxf cudnn.tgz
sudo cd -P cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo cp -P cuda/include/cudnn.h /usr/local
To make sure this setup works more than once, you will need to edit your ~/.profile file to include the PATH depencies above.
Open up Terminal, and type Gedit ~/.profile. Now enter the following lines and exit with a save:
export CUDA_HOME=/usr/local/cuda
export PATH=/usr/local/cuda-8.0/gin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64\
${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
Now install Tensorflow & Keras. For initial concerns about reproducibility of research and getting the system actually running, I opted to go with Tensorflow 1.2.1 as it is the last version that worked with cuDNN 5.1 and CUDA 8.0. Newer versions of Tensorflow require higher cuDNN/CUDA versions – I’m letting the AI/ML community test these out a bit more before I migrate forward. Lastly, I’m using Python 3.5.
Google suggests one should use virtual environments, Anaconda or Docker… When I tried the virtualenvs, I only got the cpu tensorflow version working with them. I really wanted to like Anaconda, because of its intelligent package design and its included editors, but I just couldn’t reliably get it, jupyter, and pycharm working together (I did get its IDE, Spyder, working). Docker, was very difficult to work with and awkward, and I again couldn’t get it working in windows 7 pro. I preliminarily tried these packages in Jan 2016, and I know there has been a lot of development since then, so they are probably better than when I tried and failed initially. Since the purpose of this build is to run GPU tensorflow, I’m just going to own it and avoid all this virtualization, etc.. I haven’t had any dependency problems yet.
Tensorflow has good installation instructions –for the newest Tensorflow. I installed Tensorflow 1.2.1 GPU as it was the latest version compatible with CUDA 8 and cuDNN 5.1 . Compared to everything we have been through, Keras, jupyter, and pytorch are a snap to install. While Keras can use Theano or Tensorflow background engine, the Theano devs said Theano will sunset at 1.0, so no reason to use it over TF. You should install some other packages as well
sudo apt-get install python3-numpy python3-dev python3-pip python3-wheel
sudo apt-get install libcupti-dev scipy scikit-learn pillow h5py matplotlib
sudo pip3 install –ignore-installed –upgrade \ https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.2.1-cp35-cp35m-linux_x86_64.whl
sudo pip3 install keras
sudo pip3 install jupyter
sudo pip3 install http://download.pytorch.org/whl/cu80/torch-0.2.0.post3-cp35-cp35m-manylinux1_x86_64.whl
pip3 install torchvision
And you should be ready to go. Other dependencies might be necessary depending on your flavor of linux/ubuntu, but you should be prompted. In general, probably no harm adding packages if asked.
open terminal, cd Desktop, and run jupyter notebook. You can now test tensorflow by running this short script:
import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))
which should output: Hello, Tensorflow. But that’s boring.
It is a rite of passage for machine learners to run MNIST, so the tensorflow package includes a few of them. The first you can run is a simple MNIST classifier, so fire up jupyter and head over to the MNIST for ML Beginners TF page and start copying & Pasting.
That sort of works, but doesn’t get you the visceral satisfaction that you want from seeing training epochs scroll by on your screen, so head over to the deep MNIST for experts next and copy and paste & run it. Experience deep learning joy, at last.
And finally, so you can justify spending $4K and lots of time on this project, run deep dream in keras so you feel like you accomplished something. See? Pretty!
Finally, if you want you can download the CIFAR-10 database and use this VGG variant I used to put the machine through its paces in the testing our build blogpost.
P.S. – Now that you have a dual-boot system, you can edit GRUB bootloader so that it is a bit more user-friendly (I always look away, and it reboots into whatever the default is). There is a neat little GUI program called GRUB customizer that can be used for this purpose, and I have.
sudo add-apt-repository ppa:danielrichter2007/grub-customizer sudo apt-get update sudo apt-get install grub-customizer
and just either type grub-customizer or launch from software hub. Set up how you like. Enjoy!
NOTE: There is an awful lot of information in these four blogposts which I wrote about 1-2 months after the build. I took notes and have checked websites to provide what I think is some pretty accurate information. But given the scope and depth of its all, its possible that I might have gotten one or two things slightly off. Maybe not. If you see a definate error, please message me and I will consider correcting and updating. Thanks for reading, and if you decide to build this system, I wish you all the best of luck!