Install GPU TensorFlow on AWS Ubuntu 16.04

 TensorFlow™ is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them.

On a typical system, there are multiple computing devices. In TensorFlow, the supported device types are CPU and GPU.  GPUs offer 10 to 100 times more computational power than traditional CPUs, which is one of the main reasons why graphics cards are currently being used to power some of the most advanced neural networks responsible for deep learning.

The environment setup is often the hardest part of getting a deep learning setup going, so hopefully you will find this step-by-step guide helpful.

Launch a GPU-enabled Ubuntu 16.04 AWS instance

Choose an Amazon Machine Image (AMI) – Ubuntu Server 16.04 LTS

AWS-Ubuntu

Choose an instance type

The smallest GPU-enabled machine is p2.xlarge

AWS-Ubuntu-GPUs

You can find more details here.

Configure Instance Details, Add Storage (choose storage size), Add Tags, Configure Security Group and Review Instance Launch and Launch.

launch-status

Open the terminal on your local machine and connect to the remote machine (ssh -i)

Update the package lists for upgrades for packages that need upgrading, as well as new packages that have just come to the repositories

sudo apt-get –assume-yes update

Install the newer versions of the packages

sudo apt-get –assume-yes  upgrade

Install the CUDA 8 drivers

CUDA is a parallel computing platform and application programming interface (API) model created by Nvidia. GPU-accelerated CUDA libraries enable drop-in acceleration across multiple domains such as linear algebra, image and video processing, deep learning and graph analytics.

Verify that you have a CUDA-Capable GPU

lspci | grep -i nvidia
00:1e.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)

Verify You Have a Supported Version of Linux

uname -m && cat /etc/*release

x86_64
DISTRIB_ID=Ubuntu
…..

The x86_64 line indicates you are running on a 64-bit system. The remainder gives information about your distribution.

 Verify the System Has gcc Installed

gcc –version

If the message is “The program ‘gcc’ is currently not installed. You can install it by typing: sudo apt install gcc”

sudo apt-get install gcc

gcc –version

gcc (Ubuntu 5.4.0-6ubuntu1~16.04.5) 5.4.0 20160609

….

Verify the System has the Correct Kernel Headers and Development Packages Installed

uname –r

4.4.0-1038-aws

CUDA support

Download the CUDA-8 driver (CUDA 9 is not yet supported by TensorFlow 1.4)

The driver can be downloaded from here:

CUDA-download-toolikit

CUDA-download-toolikit-installer

Or, downloaded directly to the remote machine:

wget -O ./cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb https://developer.nvidia.com/compute/cuda/8.0/Prod2/local_installers/cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64-deb

Downloading patch 2 as well:

wget -O ./cuda-repo-ubuntu1604-8-0-local-cublas-performance-update_8.0.61-1_amd64.deb https://developer.nvidia.com/compute/cuda/8.0/Prod2/patches/2/cuda-repo-ubuntu1604-8-0-local-cublas-performance-update_8.0.61-1_amd64-deb

Install the CUDA 8 driver and patch 2

Extract, analyse, unpack and install the downloaded .deb files

sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb

sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-cublas-performance-update_8.0.61-1_amd64.deb

apt-key is used to manage the list of keys used by apt to authenticate packages. Packages which have been authenticated using these keys will be considered trusted.

sudo apt-key add /var/cuda-repo-8-0-local-ga2/7fa2af80.pub
sudo apt-key add /var/cuda-repo-8-0-local-cublas-performance-update/7fa2af80.pub

sudo apt-get update

Once completed (~10 min), reboot the system to load the NVIDIA drivers.

sudo shutdown -r now

Install cuDNN v6.0

The NVIDIA CUDA® Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks. cuDNN provides highly tuned implementations for standard routines such as forward and backward convolution, pooling, normalization, and activation layers.

Download the cuDNN v6.0 driver

The driver can be downloader from here: please note that you will need to register first.

cuDNN-download2

Copy the driver to the AWS machine (scp -r -i)

Extract the cuDNN files and copy them to the target directory

tar xvzf cudnn-8.0-linux-x64-v6.0.tgz  

sudo cp -P cuda/include/cudnn.h /usr/local/cuda/includesudo

cp -P cuda/lib64/libcudnn* /usr/local/cuda/lib64

sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*

Update your bash file

nano ~/.bashrc

Add the following lines to the end of the bash file:

export CUDA_HOME=/usr/local/cuda

export LD_LIBRARY_PATH=${CUDA_HOME}/lib64:$LD_LIBRARY_PATH

export PATH=${CUDA_HOME}/bin:${PATH}

bashrc

Save the file and exit.

Install TensorFlow

Install the libcupti-dev library

The libcupti-dev library is the NVIDIA CUDA Profile Tools Interface. This library provides advanced profiling support. To install this library, issue the following command:

sudo apt-get install libcupti-dev

Install pip

Pip is a package management system used to install and manage software packages written in Python which can be found in the Python Package Index (PyPI).

sudo apt-get install python-pip

sudo pip install –upgrade pip

Install TensorFlow

sudo pip install tensorflow-gpu

Test the installation

Run the following within the Python command line:

from tensorflow.python.client import device_lib

def get_available_gpus():

    local_device_protos = device_lib.list_local_devices()

    return [x.name for x in local_device_protos if x.device_type == ‘GPU’]

get_available_gpus()

The output should look similar to that:

2017-11-22 03:18:15.187419: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA

2017-11-22 03:18:17.986516: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2017-11-22 03:18:17.986867: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties:

name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235

pciBusID: 0000:00:1e.0

totalMemory: 11.17GiB freeMemory: 11.10GiB

2017-11-22 03:18:17.986896: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0, compute capability: 3.7)

[u’/device:GPU:0′]