TensorFlow™ is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them.
On a typical system, there are multiple computing devices. In TensorFlow, the supported device types are CPU and GPU. GPUs offer 10 to 100 times more computational power than traditional CPUs, which is one of the main reasons why graphics cards are currently being used to power some of the most advanced neural networks responsible for deep learning.
The environment setup is often the hardest part of getting a deep learning setup going, so hopefully you will find this step-by-step guide helpful.
Launch a GPU-enabled Ubuntu 16.04 AWS instance
Choose an Amazon Machine Image (AMI) – Ubuntu Server 16.04 LTS

Choose an instance type
The smallest GPU-enabled machine is p2.xlarge

You can find more details here.
Configure Instance Details, Add Storage (choose storage size), Add Tags, Configure Security Group and Review Instance Launch and Launch.

Open the terminal on your local machine and connect to the remote machine (ssh -i)
Update the package lists for upgrades for packages that need upgrading, as well as new packages that have just come to the repositories
sudo apt-get –assume-yes update
Install the newer versions of the packages
sudo apt-get –assume-yes upgrade
Install the CUDA 8 drivers
CUDA is a parallel computing platform and application programming interface (API) model created by Nvidia. GPU-accelerated CUDA libraries enable drop-in acceleration across multiple domains such as linear algebra, image and video processing, deep learning and graph analytics.
Verify that you have a CUDA-Capable GPU
lspci | grep -i nvidia
00:1e.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
Verify You Have a Supported Version of Linux
uname -m && cat /etc/*release
x86_64
DISTRIB_ID=Ubuntu
…..
The x86_64 line indicates you are running on a 64-bit system. The remainder gives information about your distribution.
Verify the System Has gcc Installed
gcc –version
If the message is “The program ‘gcc’ is currently not installed. You can install it by typing: sudo apt install gcc”
sudo apt-get install gcc
gcc –version
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.5) 5.4.0 20160609
….
Verify the System has the Correct Kernel Headers and Development Packages Installed
uname –r
4.4.0-1038-aws

Download the CUDA-8 driver (CUDA 9 is not yet supported by TensorFlow 1.4)
The driver can be downloaded from here:


Or, downloaded directly to the remote machine:
wget -O ./cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb https://developer.nvidia.com/compute/cuda/8.0/Prod2/local_installers/cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64-deb
Downloading patch 2 as well:
wget -O ./cuda-repo-ubuntu1604-8-0-local-cublas-performance-update_8.0.61-1_amd64.deb https://developer.nvidia.com/compute/cuda/8.0/Prod2/patches/2/cuda-repo-ubuntu1604-8-0-local-cublas-performance-update_8.0.61-1_amd64-deb
Install the CUDA 8 driver and patch 2
Extract, analyse, unpack and install the downloaded .deb files
sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-cublas-performance-update_8.0.61-1_amd64.deb
apt-key is used to manage the list of keys used by apt to authenticate packages. Packages which have been authenticated using these keys will be considered trusted.
sudo apt-key add /var/cuda-repo-8-0-local-ga2/7fa2af80.pub
sudo apt-key add /var/cuda-repo-8-0-local-cublas-performance-update/7fa2af80.pub
sudo apt-get update
Once completed (~10 min), reboot the system to load the NVIDIA drivers.
sudo shutdown -r now
Install cuDNN v6.0
The NVIDIA CUDA® Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks. cuDNN provides highly tuned implementations for standard routines such as forward and backward convolution, pooling, normalization, and activation layers.
Download the cuDNN v6.0 driver
The driver can be downloader from here: please note that you will need to register first.

Copy the driver to the AWS machine (scp -r -i)
Extract the cuDNN files and copy them to the target directory
tar xvzf cudnn-8.0-linux-x64-v6.0.tgz
sudo cp -P cuda/include/cudnn.h /usr/local/cuda/includesudo
cp -P cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
Update your bash file
nano ~/.bashrc
Add the following lines to the end of the bash file:
export CUDA_HOME=/usr/local/cuda
export LD_LIBRARY_PATH=${CUDA_HOME}/lib64:$LD_LIBRARY_PATH
export PATH=${CUDA_HOME}/bin:${PATH}

Save the file and exit.
Install TensorFlow
Install the libcupti-dev library
The libcupti-dev library is the NVIDIA CUDA Profile Tools Interface. This library provides advanced profiling support. To install this library, issue the following command:
sudo apt-get install libcupti-dev
Install pip
Pip is a package management system used to install and manage software packages written in Python which can be found in the Python Package Index (PyPI).
sudo apt-get install python-pip
sudo pip install –upgrade pip
Install TensorFlow
sudo pip install tensorflow-gpu
Test the installation
Run the following within the Python command line:
from tensorflow.python.client import device_lib
def get_available_gpus():
local_device_protos = device_lib.list_local_devices()
return [x.name for x in local_device_protos if x.device_type == ‘GPU’]
get_available_gpus()
The output should look similar to that:
2017-11-22 03:18:15.187419: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2017-11-22 03:18:17.986516: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-11-22 03:18:17.986867: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties:
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 0000:00:1e.0
totalMemory: 11.17GiB freeMemory: 11.10GiB
2017-11-22 03:18:17.986896: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0, compute capability: 3.7)
[u’/device:GPU:0′]
Like this:
Like Loading...