Install Theano and CUDA Toolkit 7.5 on OSX
Following these steps you will have the following DNN environment in python 2.7 and OSX 10.1.4
CUDA Toolkit 7.5
The NVIDIA® CUDA® Toolkit provides a comprehensive development environment for C and C++ developers building GPU-accelerated applications. The CUDA Toolkit includes a compiler for NVIDIA GPUs, math libraries, and tools for debugging and optimizing the performance of your applications. You’ll also find programming guides, user manuals, API reference, and other documentation to help you get started quickly accelerating your application with GPUs.
cuDNN 5 Release Candidate
The NVIDIA CUDA® Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks. cuDNN provides highly tuned implementations for standard routines such as forward and backward convolution, pooling, normalization, and activation layers. cuDNN is part of the NVIDIA Deep Learning SDK.
Theano
Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Theano features:
- tight integration with NumPy – Use numpy.ndarray in Theano-compiled functions.
- transparent use of a GPU – Perform data-intensive calculations up to 140x faster than with CPU.(float32 only)
- efficient symbolic differentiation – Theano does your derivatives for function with one or many inputs.
- speed and stability optimizations – Get the right answer for
log(1+x)
even whenx
is really tiny. - dynamic C code generation – Evaluate expressions faster.
- extensive unit-testing and self-verification – Detect and diagnose many types of errors.
Installation
1- Make sure pip
is using the right version of python
$ python --version
Python 2.7.11 :: Anaconda 2.5.0 (x86_64)
$ pip --version
pip 8.1.1 from /Users/RLO/anaconda/envs/py27/lib/python2.7/site-packages
(python 2.7)
2- Install nose:
$ pip install nose
2.1 – If you later see error in Theano tests:
ERROR: Failure: ImportError (No module named nose_parameterized)
you should install nose-parametrized
$ pip-3.2 install nose-parameterized
3- Install the latest version of Theano:
$ pip install Theano --upgrade --no-deps git+git://github.com/Theano/Theano.git
4- Install GPU related packages, TWO steps process
4.1 – Install CUDA Toolkit 7.5.
Download and Install the package of CUDA Toolkit 7.5 from the official link.
4.2- Install cuDNN
Next, we have to register on NVIDIA to be able to download cuDNN, which is a GPU-accelerated library of primitives for deep neural networks.
Manual Step: After downloading, please uncompress the package and copy the header file and libraries to include
and lib
under the root directory of CUDA Toolkit (e.g. /usr/local/cuda), respectively.
5- Set environment variables for CUDA and cuDNN
There are two ways to add the environment variables for CUDA and cuDNN, pointing paths to /Developer/NVIDIA/CUDA-7.5 or to /usr/local/cuda. I ended up using the first option.
You can edit ~/.bash_profile or add the lines from a command, I prefer to edit the file directly
# add CUDA tools to command path export CUDA_ROOT=/Developer/NVIDIA/CUDA-7.5 export PATH=$CUDA_ROOT/bin:${PATH} export CPATH=$CUDA_ROOT/include:${CPATH} export LIBRARY_PATH=$CUDA_ROOT/lib:$LIBRARY_PATH export DYLD_LIBRARY_PATH=$CUDA_ROOT/lib:$DYLD_LIBRARY_PATH export LD_LIBRARY_PATH=$CUDA_ROOT/lib:$LD_LIBRARY_PATH
Reload your bash_profile to validate the settings right away
$ echo >> ~/.bash_profile
6- Optional: Install PyCUDA
PyCUDA lets you access Nvidia‘s CUDA parallel computation API from Python. Several wrappers of the CUDA API already exist.
$ pip install pycuda
7- Test Theano
To make sure that Theano is using CDNN and GPU and everything works correctly you should run the test with the following command (processing can take some time first time is executed, up to a couple of hours..)
$ python -c "import theano; theano.test(verbose=3)"
You should see “Using gpu device” in the first lines.
Using gpu device 0: GeForce GT 650M (CNMeM is disabled, cuDNN 5004)
Note: Depending on your video card GPU or external GPU you cans see some errors when memory is trying to be allocated
Go to energy settings and turn off automatic graphics card switching to turn your NVIDIA card on.
8- Testing usage
Now, we can run a test code to see if the Theano works as expected.
python code
test.py:
from theano import function, config, shared, sandbox
import theano.tensor as T
import numpy
import time
vlen = 10 * 30 * 768 # 10 x #cores x # threads per core
iters = 1000
rng = numpy.random.RandomState(22)
x = shared(numpy.asarray(rng.rand(vlen), config.floatX))
f = function([], T.exp(x))
print(f.maker.fgraph.toposort())
t0 = time.time()
for i in range(iters):
r = f()
t1 = time.time()
print("Looping %d times took %f seconds" % (iters, t1 - t0))
print("Result is %s" % (r,))
if numpy.any([isinstance(x.op, T.Elemwise) for x in f.maker.fgraph.toposort()]):
print('Used the cpu')
else:
print('Used the gpu')
Let’s run this code on CPU and GPU, separately.
CPU case:
$ THEANO_FLAGS='device=cpu' python test.py
[Elemwise{exp,no_inplace}(<TensorType(float32, vector)>)]
Looping 1000 times took 14.474722 seconds
Result is [ 1.23178029 1.61879337 1.52278066 ..., 2.20771813 2.29967761
1.62323284]
Used the cpu
GPU case:
$ THEANO_FLAGS='device=gpu' python test.py
Using gpu device 0: GeForce GTX 660M (CNMeM is disabled)
[GpuElemwise{exp,no_inplace}(<CudaNdarrayType(float32, vector)>), HostFromGpu(GpuElemwise{exp,no_inplace}.0)]
Looping 1000 times took 0.517552 seconds
Result is [ 1.23178029 1.61879349 1.52278066 ..., 2.20771813 2.29967761
1.62323296]
Used the gpu
Please note that Theano will have to compile the python code to generate C++/CUDA code when executing with GPU for the first time. Thus, the results shown above came from the execution of the second time.
Finally, it can be observed that the runtime is greatly reduced when GPU is used : )
ipython notebook
You can test the usage of GPU with the following ipython notebook
9- Optional not necessary in most cases: Configure Theano and GPU usage
Theano does not create any configuration file by itself, but have default value for all its configuration flags. You only need such a file if you want to modify the default values.
You can create the file ~/.theanorc
in your home directory to modify the default Theano settings for GPU usage
See this page for more details: http://deeplearning.net/software/theano/library/config.html
For example my ~/.theanorc
is the following
[global] floatX = float32 device = gpu force_device = True optimizer_including=cudnn [nvcc] fastmath = True