Ubuntu16.04安装caffe+CUDA8.0+cuDNN GPU version

写在前面

最开始是在win10下面开了个virtual Box的Ubuntu虚拟机，而且为了追求好看装得还是Ubuntu17.04 Gnome，因为刚出来不久，一些bug很多，踩了很多坑，后来就上了Ubuntu16.04LTS了，但是因为在虚拟机下不能用GPU版本，所以真的是慢到要死啊！好吧我也就忍了，毕竟虚拟机。但是前天因为在win10下matlab2017a打开工作没几分钟，直接占用我cpu达到97.5%！我也是醉了，想着重启会好些吧，没想到直接把我的win10给搞崩了，估计是后台进程还没有推出不正常关机造成的。所以干脆，一不做二不休，上双系统，但也意味着得重新配置了。讲实话，要不是还要在windows下做一些工作，我是真的不想用windows了，麻烦。废话不多说，我们开始caffe的安装。

配置：
win10 + Ubuntu16.04 双系统
cpu: i5-4300H
GPU: NVIDIA GTX950M

官方教程：
[caffe installation guideline](http://caffe.berkeleyvision.org/install_apt.html)
[caffe installation guideline(dependencies installation)](http://caffe.berkeleyvision.org/install_apt.html)

一、安装NVIDIA显卡驱动
我的Ubuntu是新安装的，所以上来先更新一下

sudo apt update

Ubuntu16.04 LTS是默认使用Nouveau作为我们的GPU驱动的，所以我们要先装上NVIDIA的驱动。打开软件和更新

选择使用NVIDAI驱动，应用更改等待完成之后重启即可

二、安装CUDA
CUDA下载
我选的是 linux-x86_64_Ubuntu16.04-runfile（local)-Base Installer
下载完成之后执行

sudo sh cuda_8.0.27_linux.run

执行之后会让你选择是否安装显卡驱动：

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 375.26?

选择否（因为我们刚才已经安装了更高版本）之后，之后的一切选择默认即可。

Logging to /tmp/cuda_install_27233.log
Using more to view the EULA.
End User License Agreement
--------------------------
Lisence
...
...
--------------------------
Do you accept the prevIoUsly read EULA?
accept/decline/quit:accept
Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 375.26?
(y)es/(n)o/(q)uit: n
# cause I have installed version 375.66
Install the CUDA 8.0 Toolkit?
(y)es/(n)o/(q)uit: y
Enter Toolkit Location
 [ default is /usr/local/cuda-8.0 ]:
# I chose the defalut to avoid unexpected errors
Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y
Install the CUDA 8.0 Samples?
(y)es/(n)o/(q)uit: y
Enter CUDA Samples Location
 [ default is /home/leerw ]:
# default too

有人在此过程中遇到unsupport complier的错误，是因为g++编译器的版本太高问题导致的，我没有遇到这个错误，如果你遇到了，参见xuzhongxiong的博客

安装完成之后，我们配置一下环境变量

sudo vi /etc/profile

加上两句：

export PATH=/usr/local/cuda-8.0/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64:$LD_LIBRARY_PATH

立即生效

sudo ldconfig

在这里我们可能会得到一个error：

/sbin/ldconfig.real: /usr/lib/nvidia-375/libEGL.so.1 不是符号连接

/sbin/ldconfig.real: /usr/lib32/nvidia-375/libEGL.so.1 不是符号连接

So,how to fix this error? This maybe caused by a conflit between different version of libEGL.lib.

sudo mv /usr/lib/nvidia-375/libEGL.so.1 /usr/lib/nvidia-375/libEGL.so.1.org
sudo mv /usr/lib32/nvidia-375/libEGL.so.1 /usr/lib32/nvidia-375/libEGL.so.1.org
sudo ln -s /usr/lib/nvidia-375/libEGL.so.375.39 /usr/lib/nvidia-375/libEGL.so.1
sudo ln -s /usr/lib32/nvidia-375/libEGL.so.375.39 /usr/lib32/nvidia-375/libEGL.so.1

this issue is reported at this

接下来我们测试一下：
在这之前，we would like to install some library we will use next step

sudo apt-get install freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev  libglu1-mesa libglu1-mesa-dev libgl1-mesa-glx

OK,test is starting

cd /usr/local/cuda/samples
ls
0_Simple     2_Graphics  4_Finance      6_Advanced       common    Makefile
1_Utilities  3_Imaging   5_Simulations  7_CUDALibraries  EULA.txt
sudo make -j8

make[1]: Entering directory '/usr/local/cuda-8.0/samples/0_Simple/matrixMul_nvrtc'
make[1]: Entering directory '/usr/local/cuda-8.0/samples/0_Simple/simpleZeroCopy'
make[1]: Entering directory '/usr/local/cuda-8.0/samples/0_Simple/simpleMultiGPU'
make[1]: Entering directory '/usr/local/cuda-8.0/samples/0_Simple/simplePitchLinearTexture'
...
...
make[1]: Leaving directory '/usr/local/cuda-8.0/samples/3_Imaging/dxtc'

cd ./bin/x86_64/linux/release/
./deviceQuery

./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 950M"
  CUDA Driver Version / Runtime Version          8.0 / 8.0
  CUDA Capability Major/Minor version number:    5.0
  Total amount of global memory:                 2003 MBytes (2100232192 bytes)
  ( 5) Multiprocessors,(128) CUDA Cores/MP:     640 CUDA Cores
  GPU Max Clock rate:                            1124 MHz (1.12 GHz)
  Memory Clock rate:                             1001 Mhz
  Memory Bus Width:                              128-bit
  L2 Cache Size:                                 2097152 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536),2D=(65536,65536),3D=(4096,4096,4096)
  Maximum Layered 1D Texture Size,(num) layers  1D=(16384),2048 layers
  Maximum Layered 2D Texture Size,(num) layers  2D=(16384,16384),2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,z): (1024,1024,64)
  Max dimension size of a grid size    (x,z): (2147483647,65535,65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
  Run time limit on kernels: Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery,CUDA Driver = CUDART,CUDA Driver Version = 8.0,CUDA Runtime Version = 8.0,NumDevs = 1,Device0 = GeForce GTX 950M
Result = PASS // if you see this,congratulations!

三、安装cuDNN
官网下载

I selected this version "https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v7/#prod/8.0_20170802/cudnn-8.0-linux-x64-v7-tgz" cause my cuda's version is 8.0 
    unzip the tar.gz,we got this foler "CUDA"
    copy some files to our local foler

// in the "cuda" foler
sudo cp lib64/lib* /usr/local/cuda/lib64/
sudo cp include/cudnn.h /usr/local/cuda/include/
cd /usr/local/cuda/lib64/
sudo chmod +r libcudnn.so.7.0.1
# 创建软链接
sudo ln -sf libcudnn.so.7.0.1 libcudnn.so.7
sudo ln -sf libcudnn.so.7 libcudnn.so
sudo ldconfig

至此我们的cuDNN已经安装完成

三、安装OpenCV3

we would like to get opencv3 code from git to get the latest version,here is the link

git clone https://github.com/opencv/opencv
正克隆到 'opencv'...
remote: Counting objects: 210094,done.
remote: Compressing objects: 100% (5/5),done.
remote: Total 210094 (delta 0),reused 1 (delta 0),pack-reused 210089
接收对象中: 100% (210094/210094),429.85 MiB | 245.00 KiB/s,完成.
处理 delta 中: 100% (145370/145370),完成.
检查连接... 完成。
正在检出文件: 100% (5371/5371),完成.

接下来我们开始build

// in the "opencv" folder
mkdir build
cd build/ // 前提是你已经下载安装了cmake，如果没有，可以很方便地在Ubuntu软件中心中找到安装即可
cmake -D CMAKE_BUILD_TYPE=Release -D CMAKE_INSTALL_PREFIX=/usr/local ..

可能会有错误：
...compling...
-- Checking for module 'libavresample'
-- No package 'libavresample' found
-- Checking for module 'libgphoto2'
-- No package 'libgphoto2' found
-- IPPICV: Download: ippicv_2017u2_lnx_intel64_20170418.tgz
-- =======================================================================
 Couldn't download files from the Internet.
 Please check the Internet access on this host. =======================================================================

CMake Warning at cmake/OpenCVDownload.cmake:188 (message):
 IPPICV: Download Failed: 6;"Couldn't resolve host name"

 For details please refer to the download log file:

 /media/leerw/办公/ubuntu_caffe/opencv/build/CMakeDownloadLog.txt

# 我们到log文件中看一下
################################显示如下################################
use_cache "/media/leerw/办公/ubuntu_caffe/opencv/.cache"
do_unpack "ippicv_2017u2_lnx_intel64_20170418.tgz" "87cbdeb627415d8e4bc811156289fa3a" "https://raw.githubusercontent.com/opencv/opencv_3rdparty/a62e20676a60ee0ad6581e217fe7e4bada3b95db/ippicv/ippicv_2017u2_lnx_intel64_20170418.tgz" "/media/leerw/办公/ubuntu_caffe/opencv/build/3rdparty/ippicv"
#check_md5 "/media/leerw/办公/ubuntu_caffe/opencv/.cache/ippicv/87cbdeb627415d8e4bc811156289fa3a-ippicv_2017u2_lnx_intel64_20170418.tgz"
#mismatch_md5 "/media/leerw/办公/ubuntu_caffe/opencv/.cache/ippicv/87cbdeb627415d8e4bc811156289fa3a-ippicv_2017u2_lnx_intel64_20170418.tgz" "d41d8cd98f00b204e9800998ecf8427e"
#delete "/media/leerw/办公/ubuntu_caffe/opencv/.cache/ippicv/87cbdeb627415d8e4bc811156289fa3a-ippicv_2017u2_lnx_intel64_20170418.tgz"
#cmake_download "/media/leerw/办公/ubuntu_caffe/opencv/.cache/ippicv/87cbdeb627415d8e4bc811156289fa3a-ippicv_2017u2_lnx_intel64_20170418.tgz" "https://raw.githubusercontent.com/opencv/opencv_3rdparty/a62e20676a60ee0ad6581e217fe7e4bada3b95db/ippicv/ippicv_2017u2_lnx_intel64_20170418.tgz"
#######################################################################

是因为没有下载成功，没关系我们手动下载，感谢这个网页,great help
按照log中的说法我们把这个压缩包复制到/opencv/.cache/目录下

cp ippicv_2017u2_lnx_intel64_20170418.tgz opencv/.cache/ 
// in the opencv/build/ folder
sudo make install

好了，我们的opencv3已经编译好了，我们在python下测试一下
我的python版本是自带的python2.7

>>> import cv2
 Traceback (most recent call last):
  File "<stdin>",line 1,in <module>
 ImportError: No module named cv2

这是因为我们没有安装python-opencv包

pip install python-opencv
 Collecting opencv-python
  Downloading opencv_python-3.3.0.9-cp27-cp27mu-manylinux1_x86_64.whl (8.8MB)
    100% |████████████████████████████████| 8.8MB 68kB/s 
 Collecting numpy>=1.11.1 (from opencv-python)
  Downloading numpy-1.13.1-cp27-cp27mu-manylinux1_x86_64.whl (16.6MB)
    100% |████████████████████████████████| 16.6MB 50kB/s 
 Installing collected packages: numpy,opencv-python
 Successfully installed numpy opencv-python
 Success and test again

if you do not have pip,execute as follow first to get pip

sudo apt install pip

重新测试一下

>>> import cv2
>>> print("opencv installation is succeed!")
opencv installation is succeed!

四、安装caffe

先从git上下载

git clone https://github.com/BVLC/caffe

安装boost

sudo apt-get install -y --no-install-recommends libboost-all-dev

安装BLAS

sudo apt-get install libatlas-base-dev

安装其他必须的库

sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libboost-all-dev libhdf5-serial-dev \
libgflags-dev libgoogle-glog-dev liblmdb-dev protobuf-compiler
sudo pip install scikit-image protobuf

得到一个错误：

Collecting scikit-image
  Downloading scikit_image-0.13.0-cp27-cp27mu-manylinux1_x86_64.whl (33.7MB)
    85% |███████████████████████████▍    | 28.9MB 32kB/s eta 0:02:32Exception:
Traceback (most recent call last):
  File "/home/leerw/.local/lib/python2.7/site-packages/pip/basecommand.py",line 215,in main
    status = self.run(options,args)
  File "/home/leerw/.local/lib/python2.7/site-packages/pip/commands/install.py",line 324,in run
    requirement_set.prepare_files(finder)
  File "/home/leerw/.local/lib/python2.7/site-packages/pip/req/req_set.py",line 380,in prepare_files
    ignore_dependencies=self.ignore_dependencies))
  File "/home/leerw/.local/lib/python2.7/site-packages/pip/req/req_set.py",line 620,in _prepare_file
    session=self.session,hashes=hashes)
  File "/home/leerw/.local/lib/python2.7/site-packages/pip/download.py",line 821,in unpack_url
    hashes=hashes
  File "/home/leerw/.local/lib/python2.7/site-packages/pip/download.py",line 659,in unpack_http_url
    hashes)
  File "/home/leerw/.local/lib/python2.7/site-packages/pip/download.py",line 882,in _download_http_url
    _download_url(resp,link,content_file,hashes)
  File "/home/leerw/.local/lib/python2.7/site-packages/pip/download.py",line 603,in _download_url
    hashes.check_against_chunks(downloaded_chunks)
  File "/home/leerw/.local/lib/python2.7/site-packages/pip/utils/hashes.py",line 46,in check_against_chunks
    for chunk in chunks:
  File "/home/leerw/.local/lib/python2.7/site-packages/pip/download.py",line 571,in written_chunks
    for chunk in chunks:
  File "/home/leerw/.local/lib/python2.7/site-packages/pip/utils/ui.py",line 139,in iter
    for x in it:
  File "/home/leerw/.local/lib/python2.7/site-packages/pip/download.py",line 560,in resp_read
    decode_content=False):
  File "/home/leerw/.local/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/response.py",line 357,in stream
    data = self.read(amt=amt,decode_content=decode_content)
  File "/home/leerw/.local/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/response.py",in read
    flush_decoder = True
  File "/usr/lib/python2.7/contextlib.py",line 35,in __exit__
    self.gen.throw(type,value,traceback)
  File "/home/leerw/.local/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/response.py",line 246,in _error_catcher
    raise ReadTimeoutError(self._pool,None,'Read timed out.')
ReadTimeoutError: HTTPSConnectionPool(host='pypi.python.org',port=443): Read timed out.

没关系，网络问题，我重试就好啦
then we go to caffe/python and install all the necessary Python packages

for req in $(cat requirements.txt); do sudo pip install $req; done

接下来我们要修改我们caffe目录下的Makefile和Makefile.config
首先是Makefile.config，先把Makefile.config.example中的内容复制到Makefile.config

// in the "caffe" folder
cp Makefile.config.example Makefile.config

# USE_CUDNN := 1
注释去掉修改为
USE_CUDNN := 1

# OPENCV_VERSION := 3
注释去掉修改为
OPENCV_VERSION := 3

PYTHON_INCLUDE := /usr/include/python2.7 \ /usr/lib/python2.7/dist-packages/numpy/core/include
修改为
PYTHON_INCLUDE := /usr/include/python2.7 \ /usr/local/lib/python2.7/dist-packages/numpy/core/include

# WITH_PYTHON_LAYER := 1
去掉注释修改为
WITH_PYTHON_LAYER := 1

INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include
修改为：
INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial/

接下来是Makefile

修改为 LIBRARIES += glog gflags protobuf boost_system boost_filesystem m hdf5_serial_hl hdf5_serial

接下来make

make all -j8
make test -j8
make runtest -j8
############################ if you see this,congratulations! #################################
[       OK ] ProtoTest.TestSerialization (0 ms)
[----------] 1 test from ProtoTest (0 ms total)

[----------] 2 tests from GemmTest/1,where TypeParam = double
[ RUN      ] GemmTest/1.TestGemvcpuGPU
[       OK ] GemmTest/1.TestGemvcpuGPU (1 ms)
[ RUN      ] GemmTest/1.TestGemmcpuGPU
[       OK ] GemmTest/1.TestGemmcpuGPU (0 ms)
[----------] 2 tests from GemmTest/1 (1 ms total)

[----------] 3 tests from TanHLayerTest/1,where TypeParam = caffe::cpuDevice<double>
[ RUN      ] TanHLayerTest/1.TestTanH
[       OK ] TanHLayerTest/1.TestTanH (0 ms)
[ RUN      ] TanHLayerTest/1.TestTanHOverflow
[       OK ] TanHLayerTest/1.TestTanHOverflow (0 ms)
[ RUN      ] TanHLayerTest/1.TestTanHGradient
[       OK ] TanHLayerTest/1.TestTanHGradient (2 ms)
[----------] 3 tests from TanHLayerTest/1 (2 ms total)

[----------] 3 tests from ThresholdLayerTest/1,where TypeParam = caffe::cpuDevice<double>
[ RUN      ] ThresholdLayerTest/1.TestSetup
[       OK ] ThresholdLayerTest/1.TestSetup (0 ms)
[ RUN      ] ThresholdLayerTest/1.Test
[       OK ] ThresholdLayerTest/1.Test (0 ms)
[ RUN      ] ThresholdLayerTest/1.Test2
[       OK ] ThresholdLayerTest/1.Test2 (0 ms)
[----------] 3 tests from ThresholdLayerTest/1 (0 ms total)

[----------] Global test environment tear-down
[==========] 2101 tests from 277 test cases ran. (381562 ms total)
[  PASSED  ] 2101 tests.
################################# our caffe installation is over################################

最后我们配置一下pycaffe

sudo vim ~/.bashrc

在文件末尾追加

export PYTHONPATH=/media/leerw/办公/caffe/python:$PYTHONPATH
/*这里的/media/leerw/办公/caffe/python请改为你的路径，注意路径中不能包含中文，否则你会遇到我在文末提到的痛苦*/

生效

source ~/.bashrc
make clean
make pycaffe -j8

如果你遇到了如下错误

CXX/LD -o python/caffe/_caffe.so python/caffe/_caffe.cpp
python/caffe/_caffe.cpp:10:31: fatal error: numpy/arrayobject.h: No such file or directory
compilation terminated.

是因为numpy没有安装

pip install numpy
// 如果提示没有权限
sudo pip install numpy

注意caffe路径中不能含有中文，否则在python中一直会提示无法import caffe 这个错误！（说起来都是泪啊！！！）

Ubuntu16.04安装caffe+CUDA8.0+cuDNN GPU version

至此我们的caffe GPU version就已经安装完毕了，祝你玩的快乐！如果对你有帮助，欢迎转载，转载请注明作者和我的博客地址，谢谢！

猜你在找的Ubuntu相关文章