1. 说明
本次部署使用台机器,4台用于搭建CDH集群,1台为内部源。内部源机器是可以连接公网的,可以提前部署好内部源(内部源服务器是已经存在的ubuntu14.04服务器),4台CDH集群服务器是无法连接内外。本次部署涉及到的服务器的hosts配置如下:
192.168.0.230 cdh01
192.168.0.231 cdh02
192.168.0.232 cdh03
192.168.0.233 cdh04
192.168.0.237 inner-source
2. 搭建内部源
下载cloudera-manager相关包(5.10):
地址:https://archive.cloudera.com/cm5/ubuntu/trusty/amd64/cm/pool/contrib/e/enterprise/
下载包列表:
cloudera-manager-server_5.10.0-1.cm5100.p0.85~trusty-cm5_all.deb
cloudera-manager-daemons_5.10.0-1.cm5100.p0.85~trusty-cm5_all.deb
cloudera-manager-agent_5.10.0-1.cm5100.p0.85~trusty-cm5_amd64.deb
CDH 安装包parcel下载:
地址:http://archive.cloudera.com/cdh5/parcels/5.10.0/
下载包列表:
DH-5.10.0-1.cdh5.10.0.p0.41-trusty.parcel
CDH-5.10.0-1.cdh5.10.0.p0.41-trusty.parcel.sha1
manifest.json
下载oracle-j2sdk1.7包地址:
https://archive.cloudera.com/cm5/ubuntu/trusty/amd64/cm/pool/contrib/o/oracle-j2sdk1.7/
下载MysqL数据库包地址:
地址:https://dev.MysqL.com/downloads/MysqL/
下载包列表(下载后解压获取所有deb):
MysqL-server_5.7.17-1ubuntu12.04_amd64.deb-bundle.tar
其他依赖包(使用内部源机器部署好内部源后, apt-get install安装cloudera-manager-daemons、cloudera-manager-server、cloudera-manager-agent完后获取/var/cache/apt/archives目录下的所有包,内部源实现的时候开始可以不用下载依赖包,安装cloudera相关包后更新下内部源,使其他不可以访问外网的机器可以在内部源上获取到依赖包):
lsb-base psmisc bash libsasl2-modules zlib1g libsqlite3-0 libfuse2 fuse rpcbind libxslt1.1 libsasl2-modules-gssapi-mit libMysqL-java python-urllib3 等等
创建内部源(在内部源机器192.168.0.237上执行):
安装dpkg-dev:
$sudo apt-get install dpkg-dev -y
生成Packages.gz:
$sudo -i
$sudo mkdir /data/soft/pool
拷贝所有下载完deb包到 /data/soft/pool下面,然后执行如下命令
#cd /data
#dpkg-scanpackages soft/pool | gzip > soft/Packages.gz
更新了依赖包后需重新执行,生产新的Packages.gz
安装配置apache2:
$sudo apt-get install apache2
$sudo mkdir /data/soft/cloudera
拷贝所有的CDH parcel包和manifest.json到/data/soft/cloudera目录下
$cd /var/www/html
$ln -s /data
这时候使用浏览器访问http://inner-source/data/应该可以看到我们下载的包了,inner-source为内部源服务器的hostname,或者在hosts文件中配置的映射。
3. 安装CDH所有服务器环境
所有服务器安装ubuntu14.04.10版本系统,使ultraiso制作ubuntu14.04.10 U盘启动,不要使用ubuntu14.04.5这个镜像,安装过程会无法加载CD。所有服务器使用相同的root密码和相同的安装时配置的用户名密码。这里的所有操作均针对CDH所有服务器。
统一所有CDH服务器时区和时间:
$date -R #查看时间和时区,所有时区为东八区,+0800
$sudo date -s 10:17:20 #修改为准确时间
配置所有服务器hosts如下:
192.168.0.230 cdh01
192.168.0.231 cdh02
192.168.0.232 cdh03
192.168.0.233 cdh04
192.168.0.237 inner-source、
配置所有CDH服务器apt源:
由于所有服务器均服务连接公网,部署CDH的过程中先注释掉官方源,添加内部源,安装完成后可以取消官方源的注释,否则更新源时导致长时间请求超时。执行如下命令
$sudo cp /etc/apt/sources.list /etc/apt/sources.list.bak
$sudo vim /etc/apt/sources.list.d/innersources.lis
在文件中添加如下行
deb http://inner-source/data soft/
$sudo apt-get update
$sudo vim /etc/apt/apt.conf
在新建文件中添加如下行,注意最后有分号:
APT::Get::AllowUnauthenticated 1 ;
配置root用户ssh远程登录,所有CDH服务器root密码一样:
$sudo passwd root
$sudo vi /etc/ssh/sshd_config
在文件中修改如下:
#PermitRootLogin without-password
PermitRootLogin yes
$sudo service ssh restart #重启ssh服务
所有服务器上安装jdk:
$sudo apt-get install oracle-j2sdk1.7
配置JAVA_HOME:
$sudo vim /etc/profile #在文件末尾添加如下内容
export JAVA_HOME=/usr/lib/jvm/java-7-oracle-cloudera
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=${JAVA_HOME}/lib:${JRE_HOME}/jre:${CLASSPATH}
export PATH=${JAVA_HOME}/bin:${PATH}
$source /etc/profile
所有服务器上安装dbc driver(使用MysqL数据库):
$sudo apt-get install libMysqL-java
4. 主节点安装cloudera manager
安装MysqL数据库(主节点):
$sudo apt-get install MysqL-server
$sudo service MysqL stop
$sudo mv /var/lib/MysqL/ib_logfile* /tmp/ #迁移出ib_logfile文件
$sudo vim /etc/MysqL/my.cnf #修改配置文件,内容如下
[client]
port = 3306
socket = /var/run/MysqLd/MysqLd.sock
[MysqLd_safe]
socket = /var/run/MysqLd/MysqLd.sock
nice = 0
[MysqLd]
server-id=1
user = MysqL
pid-file = /var/run/MysqLd/MysqLd.pid
socket = /var/run/MysqLd/MysqLd.sock
port = 3306
basedir = /usr
datadir = /var/lib/MysqL
tmpdir = /tmp
lc-messages-dir = /usr/share/MysqL
skip-external-locking
log_error = /var/log/MysqL/error.log
transaction-isolation = READ-COMMITTED
# Disabling symbolic-links is recommended to prevent assorted security risks;
# to do so,uncomment this line:
# symbolic-links = 0
key_buffer_size = 32M
max_allowed_packet = 32M
thread_stack = 256K
thread_cache_size = 64
query_cache_limit = 8M
query_cache_size = 64M
query_cache_type = 1
max_connections = 550
#expire_logs_days = 10
#max_binlog_size = 100M
#log_bin should be on a disk with enough free space. Replace '/var/lib/MysqL/MysqL_binary_log' with an appropriate path for your system
#and chown the specified folder to the MysqL user.
log_bin=/var/lib/MysqL/MysqL_binary_log
# For MysqL version 5.1.8 or later. For older versions,reference MysqL documentation for configuration help.
binlog_format = mixed
read_buffer_size = 2M
read_rnd_buffer_size = 16M
sort_buffer_size = 8M
join_buffer_size = 8M
# InnoDB settings
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit = 2
innodb_log_buffer_size = 64M
innodb_buffer_pool_size = 4G
innodb_thread_concurrency = 8
innodb_flush_method = O_DIRECT
innodb_log_file_size = 512M
sql_mode=STRICT_ALL_TABLES
$sudo service MysqL start
使用root用户进入MysqL,root密码在安装的过程中会设置,执行如下sql(密码可以根据需求修改):
create database cmf DEFAULT CHARACTER SET utf8;
grant all on cmf.* TO 'cmf'@'%' IDENTIFIED BY 'passwrod';
create database Metastore DEFAULT CHARACTER SET utf8;
grant all on Metastore.* TO 'hive'@'%' IDENTIFIED BY 'passwrod';
create database hue DEFAULT CHARACTER SET utf8;
grant all on hue.* TO 'hue'@'%' IDENTIFIED BY 'passwrod';
create database rman DEFAULT CHARACTER SET utf8;
grant all on rman.* TO 'rman'@'%' IDENTIFIED BY 'passwrod';
create database oozie DEFAULT CHARACTER SET utf8;
grant all on oozie.* TO 'oozie'@'%' IDENTIFIED BY 'passwrod';
flush privileges;
安装cloudera manager:
$sudo apt-get install cloudera-manager-daemons cloudera-manager-server
$sudo vim /etc/cloudera-scm-server/db.properties #(修改数据库用户配置为上面设置的cmf数据库用户信息,下面信息开始为INIT,首次不改动,启动cloudera-scm-server后改为如下然后再重启cloudera-scm-server)
#com.cloudera.cmf.db.setupType=INIT
com.cloudera.cmf.db.setupType=EXTERNAL
启动cloudera-scm-server:
$sudo service cloudera-scm-server start