总共写了5篇,都是网上找的然后自己搭建完了,把过程和操作写了一下,供参考。
传送门
1。hadoop安装:http://www.jb51.cc/article/p-pwoyusmt-bod.html
2。zookeeper安装:http://www.jb51.cc/article/p-kiwxozhi-bod.html
3。hbase安装:http://www.jb51.cc/article/p-nksepega-bod.html
4。spark安装:https://my.oschina.net/u/988386/blog/802073
5。Windows远程Eclipse调试:http://www.jb51.cc/article/p-hcslhcoq-bod.html
- 准备
- 准备2台Ubuntu16.04。
- 配置好JDK1.7
- 准备软件:hadoop-2.7.3.tar.gz、zookeeper-3.4.9.tar.gz、hbase-1.1.7-bin.tar.gz、spark-2.0.2-bin-hadoop2.7.tgz、scala-2.11.8.tgz。(说明:scala-2.12.x需要jdk8)
- 主机网络配置:设置好主机名和hosts文件,保证2台机器可以通过主机名互相ping通。
编号 主机名 IP 1 d155 192.168.158.155 2 d156 192.168.158.156
- 安装
- 建立hadoop用户密码hdp。(下面是 脚本)
-
#!/bin/bash sudo useradd -m hadoop -s /bin/bash -p mJ6D7vaH7GsrM sudo adduser hadoop sudo sudo apt-get update
-
- 建立hadoop用户密码hdp。(下面是 脚本)
- 设置ssh免密码登录(d155可以无密码ssh到d155,d156)。(下面是 脚本)
-
#!/bin/bash su hadoop <<EOF if [ ! -f ~/.ssh/id_rsa ] then echo "no id_rsa file create it user keygen:" ssh -o stricthostkeychecking=no localhost ssh-keygen -t rsa -P "" -f ~/.ssh/id_rsa else echo "has id_rsa file send to remote server" fi echo "把生成的key发送到要远程登录的机器" ssh-copy-id -i hadoop@d155 ssh-copy-id -i hadoop@d156 exit; EOF
设置完成后可以在d155上直接ssh到d155和d156.(需要在hadoop用户身份下执行ssh命令)。
-
安装hadoop并配置好环境变量。(2台机器操作相同)(下面是脚本)
执行命令 sudo -E ./xxxx.sh 注意-E参数。
执行命令source /etc/profile #使配置文件生效。
#!/bin/bash PATH_FILE="/etc/profile" #压缩包全路径 HADOOP_TAR="/home/hdp/Downloads/hadoop-2.7.3.tar.gz" HADOOP_INSTALL_HOME="/usr/local" #安装hadoop if [ -d $HADOOP_INSTALL_HOME/hadoop ] then sudo rm -rf $HADOOP_INSTALL_HOME/hadoop fi #解压hadoop sudo tar -zxvf $HADOOP_TAR -C $HADOOP_INSTALL_HOME #修改文件名称 sudo mv $HADOOP_INSTALL_HOME/hadoop-2.7.3 $HADOOP_INSTALL_HOME/hadoop #将所有者修改为hadoop sudo chown -R hadoop $HADOOP_INSTALL_HOME/hadoop #设置环境变量 if [ -z $HADOOP_HOME ] then sudo echo "export HADOOP_HOME=\"$HADOOP_INSTALL_HOME/hadoop\"" >> $PATH_FILE sudo echo "export PATH=\"\${HADOOP_HOME}/bin:\$PATH\"" >> $PATH_FILE #刷新环境变量 source /etc/profile fi
- 配置 hadoop-env.sh
加入jdk环境变量 export JAVA_HOME=/usr/lib/jvm/java #注意路径
- 配置 core-site.xml
<configuration> <property> <name>hadoop.tmp.dir</name> <value>file:/usr/local/hadoop/tmp</value> <description>Abase for other temporary directories.</description> </property> <property> <name>io.file.buffer.size</name> <value>131072</value> </property> <property> <name>fs.defaultFS</name> <value>hdfs://d155:9000</value> </property> </configuration>
- 配置hdfs-site.xml
<configuration> <property> <name>dfs.namenode.secondary.http-address</name> <value>d155:9001</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/usr/local/hadoop/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/usr/local/hadoop/dfs/data</value> </property> <property> <name>dfs.replication</name> <value>2</value> </property> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> <property> <name>dfs.permissions</name> <value>false</value> </property> </configuration>
- 配置mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>d155:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>d155:19888</value> </property> </configuration>
- 配置yarn-site.xml
<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>d155:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>d155:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>d155:8031</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>d155:8033</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>d155:8088</value> </property> </configuration>
- 配置yarn-env.sh,在开头加入JAVA_HOME变量
export JAVA_HOME=/usr/lib/jvm/java #注意路径
- 启动hadoop
- 格式化namenode
$/usr/local/hadoop/sbin/hdfs namenode -format
启动停止命令 /usr/local/hadoop/sbin/start-all.sh /usr/local/hadoop/sbin/stop-all.sh
检查安装是否成功
hadoop@d155$ jps d155主机包含ResourceManager、SecondaryNameNode、NameNode等,则表示启动成功,例如 2212 ResourceManager 2484 Jps 1917 NameNode 2078 SecondaryNameNode hadoop@d156$ jps d156主机包含Datanode、NodeManager等,则表示启用成功,例如 17153 Datanode 17334 Jps 17241 NodeManager