1. 環境準備及版本介紹
1.Linux版本介紹
CentOS release 6.8 (Final) Linux version 2.6.32-642.el6.x86_64 ([email protected]) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-17) (GCC) ) #1 SMP Tue May 10 17:27:01 UTC 2016 鏡像版本:CentOS-6.8-x86_64-minimal.iso
2.JDK版本
java version "1.8.0_131" Java(TM) SE Runtime Environment (build 1.8.0_131-b11) Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode) 環境變量配置: vim /etc/profile 添加以下文件內容 然後source /etc/profile export JAVA_HOME=/home/software/jdk/jdk1.8.0_131 export PATH=$PATH:$JAVA_HOME/bin export CLASSPATH=.:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/rt.jar
3.Hadoop版本
hadoop-2.7.7
2.節點安排
備註/etc/hosts: 192.168.1.211 z1 192.168.1.212 z2 192.168.1.213 z3 192.168.1.214 z4
3.主機之間免密操作
1.生成公鑰 $ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa 2.將公鑰追加至~/.ssh/authorized_keys $ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys 如與其他主機進行免密則需要將公鑰傳輸至目標主機並追加至該主機的~/.ssh/authorized_keys,首先也得在目標主機執行 ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa 使用scp命令進行傳輸
4.安裝zookeeper
1.zoo.cfg文件配置
# The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, /tmp here is just # example sakes. dataDir=/opt/zookeeper # the port at which the clients will connect clientPort=2181 # the maximum number of client connections. # increase this if you need to handle more clients #maxClientCnxns=60 # # Be sure to read the maintenance section of the # administrator guide before turning on autopurge. # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # The number of snapshots to retain in dataDir #autopurge.snapRetainCount=3 # Purge task interval in hours # Set to "0" to disable auto purge feature #autopurge.purgeInterval=1 server.1=z1:2888:3888 server.2=z2:2888:3888 server.3=z3:2888:3888
2.dataDir配置的路徑下創建myid文件
dataDir=/opt/zookeeper 編輯myid在對應的機器上輸入自己內容,與機器對應 server.1=z1:2888:3888 server.2=z2:2888:3888 server.3=z3:2888:3888 例如:在z1機器上的/opt/zookeeper創建一個myid文件 內容為:1
3.啟動zookeeper
進入zookeeper的安裝的bin目錄,在所有安裝的機器執行./zkServer.sh start 執行./zkServer.sh status查看是否啟動成功
5.開始安裝Hadoop
1.hadoop-env.sh文件配置
export JAVA_HOME=/home/software/jdk/jdk1.8.0_131
2.hdfs-site.xml文件配置
dfs.nameservices bigdata dfs.ha.namenodes.bigdata nn1,nn2 dfs.namenode.rpc-address.bigdata.nn1 z1:8020 dfs.namenode.rpc-address.bigdata.nn2 z2:8020 dfs.namenode.http-address.bigdata.nn1 z1:50070 dfs.namenode.http-address.bigdata.nn2 z2:50070 dfs.namenode.shared.edits.dir qjournal://z2:8485;z3:8485;z4:8485/bigdata dfs.client.failover.proxy.provider.bigdata org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider dfs.ha.fencing.methods sshfence dfs.ha.fencing.ssh.private-key-files /root/.ssh/id_dsa dfs.journalnode.edits.dir /opt/journaldata dfs.ha.automatic-failover.enabled true
3.core-site.xml配置文件
fs.defaultFS hdfs://bigdata ha.zookeeper.quorum z1:2181,z2:2181,z3:2181 hadoop.tmp.dir /home/workspace/hadoop-2.7.7
4.slaves配置文件
z2 z3 z4
6.啟動集群
1.第一步
在安裝journalnode的主機上(z2,z3,z4)上啟動journalnode 執行:./hadoop-daemon.sh start journalnode
2.第二步
選擇一臺namenode主機進行操作 執行 ./hdfs namenode –format來格式化任意一個namenode 執行 ./hadoop-daemon.sh start namenode來啟動這個格式化的namenode
3.第三步
另一臺主機進行如下操作 執行:./hdfs namenode –bootstrapStandby
4.第四步
執行 ./stop-dfs.sh停止所有服務
5.第五步
執行 ./hdfs zkfc –formatZK
6.第六步
執行 ./start-dfs.sh 啟動集群
7.第七步 訪問集群
瀏覽器 http://192.168.1.211:50070 訪問集群
7.安裝MapReduce
1.mapred-site.xml配置
mapreduce.framework.name yarn mapreduce.jobhistory.address z1:10020 MapReduce JobHistory Server IPC host:port mapreduce.jobhistory.webapp.address z1:19888 MapReduce JobHistory Server Web UI host:port mapreduce.jobhistory.intermediate-done-dir /home/mr-history/tmp Directory where history files are written by MapReduce jobs mapreduce.jobhistory.done-dir /home/mr-history/done Directory where history files are managed by the MR JobHistory Server
2. yarn-site.xml配置
yarn.nodemanager.aux-services mapreduce_shuffle yarn.resourcemanager.ha.enabled true yarn.resourcemanager.cluster-id bigdata yarn.resourcemanager.ha.rm-ids rm1,rm2 yarn.resourcemanager.hostname.rm1 z1 yarn.resourcemanager.hostname.rm2 z2 yarn.resourcemanager.zk-address z1:2181,z2:2181,z3:2181 yarn.log-aggregation-enable true yarn.log.server.url http://z1:19888/jobhistory/logs
3.啟動MapReduce
進入sbin目錄 執行 ./start-yarn.sh 啟動
4.啟動JobHistoryServer
進入sbin目錄 執行 ./ mr-jobhistory-daemon.sh start historyserver 啟動 瀏覽器輸入 http://z1:19888 訪問
安裝完畢!!!感謝閱讀,別忘點贊哦,很辛苦的 親測成功!!!