機器規劃 環境準備 安裝JDK 1. 在所有機器上安裝jdk8 2. 配置好環境變數 vi /etc/profile JAVA_HOME=/usr/local/jdk1.8.0_152 PATH=$PATH:$JAVA_HOME/bin export JAVA_HOME export PATH so ...
機器規劃
環境準備
安裝JDK
1. 在所有機器上安裝jdk8
2. 配置好環境變數
vi /etc/profile
JAVA_HOME=/usr/local/jdk1.8.0_152
PATH=$PATH:$JAVA_HOME/bin
export JAVA_HOME
export PATH
source /etc/profile
配置免密登錄
1)首先在node-2機器上操作
2)生成密鑰對 ssh-keygen -t rsa
3)進入/root/.ssh目錄,將公鑰複製到其他機器
ssh-copy-id node-2
ssh-copy-id node-3
ssh-copy-id node-4
4) 在node-3和snode-4上分別執行上述步驟
關閉防火牆和selinux
1)關閉防火牆
systemctl stop firewalld
systemctl disable firewalld
2)關閉selinux
vi /etc/selinux/config 設置SELINUX=disabled
3)重啟系統 reboot
4)在每台機器執行上述操作
安裝Hadoop
1.創建目錄,在node-2、node-3、node-4上創建下述目錄
/mnt/data/hadoop/pid
/mnt/data/hadoop/tmp
/mnt/data/hadoop/dfs/name
/mnt/data/hadoop/dfs/data
/mnt/data/hadoop/dfs/namesecondary
/mnt/data/hadoop/dfs/edits
/mnt/data/hadoop/logs
2.將 hadoop-3.2.2.tar.gz解壓到 /opt/software/hadoop-3.2.2
3.配置Hadoop環境變數, 在node-2、node-3、node-4上分別執行下述配置
vi /etc/profile 追加以下內容
HADOOP_HOME=/opt/software/hadoop-3.2.2
PATH=$PATH:$HADOOP_HOME/bin
export HADOOP_HOME
export PATH
使配置生效 source /etc/profile
4.配置hadoop-env.sh
export JAVA_HOME=/opt/software/jdk1.8.0_152
export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root
export HADOOP_PID_DIR=/mnt/data/hadoop/pid
export HADOOP_LOG_DIR=/mnt/data/hadoop/logs
5.配置core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://node-2:9870</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/mnt/data/hadoop/tmp</value>
</property>
</configuration>
6.配置hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>node-3:9868</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/mnt/data/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/mnt/data/hadoop/dfs/data</value>
</property>
<property>
<name>dfs.namenode.checkpoint.dir</name>
<value>/mnt/data/hadoop/dfs/namesecondary</value>
</property>
<property>
<name>dfs.namenode.checkpoint.edits.dir</name>
<value>/mnt/data/hadoop/dfs/edits</value>
</property>
<property>
<name>dfs.datanode.handler.count</name>
<value>30</value>
</property>
<property>
<name>dfs.namenode.handler.count</name>
<value>30</value>
</property>
</configuration>
7.配置mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
8.配置yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.log.server.url</name>
<value>http://node-4:19888/jobhistory/logs</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>604800</value>
</property>
</configuration>
9.配置workers
vi workers
node-3
node-4
10.配置log4j.properties
hadoop.log.dir=/mnt/data/hadoop/logs
11.複製 /opt/software/hadoop-3.2.2 到node-3和node-4
scp -r /opt/software/hadoop-3.2.2 root@node-3:/opt/software/hadoop-3.2.2
scp -r /opt/software/hadoop-3.2.2 root@node-4:/opt/software/hadoop-3.2.2
格式化和啟動Hadoop
1.格式化,在node-2上執行
bin/hdfs namenode -format
2.啟動hdfs, 在node-2上執行
sbin/start-dfs.sh
停止命令:sbin/stop-dfs.sh
3.啟動yarn, 在node-2上執行
sbin/start-yarn.sh
停止命令:sbin/stop-yarn.sh
- 啟動jobhistory,在node-4上執行
bin/mapred --daemon start historyserver
停止命令:bin/mapred --daemon stop historyserver
頁面訪問
-
HDFS NameNode: http://node-2:9870/
-
YARN ResourceManager: http://node-2:8088/
-
MapReduce JobHistory Server: http://node-4:19888/