準備4台虛擬機,安裝好ol7.7,分配固定ip192.168.168.11 12 13 14,其中192.168.168.11作為master,其他3個作為slave,主節點也同時作為namenode的同時也是datanode,192.168.168.14作為datanode的同時也作為second... ...
準備4台虛擬機,安裝好ol7.7,分配固定ip192.168.168.11 12 13 14,其中192.168.168.11作為master,其他3個作為slave,主節點也同時作為namenode的同時也是datanode,192.168.168.14作為datanode的同時也作為secondary namenodes
首先修改/etc/hostname將主機名改為master、slave1、slave2、slave3
然後修改/etc/hosts文件添加
192.168.168.11 master 192.168.168.12 slave1 192.168.168.13 slave2 192.168.168.14 slave3
然後卸載自帶openjdk改為sun jdk,參考https://www.cnblogs.com/yongestcat/p/13222963.html
配置無密碼登陸本機
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys chmod 0600 ~/.ssh/authorized_keys
配置互信
master上把公鑰傳輸給各個slave
scp ~/.ssh/id_rsa.pub hadoop@slave1:/home/hadoop/ scp ~/.ssh/id_rsa.pub hadoop@slave2:/home/hadoop/ scp ~/.ssh/id_rsa.pub hadoop@slave3:/home/hadoop/
在slave主機上將master的公鑰加入各自的節點上
cat ~/id_rsa.pub >> ~/.ssh/authorized_keys
master上安裝hadoop
sudo tar -xzvf ~/hadoop-3.2.1.tar.gz -C /usr/local sudo mv hadoop-3.2.1-src/ ./hadoop sudo chown -R hadoop: ./hadoop
.bashrc添加並使之生效
export HADOOP_HOME=/usr/local/hadoop export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
集群配置,/usr/local/hadoop/etc/hadoop目錄中有配置文件:
修改core-site.xml
<configuration> <property> <name>hadoop.tmp.dir</name> <value>file:/usr/local/hadoop/tmp</value> <description>Abase for other temporary directories.</description> </property> <property> <name>fs.defaultFS</name> <value>hdfs://master:9000</value> </property> </configuration>
修改hdfs-site.xml
<configuration> <property> <name>dfs.namenode.name.dir</name> <value>/home/hadoop/data/nameNode</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/home/hadoop/data/dataNode</value> </property> <property> <name>dfs.replication</name> <value>3</value> </property> <property> <name>dfs.secondary.http.address</name> <value>slave3:50090</value> </property> </configuration>
修改mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>yarn.app.mapreduce.am.env</name> <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value> </property> <property> <name>mapreduce.map.env</name> <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value> </property> <property> <name>mapreduce.reduce.env</name> <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value> </property> </configuration>
修改yarn-site.xml
<configuration> <property> <name>yarn.resourcemanager.hostname</name> <value>master</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> </configuration>
修改hadoop-env.sh找到JAVA_HOME的配置將目錄修改為
export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_191
修改workers
[hadoop@master /usr/local/hadoop/etc/hadoop]$ vim workers master slave1 slave2 slave3
最後將配置好的/usr/local/hadoop文件夾複製到其他節點
sudo scp -r /usr/local/hadoop/ slave1:/usr/local/ sudo scp -r /usr/local/hadoop/ slave2:/usr/local/ sudo scp -r /usr/local/hadoop/ slave3:/usr/local/
並且把文件夾owner改為hadoop
關閉防火牆
sudo systemctl stop firewalld sudo systemctl disable firewalld
格式化hdfs,首次運行前運行,以後不用,在任意節點執行都可以/usr/local/hadoop/bin/hadoop namenode –format
看到這個successfuly formatted就是表示成功
start-dfs.sh啟動集群hdfs
jps命令查看運行情況
通過master的9870埠可以網頁監控http://192.168.168.11:9870/
也可以通過命令行查看集群狀態hadoop dfsadmin -report
[hadoop@master ~]$ hadoop dfsadmin -report WARNING: Use of this script to execute dfsadmin is deprecated. WARNING: Attempting to execute replacement "hdfs dfsadmin" instead. Configured Capacity: 201731358720 (187.88 GB) Present Capacity: 162921230336 (151.73 GB) DFS Remaining: 162921181184 (151.73 GB) DFS Used: 49152 (48 KB) DFS Used%: 0.00% Replicated Blocks: Under replicated blocks: 0 Blocks with corrupt replicas: 0 Missing blocks: 0 Missing blocks (with replication factor 1): 0 Low redundancy blocks with highest priority to recover: 0 Pending deletion blocks: 0 Erasure Coded Block Groups: Low redundancy block groups: 0 Block groups with corrupt internal blocks: 0 Missing block groups: 0 Low redundancy blocks with highest priority to recover: 0 Pending deletion blocks: 0 ------------------------------------------------- Live datanodes (4): Name: 192.168.168.11:9866 (master) Hostname: master Decommission Status : Normal Configured Capacity: 50432839680 (46.97 GB) DFS Used: 12288 (12 KB) Non DFS Used: 9796546560 (9.12 GB) DFS Remaining: 40636280832 (37.85 GB) DFS Used%: 0.00% DFS Remaining%: 80.58% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Fri Jul 03 11:14:44 CST 2020 Last Block Report: Fri Jul 03 11:10:35 CST 2020 Num of Blocks: 0 Name: 192.168.168.12:9866 (slave1) Hostname: slave1 Decommission Status : Normal Configured Capacity: 50432839680 (46.97 GB) DFS Used: 12288 (12 KB) Non DFS Used: 9710411776 (9.04 GB) DFS Remaining: 40722415616 (37.93 GB) DFS Used%: 0.00% DFS Remaining%: 80.75% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Fri Jul 03 11:14:44 CST 2020 Last Block Report: Fri Jul 03 11:10:35 CST 2020 Num of Blocks: 0 Name: 192.168.168.13:9866 (slave2) Hostname: slave2 Decommission Status : Normal Configured Capacity: 50432839680 (46.97 GB) DFS Used: 12288 (12 KB) Non DFS Used: 9657286656 (8.99 GB) DFS Remaining: 40775540736 (37.98 GB) DFS Used%: 0.00% DFS Remaining%: 80.85% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Fri Jul 03 11:14:44 CST 2020 Last Block Report: Fri Jul 03 11:10:35 CST 2020 Num of Blocks: 0 Name: 192.168.168.14:9866 (slave3) Hostname: slave3 Decommission Status : Normal Configured Capacity: 50432839680 (46.97 GB) DFS Used: 12288 (12 KB) Non DFS Used: 9645883392 (8.98 GB) DFS Remaining: 40786944000 (37.99 GB) DFS Used%: 0.00% DFS Remaining%: 80.87% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Fri Jul 03 11:14:44 CST 2020 Last Block Report: Fri Jul 03 11:10:35 CST 2020 Num of Blocks: 0 [hadoop@master ~]$
start-yarn.sh可以開啟yarn,可以通過master8088埠監控
啟動集群命令,可以同時開啟hdfs和yarn /usr/local/hadoop/sbin/start-all.sh
停止集群命令 /usr/local/hadoop/sbin/stop-all.sh
就這樣,記錄過程,以備後查