環境的搭建

来源:https://www.cnblogs.com/qw77/p/18102996
-Advertisement-
Play Games

第4章 Hadoop文件參數配置 實驗一:hadoop 全分佈配置 1.1 實驗目的 完成本實驗,您應該能夠: 掌握 hadoop 全分佈的配置 掌握 hadoop 全分佈的安裝 掌握 hadoop 配置文件的參數意義 1.2 實驗要求 熟悉 hadoop 全分佈的安裝 瞭解 hadoop 配置文件 ...


第4章 Hadoop文件參數配置

實驗一:hadoop 全分佈配置

1.1 實驗目的

完成本實驗,您應該能夠:

  • 掌握 hadoop 全分佈的配置
  • 掌握 hadoop 全分佈的安裝
  • 掌握 hadoop 配置文件的參數意義

1.2 實驗要求

  • 熟悉 hadoop 全分佈的安裝
  • 瞭解 hadoop 配置文件的意義

1.3 實驗過程

1.3.1 實驗任務一:在 Master 節點上安裝 Hadoop

1.3.1.1 步驟一:解壓縮 hadoop-2.7.1.tar.gz 安裝包到/usr 目錄下
[root@master ~]# tar zvxf jdk-8u152-linux-x64.tar.gz -C /usr/local/src/

[root@master ~]# tar zvxf hadoop-2.7.1.tar.gz -C /usr/local/src/
1.3.1.2 步驟二:將 hadoop-2.7.1 文件夾重命名為 hadoop
[root@master ~]# cd /usr/local/src/
[root@master src]# ls
hadoop-2.7.1  jdk1.8.0_152
[root@master src]# mv hadoop-2.7.1/ hadoop
[root@master src]# mv jdk1.8.0_152/ jdk
[root@master src]# ls
hadoop  jdk
1.3.1.3 步驟三:配置 Hadoop 環境變數

​ [root@master ~]# vi /etc/profile.d/hadoop.sh

註意:在第二章安裝單機 Hadoop 系統已經配置過環境變數,先刪除之前配置後添加

#寫入以下信息
export JAVA_HOME=/usr/local/src/jdk
export HADOOP_HOME=/usr/local/src/hadoop
export PATH=${JAVA_HOME}/bin:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:$PATH
1.3.1.4 步驟四:使配置的 Hadoop 的環境變數生效
[root@master ~]# source /etc/profile.d/hadoop.sh 
[root@master ~]# echo $PATH
/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin
1.3.1.5 步驟五:執行以下命令修改 hadoop-env.sh 配置文件
[root@master ~]# vi /usr/local/src/hadoop/etc/hadoop/hadoop-env.sh

#寫入以下信息
export JAVA_HOME=/usr/local/src/jdk

1.3.2 實驗任務二:配置 hdfs-site.xml 文件參數

執行以下命令修改 hdfs-site.xml 配置文件。

[root@master ~]# vi /usr/local/src/hadoop/etc/hadoop/hdfs-site.xml

#在文件中<configuration>和</configuration>一對標簽之間追加以下配置信息
<configuration>
		<property>
				<name>dfs.namenode.name.dir</name>
				<value>file:/usr/local/src/hadoop/dfs/name</value>
		</property>
		<property>
				<name>dfs.datanode.data.dir</name>
				<value>file:/usr/local/src/hadoop/dfs/data</value>
		</property>
		<property>
				<name>dfs.replication</name>
				<value>2</value>
		</property>
</configuration>

創建目錄
[root@master ~]# mkdir -p /usr/local/src/hadoop/dfs/{name,data}

對於 Hadoop 的分散式文件系統 HDFS 而言,一般都是採用冗餘存儲,冗餘因數通常為3,也就是說,一份數據保存三份副本。所以,修改 dfs.replication 的配置,使 HDFS 文件的備份副本數量設定為2個。

1.3.3 實驗任務三:配置 core-site.xml 文件參數

[root@master ~]# vi /usr/local/src/hadoop/etc/hadoop/core-site.xml

#在文件中<configuration>和</configuration>一對標簽之間追加以下配置信息

<configuration>
		<property>
				<name>fs.defaultFS</name>
				<value>hdfs://master:9000</value>
		</property>
		<property>
				<name>io.file.buffer.size</name>
		<value>131072</value>
		</property>
		<property>
				<name>hadoop.tmp.dir</name>
				<value>file:/usr/local/src/hadoop/tmp</value>
		</property>
</configuration>

#保存以上配置後創建目錄
[root@master ~]# mkdir -p /usr/local/src/hadoop/tmp

如沒有配置 hadoop.tmp.dir 參數,此時系統預設的臨時目錄為:/tmp/hadoop-hadoop。該目錄在每次 Linux 系統重啟後會被刪除,必須重新執行 Hadoop 文件系統格式化命令,否則 Hadoop 運行會出錯。

1.3.4 實驗任務四:配置 mapred-site.xml

[root@master ~]# cd /usr/local/src/hadoop/etc/hadoop/
[root@master hadoop]# cp mapred-site.xml.template mapred-site.xml

#在文件中<configuration>和</configuration>一對標簽之間追加以下配置信息

<configuration>
		<property>
				<name>mapreduce.framework.name</name>
		<value>yarn</value>
		</property>
		<property>
				<name>mapreduce.jobhistory.address</name>
				<value>master:10020</value>
		</property>
		<property>
				<name>mapreduce.jobhistory.webapp.address</name>
				<value>master:19888</value>
		</property>
</configuration>

1.3.5 實驗任務五:配置 yarn-site.xml

[root@master hadoop]# vi /usr/local/src/hadoop/etc/hadoop/yarn-site.xml

#在文件中<configuration>和</configuration>一對標簽之間追加以下配置信息

<configuration>
<!-- Site specific YARN configuration properties -->
		<property>
				<name>arn.resourcemanager.address</name>
				<value>master:8032</value>
		</property>
		<property>
				<name>yarn.resourcemanager.scheduler.address</name>
				<value>master:8030</value>
		</property>
		<property>
				<name>yarn.resourcemanager.webapp.address</name>
				<value>master:8088</value>
		</property>
		<property>
				<name>yarn.resourcemanager.resource-tracker.address</name>
				<value>master:8031</value>
		</property>
		<property>
				<name>yarn.resourcemanager.admin.address</name>
				<value>master:8033</value>
		</property>
		<property>
				<name>yarn.nodemanager.aux-services</name>
				<value>mapreduce_shuffle</value>
		</property>
		<property>
			  <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
			  <value>org.apache.hadoop.mapred.ShuffleHandler</value>
		</property>
</configuration>

1.3.6 實驗任務六:Hadoop 其它相關配置

1.3.6.1 步驟一:配置 masters 文件
#修改 masters 配置文件
[root@master ~]# vi /usr/local/src/hadoop/etc/hadoop/masters

#加入以下配置信息
10.10.10.128
1.3.6.2 步驟二:配置 slaves 文件
#修改 slaves 配置文件
[root@master ~]# vi /usr/local/src/hadoop/etc/hadoop/slaves

#刪除 localhost,加入以下配置信息
10.10.10.129
10.10.10.130
1.3.6.3 步驟三:新建用戶以及修改目錄許可權
#新建用戶
[root@master ~]# useradd hadoop 
[root@master ~]# echo 'hadoop' | passwd --stdin hadoop
Changing password for user hadoop.
passwd: all authentication tokens updated successfully.

#修改目錄許可權
[root@master ~]# chown -R hadoop.hadoop /usr/local/src/
[root@master ~]# cd /usr/local/src/
[root@master src]# ll
total 0
drwxr-xr-x 11 hadoop hadoop 171 Mar 27 01:51 hadoop
drwxr-xr-x  8 hadoop hadoop 255 Sep 14  2017 jdk
1.3.6.4 步驟四:配置master能夠免密登錄所有slave節點
[root@master ~]# ssh-keygen -t rsa

Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): 
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:Ibeslip4Bo9erREJP37u7qhlwaEeMOCg8DlJGSComhk root@master
The key's randomart image is:
+---[RSA 2048]----+
|B.oo |
|Oo.o |
|=o=.  . o|
|E.=.o  + o   |
|.* BS|
|* o =  o |
| * * o+  |
|o O *o   |
|.=.+==   |
+----[SHA256]-----+

[root@master ~]# ssh-copy-id root@slave1
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
The authenticity of host 'slave1 (10.10.10.129)' can't be established.
ECDSA key fingerprint is SHA256:Z643OMlGh0yMEc5i85oZ7c21NHdkzSZD9hY9K39xzP4.
ECDSA key fingerprint is MD5:e0:ef:47:5f:ad:75:9a:44:08:bc:f2:10:8e:d6:53:4a.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@slave1's password: 
Number of key(s) added: 1
Now try logging into the machine, with:   "ssh 'root@slave1'"
and check to make sure that only the key(s) you wanted were added.

[root@master ~]# ssh-copy-id root@slave2
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
The authenticity of host 'slave2 (10.10.10.130)' can't be established.
ECDSA key fingerprint is SHA256:Z643OMlGh0yMEc5i85oZ7c21NHdkzSZD9hY9K39xzP4.
ECDSA key fingerprint is MD5:e0:ef:47:5f:ad:75:9a:44:08:bc:f2:10:8e:d6:53:4a.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@slave2's password: 
Number of key(s) added: 1  
Now try logging into the machine, with:   "ssh 'root@slave2'"
and check to make sure that only the key(s) you wanted were added.
   
[root@master ~]# ssh slave1
Last login: Sun Mar 27 02:58:38 2022 from master
[root@slave1 ~]# exit
logout
Connection to slave1 closed.

[root@master ~]# ssh slave2
Last login: Sun Mar 27 00:26:12 2022 from 10.10.10.1
[root@slave2 ~]# exit
logout
Connection to slave2 closed.
1.3.6.5 步驟五:同步/usr/local/src/目錄下所有文件至所有slave節點
[root@master ~]# scp -r /usr/local/src/* root@slave1:/usr/local/src/

[root@master ~]# scp -r /usr/local/src/* root@slave2:/usr/local/src/

[root@master ~]# scp /etc/profile.d/hadoop.sh root@slave1:/etc/profile.d/
hadoop.sh                                   100%  151    45.9KB/s   00:00 
   
[root@master ~]# scp /etc/profile.d/hadoop.sh root@slave2:/etc/profile.d/
hadoop.sh                                   100%  151    93.9KB/s   00:00    
1.3.6.6 步驟六:在所有slave節點執行以下命令
(1)在slave1

[root@slave1 ~]# useradd hadoop 
[root@slave1 ~]# echo 'hadoop' | passwd --stdin hadoop 
Changing password for user hadoop.
passwd: all authentication tokens updated successfully.

[root@slave1 ~]# chown -R hadoop.hadoop /usr/local/src/
[root@slave1 ~]# ll /usr/local/src/
total 0
drwxr-xr-x 11 hadoop hadoop 171 Mar 27 03:07 hadoop
drwxr-xr-x  8 hadoop hadoop 255 Mar 27 03:07 jdk

[root@slave1 ~]# source /etc/profile.d/hadoop.sh 

[root@slave1 ~]# echo $PATH
/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin

(2)在slave2

[root@slave2 ~]# useradd hadoop
[root@slave2 ~]# echo 'hadoop' | passwd --stdin hadoop
Changing password for user hadoop.
passwd: all authentication tokens updated successfully.

[root@slave2 ~]# chown -R hadoop.hadoop /usr/local/src/
[root@slave2 ~]# ll /usr/local/src/
total 0
drwxr-xr-x 11 hadoop hadoop 171 Mar 27 03:09 hadoop
drwxr-xr-x  8 hadoop hadoop 255 Mar 27 03:09 jdk

[root@slave2 ~]# source /etc/profile.d/hadoop.sh 

[root@slave2 ~]# echo $PATH
/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin

第5章 Hadoop集群運行

實驗一:hadoop 集群運行

1.1 實驗目的

完成本實驗,您應該能夠:

  • 掌握 hadoop 的運行狀態
  • 掌握 hadoop 文件系統格式化配置
  • 掌握 hadoop java 運行狀態查看
  • 掌握 hadoop hdfs 報告查看
  • 掌握 hadoop 節點狀態查看
  • 掌握停止 hadoop 進程操作

1.2 實驗要求

  • 熟悉如何查看 hadoop 的運行狀態
  • 熟悉停止 hadoop 進程的操作

1.3 實驗過程

1.3.1 實驗任務一:配置 Hadoop 格式化

1.3.1.1 步驟一:NameNode 格式化

將 NameNode 上的數據清零,第一次啟動 HDFS 時要進行格式化,以後啟動無需再格式化,否則會缺失 DataNode 進程。另外,只要運行過 HDFS,Hadoop 的工作目錄(本書設置為/usr/local/src/hadoop/tmp)就會有數據,如果需要重新格式化,則在格式化之前一定要先刪除工作目錄下的數據,否則格式化時會出問題。

執行如下命令,格式化 NameNode

[root@master ~]# su - hadoop 
Last login: Fri Apr  1 23:34:46 CST 2022 on pts/1

[hadoop@master ~]$ cd /usr/local/src/hadoop/
[hadoop@master hadoop]$ ./bin/hdfs namenode -format
22/04/02 01:22:42 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************

1.3.1.2 步驟二:啟動 NameNode

[hadoop@master hadoop]$ hadoop-daemon.sh start namenode
namenode running as process 11868. Stop it first.

1.3.2 實驗任務二:查看 Java 進程

啟動完成後,可以使用 JPS 命令查看是否成功。JPS 命令是 Java 提供的一個顯示當前所有 Java 進程 pid 的命令。

[hadoop@master hadoop]$ jps
12122 Jps
11868 NameNode

1.3.2.1 步驟一:切換到Hadoop用戶

[hadoop@master ~]$ su - hadoop 
Password: 
Last login: Sat Apr  2 01:22:13 CST 2022 on pts/1
Last failed login: Sat Apr  2 04:47:08 CST 2022 on pts/1
There was 1 failed login attempt since the last successful login.

1.3.3 實驗任務三:查看 HDFS 的報告

[hadoop@master ~]$ hdfs dfsadmin -report
Configured Capacity: 0 (0 B)
Present Capacity: 0 (0 B)
DFS Remaining: 0 (0 B)
DFS Used: 0 (0 B)
DFS Used%: NaN%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0

-------------------------------------------------

1.3.3.1 步驟一:生成密鑰

[hadoop@master ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa): 
Created directory '/home/hadoop/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:nW/cVxmRp5Ht9TKGT61OmGbhQtkBdpHyS5prGhx24pI [email protected]
The key's randomart image is:
+---[RSA 2048]----+
|  o.oo +.|
| ...o o.=|
|   = o *+|
| .o.* * *|
|S.+= O =.|
|   = ++oB.+ .|
|  E +  =+o. .|
|   . .o.  .. |
|.o   |
+----[SHA256]-----+

[hadoop@master ~]$ ssh-copy-id slave1
/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/hadoop/.ssh/id_rsa.pub"
The authenticity of host 'slave1 (10.10.10.129)' can't be established.
ECDSA key fingerprint is SHA256:BE2tM2BCeGBc6aGRKBTbMTh80VP9noFKzqDknL+0Jes.
ECDSA key fingerprint is MD5:a2:25:9c:bc:d0:df:fc:ec:44:4a:c0:10:26:f2:ef:c7.
Are you sure you want to continue connecting (yes/no)? yes
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
hadoop@slave1's password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'slave1'"
and check to make sure that only the key(s) you wanted were added.

[hadoop@master ~]$ ssh-copy-id slave2
/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/hadoop/.ssh/id_rsa.pub"
The authenticity of host 'slave2 (10.10.10.130)' can't be established.
ECDSA key fingerprint is SHA256:BE2tM2BCeGBc6aGRKBTbMTh80VP9noFKzqDknL+0Jes.
ECDSA key fingerprint is MD5:a2:25:9c:bc:d0:df:fc:ec:44:4a:c0:10:26:f2:ef:c7.
Are you sure you want to continue connecting (yes/no)? yes
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
hadoop@slave2's password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'slave2'"
and check to make sure that only the key(s) you wanted were added.

[hadoop@master ~]$ ssh-copy-id master
/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/hadoop/.ssh/id_rsa.pub"
The authenticity of host 'master (10.10.10.128)' can't be established.
ECDSA key fingerprint is SHA256:BE2tM2BCeGBc6aGRKBTbMTh80VP9noFKzqDknL+0Jes.
ECDSA key fingerprint is MD5:a2:25:9c:bc:d0:df:fc:ec:44:4a:c0:10:26:f2:ef:c7.
Are you sure you want to continue connecting (yes/no)? yes
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
hadoop@master's password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'master'"
and check to make sure that only the key(s) you wanted were added.

1.3.4 實驗任務四:停止dfs.sh

[hadoop@master ~]$ stop-dfs.sh 
Stopping namenodes on [master]
master: stopping namenode
10.10.10.129: no datanode to stop
10.10.10.130: no datanode to stop
Stopping secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
ECDSA key fingerprint is SHA256:BE2tM2BCeGBc6aGRKBTbMTh80VP9noFKzqDknL+0Jes.
ECDSA key fingerprint is MD5:a2:25:9c:bc:d0:df:fc:ec:44:4a:c0:10:26:f2:ef:c7.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.
0.0.0.0: no secondarynamenode to stop

1.3.4.1 重啟並驗證

[hadoop@master ~]$ start-dfs.sh 
Starting namenodes on [master]
master: starting namenode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-namenode-master.example.com.out
10.10.10.129: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slave1.out
10.10.10.130: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slave2.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-secondarynamenode-master.example.com.out

[hadoop@master ~]$ start-yarn.sh 
starting yarn daemons
starting resourcemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-resourcemanager-master.example.com.out
10.10.10.129: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-nodemanager-slave1.out
10.10.10.130: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-nodemanager-slave2.out

[hadoop@master ~]$ jps
12934 NameNode
13546 Jps
13131 SecondaryNameNode
13291 ResourceManager

如果在master上看到ResourceManager,並且在slave上看到NodeManager就表示成功
[hadoop@master ~]$ jps
12934 NameNode
13546 Jps
13131 SecondaryNameNode
13291 ResourceManager

[root@slave1 ~]# jps
11906 NodeManager
11797 DataNode
12037 Jps

[root@slave2 ~]# jps
12758 NodeManager
12648 DataNode
12889 Jps

[hadoop@master ~]$ hdfs dfs -mkdir /input
[hadoop@master ~]$ hdfs dfs -ls /
Found 1 items
drwxr-xr-x   - hadoop supergroup          0 2022-04-02 05:18 /input
[hadoop@master ~]$ mkdir ~/input
[hadoop@master ~]$ vim ~/input/data.txt
Hello World
Hello Hadoop
Hello Huasan
~

[hadoop@master ~]$ hdfs dfs -put ~/input/data.txt 
.bash_logout       .bashrc            .oracle_jre_usage/ .viminfo           
.bash_profile      input/             .ssh/              
[hadoop@master ~]$ hdfs dfs -put ~/input/data.txt /input
[hadoop@master ~]$ hdfs dfs -cat /input/data.txt
Hello World
Hello Hadoop
Hello Huasan
[hadoop@master ~]$ hadoop jar /usr/local/src/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar wordcount /input/data.txt /output
22/04/02 05:31:20 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
22/04/02 05:31:21 INFO input.FileInputFormat: Total input paths to process : 1
22/04/02 05:31:21 INFO mapreduce.JobSubmitter: number of splits:1
22/04/02 05:31:21 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1648846845675_0001
22/04/02 05:31:22 INFO impl.YarnClientImpl: Submitted application application_1648846845675_0001
22/04/02 05:31:22 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1648846845675_0001/
22/04/02 05:31:22 INFO mapreduce.Job: Running job: job_1648846845675_0001
22/04/02 05:31:30 INFO mapreduce.Job: Job job_1648846845675_0001 running in uber mode : false
22/04/02 05:31:30 INFO mapreduce.Job:  map 0% reduce 0%
22/04/02 05:31:38 INFO mapreduce.Job:  map 100% reduce 0%
22/04/02 05:31:42 INFO mapreduce.Job:  map 100% reduce 100%
22/04/02 05:31:42 INFO mapreduce.Job: Job job_1648846845675_0001 completed successfully
22/04/02 05:31:42 INFO mapreduce.Job: Counters: 49
    File System Counters
            FILE: Number of bytes read=56
            FILE: Number of bytes written=230931
            FILE: Number of read operations=0
            FILE: Number of large read operations=0
            FILE: Number of write operations=0
            HDFS: Number of bytes read=136
            HDFS: Number of bytes written=34
            HDFS: Number of read operations=6
            HDFS: Number of large read operations=0
            HDFS: Number of write operations=2
    Job Counters 
            Launched map tasks=1
            Launched reduce tasks=1
            Data-local map tasks=1
            Total time spent by all maps in occupied slots (ms)=5501
            Total time spent by all reduces in occupied slots (ms)=1621
            Total time spent by all map tasks (ms)=5501
            Total time spent by all reduce tasks (ms)=1621
            Total vcore-seconds taken by all map tasks=5501
            Total vcore-seconds taken by all reduce tasks=1621
            Total megabyte-seconds taken by all map tasks=5633024
            Total megabyte-seconds taken by all reduce tasks=1659904
    Map-Reduce Framework
            Map input records=3
            Map output records=6
            Map output bytes=62
            Map output materialized bytes=56
            Input split bytes=98
            Combine input records=6
            Combine output records=4
            Reduce input groups=4
            Reduce shuffle bytes=56
            Reduce input records=4
            Reduce output records=4
            Spilled Records=8
            Shuffled Maps =1
            Failed Shuffles=0
            Merged Map outputs=1
            GC time elapsed (ms)=572
            CPU time spent (ms)=1860
            Physical memory (bytes) snapshot=428474368
            Virtual memory (bytes) snapshot=4219695104
            Total committed heap usage (bytes)=284164096
    Shuffle Errors
            BAD_ID=0
            CONNECTION=0
            IO_ERROR=0
            WRONG_LENGTH=0
            WRONG_MAP=0
            WRONG_REDUCE=0
    File Input Format Counters 
            Bytes Read=38
    File Output Format Counters 
            Bytes Written=34

[hadoop@master ~]$ hdfs dfs -cat /output/part-r-00000
Hadoop  1
Hello   3
Huasan  1
World   1

第6章 Hive組建安裝配置

實驗一:Hive 組件安裝配置

1.1. 實驗目的

完成本實驗,您應該能夠:

  • 掌握Hive 組件安裝配置
  • 掌握Hive 組件格式化和啟動

1.2. 實驗要求

  • 熟悉Hive 組件安裝配置
  • 瞭解Hive 組件格式化和啟動

1.3. 實驗過程

1.3.1. 實驗任務一:下載和解壓安裝文件

1.3.1.1. 步驟一:基礎環境和安裝準備

Hive 組件需要基於Hadoop 系統進行安裝。因此,在安裝 Hive 組件前,需要確保 Hadoop 系統能夠正常運行。本章節內容是基於之前已部署完畢的 Hadoop 全分佈系統,在 master 節點上實現 Hive 組件安裝。
Hive 組件的部署規劃和軟體包路徑如下:

(1)當前環境中已安裝 Hadoop 全分佈系統。

(2)本地安裝 MySQL 資料庫(賬號 root,密碼 Password123$), 軟體包在/opt/software/mysql-5.7.18 路徑下。

(3)MySQL 埠號(3306)。

(4)MySQL 的 JDBC 驅動包/opt/software/mysql-connector-java-5.1.47.jar, 在此基礎上更新 Hive 元數據存儲。

(5)Hive 軟體包/opt/software/apache-hive-2.0.0-bin.tar.gz。

1.3.1.2. 步驟二:解壓安裝文件

(1)使用 root 用戶,將 Hive 安裝包
/opt/software/apache-hive-2.0.0-bin.tar.gz 路解壓到/usr/local/src 路徑下。

[root@master ~]# tar -zxvf /opt/software/apache-hive-2.0.0-bin.tar.gz -C /usr/local/src/

(2)將解壓後的 apache-hive-2.0.0-bin 文件夾更名為 hive;

[root@master ~]# mv /usr/local/src/apache-hive-2.0.0-bin/ /usr/local/src/hive/

(3)修改 hive 目錄歸屬用戶和用戶組為 hadoop

[root@master ~]# chown -R hadoop:hadoop /usr/local/src/hive 

1.3.2. 實驗任務二:設置 Hive 環境

1.3.2.1. 步驟一:卸載MariaDB 資料庫

Hive 元數據存儲在 MySQL 資料庫中,因此在部署 Hive 組件前需要首先在 Linux 系統下安裝 MySQL 資料庫,併進行 MySQL 字元集、安全初始化、遠程訪問許可權等相關配置。需要使用 root 用戶登錄,執行如下操作步驟:

(1)關閉 Linux 系統防火牆,並將防火牆設定為系統開機並不自動啟動。

[root@master ~]# systemctl stop firewalld
[root@master ~]# systemctl disable firewalld

(2)卸載 Linux 系統自帶的 MariaDB。

  1. 首先查看 Linux 系統中 MariaDB 的安裝情況。

    [root@master ~]# rpm -qa | grep mariadb

2)卸載 MariaDB 軟體包。
我這裡沒有就不需要卸載

1.3.2.2. 步驟二:安裝MySQL 資料庫

(1)按如下順序依次按照 MySQL 資料庫的 mysql common、mysql libs、mysql client 軟體包。

[root@master ~]# cd /opt/software/mysql-5.7.18/

[root@master mysql-5.7.18]# rpm -ivh mysql-community-common-5.7.18-1.el7.x86_64.rpm
warning: mysql-community-common-5.7.18-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
Preparing...  ################################# [100%]
package mysql-community-common-5.7.18-1.el7.x86_64 is already installed

[root@master mysql-5.7.18]# rpm -ivh mysql-community-libs-5.7.18-1.el7.x86_64.rpm
warning: mysql-community-libs-5.7.18-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
Preparing...  ################################# [100%]
package mysql-community-libs-5.7.18-1.el7.x86_64 is already installed

[root@master mysql-5.7.18]# rpm -ivh mysql-community-client-5.7.18-1.el7.x86_64.rpm
warning: mysql-community-client-5.7.18-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
Preparing...  ################################# [100%]
package mysql-community-client-5.7.18-1.el7.x86_64 is already installed

(2)安裝 mysql server 軟體包。

[root@master mysql-5.7.18]# rpm -ivh mysql-community-server-5.7.18-1.el7.x86_64.rpm 
warning: mysql-community-server-5.7.18-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
Preparing...  ################################# [100%]
package mysql-community-server-5.7.18-1.el7.x86_64 is already installed

(3)修改 MySQL 資料庫配置,在/etc/my.cnf 文件中添加如表 6-1 所示的 MySQL 資料庫配置項。

將以下配置信息添加到/etc/my.cnf 文件 symbolic-links=0 配置信息的下方。

default-storage-engine=innodb 

innodb_file_per_table 

collation-server=utf8_general_ci 

init-connect='SET NAMES utf8' 

character-set-server=utf8 

(4)啟動 MySQL 資料庫。

[root@master ~]# systemctl start mysqld

(5)查詢 MySQL 資料庫狀態。mysqld 進程狀態為 active (running),則表示 MySQL 資料庫正常運行。

如果 mysqld 進程狀態為 failed,則表示 MySQL 資料庫啟動異常。此時需要排查/etc/my.cnf 文件。

[root@master ~]# systemctl status mysqld
● mysqld.service - MySQL Server
   Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
   Active: active (running) since Sun 2022-04-10 22:54:39 CST; 1h 0min ago
 Docs: man:mysqld(8)
   http://dev.mysql.com/doc/refman/en/using-systemd.html
 Main PID: 929 (mysqld)
   CGroup: /system.slice/mysqld.service
   └─929 /usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/my...

Apr 10 22:54:35 master systemd[1]: Starting MySQL Server...
Apr 10 22:54:39 master systemd[1]: Started MySQL Server.

(6)查詢 MySQL 資料庫預設密碼。

[root@master ~]# cat /var/log/mysqld.log | grep password
2022-04-08T16:20:04.456271Z 1 [Note] A temporary password is generated for root@localhost: 0yf>>yWdMd8_

MySQL 資料庫是安裝後隨機生成的,所以每次安裝後生成的預設密碼不相同。

(7)MySQL 資料庫初始化。 0yf>>yWdMd8_

執行 mysql_secure_installation 命令初始化 MySQL 資料庫,初始化過程中需要設定資料庫 root 用戶登錄密碼,密碼需符合安全規則,包括大小寫字元、數字和特殊符號, 可設定密碼為 Password123$。

在進行 MySQL 資料庫初始化過程中會出現以下交互確認信息:

1)Change the password for root ? ((Press y|Y for Yes, any other key for No)表示是否更改 root 用戶密碼,在鍵盤輸入 y 和回車。

2)Do you wish to continue with the password provided?(Press y|Y for Yes, any other key for No)表示是否使用設定的密碼繼續,在鍵盤輸入 y 和回車。

3)Remove anonymous users? (Press y|Y for Yes, any other key for No)表示是否刪除匿名用戶,在鍵盤輸入 y 和回車。

4)Disallow root login remotely? (Press y|Y for Yes, any other key for No) 表示是否拒絕 root 用戶遠程登錄,在鍵盤輸入 n 和回車,表示允許 root 用戶遠程登錄。

5)Remove test database and access to it? (Press y|Y for Yes, any other key for No)表示是否刪除測試資料庫,在鍵盤輸入 y 和回車。

6)Reload privilege tables now? (Press y|Y for Yes, any other key for No) 表示是否重新載入授權表,在鍵盤輸入 y 和回車。

mysql_secure_installation 命令執行過程如下:

[root@master ~]# mysql_secure_installation

Securing the MySQL server deployment.

Enter password for user root: 
The 'validate_password' plugin is installed on the server.
The subsequent steps will run with the existing configuration
of the plugin.
Using existing password for root.

Estimated strength of the password: 100 
Change the password for root ? ((Press y|Y for Yes, any other key for No) : y

New password: 

Re-enter new password: 

Estimated strength of the password: 100 
Do you wish to continue with the password provided?(Press y|Y for Yes, any other key for No) : y
By default, a MySQL installation has an anonymous user,
allowing anyone to log into MySQL without having to have
a user account created for them. This is intended only for
testing, and to make the installation go a bit smoother.
You should remove them before moving into a production
environment.

Remove anonymous users? (Press y|Y for Yes, any other key for No) : y
Success.

Normally, root should only be allowed to connect from
'localhost'. This ensures that someone cannot guess at
the root password from the network.

Disallow root login remotely? (Press y|Y for Yes, any other key for No) : n

 ... skipping.
By default, MySQL comes with a database named 'test' that
anyone can access. This is also intended only for testing,
and should be removed before moving into a production
environment.

Remove test database and access to it? (Press y|Y for Yes, any other key for No) : y
 - Dropping test database...
Success.

 - Removing privileges on test database...
Success.

Reloading the privilege tables will ensure that all changes
made so far will take effect immediately.

Reload privilege tables now? (Press y|Y for Yes, any other key for No) : y
Success.

All done! 

(7) 添加 root 用戶從本地和遠程訪問 MySQL 資料庫表單的授權。

[root@master ~]# mysql -u root -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 9
Server version: 5.7.18 MySQL Community Server (GPL)

Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> grant all privileges on *.* to root@'localhost' identified by 'Password123$'; 
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> grant all privileges on *.* to root@'%' identified by 'Password123$';
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)

mysql> select user,host from mysql.user where user='root';
+------+-----------+
| user | host  |
+------+-----------+
| root | % |
| root | localhost |
+------+-----------+
2 rows in set (0.00 sec)

mysql> exit;
Bye
1.3.2.3. 步驟三:配置 Hive 組件

(1)設置 Hive 環境變數並使其生效。

[root@master ~]# vim /etc/profile

export HIVE_HOME=/usr/local/src/hive
export PATH=$PATH:$HIVE_HOME/bin

[root@master ~]# source /etc/profile

(2)修改 Hive 組件配置文件。

切換到 hadoop 用戶執行以下對 Hive 組件的配置操作。
將/usr/local/src/hive/conf 文件夾下 hive-default.xml.template 文件,更名為hive-site.xml。

[root@master ~]# su - hadoop 
Last login: Sun Apr 10 23:27:25 CS

[hadoop@master ~]$ cp /usr/local/src/hive/conf/hive-default.xml.template  /usr/local/src/hive/conf/hive-site.xml

(3)通過 vi 編輯器修改 hive-site.xml 文件實現 Hive 連接 MySQL 資料庫,並設定Hive 臨時文件存儲路徑。

[hadoop@master ~]$ vi /usr/local/src/hive/conf/hive-site.xml

1)設置 MySQL 資料庫連接。

<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://master:3306/hive?createDatabaseIfNotExist=true&amp;us eSSL=false</value>
<description>JDBC connect string for a JDBC metastore</description>

2)配置 MySQL 資料庫 root 的密碼。

<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>Password123$</value>
<description>password to use against s database</description>
</property>

3)驗證元數據存儲版本一致性。若預設 false,則不用修改。

 <property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
<description>
Enforce metastore schema version consistency.
True: Verify that version information stored in is compatible with one from Hive jars. Also disable automatic
False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.
</description>
</property>

4)配置資料庫驅動。

<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>

5)配置資料庫用戶名 javax.jdo.option.ConnectionUserName 為 root。

<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>Username to use against metastore database</description>
</property>

6 )將以下位置的 ${system:java.io.tmpdir}/${system:user.name} 替換為“/usr/local/src/hive/tmp”目錄及其子目錄。

需要替換以下 4 處配置內容:

<name>hive.querylog.location</name>
<value>/usr/local/src/hive/tmp</value>
<description>Location of Hive run time structured log file</description>

<name>hive.exec.local.scratchdir</name>
<value>/usr/local/src/hive/tmp</value>

<name>hive.downloaded.resources.dir</name>
<value>/usr/local/src/hive/tmp/resources</value>

<name>hive.server2.logging.operation.log.location</name>
<value>/usr/local/src/hive/tmp/operation_logs</value>

7)在Hive安裝目錄中創建臨時文件夾 tmp。

[hadoop@master ~]$ mkdir /usr/local/src/hive/tmp 

至此,Hive 組件安裝和配置完成。

1.3.2.4. 步驟四:初始化 hive 元數據

1)將 MySQL 資料庫驅動(/opt/software/mysql-connector-java-5.1.46.jar)拷貝到Hive 安裝目錄的 lib 下;

[hadoop@master ~]$ cp /opt/software/mysql-connector-java-5.1.46.jar /usr/local/src/hive/lib/ 

2)重新啟動 hadooop 即可

[hadoop@master ~]$ stop-all.sh 
This script is Deprecated. Instead use stop-dfs.sh and stop-yarn.sh
Stopping namenodes on [master]
master: stopping namenode
10.10.10.129: stopping datanode
10.10.10.130: stopping datanode
Stopping secondary namenodes [0.0.0.0]
0.0.0.0: stopping secondarynamenode
stopping yarn daemons
stopping resourcemanager
10.10.10.129: stopping nodemanager
10.10.10.130: stopping nodemanager
no proxyserver to stop

[hadoop@master ~]$ start-all.sh 
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [master]
master: starting namenode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-namenode-master.out
10.10.10.129: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slave1.out
10.10.10.130: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slave2.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-secondarynamenode-master.out
starting yarn daemons
starting resourcemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-resourcemanager-master.out
10.10.10.130: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-nodemanager-slave2.out
10.10.10.129: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-nodemanager-slave1.out

3)初始化資料庫

[hadoop@master ~]$ schematool -initSchema -dbType mysql 
which: no hbase in (/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/local/src/hive/bin:/home/hadoop/.local/bin:/home/hadoop/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/hive-jdbc-2.0.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL:jdbc:mysql://master:3306/hive?createDatabaseIfNotExist=true&us eSSL=false
Metastore Connection Driver :com.mysql.jdbc.Driver
Metastore connection User:   root
Mon Apr 11 00:46:32 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
Starting metastore schema initialization to 2.0.0
Initialization script hive-schema-2.0.0.mysql.sql
Password123$
Password123$
No current connection
org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization FAILED! Metastore state would be inconsistent !!

4)啟動 hive

[hadoop@master hive]$ hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/hive-jdbc-2.0.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in jar:file:/usr/local/src/hive/lib/hive-common-2.0.0.jar!/hive-log4j2.properties
Fri May 20 18:51:50 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
Fri May 20 18:51:50 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
Fri May 20 18:51:50 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
Fri May 20 18:51:50 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
Fri May 20 18:51:52 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
Fri May 20 18:51:52 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
Fri May 20 18:51:52 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
Fri May 20 18:51:52 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive> 

第7章 ZooKeeper組件安裝配置

實驗一:ZooKeeper 組件安裝配置

1.1.實驗目的

完成本實驗,您應該能夠:

  • 掌握下載和安裝 ZooKeeper
  • 掌握 ZooKeeper 的配置選項
  • 掌握啟動 ZooKeeper

1.2.實驗要求

  • 瞭解 ZooKeeper 的配置選項
  • 熟悉啟動 ZooKeeper

1.3.實驗過程

1.3.1 實驗任務一:配置時間同步
[root@master ~]# yum -y install chrony

[root@master ~]# cat /etc/chrony.conf 
# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
server time1.aliyun.com iburst 
 
[root@master ~]# systemctl restart chronyd.service 
[root@master ~]# systemctl enable chronyd.service 

[root@master ~]# date 
Fri Apr 15 15:40:14 CST 2022
[root@slave1 ~]# yum -y install chrony

[root@slave1 ~]# cat /etc/chrony.conf 
# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
server time1.aliyun.com iburst

[root@slave1 ~]# systemctl restart chronyd.service
[root@slave1 ~]# systemctl enable chronyd.service

[root@slave1 ~]# date
Fri Apr 15 15:40:17 CST 2022  
[root@slave2 ~]# yum -y install chrony

[root@slave2 ~]# cat /etc/chrony.conf 
# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
server time1.aliyun.com iburst

[root@slave2 ~]# systemctl restart chronyd.service
[root@slave2 ~]# systemctl enable chronyd.service 

[root@slave2 ~]# date
Fri Apr 15 15:40:20 CST 2022
1.3.2 實驗任務二:下載和安裝 ZooKeeper

ZooKeeper最新的版本可以通過官網http://hadoop.apache.org/zookeeper/來獲取,安裝 ZooKeeper 組件需要與 Hadoop 環境適配。

註意,各節點的防火牆需要關閉,否則會出現連接問題。

1.ZooKeeper 的安裝包 zookeeper-3.4.8.tar.gz 已放置在 Linux系統/opt/software
目錄下。

2.解壓安裝包到指定目標,在 Master 節點執行如下命令。

[root@master ~]# tar xf /opt/software/zookeeper-3.4.8.tar.gz -C /usr/local/src/

[root@master ~]# cd /usr/local/src/
[root@master src]# mv zookeeper-3.4.8/ zookeeper
1.3.3 實驗任務三:ZooKeeper的配置選項
1.3.3.1 步驟一:Master節點配置

(1)在 ZooKeeper 的安裝目錄下創建 data 和 logs 文件夾。

 [root@master src]# cd /usr/local/src/zookeeper/
 [root@master zookeeper]# mkdir data logs 

(2)在每個節點寫入該節點的標識編號,每個節點編號不同,master節點寫入 1,slave1 節點寫入2,slave2 節點寫入3。

[root@master zookeeper]# echo '1' > /usr/local/src/zookeeper/data/myid

(3)修改配置文件 zoo.cfg

[root@master zookeeper]# cd /usr/local/src/zookeeper/conf/
[root@master conf]# cp zoo_sample.cfg zoo.cfg

修改 dataDir 參數內容如下:

[root@master conf]# vi zoo.cfg 
dataDir=/usr/local/src/zookeeper/data

(4)在 zoo.cfg 文件末尾追加以下參數配置,表示三個 ZooKeeper 節點的訪問埠號。

server.1=master:2888:3888
server.2=slave1:2888:3888
server.3=slave2:2888:3888

(5)修改ZooKeeper安裝目錄的歸屬用戶為 hadoop 用戶。

[root@master conf]# chown -R hadoop:hadoop /usr/local/src/ 
1.3.3.2 步驟二:Slave 節點配置

(1)從 Master 節點複製 ZooKeeper 安裝目錄到兩個 Slave 節點。

[root@master ~]# scp -r /usr/local/src/zookeeper node1:/usr/local/src/
[root@master ~]# scp -r /usr/local/src/zookeeper node2:/usr/local/src/

(2)在slave1節點上修改 zookeeper 目錄的歸屬用戶為 hadoop 用戶。

[root@slave1 ~]# chown -R hadoop:hadoop /usr/local/src/
[root@slave1 ~]# ll /usr/local/src/
total 4
drwxr-xr-x. 12 hadoop hadoop  183 Apr  2 18:11 hadoop
drwxr-xr-x   9 hadoop hadoop  183 Apr 15 16:37 hbase
drwxr-xr-x.  8 hadoop hadoop  255 Apr  2 18:06 jdk
drwxr-xr-x  12 hadoop hadoop 4096 Apr 22 15:31 zookeeper

(3)在slave1節點上配置該節點的myid為2。

[root@slave1 ~]# echo 2 > /usr/local/src/zookeeper/data/myid

(4)在slave2節點上修改 zookeeper 目錄的歸屬用戶為 hadoop 用戶。

[root@slave2 ~]# chown -R hadoop:hadoop /usr/local/src/

(5)在slave2節點上配置該節點的myid為3。

[root@slave2 ~]# echo 3 > /usr/local/src/zookeeper/data/myid
1.3.3.3 步驟三:系統環境變數配置

在 master、slave1、slave2 三個節點增加環境變數配置。

[root@master conf]# vi /etc/profile.d/zookeeper.sh
export ZOOKEEPER_HOME=/usr/local/src/zookeeper
export PATH=${ZOOKEEPER_HOME}/bin:$PATH

[root@master ~]# scp /etc/profile.d/zookeeper.sh node1:/etc/profile.d/
zookeeper.sh 100%   8742.3KB/s   00:00

[root@master ~]# scp /etc/profile.d/zookeeper.sh node2:/etc/profile.d/
zookeeper.sh 100%   8750.8KB/s   00:00
1.3.4 實驗任務四:啟動 ZooKeeper

啟動ZooKeeper需要使用Hadoop用戶進行操作。

(1)分別在 master、slave1、slave2 三個節點使用 zkServer.sh start 命令啟動ZooKeeper。

[root@master ~]# su - hadoop 
Last login: Fri Apr 15 21:54:17 CST 2022 on pts/0

[hadoop@master ~]$ jps
3922 Jps

[hadoop@master ~]$ zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

[hadoop@master ~]$ jps
3969 Jps
3950 QuorumPeerMain

[root@slave1 ~]# su - hadoop 
Last login: Fri Apr 15 22:06:47 CST 2022 on pts/0

[hadoop@slave1 ~]$ jps
1370 Jps

[hadoop@slave1 ~]$ zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

[hadoop@slave1 ~]$ jps
1395 QuorumPeerMain
1421 Jps

[root@slave2 ~]# su - hadoop 
Last login: Fri Apr 15 16:25:52 CST 2022 on pts/1

[hadoop@slave2 ~]$ jps
1336 Jps

[hadoop@slave2 ~]$ zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

[hadoop@slave2 ~]$ jps
1361 QuorumPeerMain
1387 Jps

(2)三個節點都啟動完成後,再統一查看 ZooKeeper 運行狀態。

[hadoop@master conf]$ zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
Mode: follower

[hadoop@slave1 ~]$ zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
Mode: leader
[hadoop@slave2 conf]$ zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
Mode: follower

第8章 HBase組件安裝配置

實驗一:HBase 組件安裝與配置

1.1實驗目的

完成本實驗,您應該能夠:

  • 掌握HBase 安裝與配置

  • 掌握HBase 常用 Shell 命令


您的分享是我們最大的動力!

-Advertisement-
Play Games
更多相關文章
  • 使用opc-ua-sim模擬server 前言 一直想找一種將模擬server放到docker容器中運行的方式,這樣就不需要在每個電腦上都安裝軟體,僅僅只需要將鏡像保存起來,使用時載入就行。於是乎就跑到了HUB里搜尋,你說巧不巧,就剛好找到了. iotechsys 在HUB里找到這個作者(iotec ...
  • 本文分享自華為雲社區《openGauss 5.0 單點企業版部署_Centos7_x86》,本文作者:董小姐 本文檔環境:CentOS7.9 x86_64 4G1C40G python2.7.5 互動式初始化環境方式 1、介紹 openGauss是一款開源關係型資料庫管理系統,採用木蘭寬鬆許可證v2 ...
  • 引言 在當前的IT行業,無論是校園招聘還是社會招聘,MySQL的重要性不言而喻。 面試過程中,MySQL相關的問題經常出現,這不僅因為它是最流行的關係型資料庫之一,而且在日常的軟體開發中,MySQL的應用廣泛,尤其是對於Java後端開發者來說,熟練掌握MySQL已成為他們技術能力評估的重要指標。 因 ...
  • 不同於Oracle:SEQUENCE的區別 前言 在使用Oracle資料庫SEQUENCE功能時,發現Oracle對邊界處理比較奇怪。剛好GreatSQL也支持SEQUENCE,就拿來一起比較一下。 先說結論:GreatSQL 的使用基本和Oracle基本一致,但是對 START WITH 的邊界限 ...
  • 1.綜述 Hive的聚合函數衍生的視窗函數在我們進行數據處理和數據分析過程中起到了很大的作用 在Hive中,視窗函數允許你在結果集的行上進行計算,這些計算不會影響你查詢的結果集的行數。 Hive提供的視窗和分析函數可以分為聚合函數類視窗函數,分組排序類視窗函數,偏移量計算類視窗函數。 本節主要介紹聚 ...
  • 使用 mysqldump 備份表 powershell 下使用 | Out-file -Encoding utf8 設置字元格式 .\mysqldump.exe --single-transaction --user=root --password=123456 --host 127.0.0.1 - ...
  • 本篇文章主要介紹了GaussDB(DWS)性能調優涉及到的優化器和系統級GUC參數,通過合理配置這些GUC參數,能夠充分利用好CPU、記憶體、磁碟IO和網路IO等資源,提升語句的執行性能和GaussDB(DWS)集群的整體性能。 ...
  • 本文詳細記錄一次在Mac中安裝MySQL Server的過程,安裝環境如下: MacOS 14.4 x86, core i7 在MySQL資料庫實驗環境下通常都要安裝其MySQL Server,安裝方式五花八門,最簡單的有通過系統包管理工具一鍵安裝,例如apt和yum等,這種安裝方法會使得MySQL ...
一周排行
    -Advertisement-
    Play Games
  • 移動開發(一):使用.NET MAUI開發第一個安卓APP 對於工作多年的C#程式員來說,近來想嘗試開發一款安卓APP,考慮了很久最終選擇使用.NET MAUI這個微軟官方的框架來嘗試體驗開發安卓APP,畢竟是使用Visual Studio開發工具,使用起來也比較的順手,結合微軟官方的教程進行了安卓 ...
  • 前言 QuestPDF 是一個開源 .NET 庫,用於生成 PDF 文檔。使用了C# Fluent API方式可簡化開發、減少錯誤並提高工作效率。利用它可以輕鬆生成 PDF 報告、發票、導出文件等。 項目介紹 QuestPDF 是一個革命性的開源 .NET 庫,它徹底改變了我們生成 PDF 文檔的方 ...
  • 項目地址 項目後端地址: https://github.com/ZyPLJ/ZYTteeHole 項目前端頁面地址: ZyPLJ/TreeHoleVue (github.com) https://github.com/ZyPLJ/TreeHoleVue 目前項目測試訪問地址: http://tree ...
  • 話不多說,直接開乾 一.下載 1.官方鏈接下載: https://www.microsoft.com/zh-cn/sql-server/sql-server-downloads 2.在下載目錄中找到下麵這個小的安裝包 SQL2022-SSEI-Dev.exe,運行開始下載SQL server; 二. ...
  • 前言 隨著物聯網(IoT)技術的迅猛發展,MQTT(消息隊列遙測傳輸)協議憑藉其輕量級和高效性,已成為眾多物聯網應用的首選通信標準。 MQTTnet 作為一個高性能的 .NET 開源庫,為 .NET 平臺上的 MQTT 客戶端與伺服器開發提供了強大的支持。 本文將全面介紹 MQTTnet 的核心功能 ...
  • Serilog支持多種接收器用於日誌存儲,增強器用於添加屬性,LogContext管理動態屬性,支持多種輸出格式包括純文本、JSON及ExpressionTemplate。還提供了自定義格式化選項,適用於不同需求。 ...
  • 目錄簡介獲取 HTML 文檔解析 HTML 文檔測試參考文章 簡介 動態內容網站使用 JavaScript 腳本動態檢索和渲染數據,爬取信息時需要模擬瀏覽器行為,否則獲取到的源碼基本是空的。 本文使用的爬取步驟如下: 使用 Selenium 獲取渲染後的 HTML 文檔 使用 HtmlAgility ...
  • 1.前言 什麼是熱更新 游戲或者軟體更新時,無需重新下載客戶端進行安裝,而是在應用程式啟動的情況下,在內部進行資源或者代碼更新 Unity目前常用熱更新解決方案 HybridCLR,Xlua,ILRuntime等 Unity目前常用資源管理解決方案 AssetBundles,Addressable, ...
  • 本文章主要是在C# ASP.NET Core Web API框架實現向手機發送驗證碼簡訊功能。這裡我選擇是一個互億無線簡訊驗證碼平臺,其實像阿裡雲,騰訊雲上面也可以。 首先我們先去 互億無線 https://www.ihuyi.com/api/sms.html 去註冊一個賬號 註冊完成賬號後,它會送 ...
  • 通過以下方式可以高效,並保證數據同步的可靠性 1.API設計 使用RESTful設計,確保API端點明確,並使用適當的HTTP方法(如POST用於創建,PUT用於更新)。 設計清晰的請求和響應模型,以確保客戶端能夠理解預期格式。 2.數據驗證 在伺服器端進行嚴格的數據驗證,確保接收到的數據符合預期格 ...