環境的搭建

来源:https://www.cnblogs.com/qw77/p/18102996
-Advertisement-
Play Games

第4章 Hadoop文件參數配置 實驗一:hadoop 全分佈配置 1.1 實驗目的 完成本實驗,您應該能夠: 掌握 hadoop 全分佈的配置 掌握 hadoop 全分佈的安裝 掌握 hadoop 配置文件的參數意義 1.2 實驗要求 熟悉 hadoop 全分佈的安裝 瞭解 hadoop 配置文件 ...


第4章 Hadoop文件參數配置

實驗一:hadoop 全分佈配置

1.1 實驗目的

完成本實驗,您應該能夠:

  • 掌握 hadoop 全分佈的配置
  • 掌握 hadoop 全分佈的安裝
  • 掌握 hadoop 配置文件的參數意義

1.2 實驗要求

  • 熟悉 hadoop 全分佈的安裝
  • 瞭解 hadoop 配置文件的意義

1.3 實驗過程

1.3.1 實驗任務一:在 Master 節點上安裝 Hadoop

1.3.1.1 步驟一:解壓縮 hadoop-2.7.1.tar.gz 安裝包到/usr 目錄下
[root@master ~]# tar zvxf jdk-8u152-linux-x64.tar.gz -C /usr/local/src/

[root@master ~]# tar zvxf hadoop-2.7.1.tar.gz -C /usr/local/src/
1.3.1.2 步驟二:將 hadoop-2.7.1 文件夾重命名為 hadoop
[root@master ~]# cd /usr/local/src/
[root@master src]# ls
hadoop-2.7.1  jdk1.8.0_152
[root@master src]# mv hadoop-2.7.1/ hadoop
[root@master src]# mv jdk1.8.0_152/ jdk
[root@master src]# ls
hadoop  jdk
1.3.1.3 步驟三:配置 Hadoop 環境變數

​ [root@master ~]# vi /etc/profile.d/hadoop.sh

註意:在第二章安裝單機 Hadoop 系統已經配置過環境變數,先刪除之前配置後添加

#寫入以下信息
export JAVA_HOME=/usr/local/src/jdk
export HADOOP_HOME=/usr/local/src/hadoop
export PATH=${JAVA_HOME}/bin:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:$PATH
1.3.1.4 步驟四:使配置的 Hadoop 的環境變數生效
[root@master ~]# source /etc/profile.d/hadoop.sh 
[root@master ~]# echo $PATH
/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin
1.3.1.5 步驟五:執行以下命令修改 hadoop-env.sh 配置文件
[root@master ~]# vi /usr/local/src/hadoop/etc/hadoop/hadoop-env.sh

#寫入以下信息
export JAVA_HOME=/usr/local/src/jdk

1.3.2 實驗任務二:配置 hdfs-site.xml 文件參數

執行以下命令修改 hdfs-site.xml 配置文件。

[root@master ~]# vi /usr/local/src/hadoop/etc/hadoop/hdfs-site.xml

#在文件中<configuration>和</configuration>一對標簽之間追加以下配置信息
<configuration>
		<property>
				<name>dfs.namenode.name.dir</name>
				<value>file:/usr/local/src/hadoop/dfs/name</value>
		</property>
		<property>
				<name>dfs.datanode.data.dir</name>
				<value>file:/usr/local/src/hadoop/dfs/data</value>
		</property>
		<property>
				<name>dfs.replication</name>
				<value>2</value>
		</property>
</configuration>

創建目錄
[root@master ~]# mkdir -p /usr/local/src/hadoop/dfs/{name,data}

對於 Hadoop 的分散式文件系統 HDFS 而言,一般都是採用冗餘存儲,冗餘因數通常為3,也就是說,一份數據保存三份副本。所以,修改 dfs.replication 的配置,使 HDFS 文件的備份副本數量設定為2個。

1.3.3 實驗任務三:配置 core-site.xml 文件參數

[root@master ~]# vi /usr/local/src/hadoop/etc/hadoop/core-site.xml

#在文件中<configuration>和</configuration>一對標簽之間追加以下配置信息

<configuration>
		<property>
				<name>fs.defaultFS</name>
				<value>hdfs://master:9000</value>
		</property>
		<property>
				<name>io.file.buffer.size</name>
		<value>131072</value>
		</property>
		<property>
				<name>hadoop.tmp.dir</name>
				<value>file:/usr/local/src/hadoop/tmp</value>
		</property>
</configuration>

#保存以上配置後創建目錄
[root@master ~]# mkdir -p /usr/local/src/hadoop/tmp

如沒有配置 hadoop.tmp.dir 參數,此時系統預設的臨時目錄為:/tmp/hadoop-hadoop。該目錄在每次 Linux 系統重啟後會被刪除,必須重新執行 Hadoop 文件系統格式化命令,否則 Hadoop 運行會出錯。

1.3.4 實驗任務四:配置 mapred-site.xml

[root@master ~]# cd /usr/local/src/hadoop/etc/hadoop/
[root@master hadoop]# cp mapred-site.xml.template mapred-site.xml

#在文件中<configuration>和</configuration>一對標簽之間追加以下配置信息

<configuration>
		<property>
				<name>mapreduce.framework.name</name>
		<value>yarn</value>
		</property>
		<property>
				<name>mapreduce.jobhistory.address</name>
				<value>master:10020</value>
		</property>
		<property>
				<name>mapreduce.jobhistory.webapp.address</name>
				<value>master:19888</value>
		</property>
</configuration>

1.3.5 實驗任務五:配置 yarn-site.xml

[root@master hadoop]# vi /usr/local/src/hadoop/etc/hadoop/yarn-site.xml

#在文件中<configuration>和</configuration>一對標簽之間追加以下配置信息

<configuration>
<!-- Site specific YARN configuration properties -->
		<property>
				<name>arn.resourcemanager.address</name>
				<value>master:8032</value>
		</property>
		<property>
				<name>yarn.resourcemanager.scheduler.address</name>
				<value>master:8030</value>
		</property>
		<property>
				<name>yarn.resourcemanager.webapp.address</name>
				<value>master:8088</value>
		</property>
		<property>
				<name>yarn.resourcemanager.resource-tracker.address</name>
				<value>master:8031</value>
		</property>
		<property>
				<name>yarn.resourcemanager.admin.address</name>
				<value>master:8033</value>
		</property>
		<property>
				<name>yarn.nodemanager.aux-services</name>
				<value>mapreduce_shuffle</value>
		</property>
		<property>
			  <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
			  <value>org.apache.hadoop.mapred.ShuffleHandler</value>
		</property>
</configuration>

1.3.6 實驗任務六:Hadoop 其它相關配置

1.3.6.1 步驟一:配置 masters 文件
#修改 masters 配置文件
[root@master ~]# vi /usr/local/src/hadoop/etc/hadoop/masters

#加入以下配置信息
10.10.10.128
1.3.6.2 步驟二:配置 slaves 文件
#修改 slaves 配置文件
[root@master ~]# vi /usr/local/src/hadoop/etc/hadoop/slaves

#刪除 localhost,加入以下配置信息
10.10.10.129
10.10.10.130
1.3.6.3 步驟三:新建用戶以及修改目錄許可權
#新建用戶
[root@master ~]# useradd hadoop 
[root@master ~]# echo 'hadoop' | passwd --stdin hadoop
Changing password for user hadoop.
passwd: all authentication tokens updated successfully.

#修改目錄許可權
[root@master ~]# chown -R hadoop.hadoop /usr/local/src/
[root@master ~]# cd /usr/local/src/
[root@master src]# ll
total 0
drwxr-xr-x 11 hadoop hadoop 171 Mar 27 01:51 hadoop
drwxr-xr-x  8 hadoop hadoop 255 Sep 14  2017 jdk
1.3.6.4 步驟四:配置master能夠免密登錄所有slave節點
[root@master ~]# ssh-keygen -t rsa

Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): 
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:Ibeslip4Bo9erREJP37u7qhlwaEeMOCg8DlJGSComhk root@master
The key's randomart image is:
+---[RSA 2048]----+
|B.oo |
|Oo.o |
|=o=.  . o|
|E.=.o  + o   |
|.* BS|
|* o =  o |
| * * o+  |
|o O *o   |
|.=.+==   |
+----[SHA256]-----+

[root@master ~]# ssh-copy-id root@slave1
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
The authenticity of host 'slave1 (10.10.10.129)' can't be established.
ECDSA key fingerprint is SHA256:Z643OMlGh0yMEc5i85oZ7c21NHdkzSZD9hY9K39xzP4.
ECDSA key fingerprint is MD5:e0:ef:47:5f:ad:75:9a:44:08:bc:f2:10:8e:d6:53:4a.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@slave1's password: 
Number of key(s) added: 1
Now try logging into the machine, with:   "ssh 'root@slave1'"
and check to make sure that only the key(s) you wanted were added.

[root@master ~]# ssh-copy-id root@slave2
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
The authenticity of host 'slave2 (10.10.10.130)' can't be established.
ECDSA key fingerprint is SHA256:Z643OMlGh0yMEc5i85oZ7c21NHdkzSZD9hY9K39xzP4.
ECDSA key fingerprint is MD5:e0:ef:47:5f:ad:75:9a:44:08:bc:f2:10:8e:d6:53:4a.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@slave2's password: 
Number of key(s) added: 1  
Now try logging into the machine, with:   "ssh 'root@slave2'"
and check to make sure that only the key(s) you wanted were added.
   
[root@master ~]# ssh slave1
Last login: Sun Mar 27 02:58:38 2022 from master
[root@slave1 ~]# exit
logout
Connection to slave1 closed.

[root@master ~]# ssh slave2
Last login: Sun Mar 27 00:26:12 2022 from 10.10.10.1
[root@slave2 ~]# exit
logout
Connection to slave2 closed.
1.3.6.5 步驟五:同步/usr/local/src/目錄下所有文件至所有slave節點
[root@master ~]# scp -r /usr/local/src/* root@slave1:/usr/local/src/

[root@master ~]# scp -r /usr/local/src/* root@slave2:/usr/local/src/

[root@master ~]# scp /etc/profile.d/hadoop.sh root@slave1:/etc/profile.d/
hadoop.sh                                   100%  151    45.9KB/s   00:00 
   
[root@master ~]# scp /etc/profile.d/hadoop.sh root@slave2:/etc/profile.d/
hadoop.sh                                   100%  151    93.9KB/s   00:00    
1.3.6.6 步驟六:在所有slave節點執行以下命令
(1)在slave1

[root@slave1 ~]# useradd hadoop 
[root@slave1 ~]# echo 'hadoop' | passwd --stdin hadoop 
Changing password for user hadoop.
passwd: all authentication tokens updated successfully.

[root@slave1 ~]# chown -R hadoop.hadoop /usr/local/src/
[root@slave1 ~]# ll /usr/local/src/
total 0
drwxr-xr-x 11 hadoop hadoop 171 Mar 27 03:07 hadoop
drwxr-xr-x  8 hadoop hadoop 255 Mar 27 03:07 jdk

[root@slave1 ~]# source /etc/profile.d/hadoop.sh 

[root@slave1 ~]# echo $PATH
/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin

(2)在slave2

[root@slave2 ~]# useradd hadoop
[root@slave2 ~]# echo 'hadoop' | passwd --stdin hadoop
Changing password for user hadoop.
passwd: all authentication tokens updated successfully.

[root@slave2 ~]# chown -R hadoop.hadoop /usr/local/src/
[root@slave2 ~]# ll /usr/local/src/
total 0
drwxr-xr-x 11 hadoop hadoop 171 Mar 27 03:09 hadoop
drwxr-xr-x  8 hadoop hadoop 255 Mar 27 03:09 jdk

[root@slave2 ~]# source /etc/profile.d/hadoop.sh 

[root@slave2 ~]# echo $PATH
/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin

第5章 Hadoop集群運行

實驗一:hadoop 集群運行

1.1 實驗目的

完成本實驗,您應該能夠:

  • 掌握 hadoop 的運行狀態
  • 掌握 hadoop 文件系統格式化配置
  • 掌握 hadoop java 運行狀態查看
  • 掌握 hadoop hdfs 報告查看
  • 掌握 hadoop 節點狀態查看
  • 掌握停止 hadoop 進程操作

1.2 實驗要求

  • 熟悉如何查看 hadoop 的運行狀態
  • 熟悉停止 hadoop 進程的操作

1.3 實驗過程

1.3.1 實驗任務一:配置 Hadoop 格式化

1.3.1.1 步驟一:NameNode 格式化

將 NameNode 上的數據清零,第一次啟動 HDFS 時要進行格式化,以後啟動無需再格式化,否則會缺失 DataNode 進程。另外,只要運行過 HDFS,Hadoop 的工作目錄(本書設置為/usr/local/src/hadoop/tmp)就會有數據,如果需要重新格式化,則在格式化之前一定要先刪除工作目錄下的數據,否則格式化時會出問題。

執行如下命令,格式化 NameNode

[root@master ~]# su - hadoop 
Last login: Fri Apr  1 23:34:46 CST 2022 on pts/1

[hadoop@master ~]$ cd /usr/local/src/hadoop/
[hadoop@master hadoop]$ ./bin/hdfs namenode -format
22/04/02 01:22:42 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************

1.3.1.2 步驟二:啟動 NameNode

[hadoop@master hadoop]$ hadoop-daemon.sh start namenode
namenode running as process 11868. Stop it first.

1.3.2 實驗任務二:查看 Java 進程

啟動完成後,可以使用 JPS 命令查看是否成功。JPS 命令是 Java 提供的一個顯示當前所有 Java 進程 pid 的命令。

[hadoop@master hadoop]$ jps
12122 Jps
11868 NameNode

1.3.2.1 步驟一:切換到Hadoop用戶

[hadoop@master ~]$ su - hadoop 
Password: 
Last login: Sat Apr  2 01:22:13 CST 2022 on pts/1
Last failed login: Sat Apr  2 04:47:08 CST 2022 on pts/1
There was 1 failed login attempt since the last successful login.

1.3.3 實驗任務三:查看 HDFS 的報告

[hadoop@master ~]$ hdfs dfsadmin -report
Configured Capacity: 0 (0 B)
Present Capacity: 0 (0 B)
DFS Remaining: 0 (0 B)
DFS Used: 0 (0 B)
DFS Used%: NaN%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0

-------------------------------------------------

1.3.3.1 步驟一:生成密鑰

[hadoop@master ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa): 
Created directory '/home/hadoop/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:nW/cVxmRp5Ht9TKGT61OmGbhQtkBdpHyS5prGhx24pI [email protected]
The key's randomart image is:
+---[RSA 2048]----+
|  o.oo +.|
| ...o o.=|
|   = o *+|
| .o.* * *|
|S.+= O =.|
|   = ++oB.+ .|
|  E +  =+o. .|
|   . .o.  .. |
|.o   |
+----[SHA256]-----+

[hadoop@master ~]$ ssh-copy-id slave1
/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/hadoop/.ssh/id_rsa.pub"
The authenticity of host 'slave1 (10.10.10.129)' can't be established.
ECDSA key fingerprint is SHA256:BE2tM2BCeGBc6aGRKBTbMTh80VP9noFKzqDknL+0Jes.
ECDSA key fingerprint is MD5:a2:25:9c:bc:d0:df:fc:ec:44:4a:c0:10:26:f2:ef:c7.
Are you sure you want to continue connecting (yes/no)? yes
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
hadoop@slave1's password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'slave1'"
and check to make sure that only the key(s) you wanted were added.

[hadoop@master ~]$ ssh-copy-id slave2
/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/hadoop/.ssh/id_rsa.pub"
The authenticity of host 'slave2 (10.10.10.130)' can't be established.
ECDSA key fingerprint is SHA256:BE2tM2BCeGBc6aGRKBTbMTh80VP9noFKzqDknL+0Jes.
ECDSA key fingerprint is MD5:a2:25:9c:bc:d0:df:fc:ec:44:4a:c0:10:26:f2:ef:c7.
Are you sure you want to continue connecting (yes/no)? yes
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
hadoop@slave2's password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'slave2'"
and check to make sure that only the key(s) you wanted were added.

[hadoop@master ~]$ ssh-copy-id master
/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/hadoop/.ssh/id_rsa.pub"
The authenticity of host 'master (10.10.10.128)' can't be established.
ECDSA key fingerprint is SHA256:BE2tM2BCeGBc6aGRKBTbMTh80VP9noFKzqDknL+0Jes.
ECDSA key fingerprint is MD5:a2:25:9c:bc:d0:df:fc:ec:44:4a:c0:10:26:f2:ef:c7.
Are you sure you want to continue connecting (yes/no)? yes
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
hadoop@master's password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'master'"
and check to make sure that only the key(s) you wanted were added.

1.3.4 實驗任務四:停止dfs.sh

[hadoop@master ~]$ stop-dfs.sh 
Stopping namenodes on [master]
master: stopping namenode
10.10.10.129: no datanode to stop
10.10.10.130: no datanode to stop
Stopping secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
ECDSA key fingerprint is SHA256:BE2tM2BCeGBc6aGRKBTbMTh80VP9noFKzqDknL+0Jes.
ECDSA key fingerprint is MD5:a2:25:9c:bc:d0:df:fc:ec:44:4a:c0:10:26:f2:ef:c7.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.
0.0.0.0: no secondarynamenode to stop

1.3.4.1 重啟並驗證

[hadoop@master ~]$ start-dfs.sh 
Starting namenodes on [master]
master: starting namenode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-namenode-master.example.com.out
10.10.10.129: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slave1.out
10.10.10.130: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slave2.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-secondarynamenode-master.example.com.out

[hadoop@master ~]$ start-yarn.sh 
starting yarn daemons
starting resourcemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-resourcemanager-master.example.com.out
10.10.10.129: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-nodemanager-slave1.out
10.10.10.130: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-nodemanager-slave2.out

[hadoop@master ~]$ jps
12934 NameNode
13546 Jps
13131 SecondaryNameNode
13291 ResourceManager

如果在master上看到ResourceManager,並且在slave上看到NodeManager就表示成功
[hadoop@master ~]$ jps
12934 NameNode
13546 Jps
13131 SecondaryNameNode
13291 ResourceManager

[root@slave1 ~]# jps
11906 NodeManager
11797 DataNode
12037 Jps

[root@slave2 ~]# jps
12758 NodeManager
12648 DataNode
12889 Jps

[hadoop@master ~]$ hdfs dfs -mkdir /input
[hadoop@master ~]$ hdfs dfs -ls /
Found 1 items
drwxr-xr-x   - hadoop supergroup          0 2022-04-02 05:18 /input
[hadoop@master ~]$ mkdir ~/input
[hadoop@master ~]$ vim ~/input/data.txt
Hello World
Hello Hadoop
Hello Huasan
~

[hadoop@master ~]$ hdfs dfs -put ~/input/data.txt 
.bash_logout       .bashrc            .oracle_jre_usage/ .viminfo           
.bash_profile      input/             .ssh/              
[hadoop@master ~]$ hdfs dfs -put ~/input/data.txt /input
[hadoop@master ~]$ hdfs dfs -cat /input/data.txt
Hello World
Hello Hadoop
Hello Huasan
[hadoop@master ~]$ hadoop jar /usr/local/src/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar wordcount /input/data.txt /output
22/04/02 05:31:20 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
22/04/02 05:31:21 INFO input.FileInputFormat: Total input paths to process : 1
22/04/02 05:31:21 INFO mapreduce.JobSubmitter: number of splits:1
22/04/02 05:31:21 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1648846845675_0001
22/04/02 05:31:22 INFO impl.YarnClientImpl: Submitted application application_1648846845675_0001
22/04/02 05:31:22 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1648846845675_0001/
22/04/02 05:31:22 INFO mapreduce.Job: Running job: job_1648846845675_0001
22/04/02 05:31:30 INFO mapreduce.Job: Job job_1648846845675_0001 running in uber mode : false
22/04/02 05:31:30 INFO mapreduce.Job:  map 0% reduce 0%
22/04/02 05:31:38 INFO mapreduce.Job:  map 100% reduce 0%
22/04/02 05:31:42 INFO mapreduce.Job:  map 100% reduce 100%
22/04/02 05:31:42 INFO mapreduce.Job: Job job_1648846845675_0001 completed successfully
22/04/02 05:31:42 INFO mapreduce.Job: Counters: 49
    File System Counters
            FILE: Number of bytes read=56
            FILE: Number of bytes written=230931
            FILE: Number of read operations=0
            FILE: Number of large read operations=0
            FILE: Number of write operations=0
            HDFS: Number of bytes read=136
            HDFS: Number of bytes written=34
            HDFS: Number of read operations=6
            HDFS: Number of large read operations=0
            HDFS: Number of write operations=2
    Job Counters 
            Launched map tasks=1
            Launched reduce tasks=1
            Data-local map tasks=1
            Total time spent by all maps in occupied slots (ms)=5501
            Total time spent by all reduces in occupied slots (ms)=1621
            Total time spent by all map tasks (ms)=5501
            Total time spent by all reduce tasks (ms)=1621
            Total vcore-seconds taken by all map tasks=5501
            Total vcore-seconds taken by all reduce tasks=1621
            Total megabyte-seconds taken by all map tasks=5633024
            Total megabyte-seconds taken by all reduce tasks=1659904
    Map-Reduce Framework
            Map input records=3
            Map output records=6
            Map output bytes=62
            Map output materialized bytes=56
            Input split bytes=98
            Combine input records=6
            Combine output records=4
            Reduce input groups=4
            Reduce shuffle bytes=56
            Reduce input records=4
            Reduce output records=4
            Spilled Records=8
            Shuffled Maps =1
            Failed Shuffles=0
            Merged Map outputs=1
            GC time elapsed (ms)=572
            CPU time spent (ms)=1860
            Physical memory (bytes) snapshot=428474368
            Virtual memory (bytes) snapshot=4219695104
            Total committed heap usage (bytes)=284164096
    Shuffle Errors
            BAD_ID=0
            CONNECTION=0
            IO_ERROR=0
            WRONG_LENGTH=0
            WRONG_MAP=0
            WRONG_REDUCE=0
    File Input Format Counters 
            Bytes Read=38
    File Output Format Counters 
            Bytes Written=34

[hadoop@master ~]$ hdfs dfs -cat /output/part-r-00000
Hadoop  1
Hello   3
Huasan  1
World   1

第6章 Hive組建安裝配置

實驗一:Hive 組件安裝配置

1.1. 實驗目的

完成本實驗,您應該能夠:

  • 掌握Hive 組件安裝配置
  • 掌握Hive 組件格式化和啟動

1.2. 實驗要求

  • 熟悉Hive 組件安裝配置
  • 瞭解Hive 組件格式化和啟動

1.3. 實驗過程

1.3.1. 實驗任務一:下載和解壓安裝文件

1.3.1.1. 步驟一:基礎環境和安裝準備

Hive 組件需要基於Hadoop 系統進行安裝。因此,在安裝 Hive 組件前,需要確保 Hadoop 系統能夠正常運行。本章節內容是基於之前已部署完畢的 Hadoop 全分佈系統,在 master 節點上實現 Hive 組件安裝。
Hive 組件的部署規劃和軟體包路徑如下:

(1)當前環境中已安裝 Hadoop 全分佈系統。

(2)本地安裝 MySQL 資料庫(賬號 root,密碼 Password123$), 軟體包在/opt/software/mysql-5.7.18 路徑下。

(3)MySQL 埠號(3306)。

(4)MySQL 的 JDBC 驅動包/opt/software/mysql-connector-java-5.1.47.jar, 在此基礎上更新 Hive 元數據存儲。

(5)Hive 軟體包/opt/software/apache-hive-2.0.0-bin.tar.gz。

1.3.1.2. 步驟二:解壓安裝文件

(1)使用 root 用戶,將 Hive 安裝包
/opt/software/apache-hive-2.0.0-bin.tar.gz 路解壓到/usr/local/src 路徑下。

[root@master ~]# tar -zxvf /opt/software/apache-hive-2.0.0-bin.tar.gz -C /usr/local/src/

(2)將解壓後的 apache-hive-2.0.0-bin 文件夾更名為 hive;

[root@master ~]# mv /usr/local/src/apache-hive-2.0.0-bin/ /usr/local/src/hive/

(3)修改 hive 目錄歸屬用戶和用戶組為 hadoop

[root@master ~]# chown -R hadoop:hadoop /usr/local/src/hive 

1.3.2. 實驗任務二:設置 Hive 環境

1.3.2.1. 步驟一:卸載MariaDB 資料庫

Hive 元數據存儲在 MySQL 資料庫中,因此在部署 Hive 組件前需要首先在 Linux 系統下安裝 MySQL 資料庫,併進行 MySQL 字元集、安全初始化、遠程訪問許可權等相關配置。需要使用 root 用戶登錄,執行如下操作步驟:

(1)關閉 Linux 系統防火牆,並將防火牆設定為系統開機並不自動啟動。

[root@master ~]# systemctl stop firewalld
[root@master ~]# systemctl disable firewalld

(2)卸載 Linux 系統自帶的 MariaDB。

  1. 首先查看 Linux 系統中 MariaDB 的安裝情況。

    [root@master ~]# rpm -qa | grep mariadb

2)卸載 MariaDB 軟體包。
我這裡沒有就不需要卸載

1.3.2.2. 步驟二:安裝MySQL 資料庫

(1)按如下順序依次按照 MySQL 資料庫的 mysql common、mysql libs、mysql client 軟體包。

[root@master ~]# cd /opt/software/mysql-5.7.18/

[root@master mysql-5.7.18]# rpm -ivh mysql-community-common-5.7.18-1.el7.x86_64.rpm
warning: mysql-community-common-5.7.18-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
Preparing...  ################################# [100%]
package mysql-community-common-5.7.18-1.el7.x86_64 is already installed

[root@master mysql-5.7.18]# rpm -ivh mysql-community-libs-5.7.18-1.el7.x86_64.rpm
warning: mysql-community-libs-5.7.18-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
Preparing...  ################################# [100%]
package mysql-community-libs-5.7.18-1.el7.x86_64 is already installed

[root@master mysql-5.7.18]# rpm -ivh mysql-community-client-5.7.18-1.el7.x86_64.rpm
warning: mysql-community-client-5.7.18-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
Preparing...  ################################# [100%]
package mysql-community-client-5.7.18-1.el7.x86_64 is already installed

(2)安裝 mysql server 軟體包。

[root@master mysql-5.7.18]# rpm -ivh mysql-community-server-5.7.18-1.el7.x86_64.rpm 
warning: mysql-community-server-5.7.18-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
Preparing...  ################################# [100%]
package mysql-community-server-5.7.18-1.el7.x86_64 is already installed

(3)修改 MySQL 資料庫配置,在/etc/my.cnf 文件中添加如表 6-1 所示的 MySQL 資料庫配置項。

將以下配置信息添加到/etc/my.cnf 文件 symbolic-links=0 配置信息的下方。

default-storage-engine=innodb 

innodb_file_per_table 

collation-server=utf8_general_ci 

init-connect='SET NAMES utf8' 

character-set-server=utf8 

(4)啟動 MySQL 資料庫。

[root@master ~]# systemctl start mysqld

(5)查詢 MySQL 資料庫狀態。mysqld 進程狀態為 active (running),則表示 MySQL 資料庫正常運行。

如果 mysqld 進程狀態為 failed,則表示 MySQL 資料庫啟動異常。此時需要排查/etc/my.cnf 文件。

[root@master ~]# systemctl status mysqld
● mysqld.service - MySQL Server
   Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
   Active: active (running) since Sun 2022-04-10 22:54:39 CST; 1h 0min ago
 Docs: man:mysqld(8)
   http://dev.mysql.com/doc/refman/en/using-systemd.html
 Main PID: 929 (mysqld)
   CGroup: /system.slice/mysqld.service
   └─929 /usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/my...

Apr 10 22:54:35 master systemd[1]: Starting MySQL Server...
Apr 10 22:54:39 master systemd[1]: Started MySQL Server.

(6)查詢 MySQL 資料庫預設密碼。

[root@master ~]# cat /var/log/mysqld.log | grep password
2022-04-08T16:20:04.456271Z 1 [Note] A temporary password is generated for root@localhost: 0yf>>yWdMd8_

MySQL 資料庫是安裝後隨機生成的,所以每次安裝後生成的預設密碼不相同。

(7)MySQL 資料庫初始化。 0yf>>yWdMd8_

執行 mysql_secure_installation 命令初始化 MySQL 資料庫,初始化過程中需要設定資料庫 root 用戶登錄密碼,密碼需符合安全規則,包括大小寫字元、數字和特殊符號, 可設定密碼為 Password123$。

在進行 MySQL 資料庫初始化過程中會出現以下交互確認信息:

1)Change the password for root ? ((Press y|Y for Yes, any other key for No)表示是否更改 root 用戶密碼,在鍵盤輸入 y 和回車。

2)Do you wish to continue with the password provided?(Press y|Y for Yes, any other key for No)表示是否使用設定的密碼繼續,在鍵盤輸入 y 和回車。

3)Remove anonymous users? (Press y|Y for Yes, any other key for No)表示是否刪除匿名用戶,在鍵盤輸入 y 和回車。

4)Disallow root login remotely? (Press y|Y for Yes, any other key for No) 表示是否拒絕 root 用戶遠程登錄,在鍵盤輸入 n 和回車,表示允許 root 用戶遠程登錄。

5)Remove test database and access to it? (Press y|Y for Yes, any other key for No)表示是否刪除測試資料庫,在鍵盤輸入 y 和回車。

6)Reload privilege tables now? (Press y|Y for Yes, any other key for No) 表示是否重新載入授權表,在鍵盤輸入 y 和回車。

mysql_secure_installation 命令執行過程如下:

[root@master ~]# mysql_secure_installation

Securing the MySQL server deployment.

Enter password for user root: 
The 'validate_password' plugin is installed on the server.
The subsequent steps will run with the existing configuration
of the plugin.
Using existing password for root.

Estimated strength of the password: 100 
Change the password for root ? ((Press y|Y for Yes, any other key for No) : y

New password: 

Re-enter new password: 

Estimated strength of the password: 100 
Do you wish to continue with the password provided?(Press y|Y for Yes, any other key for No) : y
By default, a MySQL installation has an anonymous user,
allowing anyone to log into MySQL without having to have
a user account created for them. This is intended only for
testing, and to make the installation go a bit smoother.
You should remove them before moving into a production
environment.

Remove anonymous users? (Press y|Y for Yes, any other key for No) : y
Success.

Normally, root should only be allowed to connect from
'localhost'. This ensures that someone cannot guess at
the root password from the network.

Disallow root login remotely? (Press y|Y for Yes, any other key for No) : n

 ... skipping.
By default, MySQL comes with a database named 'test' that
anyone can access. This is also intended only for testing,
and should be removed before moving into a production
environment.

Remove test database and access to it? (Press y|Y for Yes, any other key for No) : y
 - Dropping test database...
Success.

 - Removing privileges on test database...
Success.

Reloading the privilege tables will ensure that all changes
made so far will take effect immediately.

Reload privilege tables now? (Press y|Y for Yes, any other key for No) : y
Success.

All done! 

(7) 添加 root 用戶從本地和遠程訪問 MySQL 資料庫表單的授權。

[root@master ~]# mysql -u root -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 9
Server version: 5.7.18 MySQL Community Server (GPL)

Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> grant all privileges on *.* to root@'localhost' identified by 'Password123$'; 
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> grant all privileges on *.* to root@'%' identified by 'Password123$';
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)

mysql> select user,host from mysql.user where user='root';
+------+-----------+
| user | host  |
+------+-----------+
| root | % |
| root | localhost |
+------+-----------+
2 rows in set (0.00 sec)

mysql> exit;
Bye
1.3.2.3. 步驟三:配置 Hive 組件

(1)設置 Hive 環境變數並使其生效。

[root@master ~]# vim /etc/profile

export HIVE_HOME=/usr/local/src/hive
export PATH=$PATH:$HIVE_HOME/bin

[root@master ~]# source /etc/profile

(2)修改 Hive 組件配置文件。

切換到 hadoop 用戶執行以下對 Hive 組件的配置操作。
將/usr/local/src/hive/conf 文件夾下 hive-default.xml.template 文件,更名為hive-site.xml。

[root@master ~]# su - hadoop 
Last login: Sun Apr 10 23:27:25 CS

[hadoop@master ~]$ cp /usr/local/src/hive/conf/hive-default.xml.template  /usr/local/src/hive/conf/hive-site.xml

(3)通過 vi 編輯器修改 hive-site.xml 文件實現 Hive 連接 MySQL 資料庫,並設定Hive 臨時文件存儲路徑。

[hadoop@master ~]$ vi /usr/local/src/hive/conf/hive-site.xml

1)設置 MySQL 資料庫連接。

<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://master:3306/hive?createDatabaseIfNotExist=true&amp;us eSSL=false</value>
<description>JDBC connect string for a JDBC metastore</description>

2)配置 MySQL 資料庫 root 的密碼。

<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>Password123$</value>
<description>password to use against s database</description>
</property>

3)驗證元數據存儲版本一致性。若預設 false,則不用修改。

 <property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
<description>
Enforce metastore schema version consistency.
True: Verify that version information stored in is compatible with one from Hive jars. Also disable automatic
False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.
</description>
</property>

4)配置資料庫驅動。

<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>

5)配置資料庫用戶名 javax.jdo.option.ConnectionUserName 為 root。

<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>Username to use against metastore database</description>
</property>

6 )將以下位置的 ${system:java.io.tmpdir}/${system:user.name} 替換為“/usr/local/src/hive/tmp”目錄及其子目錄。

需要替換以下 4 處配置內容:

<name>hive.querylog.location</name>
<value>/usr/local/src/hive/tmp</value>
<description>Location of Hive run time structured log file</description>

<name>hive.exec.local.scratchdir</name>
<value>/usr/local/src/hive/tmp</value>

<name>hive.downloaded.resources.dir</name>
<value>/usr/local/src/hive/tmp/resources</value>

<name>hive.server2.logging.operation.log.location</name>
<value>/usr/local/src/hive/tmp/operation_logs</value>

7)在Hive安裝目錄中創建臨時文件夾 tmp。

[hadoop@master ~]$ mkdir /usr/local/src/hive/tmp 

至此,Hive 組件安裝和配置完成。

1.3.2.4. 步驟四:初始化 hive 元數據

1)將 MySQL 資料庫驅動(/opt/software/mysql-connector-java-5.1.46.jar)拷貝到Hive 安裝目錄的 lib 下;

[hadoop@master ~]$ cp /opt/software/mysql-connector-java-5.1.46.jar /usr/local/src/hive/lib/ 

2)重新啟動 hadooop 即可

[hadoop@master ~]$ stop-all.sh 
This script is Deprecated. Instead use stop-dfs.sh and stop-yarn.sh
Stopping namenodes on [master]
master: stopping namenode
10.10.10.129: stopping datanode
10.10.10.130: stopping datanode
Stopping secondary namenodes [0.0.0.0]
0.0.0.0: stopping secondarynamenode
stopping yarn daemons
stopping resourcemanager
10.10.10.129: stopping nodemanager
10.10.10.130: stopping nodemanager
no proxyserver to stop

[hadoop@master ~]$ start-all.sh 
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [master]
master: starting namenode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-namenode-master.out
10.10.10.129: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slave1.out
10.10.10.130: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slave2.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-secondarynamenode-master.out
starting yarn daemons
starting resourcemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-resourcemanager-master.out
10.10.10.130: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-nodemanager-slave2.out
10.10.10.129: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-nodemanager-slave1.out

3)初始化資料庫

[hadoop@master ~]$ schematool -initSchema -dbType mysql 
which: no hbase in (/usr/local/src/jdk/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/local/src/hive/bin:/home/hadoop/.local/bin:/home/hadoop/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/hive-jdbc-2.0.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL:jdbc:mysql://master:3306/hive?createDatabaseIfNotExist=true&us eSSL=false
Metastore Connection Driver :com.mysql.jdbc.Driver
Metastore connection User:   root
Mon Apr 11 00:46:32 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
Starting metastore schema initialization to 2.0.0
Initialization script hive-schema-2.0.0.mysql.sql
Password123$
Password123$
No current connection
org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization FAILED! Metastore state would be inconsistent !!

4)啟動 hive

[hadoop@master hive]$ hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/hive-jdbc-2.0.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in jar:file:/usr/local/src/hive/lib/hive-common-2.0.0.jar!/hive-log4j2.properties
Fri May 20 18:51:50 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
Fri May 20 18:51:50 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
Fri May 20 18:51:50 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
Fri May 20 18:51:50 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
Fri May 20 18:51:52 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
Fri May 20 18:51:52 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
Fri May 20 18:51:52 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
Fri May 20 18:51:52 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive> 

第7章 ZooKeeper組件安裝配置

實驗一:ZooKeeper 組件安裝配置

1.1.實驗目的

完成本實驗,您應該能夠:

  • 掌握下載和安裝 ZooKeeper
  • 掌握 ZooKeeper 的配置選項
  • 掌握啟動 ZooKeeper

1.2.實驗要求

  • 瞭解 ZooKeeper 的配置選項
  • 熟悉啟動 ZooKeeper

1.3.實驗過程

1.3.1 實驗任務一:配置時間同步
[root@master ~]# yum -y install chrony

[root@master ~]# cat /etc/chrony.conf 
# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
server time1.aliyun.com iburst 
 
[root@master ~]# systemctl restart chronyd.service 
[root@master ~]# systemctl enable chronyd.service 

[root@master ~]# date 
Fri Apr 15 15:40:14 CST 2022
[root@slave1 ~]# yum -y install chrony

[root@slave1 ~]# cat /etc/chrony.conf 
# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
server time1.aliyun.com iburst

[root@slave1 ~]# systemctl restart chronyd.service
[root@slave1 ~]# systemctl enable chronyd.service

[root@slave1 ~]# date
Fri Apr 15 15:40:17 CST 2022  
[root@slave2 ~]# yum -y install chrony

[root@slave2 ~]# cat /etc/chrony.conf 
# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
server time1.aliyun.com iburst

[root@slave2 ~]# systemctl restart chronyd.service
[root@slave2 ~]# systemctl enable chronyd.service 

[root@slave2 ~]# date
Fri Apr 15 15:40:20 CST 2022
1.3.2 實驗任務二:下載和安裝 ZooKeeper

ZooKeeper最新的版本可以通過官網http://hadoop.apache.org/zookeeper/來獲取,安裝 ZooKeeper 組件需要與 Hadoop 環境適配。

註意,各節點的防火牆需要關閉,否則會出現連接問題。

1.ZooKeeper 的安裝包 zookeeper-3.4.8.tar.gz 已放置在 Linux系統/opt/software
目錄下。

2.解壓安裝包到指定目標,在 Master 節點執行如下命令。

[root@master ~]# tar xf /opt/software/zookeeper-3.4.8.tar.gz -C /usr/local/src/

[root@master ~]# cd /usr/local/src/
[root@master src]# mv zookeeper-3.4.8/ zookeeper
1.3.3 實驗任務三:ZooKeeper的配置選項
1.3.3.1 步驟一:Master節點配置

(1)在 ZooKeeper 的安裝目錄下創建 data 和 logs 文件夾。

 [root@master src]# cd /usr/local/src/zookeeper/
 [root@master zookeeper]# mkdir data logs 

(2)在每個節點寫入該節點的標識編號,每個節點編號不同,master節點寫入 1,slave1 節點寫入2,slave2 節點寫入3。

[root@master zookeeper]# echo '1' > /usr/local/src/zookeeper/data/myid

(3)修改配置文件 zoo.cfg

[root@master zookeeper]# cd /usr/local/src/zookeeper/conf/
[root@master conf]# cp zoo_sample.cfg zoo.cfg

修改 dataDir 參數內容如下:

[root@master conf]# vi zoo.cfg 
dataDir=/usr/local/src/zookeeper/data

(4)在 zoo.cfg 文件末尾追加以下參數配置,表示三個 ZooKeeper 節點的訪問埠號。

server.1=master:2888:3888
server.2=slave1:2888:3888
server.3=slave2:2888:3888

(5)修改ZooKeeper安裝目錄的歸屬用戶為 hadoop 用戶。

[root@master conf]# chown -R hadoop:hadoop /usr/local/src/ 
1.3.3.2 步驟二:Slave 節點配置

(1)從 Master 節點複製 ZooKeeper 安裝目錄到兩個 Slave 節點。

[root@master ~]# scp -r /usr/local/src/zookeeper node1:/usr/local/src/
[root@master ~]# scp -r /usr/local/src/zookeeper node2:/usr/local/src/

(2)在slave1節點上修改 zookeeper 目錄的歸屬用戶為 hadoop 用戶。

[root@slave1 ~]# chown -R hadoop:hadoop /usr/local/src/
[root@slave1 ~]# ll /usr/local/src/
total 4
drwxr-xr-x. 12 hadoop hadoop  183 Apr  2 18:11 hadoop
drwxr-xr-x   9 hadoop hadoop  183 Apr 15 16:37 hbase
drwxr-xr-x.  8 hadoop hadoop  255 Apr  2 18:06 jdk
drwxr-xr-x  12 hadoop hadoop 4096 Apr 22 15:31 zookeeper

(3)在slave1節點上配置該節點的myid為2。

[root@slave1 ~]# echo 2 > /usr/local/src/zookeeper/data/myid

(4)在slave2節點上修改 zookeeper 目錄的歸屬用戶為 hadoop 用戶。

[root@slave2 ~]# chown -R hadoop:hadoop /usr/local/src/

(5)在slave2節點上配置該節點的myid為3。

[root@slave2 ~]# echo 3 > /usr/local/src/zookeeper/data/myid
1.3.3.3 步驟三:系統環境變數配置

在 master、slave1、slave2 三個節點增加環境變數配置。

[root@master conf]# vi /etc/profile.d/zookeeper.sh
export ZOOKEEPER_HOME=/usr/local/src/zookeeper
export PATH=${ZOOKEEPER_HOME}/bin:$PATH

[root@master ~]# scp /etc/profile.d/zookeeper.sh node1:/etc/profile.d/
zookeeper.sh 100%   8742.3KB/s   00:00

[root@master ~]# scp /etc/profile.d/zookeeper.sh node2:/etc/profile.d/
zookeeper.sh 100%   8750.8KB/s   00:00
1.3.4 實驗任務四:啟動 ZooKeeper

啟動ZooKeeper需要使用Hadoop用戶進行操作。

(1)分別在 master、slave1、slave2 三個節點使用 zkServer.sh start 命令啟動ZooKeeper。

[root@master ~]# su - hadoop 
Last login: Fri Apr 15 21:54:17 CST 2022 on pts/0

[hadoop@master ~]$ jps
3922 Jps

[hadoop@master ~]$ zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

[hadoop@master ~]$ jps
3969 Jps
3950 QuorumPeerMain

[root@slave1 ~]# su - hadoop 
Last login: Fri Apr 15 22:06:47 CST 2022 on pts/0

[hadoop@slave1 ~]$ jps
1370 Jps

[hadoop@slave1 ~]$ zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

[hadoop@slave1 ~]$ jps
1395 QuorumPeerMain
1421 Jps

[root@slave2 ~]# su - hadoop 
Last login: Fri Apr 15 16:25:52 CST 2022 on pts/1

[hadoop@slave2 ~]$ jps
1336 Jps

[hadoop@slave2 ~]$ zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

[hadoop@slave2 ~]$ jps
1361 QuorumPeerMain
1387 Jps

(2)三個節點都啟動完成後,再統一查看 ZooKeeper 運行狀態。

[hadoop@master conf]$ zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
Mode: follower

[hadoop@slave1 ~]$ zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
Mode: leader
[hadoop@slave2 conf]$ zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
Mode: follower

第8章 HBase組件安裝配置

實驗一:HBase 組件安裝與配置

1.1實驗目的

完成本實驗,您應該能夠:

  • 掌握HBase 安裝與配置

  • 掌握HBase 常用 Shell 命令


您的分享是我們最大的動力!

-Advertisement-
Play Games
更多相關文章
  • 使用opc-ua-sim模擬server 前言 一直想找一種將模擬server放到docker容器中運行的方式,這樣就不需要在每個電腦上都安裝軟體,僅僅只需要將鏡像保存起來,使用時載入就行。於是乎就跑到了HUB里搜尋,你說巧不巧,就剛好找到了. iotechsys 在HUB里找到這個作者(iotec ...
  • 本文分享自華為雲社區《openGauss 5.0 單點企業版部署_Centos7_x86》,本文作者:董小姐 本文檔環境:CentOS7.9 x86_64 4G1C40G python2.7.5 互動式初始化環境方式 1、介紹 openGauss是一款開源關係型資料庫管理系統,採用木蘭寬鬆許可證v2 ...
  • 引言 在當前的IT行業,無論是校園招聘還是社會招聘,MySQL的重要性不言而喻。 面試過程中,MySQL相關的問題經常出現,這不僅因為它是最流行的關係型資料庫之一,而且在日常的軟體開發中,MySQL的應用廣泛,尤其是對於Java後端開發者來說,熟練掌握MySQL已成為他們技術能力評估的重要指標。 因 ...
  • 不同於Oracle:SEQUENCE的區別 前言 在使用Oracle資料庫SEQUENCE功能時,發現Oracle對邊界處理比較奇怪。剛好GreatSQL也支持SEQUENCE,就拿來一起比較一下。 先說結論:GreatSQL 的使用基本和Oracle基本一致,但是對 START WITH 的邊界限 ...
  • 1.綜述 Hive的聚合函數衍生的視窗函數在我們進行數據處理和數據分析過程中起到了很大的作用 在Hive中,視窗函數允許你在結果集的行上進行計算,這些計算不會影響你查詢的結果集的行數。 Hive提供的視窗和分析函數可以分為聚合函數類視窗函數,分組排序類視窗函數,偏移量計算類視窗函數。 本節主要介紹聚 ...
  • 使用 mysqldump 備份表 powershell 下使用 | Out-file -Encoding utf8 設置字元格式 .\mysqldump.exe --single-transaction --user=root --password=123456 --host 127.0.0.1 - ...
  • 本篇文章主要介紹了GaussDB(DWS)性能調優涉及到的優化器和系統級GUC參數,通過合理配置這些GUC參數,能夠充分利用好CPU、記憶體、磁碟IO和網路IO等資源,提升語句的執行性能和GaussDB(DWS)集群的整體性能。 ...
  • 本文詳細記錄一次在Mac中安裝MySQL Server的過程,安裝環境如下: MacOS 14.4 x86, core i7 在MySQL資料庫實驗環境下通常都要安裝其MySQL Server,安裝方式五花八門,最簡單的有通過系統包管理工具一鍵安裝,例如apt和yum等,這種安裝方法會使得MySQL ...
一周排行
    -Advertisement-
    Play Games
  • GoF之工廠模式 @目錄GoF之工廠模式每博一文案1. 簡單說明“23種設計模式”1.2 介紹工廠模式的三種形態1.3 簡單工廠模式(靜態工廠模式)1.3.1 簡單工廠模式的優缺點:1.4 工廠方法模式1.4.1 工廠方法模式的優缺點:1.5 抽象工廠模式1.6 抽象工廠模式的優缺點:2. 總結:3 ...
  • 新改進提供的Taurus Rpc 功能,可以簡化微服務間的調用,同時可以不用再手動輸出模塊名稱,或調用路徑,包括負載均衡,這一切,由框架實現並提供了。新的Taurus Rpc 功能,將使得服務間的調用,更加輕鬆、簡約、高效。 ...
  • 本章將和大家分享ES的數據同步方案和ES集群相關知識。廢話不多說,下麵我們直接進入主題。 一、ES數據同步 1、數據同步問題 Elasticsearch中的酒店數據來自於mysql資料庫,因此mysql數據發生改變時,Elasticsearch也必須跟著改變,這個就是Elasticsearch與my ...
  • 引言 在我們之前的文章中介紹過使用Bogus生成模擬測試數據,今天來講解一下功能更加強大自動生成測試數據的工具的庫"AutoFixture"。 什麼是AutoFixture? AutoFixture 是一個針對 .NET 的開源庫,旨在最大程度地減少單元測試中的“安排(Arrange)”階段,以提高 ...
  • 經過前面幾個部分學習,相信學過的同學已經能夠掌握 .NET Emit 這種中間語言,並能使得它來編寫一些應用,以提高程式的性能。隨著 IL 指令篇的結束,本系列也已經接近尾聲,在這接近結束的最後,會提供幾個可供直接使用的示例,以供大伙分析或使用在項目中。 ...
  • 當從不同來源導入Excel數據時,可能存在重覆的記錄。為了確保數據的準確性,通常需要刪除這些重覆的行。手動查找並刪除可能會非常耗費時間,而通過編程腳本則可以實現在短時間內處理大量數據。本文將提供一個使用C# 快速查找並刪除Excel重覆項的免費解決方案。 以下是實現步驟: 1. 首先安裝免費.NET ...
  • C++ 異常處理 C++ 異常處理機制允許程式在運行時處理錯誤或意外情況。它提供了捕獲和處理錯誤的一種結構化方式,使程式更加健壯和可靠。 異常處理的基本概念: 異常: 程式在運行時發生的錯誤或意外情況。 拋出異常: 使用 throw 關鍵字將異常傳遞給調用堆棧。 捕獲異常: 使用 try-catch ...
  • 優秀且經驗豐富的Java開發人員的特征之一是對API的廣泛瞭解,包括JDK和第三方庫。 我花了很多時間來學習API,尤其是在閱讀了Effective Java 3rd Edition之後 ,Joshua Bloch建議在Java 3rd Edition中使用現有的API進行開發,而不是為常見的東西編 ...
  • 框架 · 使用laravel框架,原因:tp的框架路由和orm沒有laravel好用 · 使用強制路由,方便介面多時,分多版本,分文件夾等操作 介面 · 介面開發註意欄位類型,欄位是int,查詢成功失敗都要返回int(對接java等強類型語言方便) · 查詢介面用GET、其他用POST 代碼 · 所 ...
  • 正文 下午找企業的人去鎮上做貸後。 車上聽同事跟那個司機對罵,火星子都快出來了。司機跟那同事更熟一些,連我在內一共就三個人,同事那一手指桑罵槐給我都聽愣了。司機也是老社會人了,馬上聽出來了,為那個無辜的企業經辦人辯護,實際上是為自己辯護。 “這個事情你不能怪企業。”“但他們總不能讓銀行的人全權負責, ...