1. 主機規劃 主機名稱 IP地址 操作系統 部署軟體 運行進程 備註 mini01 172.16.1.11【內網】 10.0.0.11 【外網】 CentOS 7.5 Jdk-8、zookeeper-3.4.5、Hadoop2.7.6、hbase-2.0.2、kafka_2.11-2.0.0、sp ...
1. 主機規劃
主機名稱 |
IP地址 |
操作系統 |
部署軟體 |
運行進程 |
備註 |
mini01 |
172.16.1.11【內網】 10.0.0.11 【外網】 |
CentOS 7.5 |
Jdk-8、zookeeper-3.4.5、Hadoop2.7.6、hbase-2.0.2、kafka_2.11-2.0.0、spark-2.4.0-hadoop2.7【主】 |
QuorumPeerMain、 |
|
mini02 |
172.16.1.12【內網】 10.0.0.12 【外網】 |
CentOS 7.5 |
Jdk-8、zookeeper-3.4.5、Hadoop2.7.6、hbase-2.0.2、kafka_2.11-2.0.0、spark-2.4.0-hadoop2.7【主】 |
QuorumPeerMain、 |
|
mini03 |
172.16.1.13【內網】 10.0.0.13 【外網】 |
CentOS 7.5 |
Jdk-8、zookeeper-3.4.5、Hadoop2.7.6、hbase-2.0.2、kafka_2.11-2.0.0、spark-2.4.0-hadoop2.7 |
QuorumPeerMain、 |
|
mini04 |
172.16.1.14【內網】 10.0.0.14 【外網】 |
CentOS 7.5 |
Jdk-8、zookeeper-3.4.5、Hadoop2.7.6、hbase-2.0.2、spark-2.4.0-hadoop2.7 |
QuorumPeerMain、 |
|
mini05 |
172.16.1.15【內網】 10.0.0.15 【外網】 |
CentOS 7.5 |
Jdk-8、zookeeper-3.4.5、Hadoop2.7.6、hbase-2.0.2、spark-2.4.0-hadoop2.7 |
QuorumPeerMain、 |
|
說明
藉助zookeeper,並且啟動至少兩個Master節點來實現高可靠。
2. 免密碼登錄
實現mini01、mini02到mini01、mini02、mini03、mini04、mini05通過秘鑰免密碼登錄。
參見文章:Hadoop2.7.6_01_部署
3. Jdk【java8】
參見文章:Hadoop2.7.6_01_部署
4. Zookeeper部署
參見文章:zookeeper-02 部署
並啟動zookeeper服務
5. Spark部署步驟
5.1. Spark安裝
1 [yun@mini01 software]$ pwd 2 /app/software 3 [yun@mini01 software]$ ll 4 total 238572 5 -rw-r--r-- 1 yun yun 227893062 Nov 19 21:24 spark-2.4.0-bin-hadoop2.7.tgz 6 [yun@mini01 software]$ tar xf spark-2.4.0-bin-hadoop2.7.tgz 7 [yun@mini01 software]$ mv spark-2.4.0-bin-hadoop2.7 /app/ 8 [yun@mini01 software]$ cd /app/ 9 [yun@mini01 ~]$ ln -s spark-2.4.0-bin-hadoop2.7/ spark 10 [yun@mini01 ~]$ ll -d spark-* 11 drwxr-xr-x 13 yun yun 211 Oct 29 14:36 spark-2.4.0-bin-hadoop2.7 12 lrwxrwxrwx 1 yun yun 26 Nov 24 14:23 spark -> spark-2.4.0-bin-hadoop2.7/
5.2. 環境變數修改
根據規劃,該環境變數的修改包括mini01、mini02、mini03、mini04、mini05。
1 # 需要root許可權去添加環境變數 2 [root@mini01 ~]# tail /etc/profile 3 ……………… 4 # spark環境變數 5 export SPARK_HOME="/app/spark" 6 export PATH=$SPARK_HOME/bin:$SPARK_HOME/sbin:$PATH 7 8 [root@mini01 ~]# logout 9 [yun@mini01 conf]$ source /etc/profile # 重新載入該環境變數
5.3. 配置修改
1 [yun@mini01 conf]$ pwd 2 /app/spark/conf 3 [yun@mini01 conf]$ cp -a spark-env.sh.template spark-env.sh 4 [yun@mini01 conf]$ tail spark-env.sh # 修改環境變數配置 5 # Options for native BLAS, like Intel MKL, OpenBLAS, and so on. 6 # You might get better performance to enable these options if using native BLAS (see SPARK-21305). 7 # - MKL_NUM_THREADS=1 Disable multi-threading of Intel MKL 8 # - OPENBLAS_NUM_THREADS=1 Disable multi-threading of OpenBLAS 9 10 # 添加配置如下 11 # 配置JAVA_HOME 12 export JAVA_HOME=/app/jdk 13 # -Dspark.deploy.recoverMode=ZOOKEEPER #代表發生故障使用zookeeper服務 14 # -Dspark.depoly.zookeeper.url=mini01:2181,mini02:2181,mini03:2181,mini04:2181,mini05:2181 #zookeeper的連接信息 15 # -Dspark.deploy.zookeeper.dir=/app/zookeeper/spark #spark要在zookeeper上寫數據時的保存目錄 16 export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=mini01:2181,mini02:2181,mini03:2181,mini04:2181,mini05:2181 -Dspark.deploy.zookeeper.dir=/spark" 17 # 每一個Worker最多可以使用的記憶體,我的虛擬機就2g 18 # 真實伺服器如果有128G,你可以設置為100G 19 # 所以這裡設置為1024m或1g 20 export SPARK_WORKER_MEMORY=1024m 21 # 每一個Worker最多可以使用的cpu core的個數,我虛擬機就一個... 22 # 真實伺服器如果有32個,你可以設置為32個 23 export SPARK_WORKER_CORES=1 24 # 提交Application的埠,預設就是這個,萬一要改呢,改這裡 25 export SPARK_MASTER_PORT=7077 26 27 [yun@mini01 conf]$ pwd 28 /app/spark /conf 29 [yun@mini01 conf]$ cp -a slaves.template slaves 30 [yun@mini01 conf]$ tail slaves # 修改slaves 配置 31 # distributed under the License is distributed on an "AS IS" BASIS, 32 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 33 # See the License for the specific language governing permissions and 34 # limitations under the License. 35 # 36 37 # A Spark Worker will be started on each of the machines listed below. 38 mini03 39 mini04 40 mini05
配置說明
# -Dspark.deploy.zookeeper.dir=/app/zookeeper/spark # spark要在zookeeper上寫數據時的保存目錄
1 [yun@mini05 ~]$ zkCli.sh # 進入zookeeper命令行 【在spark啟動後查看】 2 [zk: localhost:2181(CONNECTED) 0] ls / # 其中的 /spark 就是 我們在spark-env.sh中的配置 3 [cluster, brokers, zookeeper, yarn-leader-election, hadoop-ha, admin, isr_change_notification, log_dir_event_notification, controller_epoch, spark, consumers, latest_producer_id_block, config, hbase] 4 [zk: localhost:2181(CONNECTED) 1] ls /spark 5 [leader_election, master_status] 6 [zk: localhost:2181(CONNECTED) 2] ls /spark/master_status 7 [worker_worker-20181125113658-172.16.1.13-18433, worker_worker-20181125113658-172.16.1.14-14175, worker_worker-20181125113658-172.16.1.15-8887] 8 [zk: localhost:2181(CONNECTED) 3] ls /spark/leader_election 9 [_c_6c6d0c36-3017-4354-a05c-9414a78d79e2-latch-0000000000, _c_04ceffff-b763-454a-b3f1-7fb56f56fa84-latch-0000000001]
5.4. 分發到其他機器
分發到mini02、mini03、mini04和mini05
其中mini01和mini02作為master
1 [yun@mini01 ~]$ scp -pr spark-2.4.0-bin-hadoop2.7/ yun@mini02:/app # 拷貝到mini02 2 [yun@mini01 ~]$ scp -pr spark-2.4.0-bin-hadoop2.7/ yun@mini03:/app # 拷貝到mini03 3 [yun@mini01 ~]$ scp -pr spark-2.4.0-bin-hadoop2.7/ yun@mini04:/app # 拷貝到mini04 4 [yun@mini01 ~]$ scp -pr spark-2.4.0-bin-hadoop2.7/ yun@mini05:/app # 拷貝到mini05
在mini02、mini03、mini04和mini05上操作
1 [yun@mini04 ~]$ pwd 2 /app 3 [yun@mini04 ~]$ ll -d spark-2.4.0-bin-hadoop2.7 4 drwxr-xr-x 13 yun yun 211 Oct 29 14:36 spark-2.4.0-bin-hadoop2.7 5 [yun@mini04 ~]$ ln -s spark-2.4.0-bin-hadoop2.7/ spark 6 [yun@mini04 ~]$ ll -d spark-* 7 drwxr-xr-x 13 yun yun 211 Oct 29 14:36 spark-2.4.0-bin-hadoop2.7 8 lrwxrwxrwx 1 yun yun 26 Nov 24 23:39 spark -> spark-2.4.0-bin-hadoop2.7/
5.5. 啟動spark
5.5.1. 在mini01上操作
1 [yun@mini01 sbin]$ pwd 2 /app/spark/sbin 3 [yun@mini01 sbin]$ ./start-all.sh # 關閉使用 stop-all.sh 腳本 4 [yun@mini01 sbin]$ ./start-all.sh 5 starting org.apache.spark.deploy.master.Master, logging to /app/spark/logs/spark-yun-org.apache.spark.deploy.master.Master-1-mini01.out 6 mini03: starting org.apache.spark.deploy.worker.Worker, logging to /app/spark/logs/spark-yun-org.apache.spark.deploy.worker.Worker-1-mini03.out 7 mini04: starting org.apache.spark.deploy.worker.Worker, logging to /app/spark/logs/spark-yun-org.apache.spark.deploy.worker.Worker-1-mini04.out 8 mini05: starting org.apache.spark.deploy.worker.Worker, logging to /app/spark/logs/spark-yun-org.apache.spark.deploy.worker.Worker-1-mini05.out 9 [yun@mini01 ~]$ 10 [yun@mini01 ~]$ jps # 查看進程狀態 11 4033 QuorumPeerMain 12 4683 Jps 13 4575 Master
5.5.2. 在mini02上操作
1 [yun@mini02 sbin]$ pwd 2 /app/spark/sbin 3 [yun@mini02 sbin]$ ./start-master.sh 4 starting org.apache.spark.deploy.master.Master, logging to /app/spark/logs/spark-yun-org.apache.spark.deploy.master.Master-1-mini02.out 5 [yun@mini02 sbin]$ jps # 查看進程狀態 6 2914 Master 7 2999 Jps 8 2313 QuorumPeerMain
5.5.3. mini03進程查看
1 [yun@mini03 ~]$ jps 2 2824 Jps 3 2558 QuorumPeerMain 4 2766 Worker
5.5.4. mini04進程查看
1 [yun@mini04 ~]$ jps 2 2931 Jps 3 2824 Worker 4 2555 QuorumPeerMain
5.5.5. mini05進程查看
1 [yun@mini05 ~]$ jps 2 2806 Jps 3 2747 Worker 4 2527 QuorumPeerMain
5.6. 瀏覽器訪問
1 http://mini01:8080/
1 http://mini02:8080/
說明
如果我們停了mini01的spark master,稍等一會兒可見mini02的master狀態從standby變為了alive。
此時再啟動mini01的master,可見mini01的master狀態是standby。