一、安裝ubuntu操作系統 參考:https://www.cnblogs.com/Alier/p/6337151.html 二、下載hadoop以及hive hadoop:https://hadoop.apache.org/releases.html hive:http://hive.apache ...
一、安裝ubuntu操作系統
參考:https://www.cnblogs.com/Alier/p/6337151.html
二、下載hadoop以及hive
hadoop:https://hadoop.apache.org/releases.html
hive:http://hive.apache.org/downloads.html
三、hadoop安裝
1.準備工作
1 sudo useradd -m hadoop -s /bin/bash #創建hadoop用戶 2 sudo passwd hadoop #為hadoop用戶設置密碼,之後需要連續輸入兩次密碼 3 sudo adduser hadoop sudo #為hadoop用戶增加管理員許可權 4 su - hadoop #切換當前用戶為用戶hadoop 5 sudo apt-get update
2.安裝ssh並設置免密登陸
1 sudo apt-get install openssh-server #安裝SSH server 2 ssh localhost #登陸SSH,第一次登陸輸入yes 3 exit #退出登錄的ssh localhost 4 cd ~/.ssh/ #如果沒法進入該目錄,執行一次ssh localhost 5 ssh-keygen -t rsa 6 cat ./id_rsa.pub >> ./authorized_keys #加入授權 7 ssh localhost #此時不需輸入密碼
hadoop@ge-hadoop:~$ ssh localhost Welcome to Ubuntu 16.04.2 LTS (GNU/Linux 4.8.0-36-generic x86_64) * Documentation: https://help.ubuntu.com * Management: https://landscape.canonical.com * Support: https://ubuntu.com/advantage 312 個可升級軟體包。 14 個安全更新。 Last login: Mon Mar 11 21:37:12 2019 from 127.0.0.1 hadoop@ge-hadoop:~$
如上面顯示。
3.安裝配置java環境
由於撰寫本文時作者已經配置完成,故無法展示
附參考鏈接:https://www.linuxidc.com/Linux/2015-01/112030.htm
重點為配置java環境,如下:
sudo gedit ~/.bashrc #在此文件中配置加入 export JAVA_HOME=/usr/java/jdk1.8.0_201#為你java安裝路徑 sudo gedit /ect/profile #在此文件配置中加入 export JAVA_HOME=/usr/java/jdk1.8.0_201 export JRE_HOME=${JAVA_HOME}/jre export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib export PATH=${JAVA_HOME}/bin:$PATH
JAVA_HOME為你安裝java路徑
配置完兩個文件需要輸入,配置立即生效
1 source ~/.bashrc
2 source /etc/profile
在終端輸入
hadoop@ge-hadoop:~$ java -version java version "1.8.0_201" Java(TM) SE Runtime Environment (build 1.8.0_201-b09) Java HotSpot(TM) 64-Bit Server VM (build 25.201-b09, mixed mode)
表明安裝配置成功。
4.hadoop安裝與配置
將hadoop安裝包配置解壓到/usr/local目錄下:
tar -zxvf hadoop-2.9.1.tar.gz -C /usr/local
cd /usr/local
sudo mv hadoop-2.9.1 hadoop
sudo chown hadoop ./hadoop #修改文件所屬用戶
添加hadoop環境變數
sudo gedit /etc/profile #添加以下行 export HADOOP_HOME=/usr/local/hadoop export PATH=.:$HADOOP_HOME/bin:$JAVA_HOME/bin:$PATH
source /etc/prifile 立即生效
查看hadoop版本號
hadoop@ge-hadoop:~$ hadoop version Hadoop 2.9.1 Subversion https://github.com/apache/hadoop.git -r e30710aea4e6e55e69372929106cf119af06fd0e Compiled by root on 2018-04-16T09:33Z Compiled with protoc 2.5.0 From source with checksum 7d6d2b655115c6cc336d662cc2b919bd This command was run using /usr/local/hadoop/share/hadoop/common/hadoop-common-2.9.1.jar
顯示安裝成功
5.配置偽分散式
主要修改hadoop配置文件
cd /usr/local/hadoop/etc/hadoop #配置文件目錄 sudo vim hadoop-env.sh #在該文件下添加
export JAVA_HOME=/usr/java/jdk1.8.0_201
sudo vim hdfs-site.xml
修改該文件為
<configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/usr/local/hadoop/tmp/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/usr/local/hadoop/tmp/dfs/data</value> </property> <property> <name>dfs.secondary.http.address</name> <value>127.0.0.1:50090</value> </property> </configuration>
接下來修改core-site.xml
<configuration> <property> <name>hadoop.tmp.dir</name> <value>file:/usr/local/hadoop/tmp</value> <description>Abase for other temporary directories.</description> </property> <property> <name>fs.defaultFS</name> <value>hdfs://localhost:9000</value> </property> </configuration>
最後執行NAMEnode初始化
./bin/hdfs namenode -format #此段代碼執行多次容易報錯
1 ./sbin/start-dfs.sh #啟動服務
2 jps #查看服務狀態
hadoop@ge-hadoop:/usr/local/hadoop$ jps 5632 ResourceManager 5457 SecondaryNameNode 6066 Jps 5238 DataNode 5113 NameNode 5756 NodeManager hadoop@ge-hadoop:/usr/local/hadoop$
成功啟動後,可以訪問 Web 界面 http://localhost:50070 查看 NameNode 和 Datanode 信息,還可以線上查看 HDFS 中的文件。
四、hive安裝與配置
1.準備工作
安裝mysql
下載mysql-connector-java:https://dev.mysql.com/downloads/connector/j/ #最好下載與mysql配套的,不然連接時候容易報錯
2.配置mysql
mysql -u root -p; #咦root許可權登陸mysql create database hive; use hive; create table user(Host char(20),User char(10),Password char(20)); insert into user(Host,User,Password) values("localhost","hive","hive"); #建立hive用戶密碼為hive FLUSH PRIVILEGES; GRANT ALL PRIVILEGES ON *.* TO 'hive'@'localhost' IDENTIFIED BY 'hive'; FLUSH PRIVILEGES;
3.hive安裝配置
tar –zxvf apache-hive-2.3.4-bin.tar.gz /usr/local/ sudo mv apache-hive-2.3.4-bin.tar.gz hive
sudo vim /etc/profile
保存後記得要source /etc/profile 使其更改生效
export HIVE_HOME=/usr/local/hive
export PATH=$PATH:$HIVE_HOME/bin
修改hive/conf下的幾個模板
cp hive-env.sh.template hive-env.sh cp hive-default.xml.template hive-site.xml
更改hive-env.sh文件,指定hadoop的安裝路
HADOOP_HOME=/usr/local/hadoop
更改hive-site.xml文件,指定資料庫的相關信息
<property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value> <description>JDBC connect string for a JDBC metastore</description> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> <description>Driver class name for a JDBC metastore</description> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>hive</value> <description>username to use against metastore database</description> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>hive</value> <description>password to use against metastore database</description> </property>
此處建議在配置文件中找到對應行修改value值,不然容易報錯
修改hive/bin下的hive-config.sh文件
export JAVA_HOME=/usr/java/jdk1.8.0_201 export HADOOP_HOME=/usr/local/hadoop export HIVE_HOME=/usr/local/hive
解壓mysql-connector-java-5.1.47.tar.gz
tar -zxvf mysql-connector-java-5.1.47.tar.gz /usr/local
將文件中mysql-connector-java-5.1.47.jar 包copy到hive/lib目錄下
初始化hive資料庫
schematool -dbType mysql -initSchema
打開hive
hadoop@ge-hadoop:/usr/local/hive$ bin/hive Logging initialized using configuration in jar:file:/usr/local/hive/lib/hive-common-2.3.4.jar!/hive-log4j2.properties Async: true Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. hive>
完成,當然安裝郭恆中會出行很多很多報錯的地方,又問題歡迎留言