HQL DML 主要涉到對Hive表中數據操作,包含有:load、INSERT、DELETE、EXPORT and IMPORT,詳細資料參見:https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML 目錄: Loadin ...
- HQL DML 主要涉到對Hive表中數據操作,包含有:load、INSERT、DELETE、EXPORT and IMPORT,詳細資料參見:https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
目錄:
- Loading files into tables
- Inserting data into Hive Tables from queries
- Writing data into the filesystem from queries
- Inserting values into tables from SQL
- Delete
- 應用Demo
Loading files into tables:
- 語法:LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)]
- 示例代碼如下:
#創建表 CREATE TABLE web_log(viewTime INT, userid BIGINT, url STRING, referrer STRING, ip STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'; #導入文件數據 LOAD DATA LOCAL INPATH '/usr/zhu/table.txt' OVERWRITE INTO TABLE web_log;
View Code - OVERWRITE:
- 目標表(或者分區)中的內容(如果有)會被刪除,然後再將 filepath 指向的文件/目錄中的內容添加到表/分區中
- 如果目標表(分區)已經有一個文件,並且文件名和 filepath 中的文件名衝突,那麼現有的文件會被新文件所替代
Inserting data into Hive Tables from queries:
- 語法:INSERT OVERWRITE TABLE tablename1 [PARTITION] select_statement1 FROM from_statement;
INSERT INTO TABLE tablename1 [PARTITION] select_statement1 FROM from_statement; - 示例代碼:
#創建結構相同的表 create table empDemo as employee; #插入數據 insert into table empDemo select * from employee; #覆蓋插入數據 insert overwrite table empDemo select * from employee;
View Code
Writing data into the filesystem from queries:
- 語法:INSERT OVERWRITE [LOCAL] DIRECTORY directory1 SELECT ... FROM ...
- 示例代碼:
INSERT OVERWRITE LOCAL DIRECTORY './tmp/zhu' SELECT * FROM employee;
View Code
Inserting values into tables from SQL:
- 語法:INSERT INTO TABLE tablename VALUES values_row [, values_row ...]
示例代碼:
#單行插入 insert into table employee values('001','001','tgzhu'); #多行插入 insert into table employee values('004','004','WangWu'),('005','005','ZhaoZhao');View Code
Delete:
- 語法:DELETE FROM tablename [WHERE expression]
應用Demo:
- 以一個實際的應用Demo對Hive 的 DDL、DML 進行說明,過程如下
- 創建與HBase關聯的外部表,HQL如下:
CREATE EXTERNAL TABLE if not exists Hive_CM_EvcRealTimeData( Rowkey string, RealTimeData_CarNo string, RealTimeData_Time string, RealTimeData_Speed decimal(20,8), RealTimeData_Mileage decimal(20,8), RealTimeData_HighestVoltageBatteryOrd int, RealTimeData_Latitude decimal(20,8), RealTimeData_Longitude decimal(20,8) ) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES('hbase.columns.mapping' = ':key,d:RealTimeData_CarNo, ata_Time,d:RealTimeData_Speed,d:RealTimeData_Mileage,d:RealTimeData_HighestVoltageBatteryOrd,d:RealTimeData_Latitude,d:RealTimeData_Longitude') TBLPROPERTIES('hbase.table.name' = 'CM_EvcRealTimeData')
View Code - 創建一個用來保存計算結果的hive實表,如下:
CREATE TABLE if not exists Hive_CM_CarDailyRpt( CarNo string, DTime string, OnLineCount int, RunCount int, Mileage decimal(20,8), MaxSpeed decimal(20,8), totalPower decimal(20,8), AverageSpeed decimal(20,8), CDI_BatteryFlag string, CDI_CoordinatorFlag string ) STORED AS TEXTFILEView Code
- 計算並將結果插入實表,如下:
set hive.execution.engine = tez; Insert overwrite table Hive_CM_CarDailyRpt select CarNo,DTime, CONVERT(int,SUM(CT)) as OnLineCount , CONVERT(int,SUM(CTSPEED)) as RunCount, CONVERT(decimal(18,2),MAX(MILE)-MIN(MILE)) as Mileage , CONVERT(decimal(18,2),MAX(SPEED)) as MaxSpeed, ((MAX(MILE)-MIN(MILE))*0.2) as totalPower, case when SUM(CTSPEED)>0 then CONVERT(decimal(18,2),((MAX(MILE)-MIN(MILE))/SUM(CTSPEED))) else 0 end as AverageSpeed, case when SUM(RealTimeData_HighestVoltageBatteryOrd)>0 then '0' else '1' end as BatteryFlag, case when (SUM(RealTimeData_Latitude) + SUM(RealTimeData_Longitude)) >0 then '0' else '1' end as LatitudeFlag, from ( SELECT REALTIMEDATA_CARNO AS CARNO, substring(RealTimeData_Time,1,8) as DTime, 1 AS CT, CASE WHEN REALTIMEDATA_SPEED>0 THEN 1 ELSE 0 END AS CTSPEED, CASE WHEN REALTIMEDATA_MILEAGE=0 THEN NULL ELSE REALTIMEDATA_MILEAGE END AS MILE, CASE WHEN REALTIMEDATA_SPEED>200 then 0 else REALTIMEDATA_SPEED end AS SPEED, RealTimeData_HighestVoltageBatteryOrd, RealTimeData_Latitude,RealTimeData_Longitude FROM CM_EvcRealTimeData ) t group by CarNo,DTimeView Code
- 再將計算結果轉存關係型資料庫或HBase