hive的join查詢 語法 實例: left join/right join 左查詢就是不管能不能匹配到右邊的表,都會輸出左面的表,未匹配到的項目為空 | name | age | | | | | wang | 13 | | zhao | 14 | | name | score | | | | | ...
hive的join查詢
語法
join_table:
table_reference [INNER] JOIN table_factor [join_condition]
| table_reference {LEFT|RIGHT|FULL} [OUTER] JOIN table_reference join_condition
| table_reference LEFT SEMI JOIN table_reference join_condition
| table_reference CROSS JOIN table_reference [join_condition] (as of Hive 0.10)
table_reference:
table_factor
| join_table
table_factor:
tbl_name [alias]
| table_subquery alias
| ( table_references )
join_condition:
ON expression
// join查詢不支持where,改用on來替代
實例:
select count(distinct a.id) from table_1 a join table_2 b on (a.id=b.id) ;
left join/right join
左查詢就是不管能不能匹配到右邊的表,都會輸出左面的表,未匹配到的項目為空
name | age |
---|---|
wang | 13 |
zhao | 14 |
name | score |
---|---|
wang | 145 |
li | 44 |
select a.name, b.score from t1 a join t2 b on (a.aid=b.aid);
的結果是:
name | score |
---|---|
wang | 145 |
zhao | NULL |
註意點
- 註意,下麵這些操作是非法的:
CREATE TABLE a (k1 string, v1 string);
CREATE TABLE b (k2 string, v2 string);
SELECT k1, v1, k2, v2
FROM a JOIN b ON k1 = k2;
// 選擇每個屬性的時候需要制定是哪個表,否則會認為有歧義
- 只支持相等查詢的join操作
SELECT a.* FROM a JOIN b ON (a.id <> b.id)
// 非法操作
SELECT a.* FROM a LEFT OUTER JOIN b ON (a.id <> b.id)
// 合法,此處是left outer join
- 可以支持多個表的查詢
SELECT a.val, b.val, c.val FROM a JOIN b ON (a.key = b.key1) JOIN c ON (c.key = b.key2)