hadoop(四): 本地 hbase 集群配置 Azure Blob Storage

-Advertisement-

基於 HDP2.4安裝(五)：集群及組件安裝創建的hadoop集群，修改預設配置，將hbase 存儲配置為 Azure Blob Storage 目錄：簡述配置驗證簡述： hadoop-azure 提供hadoop 與 azure blob storage 集成支持，需要部署 hadoop ...

基於 HDP2.4安裝(五)：集群及組件安裝 創建的hadoop集群，修改預設配置，將hbase 存儲配置為 Azure Blob Storage

目錄：

簡述
配置
驗證

簡述：

hadoop-azure 提供hadoop 與 azure blob storage 集成支持，需要部署 hadoop-azure.jar 程式包，在HDP2.4 安裝包中已預設提供，如下圖：
配置成功後，讀寫的數據都存儲在 Azure Blob Storage account
支持配置多個 Azure Blob Storage account，實現了標準的 Hadoop FileSystem interface
Reference file system paths using URLs using the wasb scheme.
Tested on both Linux and Windows. Tested at scale.
Azure Blob Storage 包含三部分內容:

1. Storage Account: All access is done through a storage account
2. Container: A container is a grouping of multiple blobs. A storage account may have multiple containers. In Hadoop, an entire file system hierarchy is stored in a single container. It is also possible to configure multiple containers, effectively presenting multiple file systems that can be referenced using distinct URLs.
3. Blob: A file of any type and size. In Hadoop, files are stored in blobs. The internal implementation also uses blobs to persist the file system hierarchy and other metadata

配置：

在 china Azure 門戶(https://manage.windowsazure.cn) 創建一個 blob storage Account, 如下圖命名：localhbase

配置訪問 Azure blob storage 訪問證書及key以及切換文件系統配置，本地 hadoop core-site.xml 文件，內容如下

<property>
  <name>fs.defaultFS</name>
  <value>wasb://[email protected]</value>
</property>
<property>
  <name>fs.azure.account.key.localhbase.blob.core.chinacloudapi.cn</name>
  <value>YOUR ACCESS KEY</value>
</property>

在大多數場景下Hadoop clusters, the core-site.xml file is world-readable，為了安全起見，可通過配置將Key加密，然後通過配置的程式對key進行解密，此場景下的配置如下（基於安全考慮的可選配置）：

<property>
  <name>fs.azure.account.keyprovider.localhbase.blob.core.chinacloudapi.cn</name>
  <value>org.apache.hadoop.fs.azure.ShellDecryptionKeyProvider</value>
</property>
<property>
  <name>fs.azure.account.key.localhbase.blob.core.chinacloudapi.cn</name>
  <value>YOUR ENCRYPTED ACCESS KEY</value>
</property>
<property>
  <name>fs.azure.shellkeyprovider.script</name>
  <value>PATH TO DECRYPTION PROGRAM</value>
</property>

Azure Blob Storage interface for Hadoop supports two kinds of blobs, block blobs and page blobs；Block blobs are the default kind of blob and are good for most big-data use cases, like input data for Hive, Pig, analytical map-reduce jobs etc
Page blob handling in hadoop-azure was introduced to support HBase log files. Page blobs can be written any number of times, whereas block blobs can only be appended to 50,000 times before you run out of blocks and your writes will fail，That won’t work for HBase logs, so page blob support was introduced to overcome this limitation
Page blobs can be up to 1TB in size, larger than the maximum 200GB size for block blobs

In order to have the files you create be page blobs, you must set the configuration variable fs.azure.page.blob.dir to a comma-separated list of folder names

<property>
   <name>fs.azure.page.blob.dir</name>
   <value>/hbase/WALs,/hbase/oldWALs,/mapreducestaging,/hbase/MasterProcWALs,/atshistory,/tezstaging,/ams/hbase</value>
</property>

驗證：

上面的參數配置均在 ambari 中完成，重啟參數依賴的服務
命令： hdfs dfs -ls /hbase/data/default 如下圖, 沒有數據
參見 HBase(三): Azure HDInsigt HBase表數據導入本地HBase 將測試表數據導入，完成後如下圖：
命令：./hbase hbck -repair -ignorePreCheckPermission
命令： hbase shell
查看數據，如下圖，則OK
用我們自己開發的查詢工具驗證數據，如下圖，關於工具的開發見下一章
參考資料： https://hadoop.apache.org/docs/current/hadoop-azure/index.html

您的分享是我們最大的動力!

-Advertisement-

更多相關文章

React Native知識1-FlexBox 佈局內容

一：理論知識點 1:什麼是FlexBox佈局? 彈性盒模型（The Flexible Box Module）,又叫Flexbox，意為“彈性佈局”，旨在通過彈性的方式來對齊和分佈容器中內容的空間，使其能適應不同屏幕，為盒裝模型提供最大的靈活性。 Flex佈局主要思想是：讓容器有能力讓其子項目能夠改變 ...
Android Gson的使用總結

1、概念 Gson是谷歌發佈的一個json解析框架 2、如何獲取 github:https://github.com/google/gson android studio使用查看最新版本號下載最新的jar包，http://search.maven.org/#search%7Cga%7C1%7 ...
二維碼

使用系統自帶生成/掃描二維碼iOS7開始蘋果集成了二維碼的生成的掃描### 生成二維碼的步驟導入CoreImage框架 #import 通過濾鏡CIFilte生成二維碼### 二維碼的內容(傳統的條形碼只能放數字)純文本名片URL生成二維碼 // 1.創建過濾器 CIFilter *filter =... ...
Android 照片選擇，裁剪，上傳，一整套圖片解決方案

1、Android一整套圖片解決方案 http://mp.weixin.qq.com/s?__biz=MzAxMTI4MTkwNQ==&mid=2650820998&idx=1&sn=c9670674dcfb71a24521e898776f234e&scene=1&srcid=0905yknSzNO ...
Glide源碼導讀

在這篇文章里，我會介紹下Glide中的一些關鍵概念，並走一遍圖片載入流程，如果你要閱讀Glide源碼的話，應該多少會有點幫助。 ...
__block 和 __weak的區別

Blocks理解： Blocks可以訪問局部變數，但是不能修改如果修改局部變數，需要加__block 2、如果局部變數是數組或者指針的時候只複製這個指針，兩個指針指向同一個地址,block只修改指針上的內容。如：例子裡面確實沒有修改mArrayCount這個局部變數啊。mArrayCount是一 ...
【java學習系列】 Android第一本書《第一行代碼》

開始Java的學習，從Android，開始吧。《第一代碼》開始閱讀和調試demo例子。下麵是《第一行代碼》的思維導圖： ...
Android 從Gallery獲取圖片

本文主要介紹Android中從Gallery獲取圖片設計項目佈局打開packages\apps\Gallery下的清單文件，可以看到其中包含下麵的代碼：邏輯部分代碼如下： ...