背景:(測試環境)只有兩台機器一臺namenode一臺namenode,但集群只有一個結點感覺不出來效果,在namenode上掛一個datanode就有兩個節點,弊端見最後 操作非常簡單(添加獨立節點參照:http://www.cnblogs.com/pu20065226/p/8493316.htm ...
背景:(測試環境)只有兩台機器一臺namenode一臺namenode,但集群只有一個結點感覺不出來效果,在namenode上掛一個datanode就有兩個節點,弊端見最後
操作非常簡單(添加獨立節點參照:http://www.cnblogs.com/pu20065226/p/8493316.html)
1.修改namenode節點的slave文件,增加新節點信息
[hadoop@hadoop-master hadoop]$ pwd /usr/hadoop/hadoop-2.7.5/etc/hadoop [hadoop@hadoop-master hadoop]$ cat slaves slave1 hadoop-master [hadoop@hadoop-master hadoop]$
2.啟動新datanode的datanode和nodemanger進程
先確認namenode和當前的datanode中,etc/hoadoop/excludes文件中無待加入的主機,再進行下麵操作
[hadoop@slave2 hadoop-2.7.5]$ sbin/hadoop-daemon.sh start datanode starting datanode, logging to /usr/hadoop/hadoop-2.7.5/logs/hadoop-hadoop-datanode-slave2.out [hadoop@slave2 hadoop-2.7.5]$ sbin/yarn-daemon.sh start nodemanager starting datanode, logging to /usr/hadoop/hadoop-2.7.5/logs/yarn-hadoop-datanode-slave2.out [hadoop@slave2 hadoop-2.7.5]$ 91284 SecondaryNameNode
90979 NameNode
91519 ResourceManager
41768 DataNode
41899 NodeManager
41999 Jps [hadoop@slave2 ~]$
3.在NameNode上刷新節點
[hadoop@hadoop-master ~]$ hdfs dfsadmin -refreshNodes Refresh nodes successful [hadoop@hadoop-master ~]$sbin/start-balancer.sh
4.在namenode查看當前集群情況,
確認節點已經正常加入
[hadoop@hadoop-master hadoop-2.7.5]$ hdfs dfsadmin -report Configured Capacity: 58663657472 (54.63 GB) Present Capacity: 35990061540 (33.52 GB) DFS Remaining: 35989540864 (33.52 GB) DFS Used: 520676 (508.47 KB) DFS Used%: 0.00% Under replicated blocks: 12 Blocks with corrupt replicas: 0 Missing blocks: 0 Missing blocks (with replication factor 1): 0 ------------------------------------------------- Live datanodes (2): Name: 192.168.48.129:50010 (hadoop-master) Hostname: hadoop-master Decommission Status : Normal Configured Capacity: 38588669952 (35.94 GB) DFS Used: 213476 (208.47 KB) Non DFS Used: 16331292188 (15.21 GB) DFS Remaining: 22257164288 (20.73 GB) DFS Used%: 0.00% DFS Remaining%: 57.68% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Mon Mar 19 19:54:45 PDT 2018 Name: 192.168.48.132:50010 (slave1) Hostname: slave1 Decommission Status : Normal Configured Capacity: 20074987520 (18.70 GB) DFS Used: 307200 (300 KB) Non DFS Used: 6342303744 (5.91 GB) DFS Remaining: 13732376576 (12.79 GB) DFS Used%: 0.00% DFS Remaining%: 68.41% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Mon Mar 19 19:54:46 PDT 2018
網頁查看
弊端(來源網路):首先NameNode將文件命名空間的狀態保存在狀態中,比如哪個文件塊在哪個datanode上,由於在較大的hadoop集群中,會存在很多文件塊,這樣就會占用NameNode很大的記憶體,所以不會浪費NameNode的計算資源 其次,對於長時間運行的集群來說,NameNode一致將命名空間的狀態變化寫入edits日誌文件,時間久了該文件也會很大,只要將NameNode的存儲規劃的合理,是不會浪費存儲的
hadoop集群重要的是保證namdenode的長期穩定運行,把datanode放在namenode上,增加了namenode的負擔,datanode占用大量的磁碟io,網路流量可能導致hdfs響應慢,錯誤率增加,要進行大量錯誤恢復,這影響集群的穩定性。
至於namenode是否浪費資源,namenode要維護整個集群的(一,二級關係)一、目錄樹,文件元信息,二、塊到數據節點的映射。對於一定規模的集群要消耗大量的記憶體,cpu資源。namenode還會把一級關係持久化到鏡像文件中,並且用編輯日誌保證數據被持久化。這也會占用大量的存儲資源,同事,有大量的datanode節點,可能還有大量的客戶端同namenode進行網路通信。綜上,namenode資源並沒浪費!