Building a HA hadoop cluster with active passive namenode pair.
namenode is the controller for HDFS reads and writes. In HA mode namenode edits the journalnodes log, the standby then reads this log and applys to its own copy.
Create a new hadoop Cluster named lab1. 3 KVMs all running instances of zookeeper, and journalnode. node 1 will be the active namenode, node2 will be the standby namenode.
All hosts will run datanodes with a replication factor of 2.
git repo hosting deploy with ansible to come.
Configuration
core-site.xml : Config for HA HDFS cluster naming.
hdfs-site.xml : Config for HDFS. Active/Passive Namenode config. 3 way minimum cluster n+1 config. replication factor of 2 for storage.
Initialise a new hdfs cluster
v2
Note: zookeeper must be running on all 3 nodes.
Start journal node on all 3 servers.
Initalise and start nodename on hadoop-n1 (active)
Initialise and start namenode on hadoop-n2 (standby)
Start all dfs services, this will start the ZK failover controller, namenodes will then become Active/Passive.