Hadoop HDFS/YARN HA cluster
Description
Building a HA hadoop cluster with active passive namenode pair.
namenode is the controller for HDFS reads and writes. In HA mode namenode edits the journalnodes log, the standby then reads this log and applys to its own copy.
https://hadoop.apache.org/docs/r2.9.2/hadoop-project-dist/hadoop-common/ClusterSetup.html
Solution
Create a new hadoop Cluster named lab1. 3 KVMs all running instances of zookeeper, and journalnode. node 1 will be the active namenode, node2 will be the standby namenode.
All hosts will run datanodes with a replication factor of 2.
git repo hosting deploy with ansible to come.
ansible-playbook hadoop.yml --diff -i inventory -u root -e "bootstrap=True"
Configuration
core-site.xml : Config for HA HDFS cluster naming.
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://lab1</value>
</property>
</configuration>
hdfs-site.xml : Config for HDFS. Active/Passive Namenode config. 3 way minimum cluster n+1 config. replication factor of 2 for storage.
<configuration>
<property>
<name>dfs.nameservices</name>
<value>lab1</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.ha.namenodes.lab1</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.lab1.nn1</name>
<value>hadoop-n1.home.lan:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.lab1.nn2</name>
<value>hadoop-n2.home.lan:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.lab1.nn1</name>
<value>hadoop-n1.home.lan:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.lab1.nn2</name>
<value>hadoop-n2.home.lan:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://hadoop-n1.home.lan:8485;hadoop-n2.home.lan:8485;hadoop-n3.home.lan:8485/lab1</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/data1/hadoop/hdfs/jn</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.lab1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<!--
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>
-->
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///data1/hadoop/hdfs/nn</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///data1/hadoop/hdfs/dn</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>hadoop-n1.home.lan:2181,hadoop-n2.home.lan:2181,hadoop-n3.home.lan:2181</value>
</property>
</configuration>
Initialise a new hdfs cluster
v2
Note: zookeeper must be running on all 3 nodes.
Start journal node on all 3 servers.
hadoop-daemon.sh start journalnode
Initalise and start nodename on hadoop-n1 (active)
hdfs namenode -format
hadoop-daemon.sh start namenode
Initialise and start namenode on hadoop-n2 (standby)
hdfs namenode —bootstrapStandby
hadoop-daemon.sh start namenode
Start all dfs services, this will start the ZK failover controller, namenodes will then become Active/Passive.
start-dfs.sh
Confirm status of namenodes
Active:
http://hadoop-n1.home.lan:50070/dfshealth.html#tab-overview
Standby:
http://hadoop-n2.home.lan:50070/dfshealth.html#tab-overview
Confirm presence of datanodes:
http://hadoop-n1.home.lan:50070/dfshealth.html#tab-datanode
Review status of a datanode:
http://hadoop-n1.home.lan:50075/datanode.html
Upload a file in the Utilities » Browse filesystem
http://hadoop-n1.home.lan:50070/explorer.html#/
TODO Yarn
Start yarn services:
[root@hadoop-n1 sbin]# ./start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /opt/hadoop/logs/yarn-root-resourcemanager-hadoop-n1.home.lan.out
localhost: starting nodemanager, logging to /opt/hadoop/logs/yarn-root-nodemanager-hadoop-n1.home.lan.out
Check yarn scheduler status:
http://hadoop-n1.home.lan:8088/cluster