Why do we need to decommission a data node ?
- For maintenance of datanode host like patching/hardware replacement etc.
- To discard datanode server when it complete it’s life cycle and categorized as DNP (do not permit).
How to decommission a data node ?
- add the decommission node name in <HADOOP_CONF_DIR>/dfs.exclude. If more than one node is there tdecommission, then list them separated by newline. An example for <HADOOP_CONF_DIR> is /etc/hadoop/conf
- If you have <HADOOP_CONF_DIR>/dfs.include file using for dfs in hadoop, then make sure the decommission nodes are removed from this list.
- su <HDFS USER> (example: su – hdfs); hdfs dfadmin -refresh nodes
- Note: user can very for your environment, use the right user.
- Monitor decommission in progress until it turn it changes its status to “Decommissioned”. This state can be monitored from (http://NameNode_FQDN:50070 ) data node page.