Decommissioning a Datanode in Hadoop cluster

Why do we need to decommission a data node ?

  1. For maintenance of datanode host like patching/hardware replacement etc.
  2. To discard datanode server when it complete it’s life cycle and categorized as DNP (do not permit).

How to decommission a data node ?

  1. add the decommission node name in <HADOOP_CONF_DIR>/dfs.exclude. If more than one node is there tdecommission, then list them separated by newline. An example for <HADOOP_CONF_DIR> is /etc/hadoop/conf
  2. If you have <HADOOP_CONF_DIR>/dfs.include file using for dfs in hadoop, then make sure the decommission nodes are removed from this list.
  3. su <HDFS USER> (example: su – hdfs); hdfs dfadmin -refresh nodes
    1.  Note: user can very for your environment, use the right user.
  4. Monitor decommission in progress until it turn it changes its status to “Decommissioned”. This state can be monitored from (http://NameNode_FQDN:50070 ) data node page.

Reference:

http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_Sys_Admin_Guides/content/ref-a179736c-eb7c-4dda-b3b4-6f3a778bd8c8.1.html

Advertisements

Author: rajukv

Hadoop(BigData) Architect and Hadoop Security Architect can design and build hadoop system to meet various data science projects.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s