Decommissioning a Resource Manager in Hadoop cluster

Why do we need to decommission a Resource node ?

  1. For maintenance of Resource node host like patching/hardware replacement etc.
  2. To discard Resource node server when it complete it’s life cycle and categorized as DNP (do not permit).
  3. Upgrade of hardware from lower configuration to higher configuration servers

How to decommission a data node ?

  • On the NameNode host machine, edit the <HADOOP_CONF_DIR>/dfs.exclude file and add the list of DataNodes hostnames (separated by a newline character).

    where <HADOOP_CONF_DIR> is the directory for storing the Hadoop configuration files. For example, /etc/hadoop/conf.

  • Update the NameNode with the new set of excluded DataNodes. On the NameNode host machine, execute the following command:
    su <HDFS_USER> 
    hdfs dfsadmin -refreshNodes

    where <HDFS_USER> is the user owning the HDFS services. For example, hdfs.

  • Open the NameNode web UI (http://<NameNode_FQDN&gt;:50070) and navigate to the DataNodes page. Check to see whether the state has changed to Decommission In Progress for the DataNodes being decommissioned.
  • When all the DataNodes report their state as Decommissioned (on the DataNodes page, or on the Decommissioned Nodes page at http://<NameNode_FQDN&gt;:8088/cluster/ nodes/decommissioned), all of the blocks have been replicated. You can then shut down the decommissioned nodes.
  • If your cluster utilizes a dfs.include file, remove the decommissioned nodes from the <HADOOP_CONF_DIR>/dfs.include file on the NameNode host machine, then execute the following command:
    su <HDFS_USER> 
    hdfs dfsadmin -refreshNodes
    [Note] Note
    If no dfs.include file is specified, all DataNodes are considered to be included in the cluster (unless excluded in the dfs.exclude file). The dfs.hosts and dfs.hosts.exclude properties in hdfs-site.xml are used to specify the dfs.include and dfs.exclude files.

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_Sys_Admin_Guides/content/ref-a179736c-eb7c-4dda-b3b4-6f3a778bd8c8.1.html

 

How to decommision a nodemanager ?

  1. yarn-site.xml  file will have 2 configurations yarn.resourcemanager.nodes.include-path and yarn.resourcemanager.nodes.exclude-path
  2. Add the decommission node name in <HADOOP_CONF_DIR>/yarn.exclude. If more than one node is there to decommission, then list them separated by newline. An example for <HADOOP_CONF_DIR> is /etc/hadoop/conf
  3. If you have <HADOOP_CONF_DIR>/yarn.include file using for yarn resource manager in hadoop, then make sure the decommission nodes are removed from this list.
  4. su <YARN USER> (example: su – yarn); yarn rmadmin -refresh nodes
    1.  Note: user can very for your environment, use the right user.
  1. Reference:

http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_Sys_Admin_Guides/content/ref-5981b2ae-bdc1-4eeb-8d01-fa2c088edf83.1.html

Advertisements

Author: rajukv

Hadoop(BigData) Architect and Hadoop Security Architect can design and build hadoop system to meet various data science projects.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s