Install Yarn on Ubuntu Cluster via Scripts

There is beautiful blog on using scripts to install YARN on Ubuntu

You can use the script to install hadoop on centos systems.

I come across these steps when reading Arun C Murthy and Vinod Kumar Vavillapalli’s  book Apache Hadoop Yarn.



Extend AWS RHEL 6 instance root file system size

AWS: Free tier t2.micro 2 instance A and B

OS: RHEL 6.6

root FS: Ext4

Filesystem      Size  Used Avail Use% Mounted on
/dev/xvda1      5.8G  4.1G  1.5G  74% /

Using: AWS CLI (They can be done easily from aws console with web ui)

Task: Root file system of instance B from 6G to available space.

Disclaimer: Please be careful while performing these steps as they are too risk and can cause loss of entire system.


  1.  Stop the instance B #aws ec2 stop-instances –instance-ids <instance B ID>
  2. Detach the root volume from instance B  #aws ec2 detach-volume –volume-id=”vol-xxxxxxxxx”
  3. Create snashot (For recovery in case you last the file system while doing below steps) $aws ec2 create-snapshot –volume-id “vol-xxxxxxxxx” –description “instance B root vol backup”

–Note: This is an extremly important to create snapshot for the safety to recover

  1. Attach to Instance A #aws ec2 attach-volume –volume-id=”vol-0d61acbe3bff8d115″ –instance-id=”i-05c133fd050dbd04d” –device=”/dev/sdf”
  2. Connect to the instance A through ssh $ssh -i “mykey.pem” ec2-user@<public ip of instanc A>
  3. Use command “lsblk” to verify the list of devices $sudo lsblk
    xvda    202:0    0  10G  0 disk
    └─xvda1 202:1    0   6G  0 part /
    xvdb    202:16   0  10G  0 disk /hadoop
    xvdf    202:80   0  30G  0 disk
    └─xvdf1 202:81   0  6G  0 part
  4.  Using parted, recreated the 6G partition xvdf1 to 10G $sudo parted /dev/xvdf
  5.  Change units to sectors (parted) unit s
  6. Print the current partion
    (parted) print
    Model: Xen Virtual Block Device (xvd)
    Disk /dev/xvdf: 62914560s
    Sector size (logical/physical): 512B/512B
    Partition Table: gpt
    Number Start End Size File system Name Flags
    1 2048s 12584959s 12582912s ext4 ext4 boot
  7. Delete the partion (parted) rm 1
  8. Create new partition (parted) mkpart ext4 2048s 100%  Warning: You requested a partition from 2048s to 62914559s. The closest location we can manage is 2048s to 20971486s.Is this still acceptable to you?Yes/No? Yes
  9. Print New Partition and verify(parted) print
    Model: Xen Virtual Block Device (xvd)
    Disk /dev/xvdf: 62914560s
    Sector size (logical/physical): 512B/512B
    Partition Table: gpt

    Number Start End Size File system Name Flags
    1 2048s 20971486s 20969439s ext4 ext4

  10. Make it bootable  (parted) set 1 boot on
  11. Quit from parted (parted) quit
  12. Run e2fsck $sudo e2fsck -f /dev/xvdf1
    e2fsck 1.41.12 (17-May-2010)
    Pass 1: Checking inodes, blocks, and sizes
    Pass 2: Checking directory structure
    Pass 3: Checking directory connectivity
    Pass 4: Checking reference counts
    Pass 5: Checking group summary information
    /dev/xvdf1: 58700/393216 files (0.1% non-contiguous), 631205/1572864 blocks
  13.  Extend using resize2fs command $sudo resize2fs /dev/xvdf1
    resize2fs 1.41.12 (17-May-2010)
    Resizing the filesystem on /dev/xvdf1 to 2621179 (4k) blocks.
    The filesystem on /dev/xvdf1 is now 2621179 blocks long.
  14. Verify attaching the extended partition $sudo mount -t ext4 /dev/xvdf1 /mnt
  15. Verify the file system with ls and $sudo df -h /mnt
    Filesystem Size Used Avail Use% Mounted on
    /dev/xvdf1 9.8G 2.2G 7.1G 24% /mnt

Now detach the ebs volume from instance A and attach it to instance B as root device i.e. “/dev/sda”

16. aws ec2 detach-volume –volume-id=”vol-*********”

17. aws ec2 attach-volume –volume-id=”vol-*********” –instance-id=”Instance B id” –device=”/dev/sda1
Note: attaching device as /dev/sda1 is import, as aws treats that as root device.

18. Start the instance B

aws ec2 start-instances –instance-id=”Instance B id”

19. Connect to the instance B through ssh and check file system size.

$ssh -i “mykey.pem” ec2-user@<public ip of instanc B>

df -h
Filesystem Size Used Avail Use% Mounted on
/dev/xvda1 9.8G 2.2G 7.1G 24% /
tmpfs 498M 0 498M 0% /dev/shm




Hive Tuning

10 Best Practices for Apache Hive


Set Up LDAP Authentication – Ambari

vi /etc/ambari-server/conf/

ambari-server setup-ldap
Using python  /usr/bin/python
Setting up LDAP properties...
Primary URL* {host:port} (
Secondary URL {host:port} :
Use SSL* [true/false] (false):
User object class* (user):
User name attribute* (sAMAccountName):
Group object class* (group):
Group name attribute* (cn):
Group member attribute* (member):
Distinguished name attribute* (distinguishedName):
Base DN* (dc=abc,dc=com):
Referral method [follow/ignore] :
Bind anonymously* [true/false] (false):
Manager DN* (cn=<AD service account>,OU=Hadoop,OU=Applications,DC=abc,DC=com):
Enter Manager Password* :
Re-enter password:
Review Settings
authentication.ldap.managerDn: cn=<AD service account>,OU=Hadoop,OU=Applications,DC=abc,DC=com
authentication.ldap.managerPassword: *****
Save settings [y/n] (y)? y
Ambari Server 'setup-ldap' completed successfully.

To Sync the groups.
vi groups.csv
<add all the ad groups which need to be sync with ambari>

ambari-server sync-ldap --groups groups.csv

To Sync the users. create users.csv file with list of ad user accounts separated by comma.
ambari-server sync-ldap --user users.csv

ambari-server setup-security

ambari-server setup-security
Using python  /usr/bin/python
Security setup options…
Choose one of the following options:
[1] Enable HTTPS for Ambari server.
[2] Encrypt passwords stored in file.
[3] Setup Ambari kerberos JAAS configuration.
[4] Setup truststore.
[5] Import certificate to truststore.
Enter choice, (1-5): 3
Setting up Ambari kerberos JAAS configuration to access secured Hadoop daemons…
Enter ambari server’s kerberos principal name (ambari@EXAMPLE.COM): qa-ambari-server@ABC.COM
Enter keytab path for ambari server’s kerberos principal: /root/qa-ambari-server.keytab
Ambari Server ‘setup-security’ completed successfully.