- Daily Operations
- Decommissioning Data Nodes
- Cleaning Up a CORRUPT Filesystem
- Restoring from a checkpoint
- Fixing Stuck and Under Replicated Files
- Port Forwarding for the Hadoop Web Interface
- Running the Balancer
This article answers many questions on HDFS for those who want to understand HDFS in depth.
Below activity need to be performed in all hive servers, hive metastore and hive client nodes.
- Create the folder if not exists “/usr/hdp/184.108.40.206-121/hive/auxlib”
- copy the custom build jar into this folder “customserde.jar”
- Restart the hive service
- verify with “ps -ef|grep -hive|grep customserde”. Hive process should have loaded this file along with path in section “–hiveconf hive.aux.jars.path=”
Recently upgraded from hdp 2.3.2 to 2.5.3 using rolling upgrade method. Wile upgrade status in Pause status, pending for commit. HDFS preserving all data in hdfs even after using -skipTrash option.
hdfs dfs -du -s -h give correct size after deletion. But hdfs dfsadmin -report show higher value.
Identified root cause working with hwx is : It keeps data in “trash” folder in datanode disks allocated for dfs storage.
This will be cleared after commit the rolling upgrade through Ambari or manually ” dfsadmin -finalizeUpgrade”
hadoop jar /usr/hdp/220.127.116.11-1245/hadoop-mapreduce/hadoop-mapreduce-examples.jar teragen 10000000000 /teraInput
# hdfs dfs -mv /teraInput /user/root/10000000 # hadoop jar /usr/hdp/18.104.22.168-1245/hadoop-mapreduce/hadoop-mapreduce-examples.jar terasort 10000000 /teraInput /teraOutput
# hdfs dfs -mv /teraInput /teraOutput # hadoop jar /usr/hdp/22.214.171.124-1245/hadoop-mapreduce/hadoop-mapreduce-examples.jar teravalidate /teraOutput /teraValidate