MYSQL Backup using mysqldump

  1. Create a script “mysql_backup.sh” and set in cron to run the jobs.

0 3 * * * /var/lib/mysql/mysql_backup.sh >> /var/log/mysql_backup/mysql_backup.log 2>> /var/log/mysql_backup/mysql_backup.err

2. SCRIPT:

#!/bin/bash
#Script to run on cron for taking mysql db backups
#Author: rajukv
#Date:
Host=slavemysqlserver.mydomain.com
BKPDir=/logs/mysql_backup
USER=mysql_bkpuser
PASS=’*******’
LOG=/var/log/mysql_backup/mysql_backup.log
LOG_ERROR=/var/log/mysql_backup/mysql_backup.log

#Set MYSQL Dump Command
Dump=”/usr/bin/mysqldump -u $USER -p$PASS -h $Host –skip-extended-insert –force”
#Set MYSQL DUMP COMMAND with –skip-lock-tables options
Dump_skip_lock_tables=”/usr/bin/mysqldump -u $USER -p$PASS -h $Host –skip-extended-insert –force –skip-lock-tables”
MySQL=/usr/bin/mysql

Today=$(date “+%a”)
echo ‘>>>Starting mysql backup at `date`’

# Get a list of all databases
Databases=$(echo “SHOW DATABASES” | $MySQL -u $USER -p$PASS -h $Host|grep -v Database)

for db in $Databases; do
date=`date`
file=”$BKPDir/$Host-$db-$Today.sql.gz”
echo “Backing up ‘$db’ from ‘$Host’ on ‘$Today’ to: ”
echo ” $file”
if [ “$db” == “information_schema” ] || [ “$db” == “performance_schema” ]; then
$Dump_skip_lock_tables -h $Host $db | gzip > $file
else
$Dump -h $Host $db | gzip > $file
fi
done
echo ‘>>>End of mysql backup at `date`’

3. Backup file verification

-rw-r–r– 1 mysql mysql    6424169 Sep 18 03:00 slavemysqlserver.mydomain.com-information_schema-Sun.sql.gz

-rw-r–r– 1 mysql mysql 546263 Sep 18 03:02 slavemysqlserver.mydomain.com-performance_schema-Sun.sql.gz

-rw-r–r– 1 mysql mysql      28365 Sep 18 03:04 slavemysqlserver.mydomain.com-sys-Sun.sql.gz

 

Advertisements

HADOOP:HDFS: Recover files deleted in HDFS from .Trash

When files/directories are deleted, Hadoop moves files to .Trash directory  if “TrashPolicyDefault: Namenode trash configuration: Deletion interval ” enabled and interval is set.

hadoop fs -rm -r -f /user/root/employee

15/07/26 05:12:14 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 360 minutes, Emptier interval = 0 minutes.

Moved: ‘hdfs://sandbox.hortonworks.com:8020/user/root/employee’ to trash at: hdfs://sandbox.hortonworks.com:8020/user/root/.Trash/Current

# hadoop fs -ls /user/root/employee

ls: `/user/root/employee’: No such file or directory

Notes on .Trash:

The Hadoop trash feature helps prevent accidental deletion of files and directories. If trash is enabled and a file or directory is deleted using the Hadoop shell, the file is moved to the .Trash directory in the user’s home directory instead of being deleted. Deleted files are initially moved to the Current sub-directory of the .Trash directory, and their original path is preserved. Files in .Trash are permanently removed after a user-configurable time interval. The interval setting also enables trash checkpointing, where the Current directory is periodically renamed using a timestamp. Files and directories in the trash can be restored simply by moving them to a location outside the .Trash directory.

Where is the configuration located ?

# grep -ri -a1 trash /etc/hadoop/conf/

/etc/hadoop/conf/core-site.xml: <name>fs.trash.interval</name>

/etc/hadoop/conf/core-site.xml- <value>360</value>

in CDS you can enable/disable and configure interval:

http://www.cloudera.com/content/cloudera/en/documentation/cloudera-manager/v4-latest/Cloudera-Manager-Managing-Clusters/cmmc_hdfs_trash.html

From AMBARI you can change the settings using below path:

HDFS==>Configs==>Advanced Core Site ==>fs.trash.interval