MYSQL Backup using mysqldump

  1. Create a script “mysql_backup.sh” and set in cron to run the jobs.

0 3 * * * /var/lib/mysql/mysql_backup.sh >> /var/log/mysql_backup/mysql_backup.log 2>> /var/log/mysql_backup/mysql_backup.err

2. SCRIPT:

#!/bin/bash
#Script to run on cron for taking mysql db backups
#Author: rajukv
#Date:
Host=slavemysqlserver.mydomain.com
BKPDir=/logs/mysql_backup
USER=mysql_bkpuser
PASS=’*******’
LOG=/var/log/mysql_backup/mysql_backup.log
LOG_ERROR=/var/log/mysql_backup/mysql_backup.log

#Set MYSQL Dump Command
Dump=”/usr/bin/mysqldump -u $USER -p$PASS -h $Host –skip-extended-insert –force”
#Set MYSQL DUMP COMMAND with –skip-lock-tables options
Dump_skip_lock_tables=”/usr/bin/mysqldump -u $USER -p$PASS -h $Host –skip-extended-insert –force –skip-lock-tables”
MySQL=/usr/bin/mysql

Today=$(date “+%a”)
echo ‘>>>Starting mysql backup at `date`’

# Get a list of all databases
Databases=$(echo “SHOW DATABASES” | $MySQL -u $USER -p$PASS -h $Host|grep -v Database)

for db in $Databases; do
date=`date`
file=”$BKPDir/$Host-$db-$Today.sql.gz”
echo “Backing up ‘$db’ from ‘$Host’ on ‘$Today’ to: ”
echo ” $file”
if [ “$db” == “information_schema” ] || [ “$db” == “performance_schema” ]; then
$Dump_skip_lock_tables -h $Host $db | gzip > $file
else
$Dump -h $Host $db | gzip > $file
fi
done
echo ‘>>>End of mysql backup at `date`’

3. Backup file verification

-rw-r–r– 1 mysql mysql    6424169 Sep 18 03:00 slavemysqlserver.mydomain.com-information_schema-Sun.sql.gz

-rw-r–r– 1 mysql mysql 546263 Sep 18 03:02 slavemysqlserver.mydomain.com-performance_schema-Sun.sql.gz

-rw-r–r– 1 mysql mysql      28365 Sep 18 03:04 slavemysqlserver.mydomain.com-sys-Sun.sql.gz

 

SQOOP ( pull and push data from/to RDBMS, EDW and files from/to Hadoop hdfs)

What is Sqoop ?

Sqoop is a tool designed to transfer data between Hadoop and relational databases. You can use Sqoop to import data from a relational database management system (RDBMS) such as MySQL or Oracle into the Hadoop Distributed File System (HDFS), transform the data in Hadoop MapReduce, and then export the data back into an RDBMS.

Sqoop automates most of this process, relying on the database to describe the schema for the data to be imported. Sqoop uses MapReduce to import and export the data, which provides parallel operation as well as fault tolerance.

https://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html

sqoop import (exmaples)

  1. import “salaries” table from database test in localhost

sqoop import –connect jdbc:mysql://localhost/test –table salaries –username root

hadoop fs -ls /user/root/salaries (this will have 4 map files under /user/root/salaries directory)

hadoop fs -cat /user/root/salaries/part-m-00000 (This will list content of the database in csv format)

2. importing selected columns using single map task into salaries2 directory

sqoop import –connect jdbc:mysql://localhost/test –table salaries –username root –columns salary,age -m 1 –target-dir /user/root/salaries2

hadoop fs -ls /user/root/salaries2/ (This will show only one map file as we used -m 1 option)
hadoop fs -cat /user/root/slaries2/part-m-00000  (this will show only two columns i.e. salary and age)

3. sqoop query option to import data of employees with salary > 90000 and sort out into 2 files based on gender

sqoop import –connect jdbc:mysql://localhost/test –query “SELECT * FROM salaries s where s.salary >=90000 AND \$CONDITIONS” –username root –split-by gender -m2 –target-dir /user/root/salaries3

hadoop fs -ls /user/root/salaries3
hadoop fs -cat /user/root/salaries3/part-m-00000
hadoop fs -cat /user/root/salaries3/part-m-00001

# sqoop import –connect jdbc:mysql://localhost/employees –query “SELECT emp_no,salary FROM salaries WHERE \$CONDITIONS” -m1 –target-dir /user/root/employee –username root

15/07/26 04:56:48 INFO mapreduce.ImportJobBase: Transferred 34.5346 MB in 49.9978 seconds (707.2999 KB/sec)
15/07/26 04:56:48 INFO mapreduce.ImportJobBase: Retrieved 2844047 records.
[root@sandbox employees_db]# hadoop fs -ls /user/root/employee
Found 2 items
-rw-r–r–   1 root hdfs          0 2015-07-26 04:56 /user/root/employee/_SUCCESS
-rw-r–r–   1 root hdfs   36212147 2015-07-26 04:56 /user/root/employee/part-m-00000

# sqoop import –connect jdbc:mysql://localhost/employees –query “SELECT emp_no,salary FROM salaries WHERE salary >=10000 AND \$CONDITIONS” -m1 –target-dir /user/root/employee –username root

Import from Oracle:

sqoop import --connect jdbc:oracle:thin:@localhost:1521/orcl --username MOVIEDEMO --password welcome1 --table ACTIVITY

sqoop export (Examples)

hadoop fs -mkdir /user/root/salarydata

hadoop fs -put salarydata.txt /user/root/salarydata/

mysql test < salaries2.sql

sqoop export –connect jdbc:mysql://localhost/test –table salaries2 –username root –export-dir /user/root/salarydata/

mysql> select * from salaries2 limit 10;
+——–+——+——–+———+
| gender | age  | salary | zipcode |
+——–+——+——–+———+
| M      |   52 |  85000 |   95102 |
| M      |   60 |  78000 |   94040 |
| F      |   74 |  89000 |   94040 |
| F      |   87 |  44000 |   95103 |
| F      |   74 |   2000 |   95103 |
| M      |   66 |  52000 |   95050 |
| F      |   62 |   9000 |   94040 |
| M      |   95 |  31000 |   95105 |
| F      |   90 |  39000 |   95050 |
| F      |   12 |      0 |   94041 |
+——–+——+——–+———+
10 rows in set (0.00 sec)

Export to Oracle database:

sqoop export --connect jdbc:oracle:thin:@localhost:1521/orcl --username MOVIEDEMO --password --table ACTIVITY_FILTERED --export-dir FILTERED_ACTIVITIES

SQOOP NOTES:

sqoop by defaults creates 4 map only tasks, this can be changed using -m option or –split-by options

sqoop import using –query then –split-by should be mentioned else it will fail.

You can import data in one of two file formats: delimited text or SequenceFiles.

sqoop export has 3 methods (update, insert and call)

sqoop supported database

Database version --direct support? connect string matches
HSQLDB 1.8.0+ No jdbc:hsqldb:*//
MySQL 5.0+ Yes jdbc:mysql://
Oracle 10.2.0+ No jdbc:oracle:*//
PostgreSQL 8.3+ Yes (import only) jdbc:postgresql://

Also it has connectors  to Teradata, Netezza and Microsoft SQL Server R2

Reference:

https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html

https://www.rittmanmead.com/blog/2014/03/using-sqoop-for-loading-oracle-data-into-hadoop-on-the-bigdatalite-vm/

 

Intro to mysql (Install, load sample employee database and uninstall mysql)

Installing mysql on Centos or ubuntu

CentOS:

yum -y install mysql mysql-server

service start mysqld

Ubuntu

$ sudo apt-get install mysql-server

How to download and load sample employee database

# wget -c https://launchpad.net/test-db/employees-db-1/1.0.6/+download/employees_db-dump-files-1.0.5.tar.bz2

# tar -xjf employees_db-dump-files-1.0.5.tar.bz2

# cd employees_db

Found only dump files, hence download another copy for sql files to load data.

# wget -c https://launchpad.net/test-db/employees-db-1/1.0.6/+download/employees_db-code-1.0.6.tar.bz2

# tar -xjf employees_db-code-1.0.6.tar.bz2

# cd employees_db;ls

Changelog employees_partitioned.sql load_departments.dump load_employees.dump objects.sql

README employees_partitioned2.sql load_dept_emp.dump load_salaries.dump test_employees_md5.sql

employees.sql employees_partitioned3.sql load_dept_manager.dump load_titles.dump test_employees_sha.sql

Load data

# mysql test < employees.sql

Verify the database
#mysql
if privilege not set, pass user and password details

#mysql –user=your-user-name –password=your-password
mysql> show databases;

+——————–+

| Database |

+——————–+

| information_schema |

| employees |

| hive |

| mysql |

| ranger |

| ranger_audit |

| test |

+——————–+
mysql> use employees;

mysql> show tables;

+———————+

| Tables_in_employees |

+———————+

| departments |

| dept_emp |

| dept_manager |

| employees |

| salaries |

| titles |

+———————+

6 rows in set (0.00 sec)
mysql> select * from departments limit 10;

+———+——————–+

| dept_no | dept_name |

+———+——————–+

| d009 | Customer Service |

| d005 | Development |

| d002 | Finance |

| d003 | Human Resources |

| d001 | Marketing |

| d004 | Production |

| d006 | Quality Management |

| d008 | Research |

| d007 | Sales |

+———+——————–+

To see schema of a table use describe

mysql> describe titles;

+———–+————-+——+—–+———+——-+

| Field | Type | Null | Key | Default | Extra |

+———–+————-+——+—–+———+——-+

| emp_no | int(11) | NO | PRI | NULL | |

| title | varchar(50) | NO | PRI | NULL | |

| from_date | date | NO | PRI | NULL | |

| to_date | date | YES | | NULL | |

+———–+————-+——+—–+———+——-+

Uninstall mysql in ubuntu

$ sudo apt-get autoremove mysql-server

Reference:

CentOS:

yum -y install mysql mysql-server

service start mysqld

Ubuntu

user1@ubuntu1:~$ sudo apt-get install mysql-server 

[sudo] password for user1: ******

Reading package lists… Done

Building dependency tree

Reading state information… Done

The following packages were automatically installed and are no longer required:

libgtkspell0 pidgin-data python-support

Use ‘apt-get autoremove’ to remove them.

The following extra packages will be installed:

libdbd-mysql-perl libdbi-perl libhtml-template-perl libmysqlclient18

libterm-readkey-perl mysql-client-5.5 mysql-client-core-5.5 mysql-common

mysql-server-5.5 mysql-server-core-5.5

Suggested packages:

libmldbm-perl libnet-daemon-perl libplrpc-perl libsql-statement-perl

libipc-sharedcache-perl tinyca mailx

The following NEW packages will be installed:

libdbd-mysql-perl libdbi-perl libhtml-template-perl libmysqlclient18

libterm-readkey-perl mysql-client-5.5 mysql-client-core-5.5 mysql-common

mysql-server mysql-server-5.5 mysql-server-core-5.5

0 upgraded, 11 newly installed, 0 to remove and 74 not upgraded.

Need to get 9,513 kB of archives.

After this operation, 96.5 MB of additional disk space will be used.

Do you want to continue? [Y/n] Y

Get:1 http://id.archive.ubuntu.com/ubuntu/ trusty-updates/main mysql-common all 5.5.44-0ubuntu0.14.04.1 [13.9 kB]

Get:2 http://id.archive.ubuntu.com/ubuntu/ trusty-updates/main libmysqlclient18 amd64 5.5.44-0ubuntu0.14.04.1 [596 kB]

Get:3 http://id.archive.ubuntu.com/ubuntu/ trusty/main libdbi-perl amd64 1.630-1 [879 kB]

Get:4 http://id.archive.ubuntu.com/ubuntu/ trusty/main libdbd-mysql-perl amd64 4.025-1 [99.3 kB]

Get:5 http://id.archive.ubuntu.com/ubuntu/ trusty/main libterm-readkey-perl amd64 2.31-1 [27.4 kB]

Get:6 http://id.archive.ubuntu.com/ubuntu/ trusty-updates/main mysql-client-core-5.5 amd64 5.5.44-0ubuntu0.14.04.1 [702 kB]

Get:7 http://id.archive.ubuntu.com/ubuntu/ trusty-updates/main mysql-client-5.5 amd64 5.5.44-0ubuntu0.14.04.1 [1,573 kB]

Get:8 http://id.archive.ubuntu.com/ubuntu/ trusty-updates/main mysql-server-core-5.5 amd64 5.5.44-0ubuntu0.14.04.1 [3,589 kB]

Get:9 http://id.archive.ubuntu.com/ubuntu/ trusty-updates/main mysql-server-5.5 amd64 5.5.44-0ubuntu0.14.04.1 [1,955 kB]

Get:10 http://id.archive.ubuntu.com/ubuntu/ trusty/main libhtml-template-perl all 2.95-1 [65.5 kB]

Get:11 http://id.archive.ubuntu.com/ubuntu/ trusty-updates/main mysql-server all 5.5.44-0ubuntu0.14.04.1 [12.2 kB]

Fetched 9,513 kB in 20s (471 kB/s)

Preconfiguring packages …

Selecting previously unselected package mysql-common.

(Reading database … 301431 files and directories currently installed.)

Preparing to unpack …/mysql-common_5.5.44-0ubuntu0.14.04.1_all.deb …

Unpacking mysql-common (5.5.44-0ubuntu0.14.04.1) …

Selecting previously unselected package libmysqlclient18:amd64.

Preparing to unpack …/libmysqlclient18_5.5.44-0ubuntu0.14.04.1_amd64.deb …

Unpacking libmysqlclient18:amd64 (5.5.44-0ubuntu0.14.04.1) …

Selecting previously unselected package libdbi-perl.

Preparing to unpack …/libdbi-perl_1.630-1_amd64.deb …

Unpacking libdbi-perl (1.630-1) …

Selecting previously unselected package libdbd-mysql-perl.

Preparing to unpack …/libdbd-mysql-perl_4.025-1_amd64.deb …

Unpacking libdbd-mysql-perl (4.025-1) …

Selecting previously unselected package libterm-readkey-perl.

Preparing to unpack …/libterm-readkey-perl_2.31-1_amd64.deb …

Unpacking libterm-readkey-perl (2.31-1) …

Selecting previously unselected package mysql-client-core-5.5.

Preparing to unpack …/mysql-client-core-5.5_5.5.44-0ubuntu0.14.04.1_amd64.deb …

Unpacking mysql-client-core-5.5 (5.5.44-0ubuntu0.14.04.1) …

Selecting previously unselected package mysql-client-5.5.

Preparing to unpack …/mysql-client-5.5_5.5.44-0ubuntu0.14.04.1_amd64.deb …

Unpacking mysql-client-5.5 (5.5.44-0ubuntu0.14.04.1) …

Selecting previously unselected package mysql-server-core-5.5.

Preparing to unpack …/mysql-server-core-5.5_5.5.44-0ubuntu0.14.04.1_amd64.deb …

Unpacking mysql-server-core-5.5 (5.5.44-0ubuntu0.14.04.1) …

Processing triggers for man-db (2.6.7.1-1ubuntu1) …

Setting up mysql-common (5.5.44-0ubuntu0.14.04.1) …

Selecting previously unselected package mysql-server-5.5.

(Reading database … 301787 files and directories currently installed.)

Preparing to unpack …/mysql-server-5.5_5.5.44-0ubuntu0.14.04.1_amd64.deb …

Unpacking mysql-server-5.5 (5.5.44-0ubuntu0.14.04.1) …

Selecting previously unselected package libhtml-template-perl.

Preparing to unpack …/libhtml-template-perl_2.95-1_all.deb …

Unpacking libhtml-template-perl (2.95-1) …

Selecting previously unselected package mysql-server.

Preparing to unpack …/mysql-server_5.5.44-0ubuntu0.14.04.1_all.deb …

Unpacking mysql-server (5.5.44-0ubuntu0.14.04.1) …

Processing triggers for man-db (2.6.7.1-1ubuntu1) …

Processing triggers for ureadahead (0.100.0-16) …

ureadahead will be reprofiled on next reboot

Setting up libmysqlclient18:amd64 (5.5.44-0ubuntu0.14.04.1) …

Setting up libdbi-perl (1.630-1) …

Setting up libdbd-mysql-perl (4.025-1) …

Setting up libterm-readkey-perl (2.31-1) …

Setting up mysql-client-core-5.5 (5.5.44-0ubuntu0.14.04.1) …

Setting up mysql-client-5.5 (5.5.44-0ubuntu0.14.04.1) …

Setting up mysql-server-core-5.5 (5.5.44-0ubuntu0.14.04.1) …

Setting up mysql-server-5.5 (5.5.44-0ubuntu0.14.04.1) …

150725 22:57:13 [Warning] Using unique option prefix key_buffer instead of key_buffer_size is deprecated and will be removed in a future release. Please use the full name instead.

150725 22:57:13 [Note] /usr/sbin/mysqld (mysqld 5.5.44-0ubuntu0.14.04.1) starting as process 15359 …

mysql start/running, process 15491

Setting up libhtml-template-perl (2.95-1) …

Processing triggers for ureadahead (0.100.0-16) …

Setting up mysql-server (5.5.44-0ubuntu0.14.04.1) …

Processing triggers for libc-bin (2.19-0ubuntu6.6) …

How to download and load sample employee database

# wget -c https://launchpad.net/test-db/employees-db-1/1.0.6/+download/employees_db-dump-files-1.0.5.tar.bz2 

–2015-07-25 14:41:08– https://launchpad.net/test-db/employees-db-1/1.0.6/+download/employees_db-dump-files-1.0.5.tar.bz2

Resolving launchpad.net… 91.189.89.223, 91.189.89.222

Connecting to launchpad.net|91.189.89.223|:443… connected.

HTTP request sent, awaiting response… 302 Moved Temporarily

Location: https://launchpadlibrarian.net/24493789/employees_db-dump-files-1.0.5.tar.bz2 [following]

–2015-07-25 14:41:09– https://launchpadlibrarian.net/24493789/employees_db-dump-files-1.0.5.tar.bz2

Resolving launchpadlibrarian.net… 91.189.89.229, 91.189.89.228

Connecting to launchpadlibrarian.net|91.189.89.229|:443… connected.

HTTP request sent, awaiting response… 200 OK

Length: 26706427 (25M) [text/plain]

Saving to: `employees_db-dump-files-1.0.5.tar.bz2′

100%[======================================================================================================>] 26,706,427 422K/s in 58s

2015-07-25 14:42:12 (451 KB/s) – `employees_db-dump-files-1.0.5.tar.bz2′ saved [26706427/26706427]

# tar -xjf employees_db-dump-files-1.0.5.tar.bz2 
# cd employees_db
# ls

load_departments.dump load_dept_emp.dump load_dept_manager.dump load_employees.dump load_salaries.dump load_titles.dump

Found only dump files, hence download another copy for sql files to load data.

# wget -c https://launchpad.net/test-db/employees-db-1/1.0.6/+download/employees_db-code-1.0.6.tar.bz2 

–2015-07-25 14:43:35– https://launchpad.net/test-db/employees-db-1/1.0.6/+download/employees_db-code-1.0.6.tar.bz2

Resolving launchpad.net… 91.189.89.223, 91.189.89.222

Connecting to launchpad.net|91.189.89.223|:443… connected.

HTTP request sent, awaiting response… 302 Moved Temporarily

Location: https://launchpadlibrarian.net/24493350/employees_db-code-1.0.6.tar.bz2 [following]

–2015-07-25 14:43:37– https://launchpadlibrarian.net/24493350/employees_db-code-1.0.6.tar.bz2

Resolving launchpadlibrarian.net… 91.189.89.229, 91.189.89.228

Connecting to launchpadlibrarian.net|91.189.89.229|:443… connected.

HTTP request sent, awaiting response… 200 OK

Length: 6511 (6.4K) [text/plain]

Saving to: `employees_db-code-1.0.6.tar.bz2′

100%[======================================================================================================>] 6,511 –.-K/s in 0s

2015-07-25 14:43:38 (30.4 MB/s) – `employees_db-code-1.0.6.tar.bz2′ saved [6511/6511]

# tar -xjf employees_db-code-1.0.6.tar.bz2 
# cd employees_db;ls 

Changelog employees_partitioned.sql load_departments.dump load_employees.dump objects.sql

README employees_partitioned2.sql load_dept_emp.dump load_salaries.dump test_employees_md5.sql

employees.sql employees_partitioned3.sql load_dept_manager.dump load_titles.dump test_employees_sha.sql

Load data

# mysql test < employees.sql 

INFO

CREATING DATABASE STRUCTURE

INFO

storage engine: InnoDB

INFO

LOADING departments

INFO

LOADING employees

INFO

LOADING dept_emp

INFO

LOADING dept_manager

INFO

LOADING titles

INFO

LOADING salaries

Verify the database

#mysql 

if privilege not set, pass user and password details

#mysql --user=your-user-name --password=your-password 

mysql> show databases;

+——————–+

| Database |

+——————–+

| information_schema |

| employees |

| hive |

| mysql |

| ranger |

| ranger_audit |

| test |

+——————–+

mysql> use employees;

mysql> show tables;

+———————+

| Tables_in_employees |

+———————+

| departments |

| dept_emp |

| dept_manager |

| employees |

| salaries |

| titles |

+———————+

6 rows in set (0.00 sec)

mysql> select * from departments limit 10;

+———+——————–+

| dept_no | dept_name |

+———+——————–+

| d009 | Customer Service |

| d005 | Development |

| d002 | Finance |

| d003 | Human Resources |

| d001 | Marketing |

| d004 | Production |

| d006 | Quality Management |

| d008 | Research |

| d007 | Sales |

+———+——————–+

To see schema of a table use describe

mysql> describe titles;

+———–+————-+——+—–+———+——-+

| Field | Type | Null | Key | Default | Extra |

+———–+————-+——+—–+———+——-+

| emp_no | int(11) | NO | PRI | NULL | |

| title | varchar(50) | NO | PRI | NULL | |

| from_date | date | NO | PRI | NULL | |

| to_date | date | YES | | NULL | |

+———–+————-+——+—–+———+——-+

Uninstall mysql in ubuntu

user1@ubuntu1:~$ sudo apt-get autoremove mysql-server 

Reading package lists… Done

Building dependency tree

Reading state information… Done

Package ‘mysql-server’ is not installed, so not removed

The following packages will be REMOVED:

libdbd-mysql-perl libdbi-perl libgtkspell0 libmysqlclient18

libterm-readkey-perl mysql-client-5.5 mysql-client-core-5.5 mysql-common

mysql-server-5.5 mysql-server-core-5.5 pidgin-data python-support

0 upgraded, 0 newly installed, 12 to remove and 74 not upgraded.

After this operation, 100 MB disk space will be freed.

Do you want to continue? [Y/n] y

(Reading database … 301881 files and directories currently installed.)

Removing mysql-server-5.5 (5.5.44-0ubuntu0.14.04.1) …

mysql stop/waiting

Removing mysql-client-5.5 (5.5.44-0ubuntu0.14.04.1) …

Removing libdbd-mysql-perl (4.025-1) …

Removing libdbi-perl (1.630-1) …

Removing libgtkspell0 (2.0.16-1ubuntu7) …

Removing libmysqlclient18:amd64 (5.5.44-0ubuntu0.14.04.1) …

Removing libterm-readkey-perl (2.31-1) …

Removing mysql-client-core-5.5 (5.5.44-0ubuntu0.14.04.1) …

Removing mysql-common (5.5.44-0ubuntu0.14.04.1) …

Removing mysql-server-core-5.5 (5.5.44-0ubuntu0.14.04.1) …

Removing pidgin-data (1:2.10.9-0ubuntu3.2) …

Removing python-support (1.0.15) …

Processing triggers for man-db (2.6.7.1-1ubuntu1) …

Processing triggers for libc-bin (2.19-0ubuntu6.6) …

Processing triggers for hicolor-icon-theme (0.13-1) …