hadoop:sqoop:export update

  1. Create table in mysql db

mysql> CREATE TABLE foo(
->     id INT NOT NULL PRIMARY KEY,
->     msg VARCHAR(32),
->     bar INT);

2. create data file in hdfs

vi foo.txt
0,where are you,42
1,how are you,100
2,where do you go,67

hdfs dfs -put foo.txt /user/root

3. Export to mysqldb into foo table created in step 1

[root@sandbox ~]# sqoop export --connect jdbc:mysql://sandbox.hortonworks.com:3306/employees --table foo --export-dir /user/root/foo.txt --input-fields-terminated-by ',' --update-key id --username root -P
Warning: /usr/hdp/2.3.0.0-2130/accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
16/02/09 03:15:23 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.3.0.0-2130
Enter password: 
16/02/09 03:15:28 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.3.0.0-2130/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.3.0.0-2130/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
16/02/09 03:15:28 INFO tool.CodeGenTool: Beginning code generation
16/02/09 03:15:28 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `foo` AS t LIMIT 1
16/02/09 03:15:29 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `foo` AS t LIMIT 1
16/02/09 03:15:29 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/2.3.0.0-2130/hadoop-mapreduce
Note: /tmp/sqoop-root/compile/ac651fa36762de6388a2b994c0a40cbc/foo.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
16/02/09 03:15:34 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/ac651fa36762de6388a2b994c0a40cbc/foo.jar
16/02/09 03:15:34 INFO mapreduce.ExportJobBase: Beginning export of foo
16/02/09 03:15:39 INFO impl.TimelineClientImpl: Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/
16/02/09 03:15:39 INFO client.RMProxy: Connecting to ResourceManager at sandbox.hortonworks.com/10.0.2.15:8050
16/02/09 03:15:43 INFO input.FileInputFormat: Total input paths to process : 1
16/02/09 03:15:43 INFO input.FileInputFormat: Total input paths to process : 1
16/02/09 03:15:43 INFO mapreduce.JobSubmitter: number of splits:4
16/02/09 03:15:43 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1454676435965_0015
16/02/09 03:15:44 INFO impl.YarnClientImpl: Submitted application application_1454676435965_0015
16/02/09 03:15:44 INFO mapreduce.Job: The url to track the job: http://sandbox.hortonworks.com:8088/proxy/application_1454676435965_0015/
16/02/09 03:15:44 INFO mapreduce.Job: Running job: job_1454676435965_0015

16/02/09 03:15:44 INFO mapreduce.Job: Running job: job_1454676435965_0015
16/02/09 03:15:57 INFO mapreduce.Job: Job job_1454676435965_0015 running in uber mode : false
16/02/09 03:15:57 INFO mapreduce.Job:  map 0% reduce 0%
16/02/09 03:16:21 INFO mapreduce.Job:  map 50% reduce 0%
16/02/09 03:16:24 INFO mapreduce.Job:  map 75% reduce 0%
16/02/09 03:16:25 INFO mapreduce.Job:  map 100% reduce 0%
16/02/09 03:16:27 INFO mapreduce.Job: Job job_1454676435965_0015 completed successfully
16/02/09 03:16:27 INFO mapreduce.Job: Counters: 30
    File System Counters
        FILE: Number of bytes read=0
        FILE: Number of bytes written=570836
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=777
        HDFS: Number of bytes written=0
        HDFS: Number of read operations=19
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=0
    Job Counters 
        Launched map tasks=4
        Data-local map tasks=4
        Total time spent by all maps in occupied slots (ms)=94152
        Total time spent by all reduces in occupied slots (ms)=0
        Total time spent by all map tasks (ms)=94152
        Total vcore-seconds taken by all map tasks=94152
        Total megabyte-seconds taken by all map tasks=23538000
    Map-Reduce Framework
        Map input records=3
        Map output records=3
        Input split bytes=606
        Spilled Records=0
        Failed Shuffles=0
        Merged Map outputs=0
        GC time elapsed (ms)=849
        CPU time spent (ms)=9970
        Physical memory (bytes) snapshot=573116416
        Virtual memory (bytes) snapshot=3275403264
        Total committed heap usage (bytes)=421003264
    File Input Format Counters 
        Bytes Read=0
    File Output Format Counters 
        Bytes Written=0
16/02/09 03:16:27 INFO mapreduce.ExportJobBase: Transferred 777 bytes in 49.2857 seconds (15.7652 bytes/sec)
16/02/09 03:16:27 INFO mapreduce.ExportJobBase: Exported 3 records.

4. Test running a query in mysql. It did not updated though it showed as exported 3 records. As foo table is empty update is failed.
mysql> select * from foo;
Empty set (0.00 sec)

5. Now try with --update-mode as allowinsert

 sqoop export --connect jdbc:mysql://sandbox.hortonworks.com:3306/employees --table foo --export-dir /user/root/foo.txt --input-fields-terminated-by ',' --update-key id --update-mode allowinsert --username root -P


with -update-mode changed from default updateonly to allowinsert it inserted records

6. Test Sql query

select * from foo;
+----+-----------------+------+
| id | msg             | bar  |
+----+-----------------+------+
|  2 | where do you go |   67 |
|  1 | how are you     |  100 |
|  0 | where are you   |   42 |
+----+-----------------+------+
3 rows in set (0.00 sec)

Advertisements

Author: rajukv

Hadoop(BigData) Architect and Hadoop Security Architect can design and build hadoop system to meet various data science projects.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s