hive> CREATE EXTERNAL TABLE h2 (id int, name STRING, ts TIMESTAMP) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE location '/user/it1/sqin5'; OK Time taken: 1.582 seconds hive> select count(*) from h2; Query ID = it1_20160125133417_82ddd40f-568a-46d5-b260-2906f597c39e Total jobs = 1 Launching Job 1 out of 1 Status: Running (Executing on YARN cluster with App id application_1453391437654_0052) -------------------------------------------------------------------------------- VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -------------------------------------------------------------------------------- Map 1 SUCCEEDED 0 0 0 0 0 0 Reducer 2 ...... SUCCEEDED 1 1 0 0 0 0 -------------------------------------------------------------------------------- VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 6.36 s -------------------------------------------------------------------------------- OK 0 Time taken: 10.466 seconds, Fetched: 1 row(s) hive> quit; [it1@sandbox input]$ sqoop import --connect jdbc:mysql://localhost:3306/test --driver com.mysql.jdbc.Driver --username it1 --password hadoop --table st1 --target-dir sqin5 -m 1 --incremental append -check-column id Warning: /usr/hdp/2.3.2.0-2950/accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. 16/01/25 13:36:01 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.3.2.0-2950 16/01/25 13:36:01 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 16/01/25 13:36:01 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time. 16/01/25 13:36:01 INFO manager.SqlManager: Using default fetchSize of 1000 16/01/25 13:36:01 INFO tool.CodeGenTool: Beginning code generation SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/hdp/2.3.2.0-2950/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hdp/2.3.2.0-2950/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 16/01/25 13:36:02 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM st1 AS t WHERE 1=0 16/01/25 13:36:02 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM st1 AS t WHERE 1=0 16/01/25 13:36:02 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/2.3.2.0-2950/hadoop-mapreduce Note: /tmp/sqoop-it1/compile/1eeac2cd815470e1a32c23bd2f156b9f/st1.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. 16/01/25 13:36:05 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-it1/compile/1eeac2cd815470e1a32c23bd2f156b9f/st1.jar 16/01/25 13:36:07 INFO tool.ImportTool: Maximal id query for free form incremental import: SELECT MAX(id) FROM st1 16/01/25 13:36:07 INFO tool.ImportTool: Incremental import based on column id 16/01/25 13:36:07 INFO tool.ImportTool: Upper bound value: 5000 16/01/25 13:36:07 INFO mapreduce.ImportJobBase: Beginning import of st1 16/01/25 13:36:07 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM st1 AS t WHERE 1=0 16/01/25 13:36:08 INFO impl.TimelineClientImpl: Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/ 16/01/25 13:36:09 INFO client.RMProxy: Connecting to ResourceManager at sandbox.hortonworks.com/10.0.2.15:8050 16/01/25 13:36:11 INFO db.DBInputFormat: Using read commited transaction isolation 16/01/25 13:36:11 INFO mapreduce.JobSubmitter: number of splits:1 16/01/25 13:36:11 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1453391437654_0053 16/01/25 13:36:12 INFO impl.YarnClientImpl: Submitted application application_1453391437654_0053 16/01/25 13:36:12 INFO mapreduce.Job: The url to track the job: http://sandbox.hortonworks.com:8088/proxy/application_1453391437654_0053/ 16/01/25 13:36:12 INFO mapreduce.Job: Running job: job_1453391437654_0053 16/01/25 13:36:19 INFO mapreduce.Job: Job job_1453391437654_0053 running in uber mode : false 16/01/25 13:36:19 INFO mapreduce.Job: map 0% reduce 0% 16/01/25 13:36:27 INFO mapreduce.Job: map 100% reduce 0% 16/01/25 13:36:27 INFO mapreduce.Job: Job job_1453391437654_0053 completed successfully 16/01/25 13:36:27 INFO mapreduce.Job: Counters: 30 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=145472 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=87 HDFS: Number of bytes written=154293 HDFS: Number of read operations=4 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=1 Other local map tasks=1 Total time spent by all maps in occupied slots (ms)=4191 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=4191 Total vcore-seconds taken by all map tasks=4191 Total megabyte-seconds taken by all map tasks=1047750 Map-Reduce Framework Map input records=5000 Map output records=5000 Input split bytes=87 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=36 CPU time spent (ms)=1320 Physical memory (bytes) snapshot=151392256 Virtual memory (bytes) snapshot=826109952 Total committed heap usage (bytes)=132644864 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=154293 16/01/25 13:36:27 INFO mapreduce.ImportJobBase: Transferred 150.6768 KB in 19.2572 seconds (7.8244 KB/sec) 16/01/25 13:36:27 INFO mapreduce.ImportJobBase: Retrieved 5000 records. 16/01/25 13:36:27 INFO util.AppendUtils: Appending to directory sqin5 16/01/25 13:36:27 INFO tool.ImportTool: Incremental import complete! To run another incremental import of all data following this import, supply the following arguments: 16/01/25 13:36:27 INFO tool.ImportTool: --incremental append 16/01/25 13:36:27 INFO tool.ImportTool: --check-column id 16/01/25 13:36:27 INFO tool.ImportTool: --last-value 5000 16/01/25 13:36:27 INFO tool.ImportTool: (Consider saving this with 'sqoop job --create') [it1@sandbox input]$ hive SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/hdp/2.3.2.0-2950/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hdp/2.3.2.0-2950/spark/lib/spark-assembly-1.4.1.2.3.2.0-2950-hadoop2.7.1.2.3.2.0-2950.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] WARNING: Use "yarn jar" to launch YARN applications. SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/hdp/2.3.2.0-2950/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hdp/2.3.2.0-2950/spark/lib/spark-assembly-1.4.1.2.3.2.0-2950-hadoop2.7.1.2.3.2.0-2950.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Logging initialized using configuration in file:/etc/hive/2.3.2.0-2950/0/hive-log4j.properties hive> select count(*) from h2; Query ID = it1_20160125133734_329ffc2e-882b-4f97-85cf-3740d11e0bd2 Total jobs = 1 Launching Job 1 out of 1 Status: Running (Executing on YARN cluster with App id application_1453391437654_0054) -------------------------------------------------------------------------------- VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -------------------------------------------------------------------------------- Map 1 .......... SUCCEEDED 1 1 0 0 0 0 Reducer 2 ...... SUCCEEDED 1 1 0 0 0 0 -------------------------------------------------------------------------------- VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 8.24 s -------------------------------------------------------------------------------- OK 5000 Time taken: 19.838 seconds, Fetched: 1 row(s) hive> quit; [it1@sandbox input]$ mysql Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1625 Server version: 5.1.73 Source distribution Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> use test; Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed mysql> load data local infile '/home/it1/input/in1' into table st1 FIELDS TERMINATED BY ','; Query OK, 900 rows affected, 900 warnings (0.00 sec) Records: 900 Deleted: 0 Skipped: 0 Warnings: 900 mysql> select count(*) from st1; +----------+ | count(*) | +----------+ | 5900 | +----------+ 1 row in set (0.00 sec) mysql> quit; Bye [it1@sandbox input]$ sqoop import --connect jdbc:mysql://localhost:3306/test --driver com.mysql.jdbc.Driver --username it1 --password hadoop --table st1 --target-dir sqin5 -m 1 --incremental append -check-column id --last-value 5000 Warning: /usr/hdp/2.3.2.0-2950/accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. 16/01/25 13:38:17 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.3.2.0-2950 16/01/25 13:38:17 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 16/01/25 13:38:17 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time. 16/01/25 13:38:17 INFO manager.SqlManager: Using default fetchSize of 1000 16/01/25 13:38:17 INFO tool.CodeGenTool: Beginning code generation SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/hdp/2.3.2.0-2950/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hdp/2.3.2.0-2950/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 16/01/25 13:38:18 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM st1 AS t WHERE 1=0 16/01/25 13:38:18 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM st1 AS t WHERE 1=0 16/01/25 13:38:18 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/2.3.2.0-2950/hadoop-mapreduce Note: /tmp/sqoop-it1/compile/29970e827f2a30d416a8a879fe7c67e9/st1.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. 16/01/25 13:38:21 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-it1/compile/29970e827f2a30d416a8a879fe7c67e9/st1.jar 16/01/25 13:38:23 INFO tool.ImportTool: Maximal id query for free form incremental import: SELECT MAX(id) FROM st1 16/01/25 13:38:23 INFO tool.ImportTool: Incremental import based on column id 16/01/25 13:38:23 INFO tool.ImportTool: Lower bound value: 5000 16/01/25 13:38:23 INFO tool.ImportTool: Upper bound value: 5999 16/01/25 13:38:23 INFO mapreduce.ImportJobBase: Beginning import of st1 16/01/25 13:38:23 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM st1 AS t WHERE 1=0 16/01/25 13:38:24 INFO impl.TimelineClientImpl: Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/ 16/01/25 13:38:24 INFO client.RMProxy: Connecting to ResourceManager at sandbox.hortonworks.com/10.0.2.15:8050 16/01/25 13:38:27 INFO db.DBInputFormat: Using read commited transaction isolation 16/01/25 13:38:27 INFO mapreduce.JobSubmitter: number of splits:1 16/01/25 13:38:28 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1453391437654_0055 16/01/25 13:38:28 INFO impl.YarnClientImpl: Submitted application application_1453391437654_0055 16/01/25 13:38:28 INFO mapreduce.Job: The url to track the job: http://sandbox.hortonworks.com:8088/proxy/application_1453391437654_0055/ 16/01/25 13:38:28 INFO mapreduce.Job: Running job: job_1453391437654_0055 16/01/25 13:38:39 INFO mapreduce.Job: Job job_1453391437654_0055 running in uber mode : false 16/01/25 13:38:39 INFO mapreduce.Job: map 0% reduce 0% 16/01/25 13:38:47 INFO mapreduce.Job: map 100% reduce 0% 16/01/25 13:38:47 INFO mapreduce.Job: Job job_1453391437654_0055 completed successfully 16/01/25 13:38:47 INFO mapreduce.Job: Counters: 30 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=145489 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=87 HDFS: Number of bytes written=27900 HDFS: Number of read operations=4 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=1 Other local map tasks=1 Total time spent by all maps in occupied slots (ms)=4529 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=4529 Total vcore-seconds taken by all map tasks=4529 Total megabyte-seconds taken by all map tasks=1132250 Map-Reduce Framework Map input records=900 Map output records=900 Input split bytes=87 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=55 CPU time spent (ms)=1100 Physical memory (bytes) snapshot=148291584 Virtual memory (bytes) snapshot=825331712 Total committed heap usage (bytes)=133693440 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=27900 16/01/25 13:38:47 INFO mapreduce.ImportJobBase: Transferred 27.2461 KB in 23.5963 seconds (1.1547 KB/sec) 16/01/25 13:38:47 INFO mapreduce.ImportJobBase: Retrieved 900 records. 16/01/25 13:38:47 INFO util.AppendUtils: Appending to directory sqin5 16/01/25 13:38:47 INFO util.AppendUtils: Using found partition 1 16/01/25 13:38:47 INFO tool.ImportTool: Incremental import complete! To run another incremental import of all data following this import, supply the following arguments: 16/01/25 13:38:47 INFO tool.ImportTool: --incremental append 16/01/25 13:38:47 INFO tool.ImportTool: --check-column id 16/01/25 13:38:47 INFO tool.ImportTool: --last-value 5999 16/01/25 13:38:47 INFO tool.ImportTool: (Consider saving this with 'sqoop job --create') [it1@sandbox input]$ hive SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/hdp/2.3.2.0-2950/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hdp/2.3.2.0-2950/spark/lib/spark-assembly-1.4.1.2.3.2.0-2950-hadoop2.7.1.2.3.2.0-2950.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] WARNING: Use "yarn jar" to launch YARN applications. SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/hdp/2.3.2.0-2950/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hdp/2.3.2.0-2950/spark/lib/spark-assembly-1.4.1.2.3.2.0-2950-hadoop2.7.1.2.3.2.0-2950.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Logging initialized using configuration in file:/etc/hive/2.3.2.0-2950/0/hive-log4j.properties hive> select count(*) from h2; Query ID = it1_20160125133915_2b827500-d43e-46f7-95e2-ccf986cb8d7a Total jobs = 1 Launching Job 1 out of 1 Status: Running (Executing on YARN cluster with App id application_1453391437654_0056) -------------------------------------------------------------------------------- VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -------------------------------------------------------------------------------- Map 1 .......... SUCCEEDED 1 1 0 0 0 0 Reducer 2 ...... SUCCEEDED 1 1 0 0 0 0 -------------------------------------------------------------------------------- VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 5.55 s -------------------------------------------------------------------------------- OK 5900 Time taken: 15.945 seconds, Fetched: 1 row(s) hive> quit; [it1@sandbox input]$ hdfs dfs -ls sqin5 Found 2 items -rw-r--r-- 3 it1 IT 154293 2016-01-25 13:36 sqin5/part-m-00000 -rw-r--r-- 3 it1 IT 27900 2016-01-25 13:38 sqin5/part-m-00001