Member since
12-03-2017
3
Posts
0
Kudos Received
0
Solutions
12-03-2017
04:01 AM
@Shu Hi, I add the merge condition to overcome the below error. Also I can't use last-value as I don't know the value. ERROR tool.ImportTool: Error during import: --merge-key or --append is required when using --incremental lastmodified and the output directory exists.
... View more
12-03-2017
01:14 AM
Mode: lastmodified mysql> describe orders;
+-------------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key |Default | Extra |
+-------------------+-------------+------+-----+---------+----------------+
| order_id | int(11) | NO | PRI | NULL | auto_increment |
| order_date | datetime | NO || NULL | |
| order_customer_id | int(11) | NO | | NULL | |
| order_status | varchar(45) | NO | | NULL | |
+-------------------+-------------+------+-----+---------+----------------+
4 rows in set (0.00 sec) Import the order into hdfs sqoop import \
--connect jdbc:mysql://quickstart:3306/retail_db \
--username retail_dba \
--password cloudera \
--table orders \
--split-by order_id \
--target-dir /user/sqoop/orders \
--as-textfile After import [cloudera@quickstart lib]$ hadoop fs -ls -R
/user/sqoop/orders
-rw-r--r-- 1 cloudera supergroup 0 2017-12-02 16:01
/user/sqoop/orders/_SUCCESS
-rw-r--r-- 1 cloudera supergroup 741597 2017-12-02 16:01
/user/sqoop/orders/part-m-00000
-rw-r--r-- 1 cloudera supergroup 753022 2017-12-02 16:01
/user/sqoop/orders/part-m-00001
-rw-r--r-- 1 cloudera supergroup 752368 2017-12-02 16:01 /user/sqoop/orders/part-m-00002
-rw-r--r-- 1 cloudera supergroup 752940 2017-12-02 16:01
/user/sqoop/orders/part-m-00003 Update order data mysql> select * from orders where
order_id=10;
+----------+---------------------+-------------------+-----------------+
| order_id | order_date |
order_customer_id | order_status
|
+----------+---------------------+-------------------+-----------------+
| 10 | 2013-07-25 00:00:00 | 5648 | PENDING_PAYMENT
|
+----------+---------------------+-------------------+-----------------+
1 row in set (0.00 sec)mysql> update orders set order_status='CLOSED',
order_date=now() where order_id=10;
Query OK, 1 row affected (0.01 sec)
Rows matched: 1 Changed: 1 Warnings: 0mysql> select * from orders where
order_id=10;
+----------+---------------------+-------------------+--------------+
| order_id | order_date |
order_customer_id | order_status |
+----------+---------------------+-------------------+--------------+
| 10 | 2017-12-02 16:19:23 | 5648 | CLOSED |
+----------+---------------------+-------------------+--------------+
1 row in set (0.00 sec) Import additional data sqoop import \
--connect jdbc:mysql://quickstart:3306/retail_db \
--username retail_dba \
--password cloudera \
--table orders \
--split-by order_id
\
--check-column order_date \
--merge-key order_id \
--incremental lastmodified \
--target-dir /user/sqoop/orders \
--as-textfile Output: [cloudera@quickstart lib]$ hadoop fs -ls -R
/user/sqoop/orders
-rw-r--r-- 1 cloudera cloudera 0 2017-12-02 16:07
/user/sqoop/orders/_SUCCESS-rw-r--r-- 1 cloudera cloudera 2999918 2017-12-02 16:07
/user/sqoop/orders/part-r-00000 Question: Old file in HDFS directory (/user/sqoop/orders/part-m-00000-3)got delete. If it is incremental import then why scoop is deleting old files?
... View more
Labels:
- Labels:
-
Apache Sqoop