Support Questions
Find answers, ask questions, and share your expertise

Sqoop large amount of data import--opinions exchange

Explorer

May I ask everyone some questions?
I am importing data from MySQL into HBase via Sqoop command.
The imported data table CustOrders, a total of 20,000,000 records (1739 MB)
And Sqoop data import command as follows:

sqoop import --connect jdbc:mysql://127.0.0.1:3306/ec_website \
--username root \
--password cloudera \
--table CustOrders \
--hbase-table CustOrders \
--column-family CustOrders \
--hbase-row-key 1COID \
-m 1

It has taken more than 91 hours to import data
Is this normal import speed (or how long does it take to import the data)?
And what are the recommendations for importing large amounts of data (> 1 GB)?

 

Specifications:
Virtual Machine: Cloudera Quickstart Virtual Machine (Using Oracle VM Virtualbox, download URL: https://www.cloudera.com/downloads/quickstart_vms/5-13.html)
(1) Version: 5.13.0-0
(2) Number of processors: 2
(3) Controller: IDE Controller
(4) Memory space: 11400 MB
(5) Hard drive: 64.00 GB storage, solid state drive
(6) OS: Red Hat, 64-bit
(7) MySQL 5.1.73
(8) HBase 1.2.0-cdh5.13.0
(9) Hadoop 2.6.0-cdh5.13.0
(10) Sqoop 1.4.6-cdh5.13.0

1 REPLY 1

Supplementary notes: Sqoop large amount of data import--opinions exchange

Explorer

As mentioned above, I used Sqoop to import the relation table CustOrders from MySQL to HBase, but there are still many unreasonable results:
(1) When using one mapper, more than 1 million records are imported in 7 hours, but when using 2 mappers, only 38,000 records are imported in 7 hours.
(2) When two mappers are used, the data is imported into 38,000 records in 7 hours, but only 23,000 records are imported in 10.3 hours.
(3) When using one mapper, more than one million records are imported for 7 hours of data import, and more than one million records are also imported for 99 hours of data import.

Is there any possible reasons behind it (or has the other opinions)?