Reply
Nir
New Contributor
Posts: 5
Registered: ‎01-22-2016

Sqoop jobs not progressing while running 4 sqoop jobs in parallel

Hi,

WE have Cloudera quickstark setup on Oracle Virtual box VM. We are performing some sort of stress testing. How fast data can ingested in running 5 parallel imports? We have many clients so during POC, we like to check how much load we can put during data ingestion in parallel.

While i was importing the data from single sql server instance with single sqoop job, it was working fine. Now while i am trying to import data from 4 sql server databases on the two sql server instance (running in two different machines) using 4 different sqoop import command, no job is finishing its job. All showing 0% map 0% reduce.

 

We have oracle VM virtual box setup with following resources

1. RAM : 24 GB

2. Disk : 256 GB

3. Processors assigned : 6 (Hypertheading not enabled)

 

What could be potential reasons for the data ingestion failure? As you can check job started running on Feb 12, 2016 and today (Feb 15, 2016) it's still running. Data is aroung 5GB which has to be import through hive sqoop jobs

 

sqoop_stuck_jobs.png

Do i need to have more resources assigned to the VM ?

Going to real physical cluster is not an option yet for us. 

 

Thanks

 

Highlighted
Posts: 1,826
Kudos: 406
Solutions: 292
Registered: ‎07-31-2013

Re: Sqoop jobs not progressing while running 4 sqoop jobs in parallel

The hang at 0% usually indicates that your MR2 AM is running but the containers are not. On single-node/pseudo-distributed installs, the most common cause is a badly configured yarn.nodemanager.resource.memory-mb property on the NodeManagers. What is your current value set for this, and have you tried to increase it?
Announcements