Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Sqoop jobs not progressing while running 4 sqoop jobs in parallel

Highlighted

Sqoop jobs not progressing while running 4 sqoop jobs in parallel

New Contributor

Hi,

WE have Cloudera quickstark setup on Oracle Virtual box VM. We are performing some sort of stress testing. How fast data can ingested in running 5 parallel imports? We have many clients so during POC, we like to check how much load we can put during data ingestion in parallel.

While i was importing the data from single sql server instance with single sqoop job, it was working fine. Now while i am trying to import data from 4 sql server databases on the two sql server instance (running in two different machines) using 4 different sqoop import command, no job is finishing its job. All showing 0% map 0% reduce.

 

We have oracle VM virtual box setup with following resources

1. RAM : 24 GB

2. Disk : 256 GB

3. Processors assigned : 6 (Hypertheading not enabled)

 

What could be potential reasons for the data ingestion failure? As you can check job started running on Feb 12, 2016 and today (Feb 15, 2016) it's still running. Data is aroung 5GB which has to be import through hive sqoop jobs

 

sqoop_stuck_jobs.png

Do i need to have more resources assigned to the VM ?

Going to real physical cluster is not an option yet for us. 

 

Thanks

 

1 REPLY 1

Re: Sqoop jobs not progressing while running 4 sqoop jobs in parallel

Master Guru
The hang at 0% usually indicates that your MR2 AM is running but the containers are not. On single-node/pseudo-distributed installs, the most common cause is a badly configured yarn.nodemanager.resource.memory-mb property on the NodeManagers. What is your current value set for this, and have you tried to increase it?