Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

I want to run two reducer in Sqoop? What is the query used in Sqoop to achieve this?

Highlighted

I want to run two reducer in Sqoop? What is the query used in Sqoop to achieve this?

New Contributor
 
1 REPLY 1

Re: I want to run two reducer in Sqoop? What is the query used in Sqoop to achieve this?

Super Guru

@Shajahan Anwarbasha

Sqoop only runs map only job and there is no reducer initialized in SQOOP.

I don't think there is a way to run reducer phase in Sqoop job.

if you are thinking to get only two files into HDFS then use

--num-mappers 2 //sqoop job with 2 mapper phases

Another possible way to initialize reducer task would be:

If you are thinking to run two reducer phases then import the data into staging directory then trigger an hive job by setting two reducer using

set mapred.reduce.tasks=2

(or)

set hive.exec.reducers.bytes.per.reducer=1000000; //change the number accordingly to your requirements.

(or)

by using distribute by sort by in the insert query

to insert data from staging table to final table.



Don't have an account?
Coming from Hortonworks? Activate your account here