Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

I want to run two reducer in Sqoop? What is the query used in Sqoop to achieve this?

I want to run two reducer in Sqoop? What is the query used in Sqoop to achieve this?

New Contributor
 
1 REPLY 1

Re: I want to run two reducer in Sqoop? What is the query used in Sqoop to achieve this?

Super Guru

@Shajahan Anwarbasha

Sqoop only runs map only job and there is no reducer initialized in SQOOP.

I don't think there is a way to run reducer phase in Sqoop job.

if you are thinking to get only two files into HDFS then use

--num-mappers 2 //sqoop job with 2 mapper phases

Another possible way to initialize reducer task would be:

If you are thinking to run two reducer phases then import the data into staging directory then trigger an hive job by setting two reducer using

set mapred.reduce.tasks=2

(or)

set hive.exec.reducers.bytes.per.reducer=1000000; //change the number accordingly to your requirements.

(or)

by using distribute by sort by in the insert query

to insert data from staging table to final table.