Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Set up static sqoop jarlib folder similar to oozie

Highlighted

Set up static sqoop jarlib folder similar to oozie

New Contributor

Sqoop 1 has a nice option called --skip-dist-cache that prevents Sqoop from copying its distributed cache every time for its MapReduce2 job to execute. The application jar is always copied and the "libjars" sub-directory is created for the mentioned dependencies (100s of MB of files, slowing down the execution every time).

 

As far as I can tell the jar dependencies are linked here:

 /opt/cloudera/parcels/CDH/lib/sqoop/lib/

 

All nodes already have this path thanks to CM, but it would be a small effort to push the jars once every upgrade to a set HDFS folder for use with the distributed cache. Oozie somehow does this very thing but it's not necessarily documented for others to do.

 

Does anyone know how to set a local or HDFS path for Sqoop's MR distributed cache? Maybe it is an unexposed setting in the MR2/Yarn job that sqoop creates on the destination side.

Don't have an account?
Coming from Hortonworks? Activate your account here