Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

sqoop warehouse dir argument with hive overwrite argument

I am using --warehouse-dir argument for loading data in HDFS before sqoop puts it into hive. I am running all my sqoop jobs through oozie.

Now, if the task fails for some reason, it is reattempted and the problem here is that the warehouse dir created by previous task is still there and the task re-attempt fails with output directory already exists.

I understand I could use direct argument to skip intermediate loading in HDFS step but I need to use drop import hive delims argument as well and that's not supported with Hive. Advice, please? It's important.

1 REPLY 1

Hey @Simran kaur!

Not sure if you already solved your problem, but my advice would be to add the --append command to your sqoop job.
If I get your problem right, doing this you shouldn't face the 'output path already exists'

Hope this helps! 🙂

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.