Support Questions

Find answers, ask questions, and share your expertise

output directory already exists error in sqoop

Expert Contributor

This is what my typical sqoop job looks like:

sqoop job -Dmapred.reduce.tasks=3 --meta-connect jdbc:hsqldb:hsql://IP:16000/sqoop --create job_name -- import --driver com.mysql.jdbc.Driver --connect 'jdbc:mysql://ip2/erp?zeroDateTimeBehavior=convertToNull&serverTimezone=IST' --username username --password 'PASS' --table orders --merge-key order_num --split-by order_num  --hive-import --hive-overwrite  --hive-database Erp --hive-drop-import-delims --null-string '\\N' --null-non-string '\\N' --fields-terminated-by '\001' --input-null-string '\\N' --input-null-non-string '\\N' --input-null-non-string '\\N' --input-fields-terminated-by '\001' --m 12

This is how I execute my jobs in oozie

job --meta-connect jdbc:hsqldb:hsql://ip:16000/sqoop --exec JOB_NAME-- --warehouse-dir folder/Snapshots/${DATE}

Now, I recently started getting error: output directory already exists no matter what timestamp I pass in $DATE variable. This is probably because of a server process restarting. Yesterday I could see node manager restart over and over but that's not the case today either. It randomly gives this out on any sqoop job in oozie.

I add --warehouse-dir folder/Snapshots/${DATE} while executing job so that I DONT GET output directory already exists ever but I started getting this yesterday out of nowhere.

Currently I do not see any flags about services acting up however namenode pause duration is concerning at regular intervals. How do I fix this?


Expert Contributor

May be you can use following option to overwrite: