Support Questions
Find answers, ask questions, and share your expertise

sqoop import - how to specify directory for work files?

I am running a sqoop import job from Oracle into a Hive table where I am receiving the message Error: org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace quota of /user/xxxxxx is exceeded: quota = 1649267441664 B = 1.50 TB but diskspace consumed = 1649444058918 B = 1.50

I would like to redirect the output from the job to a directory with a larger space quota but I haven't been able to figure out the

option to do that. Can someone help me figure out how to do that?



I have the answer now. Use the --warehouse-dir option with a directory location for sqoop import.

Example in sqoop script:



And be sure the --warehouse-dir directory you select has a larger enough space quota in Hadoop.

I don't believe my previous conclusion was correct. I think what I need to do is set the hive configuration parameter hive.exec.scratchdir and have the sqoop script pick up the config value. However, I do not see I way to do this in the sqoop script. Can anyone help?

Expert Contributor
@Carol Elliott

Create a folder on hdfs filesystem, and set the space quota on that folder,sample is show below

[hdfs@XXXX ~]$ hdfs dfs -count -q /user/ruser/quota 
none inf none inf 1 0 0 /user/ruser/quota
[hdfs@XXXX ~]$ hdfs dfsadmin -setSpaceQuota 40g /user/ruser/quota
[hdfs@XXXX ~]$ hdfs dfs -count -q /user/ruser/quota none inf 42949672960 42949672960 1 0 0 /user/ruser/quota 

Expert Contributor

@Carol Elliott

can you try the '--target-dir' option. This will import the files into the /dest directory

sqoop import --connnect <connect-str> --table <tableName> --target-dir /dest \
; ;