Created 03-16-2017 08:39 PM
I am running a sqoop import job from Oracle into a Hive table where I am receiving the message Error: org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace quota of /user/xxxxxx is exceeded: quota = 1649267441664 B = 1.50 TB but diskspace consumed = 1649444058918 B = 1.50
I would like to redirect the output from the job to a directory with a larger space quota but I haven't been able to figure out the
option to do that. Can someone help me figure out how to do that?
Thanks.
Created 03-16-2017 10:03 PM
I have the answer now. Use the --warehouse-dir option with a directory location for sqoop import.
Example in sqoop script:
--warehouse-dir
/data/groups/yourgroup
Created 03-16-2017 10:05 PM
And be sure the --warehouse-dir directory you select has a larger enough space quota in Hadoop.
Created 03-17-2017 04:00 AM
I don't believe my previous conclusion was correct. I think what I need to do is set the hive configuration parameter hive.exec.scratchdir and have the sqoop script pick up the config value. However, I do not see I way to do this in the sqoop script. Can anyone help?
Created 03-17-2017 05:09 AM
Create a folder on hdfs filesystem, and set the space quota on that folder,sample is show below
[hdfs@XXXX ~]$ hdfs dfs -count -q /user/ruser/quota none inf none inf 1 0 0 /user/ruser/quota [hdfs@XXXX ~]$ hdfs dfsadmin -setSpaceQuota 40g /user/ruser/quota [hdfs@XXXX ~]$ hdfs dfs -count -q /user/ruser/quota none inf 42949672960 42949672960 1 0 0 /user/ruser/quota
Created 03-17-2017 06:26 AM
can you try the '--target-dir' option. This will import the files into the /dest directory
sqoop import --connnect <connect-str> --table <tableName> --target-dir /dest \