I am running a sqoop import job from Oracle into a Hive table where I am receiving the message Error: org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace quota of /user/xxxxxx is exceeded: quota = 1649267441664 B = 1.50 TB but diskspace consumed = 1649444058918 B = 1.50
I would like to redirect the output from the job to a directory with a larger space quota but I haven't been able to figure out the
option to do that. Can someone help me figure out how to do that?
I have the answer now. Use the --warehouse-dir option with a directory location for sqoop import.
Example in sqoop script:
I don't believe my previous conclusion was correct. I think what I need to do is set the hive configuration parameter hive.exec.scratchdir and have the sqoop script pick up the config value. However, I do not see I way to do this in the sqoop script. Can anyone help?
Create a folder on hdfs filesystem, and set the space quota on that folder,sample is show below
[hdfs@XXXX ~]$ hdfs dfs -count -q /user/ruser/quota none inf none inf 1 0 0 /user/ruser/quota [hdfs@XXXX ~]$ hdfs dfsadmin -setSpaceQuota 40g /user/ruser/quota [hdfs@XXXX ~]$ hdfs dfs -count -q /user/ruser/quota none inf 42949672960 42949672960 1 0 0 /user/ruser/quota