Application has multiple jobs with ‘insert overwrite’ route and that is filling up the trash dir (under home dir, by default) with deleted records. As quotas are implemented, this is causing to exceed the threshold size. do we have any option to skip trash in hive per session basis, or , any option to change the trash dir location per session.
When you perform INSERT OVERWRITE into a table the old data will be moved to trash for some duration.
To avoid data moving into trash and free up space immediately just specify auto.purge=true
TBLPROPERTIES ("auto.purge"="true") or ("auto.purge"="false")
Hope it helps!
Hi , I am facing same issue. Table is holding very huge data and while doing insert overwrite , files are getting placed in my user directory, /user/anjali/.Trash, causing hive action in oozie to fail after 1.5 hr long run. Please help. The table is external and ev even I changed it to internal table, auto purge = true is not working.
@AnjaliRocks , As this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question.
@VidyaSargur , I had started a new thread for the issue, and no solution received. So i was digging out old posts.
This issue is causing trouble in my org and unable to solve.
@paras , Thanks a lot for your reply. The solution you had provided was for spark oozie action.
I was able to solve this using the same configuration --conf spark.hadoop.dfs.user.home.dir.prefix=/tmp 2 days ago.
This was during ingestion part of flow. So ultimately my sqoop and spark jobs are redirecting any .Trash to my tmp directory which has enough quota. Now I am facing this issue with Hive action where I am not sure of such configuration equivalend to --conf spark.hadoop.dfs.user.home.dir.prefix=/appldigi/tmp or
Can you please guide on this . I am unable to solve this.
I am trying to execute hiveql which is insert overwrite script. I have already tried auto.purge = true option which is not working