- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
how to skip trash when using INSERT OVERWRITE in hive
- Labels:
-
Apache Hive
Created 11-26-2018 11:48 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Application has multiple jobs with ‘insert overwrite’ route and that is filling up the trash dir (under home dir, by default) with deleted records. As quotas are implemented, this is causing to exceed the threshold size. do we have any option to skip trash in hive per session basis, or , any option to change the trash dir location per session.
Created 11-26-2018 12:19 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When you perform INSERT OVERWRITE into a table the old data will be moved to trash for some duration.
To avoid data moving into trash and free up space immediately just specify auto.purge=true
TBLPROPERTIES ("auto.purge"="true") or ("auto.purge"="false")
Thanks.
Hope it helps!
Created 11-26-2018 04:01 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Much appreciated.
Created 11-28-2018 08:08 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@pradeep kammella : If you found this answer is helpful, please take a moment to login and click the "accept" link on the answer. Thanks!!!
Created 06-27-2020 12:39 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi , I am facing same issue. Table is holding very huge data and while doing insert overwrite , files are getting placed in my user directory, /user/anjali/.Trash, causing hive action in oozie to fail after 1.5 hr long run. Please help. The table is external and ev even I changed it to internal table, auto purge = true is not working.
Created 06-29-2020 12:45 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@AnjaliRocks , As this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question.
Regards,
Vidya Sargur,Community Manager
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Created 06-29-2020 01:02 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@VidyaSargur , I had started a new thread for the issue, and no solution received. So i was digging out old posts.
My thread :- https://community.cloudera.com/t5/Support-Questions/Trash-space-issue-during-insert-overwrite-in-Hiv...
This issue is causing trouble in my org and unable to solve.
Created 06-29-2020 01:16 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@AnjaliRocks , I see that our expert @paras has responded to your thread. Can you please check if his response is helpful? Please feel free to @ mention him for further queries.
Regards,
Vidya Sargur,Community Manager
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Created 06-29-2020 02:03 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Replied on the new thread
Created 06-29-2020 09:12 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@paras , Thanks a lot for your reply. The solution you had provided was for spark oozie action.
I was able to solve this using the same configuration --conf spark.hadoop.dfs.user.home.dir.prefix=/tmp 2 days ago.
This was during ingestion part of flow. So ultimately my sqoop and spark jobs are redirecting any .Trash to my tmp directory which has enough quota. Now I am facing this issue with Hive action where I am not sure of such configuration equivalend to --conf spark.hadoop.dfs.user.home.dir.prefix=/appldigi/tmp or
-Dyarn.app.mapreduce.am.staging-dir=/tmp.
Can you please guide on this . I am unable to solve this.
I am trying to execute hiveql which is insert overwrite script. I have already tried auto.purge = true option which is not working
