Support Questions

Find answers, ask questions, and share your expertise

is it safe to remove the /tmp/hive/hive folder?

avatar


is it safe to remove the /tmp/hive/hive folder? ( from hdfs )


as


hdfs dfs -rm -r /tmp/hive/hive


the reason for that because under /tmp/hive/hive we have thousand of files and we cant delete therm


hdfs dfs -ls /tmp/hive/
Found 7 items
drwx------   - admin     hdfs          0 2019-03-05 12:00 /tmp/hive/admin
drwx------   - drt       hdfs          0 2019-06-16 14:02 /tmp/hive/drt
drwx------   - ambari-qa hdfs          0 2019-06-16 15:11 /tmp/hive/ambari-qa
drwx------   - anonymous hdfs          0 2019-06-16 08:57 /tmp/hive/anonymous
drwx------   - hdfs      hdfs          0 2019-06-13 08:42 /tmp/hive/hdfs
drwx------   - hive      hdfs          0 2019-06-13 10:58 /tmp/hive/hive
drwx------   - root      hdfs          0 2018-07-17 23:37 /tmp/hive/root
You have mail in /var/spool/mail/root
Michael-Bronson
1 ACCEPTED SOLUTION

avatar
Master Mentor

@Michael Bronson

As per the apache hive docs there seems to be some parameters and tools available to deal with such issue. Although i have not personally tested those tools. But looks like they were introduced to deal with similar issue long back as part of https://issues.apache.org/jira/browse/HIVE-13429


For example i see that the Hive Config "hive.exec.scratchdir" points to the "/tmp/hive" dir.

Can you please check and let us know what is the value set for the following parameter "hive.scratchdir.lock". (if not set then default value will be "false"? Additionally you might want to refer about "hive.server2.clear.dangling.scratchdir" and "hive.start.cleanup.scratchdir" parameters of Hive Server config.


Please refer to [1] the following link to know more about those parameters.

There is a tool "cleardanglingscratchdir" mentioned as part of the link [2] may be you would like to read more about it.

# hive --service cleardanglingscratchdir [-r] [-v] [-s scratchdir]
    -r      dry-run mode, which produces a list on console
    -v      verbose mode, which prints extra debugging information
    -s      if you are using non-standard scratch directory

.

[1] https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hi....

[2] https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2#SettingUpHiveServer2-ClearDa...



View solution in original post

12 REPLIES 12

avatar
Master Mentor

@Michael Bronson

I do not remember/think of any specific idem to check at this point, But as long as you are able to run your Hive Queries without any issue and HiveService checks are also running fine. I think we should be good.

avatar

@Dear KJay


so finally lets summary

when we set the following


hive.server2.clear.dangling.scratchdir=true hive.start.cleanup.scratchdir=true 


and then we restart the hive service from ambari


do you think this configuration will be able to delete the old folders under /tmp/hive/hive in spite the folder are a millions folders ?


Michael-Bronson

avatar
New Contributor

@Michael Bronson

Yes you can delete /tmp/hive/hive if it is occupying the hdfs. Its better to schedule a script for every 15 days to cleanup the directory and enable e-mail notifications to get the alerts/warns accordingly. I tried the same in my org. due to storage crises.

Thank you.