Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

is it safe to remove the /tmp/hive/hive folder?

SOLVED Go to solution

is it safe to remove the /tmp/hive/hive folder?


is it safe to remove the /tmp/hive/hive folder? ( from hdfs )


as


hdfs dfs -rm -r /tmp/hive/hive


the reason for that because under /tmp/hive/hive we have thousand of files and we cant delete therm


hdfs dfs -ls /tmp/hive/
Found 7 items
drwx------   - admin     hdfs          0 2019-03-05 12:00 /tmp/hive/admin
drwx------   - drt       hdfs          0 2019-06-16 14:02 /tmp/hive/drt
drwx------   - ambari-qa hdfs          0 2019-06-16 15:11 /tmp/hive/ambari-qa
drwx------   - anonymous hdfs          0 2019-06-16 08:57 /tmp/hive/anonymous
drwx------   - hdfs      hdfs          0 2019-06-13 08:42 /tmp/hive/hdfs
drwx------   - hive      hdfs          0 2019-06-13 10:58 /tmp/hive/hive
drwx------   - root      hdfs          0 2018-07-17 23:37 /tmp/hive/root
You have mail in /var/spool/mail/root
Michael-Bronson
1 ACCEPTED SOLUTION

Accepted Solutions

Re: is it safe to remove the /tmp/hive/hive folder?

Super Mentor

@Michael Bronson

As per the apache hive docs there seems to be some parameters and tools available to deal with such issue. Although i have not personally tested those tools. But looks like they were introduced to deal with similar issue long back as part of https://issues.apache.org/jira/browse/HIVE-13429


For example i see that the Hive Config "hive.exec.scratchdir" points to the "/tmp/hive" dir.

Can you please check and let us know what is the value set for the following parameter "hive.scratchdir.lock". (if not set then default value will be "false"? Additionally you might want to refer about "hive.server2.clear.dangling.scratchdir" and "hive.start.cleanup.scratchdir" parameters of Hive Server config.


Please refer to [1] the following link to know more about those parameters.

There is a tool "cleardanglingscratchdir" mentioned as part of the link [2] may be you would like to read more about it.

# hive --service cleardanglingscratchdir [-r] [-v] [-s scratchdir]
    -r      dry-run mode, which produces a list on console
    -v      verbose mode, which prints extra debugging information
    -s      if you are using non-standard scratch directory

.

[1] https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hi....

[2] https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2#SettingUpHiveServer2-ClearDa...



12 REPLIES 12

Re: is it safe to remove the /tmp/hive/hive folder?

we try the following to remove files that older then 10 days , but because there are so many files then files not deleted at all



hdfs dfs -ls /tmp/hive/hive   |   tr -s " "    |    cut -d' ' -f6-8    |     grep "^[0-9]"    |    awk 'BEGIN{ MIN=14400; LAST=60*MIN; "date +%s" | getline NOW } { cmd="date -d'\''"$1" "$2"'\'' +%s"; cmd | getline WHEN; DIFF=NOW-WHEN; if(DIFF > LAST){ print "Deleting: "$3; system("hdfs dfs -rm -r "$3) }}'




Michael-Bronson

Re: is it safe to remove the /tmp/hive/hive folder?

Re: is it safe to remove the /tmp/hive/hive folder?

Super Mentor

@Michael Bronson

As per the apache hive docs there seems to be some parameters and tools available to deal with such issue. Although i have not personally tested those tools. But looks like they were introduced to deal with similar issue long back as part of https://issues.apache.org/jira/browse/HIVE-13429


For example i see that the Hive Config "hive.exec.scratchdir" points to the "/tmp/hive" dir.

Can you please check and let us know what is the value set for the following parameter "hive.scratchdir.lock". (if not set then default value will be "false"? Additionally you might want to refer about "hive.server2.clear.dangling.scratchdir" and "hive.start.cleanup.scratchdir" parameters of Hive Server config.


Please refer to [1] the following link to know more about those parameters.

There is a tool "cleardanglingscratchdir" mentioned as part of the link [2] may be you would like to read more about it.

# hive --service cleardanglingscratchdir [-r] [-v] [-s scratchdir]
    -r      dry-run mode, which produces a list on console
    -v      verbose mode, which prints extra debugging information
    -s      if you are using non-standard scratch directory

.

[1] https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hi....

[2] https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2#SettingUpHiveServer2-ClearDa...



Re: is it safe to remove the /tmp/hive/hive folder?

@dear jay - what is the meaning of hive.scratchdir.lock when is set to false?

Michael-Bronson

Re: is it safe to remove the /tmp/hive/hive folder?

second is it safe to delete the folder -

hdfs dfs -rm -r /tmp/hive/hive

Michael-Bronson
Highlighted

Re: is it safe to remove the /tmp/hive/hive folder?

Super Mentor

@Michael Bronson

"hive.scratchdir.lock" : When true, holds a lock file in the scratch directory. If a Hive process dies and accidentally leaves a dangling scratchdir behind, the cleardanglingscratchdir tool will remove it.

When false, does not create a lock file and therefore the cleardanglingscratchdir tool cannot remove any dangling scratch directories.



Regarding your query "second is it safe to delete the folder - /tmp/hive/hive"

>>> I do not think that we should do it on our own. As the whole purpose of the following JIRA was to introduce some tool like "cleardanglingscratchdir" to safely remove the scratch contents. https://issues.apache.org/jira/browse/HIVE-13429

.

Re: is it safe to remove the /tmp/hive/hive folder?

for your info - actually we already delete this folder before you post your answer , and after we restart the hive service in ambari , it create again the /tmp/hive/hive folder

Michael-Bronson

Re: is it safe to remove the /tmp/hive/hive folder?

Super Mentor

@Michael Bronson

To clean up the Hive scratch directory manually may not be a safe option for a multi-user environment (where multiple users might be executing the hive queries concurrently) since it will accidentally remove the scratch directory in use.

Re: is it safe to remove the /tmp/hive/hive folder?

@dear Jay - you said - "I do not think that we should do it on our own" I agree but we not have a choice because under /tmp/hive/hive we have a millions of folders and we cant delete them . so after we delete the folder from hdfs , we seen that after hive restart it create again the /tmp/hive/hive folder , do you have some advice what need to check after this brutal action ?

Michael-Bronson