<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Do we have any script which we can use to clean /tmp/hive/ dir frequently on hdfs. Because it is consuming space in TB. in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Do-we-have-any-script-which-we-can-use-to-clean-tmp-hive-dir/m-p/156974#M119387</link>
    <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/12793/trainings.html" nodeid="12793"&gt;@Gurmukh Singh&lt;/A&gt;: I tried this script and not getting anything just below output. &lt;/P&gt;&lt;P&gt;[user@server2~]$ ./cleanup.sh &lt;/P&gt;&lt;P&gt;Usage: dir_diff.sh [30]&lt;/P&gt;&lt;P&gt;I have same thing in script which you have mentioned. &lt;/P&gt;</description>
    <pubDate>Wed, 21 Sep 2016 17:16:43 GMT</pubDate>
    <dc:creator>SK1</dc:creator>
    <dc:date>2016-09-21T17:16:43Z</dc:date>
    <item>
      <title>Do we have any script which we can use to clean /tmp/hive/ dir frequently on hdfs. Because it is consuming space in TB.</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Do-we-have-any-script-which-we-can-use-to-clean-tmp-hive-dir/m-p/156965#M119378</link>
      <description>&lt;P&gt;Do we have any script which we can use to clean /tmp/hive/ dir frequently on hdfs. Because it is consuming space in TB.&lt;/P&gt;&lt;P&gt;I have gone through below one but I am looking for any shell script.&lt;/P&gt;&lt;P&gt;&lt;A href="https://github.com/nmilford/clean-hadoop-tmp/blob/master/clean-hadoop-tmp" target="_blank"&gt;https://github.com/nmilford/clean-hadoop-tmp/blob/master/clean-hadoop-tmp&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 24 Feb 2016 21:12:35 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Do-we-have-any-script-which-we-can-use-to-clean-tmp-hive-dir/m-p/156965#M119378</guid>
      <dc:creator>SK1</dc:creator>
      <dc:date>2016-02-24T21:12:35Z</dc:date>
    </item>
    <item>
      <title>Re: Do we have any script which we can use to clean /tmp/hive/ dir frequently on hdfs. Because it is consuming space in TB.</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Do-we-have-any-script-which-we-can-use-to-clean-tmp-hive-dir/m-p/156966#M119379</link>
      <description>&lt;P&gt;Create a file /scripts/myLogCleaner.sh ( or whatever ) &lt;/P&gt;&lt;P&gt;add the following command ( which deletes all files having a log in the name and are older than a day )&lt;/P&gt;&lt;P&gt;find /tmp/hive -name *log* -mtime +1 -exec rm {} \;&lt;/P&gt;&lt;P&gt;and crontab it. &lt;/P&gt;&lt;P&gt;crontab -e&lt;/P&gt;&lt;P&gt;0 0 * * * /scripts/myLogCleaner.sh&lt;/P&gt;&lt;P&gt;This will start the cleaner every day at midnight.&lt;/P&gt;&lt;P&gt;( obviously just one out of approximately 3 million different ways to do it &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt; ) &lt;/P&gt;&lt;P&gt;Edit: ah not the logs of the hive CLI but the scratch dir of hive. That makes it a bit harder since there is no hadoop find. Weird that it grows so big it should clean up after itself unless the command line interface or task gets killed. &lt;/P&gt;</description>
      <pubDate>Wed, 24 Feb 2016 21:19:48 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Do-we-have-any-script-which-we-can-use-to-clean-tmp-hive-dir/m-p/156966#M119379</guid>
      <dc:creator>bleonhardi</dc:creator>
      <dc:date>2016-02-24T21:19:48Z</dc:date>
    </item>
    <item>
      <title>Re: Do we have any script which we can use to clean /tmp/hive/ dir frequently on hdfs. Because it is consuming space in TB.</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Do-we-have-any-script-which-we-can-use-to-clean-tmp-hive-dir/m-p/156967#M119380</link>
      <description>&lt;P&gt;this is on hdfs Benjamin. I mean same approach just hdfs commands vs local fs.&lt;/P&gt;</description>
      <pubDate>Wed, 24 Feb 2016 21:23:22 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Do-we-have-any-script-which-we-can-use-to-clean-tmp-hive-dir/m-p/156967#M119380</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2016-02-24T21:23:22Z</dc:date>
    </item>
    <item>
      <title>Re: Do we have any script which we can use to clean /tmp/hive/ dir frequently on hdfs. Because it is consuming space in TB.</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Do-we-have-any-script-which-we-can-use-to-clean-tmp-hive-dir/m-p/156968#M119381</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/168/bleonhardi.html" nodeid="168"&gt;@Benjamin Leonhardi&lt;/A&gt; : I can do it easily with local but I am looking for hdfs /tmp/hive dir. &lt;/P&gt;&lt;P&gt;So do we have anything like this for hdfs. &lt;/P&gt;</description>
      <pubDate>Wed, 24 Feb 2016 21:37:24 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Do-we-have-any-script-which-we-can-use-to-clean-tmp-hive-dir/m-p/156968#M119381</guid>
      <dc:creator>SK1</dc:creator>
      <dc:date>2016-02-24T21:37:24Z</dc:date>
    </item>
    <item>
      <title>Re: Do we have any script which we can use to clean /tmp/hive/ dir frequently on hdfs. Because it is consuming space in TB.</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Do-we-have-any-script-which-we-can-use-to-clean-tmp-hive-dir/m-p/156969#M119382</link>
      <description>&lt;P&gt;That would be the time when I start writing some python magic parsing the timestamp from the hadoop -ls output command. Or to be faster a small Java program doing the same with the FileSystem API.&lt;/P&gt;&lt;P&gt;Someone already did the first approach with shell script apparently. Replace the echo with a hadoop fs -rm -r -f and you might be good. But I didn't test it obviously ...&lt;/P&gt;&lt;P&gt;&lt;A href="http://stackoverflow.com/questions/12613848/finding-directories-older-than-n-days-in-hdfs"&gt;http://stackoverflow.com/questions/12613848/finding-directories-older-than-n-days-in-hdfs&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 24 Feb 2016 21:43:01 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Do-we-have-any-script-which-we-can-use-to-clean-tmp-hive-dir/m-p/156969#M119382</guid>
      <dc:creator>bleonhardi</dc:creator>
      <dc:date>2016-02-24T21:43:01Z</dc:date>
    </item>
    <item>
      <title>Re: Do we have any script which we can use to clean /tmp/hive/ dir frequently on hdfs. Because it is consuming space in TB.</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Do-we-have-any-script-which-we-can-use-to-clean-tmp-hive-dir/m-p/156970#M119383</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/168/bleonhardi.html" nodeid="168"&gt;@Benjamin Leonhardi&lt;/A&gt; yep, I've done that a while ago with java hdfs api. Look up the paths, identify age of files, delete.&lt;/P&gt;</description>
      <pubDate>Wed, 24 Feb 2016 21:44:26 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Do-we-have-any-script-which-we-can-use-to-clean-tmp-hive-dir/m-p/156970#M119383</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2016-02-24T21:44:26Z</dc:date>
    </item>
    <item>
      <title>Re: Do we have any script which we can use to clean /tmp/hive/ dir frequently on hdfs. Because it is consuming space in TB.</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Do-we-have-any-script-which-we-can-use-to-clean-tmp-hive-dir/m-p/156971#M119384</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/2273/saurabhmcakiet.html" nodeid="2273"&gt;@Saurabh Kumar&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/2273/saurabhmcakiet.html" nodeid="2273"&gt;&lt;/A&gt;I have not used this but worth trying.&lt;/P&gt;&lt;P&gt;&lt;A href="https://issues.apache.org/jira/browse/HADOOP-8989" target="_blank"&gt;https://issues.apache.org/jira/browse/HADOOP-8989&lt;/A&gt; &lt;/P&gt;</description>
      <pubDate>Wed, 24 Feb 2016 21:54:09 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Do-we-have-any-script-which-we-can-use-to-clean-tmp-hive-dir/m-p/156971#M119384</guid>
      <dc:creator>rahulpathak109</dc:creator>
      <dc:date>2016-02-24T21:54:09Z</dc:date>
    </item>
    <item>
      <title>Re: Do we have any script which we can use to clean /tmp/hive/ dir frequently on hdfs. Because it is consuming space in TB.</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Do-we-have-any-script-which-we-can-use-to-clean-tmp-hive-dir/m-p/156972#M119385</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/2273/saurabhmcakiet.html" nodeid="2273"&gt;@Saurabh Kumar&lt;/A&gt; To add to this, you could investigate third party dev projects such as &lt;A href="https://github.com/nmilford/clean-hadoop-tmp" target="_blank"&gt;https://github.com/nmilford/clean-hadoop-tmp&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 08 Apr 2016 22:05:22 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Do-we-have-any-script-which-we-can-use-to-clean-tmp-hive-dir/m-p/156972#M119385</guid>
      <dc:creator>iroberts</dc:creator>
      <dc:date>2016-04-08T22:05:22Z</dc:date>
    </item>
    <item>
      <title>Re: Do we have any script which we can use to clean /tmp/hive/ dir frequently on hdfs. Because it is consuming space in TB.</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Do-we-have-any-script-which-we-can-use-to-clean-tmp-hive-dir/m-p/156973#M119386</link>
      <description>&lt;P&gt;
	You can do:
&lt;/P&gt;
&lt;P&gt;
	#!/bin/bash

	usage="Usage: dir_diff.sh [days]"

	

	if [ ! "$1" ]

	then

	  echo $usage

	  exit 1

	fi

	

	now=$(date +%s)

	hadoop fs -ls -R /tmp/ | grep "^d" | while read f; do

	  dir_date=`echo $f | awk '{print $6}'`

	  difference=$(( ( $now - $(date -d "$dir_date" +%s) ) / (24 * 60 * 60 ) ))

	  if [ $difference -gt $1 ]; then

	  hadoop fs -rm -r `echo $f | awk '{ print $8 }'`;

	  fi

	done
&lt;/P&gt;
&lt;P&gt;
	Replace the directories or files you need to clean up appropriately.
&lt;/P&gt;</description>
      <pubDate>Wed, 31 Aug 2016 06:24:57 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Do-we-have-any-script-which-we-can-use-to-clean-tmp-hive-dir/m-p/156973#M119386</guid>
      <dc:creator>trainings</dc:creator>
      <dc:date>2016-08-31T06:24:57Z</dc:date>
    </item>
    <item>
      <title>Re: Do we have any script which we can use to clean /tmp/hive/ dir frequently on hdfs. Because it is consuming space in TB.</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Do-we-have-any-script-which-we-can-use-to-clean-tmp-hive-dir/m-p/156974#M119387</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/12793/trainings.html" nodeid="12793"&gt;@Gurmukh Singh&lt;/A&gt;: I tried this script and not getting anything just below output. &lt;/P&gt;&lt;P&gt;[user@server2~]$ ./cleanup.sh &lt;/P&gt;&lt;P&gt;Usage: dir_diff.sh [30]&lt;/P&gt;&lt;P&gt;I have same thing in script which you have mentioned. &lt;/P&gt;</description>
      <pubDate>Wed, 21 Sep 2016 17:16:43 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Do-we-have-any-script-which-we-can-use-to-clean-tmp-hive-dir/m-p/156974#M119387</guid>
      <dc:creator>SK1</dc:creator>
      <dc:date>2016-09-21T17:16:43Z</dc:date>
    </item>
    <item>
      <title>Re: Do we have any script which we can use to clean /tmp/hive/ dir frequently on hdfs. Because it is consuming space in TB.</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Do-we-have-any-script-which-we-can-use-to-clean-tmp-hive-dir/m-p/156975#M119388</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/2273/saurabhmcakiet.html" nodeid="2273"&gt;@Saurabh&lt;/A&gt; the script takes a argument as the number of days &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;&lt;P&gt;So, if you want to look for files older then 10 days then #./cleaup.sh 10&lt;/P&gt;</description>
      <pubDate>Thu, 22 Sep 2016 19:51:55 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Do-we-have-any-script-which-we-can-use-to-clean-tmp-hive-dir/m-p/156975#M119388</guid>
      <dc:creator>trainings</dc:creator>
      <dc:date>2016-09-22T19:51:55Z</dc:date>
    </item>
    <item>
      <title>Re: Do we have any script which we can use to clean /tmp/hive/ dir frequently on hdfs. Because it is consuming space in TB.</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Do-we-have-any-script-which-we-can-use-to-clean-tmp-hive-dir/m-p/156976#M119389</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/12793/trainings.html" nodeid="12793"&gt;@Gurmukh Singh&lt;/A&gt;: Thanks I just tested it with following ways and it is working fine. We can change hadoop fs -ls to hadoop fs -rm -r and required dir. &lt;/P&gt;&lt;P&gt;#!/bin/bash&lt;/P&gt;&lt;P&gt;usage="Usage: dir_diff.sh [days]"&lt;/P&gt;&lt;P&gt;if [!"$1"]&lt;/P&gt;&lt;P&gt;then&lt;/P&gt;&lt;P&gt;echo$usage&lt;/P&gt;&lt;P&gt;exit1&lt;/P&gt;&lt;P&gt;fi&lt;/P&gt;&lt;P&gt;now=$(date +%s)&lt;/P&gt;&lt;P&gt;hadoop fs -ls /zone_encr2/ | grep "^d" | while read f; do&lt;/P&gt;&lt;P&gt;dir_date=`echo $f | awk '{print $6}'`&lt;/P&gt;&lt;P&gt;difference=$(( ( $now - $(date -d "$dir_date" +%s) ) / (24 * 60 * 60 ) ))&lt;/P&gt;&lt;P&gt;if [$difference-gt$1]; then&lt;/P&gt;&lt;P&gt;    hadoop fs -ls `echo$f| awk '{ print $8 }'`;&lt;/P&gt;&lt;P&gt;fi&lt;/P&gt;&lt;P&gt;done&lt;/P&gt;</description>
      <pubDate>Sat, 24 Sep 2016 14:18:49 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Do-we-have-any-script-which-we-can-use-to-clean-tmp-hive-dir/m-p/156976#M119389</guid>
      <dc:creator>SK1</dc:creator>
      <dc:date>2016-09-24T14:18:49Z</dc:date>
    </item>
    <item>
      <title>Re: Do we have any script which we can use to clean /tmp/hive/ dir frequently on hdfs. Because it is consuming space in TB.</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Do-we-have-any-script-which-we-can-use-to-clean-tmp-hive-dir/m-p/156977#M119390</link>
      <description>&lt;P&gt;&lt;A href="https://community.hortonworks.com/users/2273/saurabhmcakiet.html"&gt;&lt;/A&gt;&lt;A rel="user" href="https://community.cloudera.com/users/2273/saurabhmcakiet.html" nodeid="2273"&gt;@Saurabh&lt;/A&gt;Saurabh &lt;/P&gt;&lt;P&gt;Yes, the script I gave was with "hadoop fs -ls" command, because many people do not understand what it does and they will simply copy the script, run it and then blame that they lost data.&lt;/P&gt;&lt;P&gt;The problem is most people, call themselves Hadoop admins, but have never worked as Linux system admins/engineer &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt; &lt;/P&gt;</description>
      <pubDate>Tue, 11 Oct 2016 13:52:30 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Do-we-have-any-script-which-we-can-use-to-clean-tmp-hive-dir/m-p/156977#M119390</guid>
      <dc:creator>trainings</dc:creator>
      <dc:date>2016-10-11T13:52:30Z</dc:date>
    </item>
    <item>
      <title>Re: Do we have any script which we can use to clean /tmp/hive/ dir frequently on hdfs. Because it is consuming space in TB.</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Do-we-have-any-script-which-we-can-use-to-clean-tmp-hive-dir/m-p/156978#M119391</link>
      <description>&lt;P&gt;can some help me on this as well?&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.hortonworks.com/questions/243908/major-compaction-failure.html"&gt;https://community.hortonworks.com/questions/243908/major-compaction-failure.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 11 Apr 2019 22:27:10 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Do-we-have-any-script-which-we-can-use-to-clean-tmp-hive-dir/m-p/156978#M119391</guid>
      <dc:creator>SY0C64110</dc:creator>
      <dc:date>2019-04-11T22:27:10Z</dc:date>
    </item>
  </channel>
</rss>

