Created 07-07-2016 10:13 AM
Created 07-08-2016 05:19 PM
you shouldn't wipe the entire /tmp directory, this would affect your current jobs indeed.
There's no builtin way to do that but you can cron a job which deletes the files/directories older than x days
You'll find some examples around, here is a shell (dirty but efficient) easy way for cleaning up files only:
#!/bin/bash
usage="Usage: dir_diff.sh [days]"
if [ ! "$1" ]
then
  echo $usage
  exit 1
fi
now=$(date +%s)
hadoop fs -ls -R /tmp/ | grep "^-" | while read f; do
  dir_date=`echo $f | awk '{print $6}'`
  difference=$(( ( $now - $(date -d "$dir_date" +%s) ) / (24 * 60 * 60 ) ))
  if [ $difference -gt $1 ]; then
    hdfs dfs -rm -f $(echo $f | awk '{print $NF}');
  fi
done
					
				
			
			
				
			
			
			
				
			
			
			
			
			
		Created 07-07-2016 10:24 PM
I'm assuming you are referring to /tmp/ directory in hdfs. You can use below command to clean it up and cron it to run every week.
hadoop fs -rm -r /tmp/*
Created 07-08-2016 06:34 AM
Thank you so much Rahul...so if i deleted hdfs /tmp directory which is not effect my current jobs?
Created 07-08-2016 05:19 PM
you shouldn't wipe the entire /tmp directory, this would affect your current jobs indeed.
There's no builtin way to do that but you can cron a job which deletes the files/directories older than x days
You'll find some examples around, here is a shell (dirty but efficient) easy way for cleaning up files only:
#!/bin/bash
usage="Usage: dir_diff.sh [days]"
if [ ! "$1" ]
then
  echo $usage
  exit 1
fi
now=$(date +%s)
hadoop fs -ls -R /tmp/ | grep "^-" | while read f; do
  dir_date=`echo $f | awk '{print $6}'`
  difference=$(( ( $now - $(date -d "$dir_date" +%s) ) / (24 * 60 * 60 ) ))
  if [ $difference -gt $1 ]; then
    hdfs dfs -rm -f $(echo $f | awk '{print $NF}');
  fi
done
					
				
			
			
				
			
			
			
			
			
			
			
		 
					
				
				
			
		
