Created 03-20-2018 03:38 PM
Hello
I need to remove /tmp/hive/hive directory content which has too many folders which i guess most of them are empty.
bash-4.2$ hadoop fs -count /tmp/hive/hive
2097194 18 2581710 /tmp/hive/hive
I tried using a script which referred before in below question
Script is located in Github:
https://github.com/nmilford/clean-hadoop-tmp
Script is written in Ruby. When i run this script I receive an exception. How can I delete them? Do you have any suggestions?
Exception with Ruby script is:
-bash-4.2$ ./clean-hadoop-tmp
Dropping lock file. (/var/tmp/clean-hadoop-tmp.lock)
Scanning for directories in HDFS' /tmp older than 1800 seconds.
./clean-hadoop-tmp:38:in `block in <main>': undefined method `each' for nil:NilClass (NoMethodError)
from ./clean-hadoop-tmp:35:in `each'
from ./clean-hadoop-tmp:35:in `<main>'
Should I modify this script and how? Is there any other ways?
Created 03-20-2018 04:08 PM
Initial solution i tried was a script in below answer
https://community.hortonworks.com/answers/92110/view.html
I modified this script to work in /tmp/hive/hive path but it received OOM error.
Created 03-23-2018 08:18 AM
There is a empty directory inside /tmp/
Just modify the code like this and it will work.
target_dirs.each do |tdir| target_dir = tdir.split(" ")[7] sub_dirs = [] dir_list = `hadoop fs -ls #{target_dir}`.split("\n")[2..-1] sub_dirs = dir_list if dir_list sub_dirs.each do |sdir|
Created 03-23-2018 08:51 PM
Good idea. Felt bad and lazy for not inspecting the code before.
Got again OOM however.
Thank you anyways. I will try to spare some time to take a look at the code. I will update the question if I can progress.
Created 05-24-2018 09:13 AM
What I did was editing the code giving '/tmp/hive/hive' as the root directory. I should have given '/tmp' at the first place.