02-27-2017 08:38 AM
Does Cloudera navigator has an option to identify Unused objects for a particular period (like more than 6 months, 1 year, etc)?
The object can be HDFS files, Hive/Impala tables/Oozie, dataset, etc
This is my requirement: Our non-prod environment has been used by multiple users for different reasons like dev, test, etc. Sometimes they use common user id & user space to create db, create/import tables, etc. After the task finished, they will move to the next task without cleaning the old DB, tables, files which become garbage after few days.
It has been accumulated and become a big garbage now (with 3 replication). I want to identify the DB, tables, files which are not in use for more than 6 months (or) 1 year and delete them (with proper approval...)
Is it possible with Navigator? is there any other option/ideas?
02-27-2017 09:24 AM
upon further analysis, i've noticed that "navigator policies" might help on this
It seems that I need to write search query, let me try to write one... In the mean time, it will be great if some share the query for the above scenario...
01-09-2018 04:03 AM
I don't know whether this problem was resolved, but, since the post is still open, I write the kind of Search query you may need:
lastAccessed:[NOW/DAY-30DAYS TO NOW/DAY+1DAY] AND type:file AND deleted:FALSE
I hope it helps you!