Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Cloudera Navigator - Identify unused db, tables, files, folders, etc

avatar
Champion

Hi

 

Does Cloudera navigator has an option to identify Unused objects for a particular period (like more than 6 months, 1 year, etc)?

 

The object can be HDFS files, Hive/Impala tables/Oozie, dataset, etc

 

This is my requirement: Our non-prod environment has been used by multiple users for different reasons like dev, test, etc. Sometimes they use common user id & user space to create db, create/import tables, etc. After the task finished, they will move to the next task without cleaning the old DB, tables, files  which become garbage after few days. 

 

It has been accumulated and become a big garbage now (with 3 replication). I want to identify the DB, tables, files which are not in use for more than 6 months (or) 1 year and delete them (with proper approval...)

 

Is it possible with Navigator? is there any other option/ideas?

 

Thanks

Kumar

 

1 ACCEPTED SOLUTION

avatar
Explorer

Hello,

 

I don't know whether this problem was resolved, but, since the post is still open, I write the kind of Search query you may need:

 

lastAccessed:[NOW/DAY-30DAYS TO NOW/DAY+1DAY] AND type:file AND deleted:FALSE

 

I hope it helps you!

 

View solution in original post

4 REPLIES 4

avatar
Champion

upon further analysis, i've noticed that "navigator policies" might help on this

 

https://www.cloudera.com/documentation/enterprise/5-5-x/topics/navigator_policies.html

 

It seems that I need to write search query, let me try to write one... In the mean time, it will be great if some share the query for the above scenario...

 

 

avatar
New Contributor

Anyone had luck getting this query right? Please share some examples.

 

Thanks
Rahul

avatar
Explorer

Hello,

 

I don't know whether this problem was resolved, but, since the post is still open, I write the kind of Search query you may need:

 

lastAccessed:[NOW/DAY-30DAYS TO NOW/DAY+1DAY] AND type:file AND deleted:FALSE

 

I hope it helps you!

 

avatar
New Contributor

This command is not clear.