I need to be able to find directories and files at a specified path that have 'world read' set. From there, I'm hoping to generate and execute HDFS commands to unset 'world read' on the files (for governance/access compliance).
I'm thinking I'll have to use Python and the RESTful API, but I was hoping someone could provide some guidance - or if there's an easier way to unset 'world read' on a large number of files, that would be greatly appreciated.
You should be able to achieve that with HdfsFindTool see http://www.cloudera.com/documentation/archive/search/1-3-0/Cloudera-Search-User-Guide/csug_hdfsfindt...
hadoop jar /opt/cloudera/parcels/CDH/jars/search-mr-1.0.0-cdh5.5.1.jar org.apache.solr.hadoop.HdfsFindTool -find "/" -perm -o=r
So, I'm pretty new to the Cloudera/Hadoop world...
If I wanted to to find all files in /temp_example that have world read set, and unset world read on all those files - would I do something like...
hadoop jar /opt/cloudera/parcels/CDH/jars/search-mr-1.0.0-cdh5.5.1.jar org.apache.solr.hadoop.HdfsFindTool -find "/temp_example" -perm -o=r
I'm assuming that "-perm -o=r" is looking for files with world read set? If so, I'm guessing the above command would then output all files/directories with world read set - what then would I need to add to the command to unset world read?
Thanks for your assistance by the way - again, very new to this world :)