Member since
07-24-2019
21
Posts
0
Kudos Received
0
Solutions
07-29-2021
11:08 PM
@smkmuthu, Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. If you are still experiencing the issue, can you provide the information as requested?
... View more
08-06-2019
08:23 AM
You can use a script like this to create snapshots of old and new files - i.e. search files which are older than 3 days and search for files which are newer than 3 days, just make sure, you use the correct path to the cloudera jars. In the case of CDH5.15: #!/bin/bash
now=`date +"%Y-%m-%dT%H:%M:%S"`
hdfs dfs -rm /data/cleanup_report/part=older3days/*
hdfs dfs -rm /data/cleanup_report/part=newer3days/*
hadoop jar /opt/cloudera/parcels/CDH/jars/search-mr-1.0.0-cdh5.15.1.jar org.apache.solr.hadoop.HdfsFindTool -find /data -type d -mtime +3 | sed "s/^/${now}\tolder3days\t/" | hadoop fs -put - /data/cleanup_report/part=older3days/data.csv
hadoop jar /opt/cloudera/parcels/CDH/jars/search-mr-1.0.0-cdh5.15.1.jar org.apache.solr.hadoop.HdfsFindTool -find /data -type d -mtime -3 | sed "s/^/${now}\tnewer3days\t/" | hadoop fs -put - /data/cleanup_report/part=newer3days/data.csv Then create an external table with partitions on top of this HDFS folder.
... View more