Support Questions
Find answers, ask questions, and share your expertise

Is there API to query files changed after specified time

Explorer

Hello all,

 
I want to list all the files in HDFS which are changed after a specified time. Is there a API or way in Hadoop to query those files?
 
Thanks,
Jack Chen
1 REPLY 1

Re: Is there API to query files changed after specified time

Master Collaborator

Jack,

 

  As far as I know, this functionality has not been added into HDFS just yet.  There is a JIRA about adding this feature to the command-line tools: HADOOP-8989, but it has not been fixed just yet.  The "hadoop dfs" command in the latest CDH versions provides a "-stat" command-line switch, which reports back to you the last modified time (mtime) of any file or directory in HDFS, so maybe you could write a script around that?  I know there were some 3rd party tools out there on github which gave a "find" type of capability on HDFS, but I can't recall any specific name right now.