Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to check whether a file was updated in HDFS using Java?

How to check whether a file was updated in HDFS using Java?

New Contributor

I need to create a process in order to verify if files into a specified directory was updated in HDFS. This event will trigger an action and should run indefinitely.

 

I tried using getAccessTime and getModificationTime, but without success. Is there some requirement to do that using those methods or another viable approach without keeping a hash for each file?

1 REPLY 1
Highlighted

Re: How to check whether a file was updated in HDFS using Java?

Super Mentor

@lucasvenez 

I personally have never used the "org.apache.hadoop.hdfs.inotify.Event" API, However it looks like it can fulfil your requirement. As per the doc it can notify a client when a file is created/appended/closed/renamed ...etc

/**
 * Events sent by the inotify system. Note that no events are necessarily sent
 * when a file is opened for read (although a MetadataUpdateEvent will be sent
 * if the atime is updated).
 */
@InterfaceAudience.Public
@InterfaceStability.Unstable
public abstract class Event {
  public static enum EventType {
    CREATE, CLOSE, APPEND, RENAME, METADATA, UNLINK
  }

https://github.com/apache/hadoop/blob/release-2.7.3-RC1/hadoop-hdfs-project/hadoop-hdfs/src/main/jav...

 

.

Example:

https://gist.github.com/sadikovi/5f71e022e062150ecb9077a0442d16a2

.

 

Don't have an account?
Coming from Hortonworks? Activate your account here