Support Questions
Find answers, ask questions, and share your expertise

Copy latest files from HDFS to local file system

Copy latest files from HDFS to local file system

Explorer

I would like to know how I could copy the only the latest 15 minutes files from HDFS to the local files system. 

 

For the moment, I have the following command that copies files from HDFD to the local filesystem with no test on the creation/modification date:

 

 $HADOOP_FS fs -copyToLocal hdfs://node:8020/sourcePath/ /destinationPath/

 

I have also tried to used HDFS find tool with -exec but it throws a null pointer exception:

hadoop jar $HADOOP_FIND_JAR org.apache.solr.hadoop.HdfsFindTool -find $INPUT_DATASET -type f -name $INPUT_FILENAMES -mmin -15 -exec $HADOOP_FS fs -copyToLocal {} /destinationPath/ \;
-find: Fatal internal error
java.lang.NullPointerException
        at org.apache.hadoop.fs.shell.find.Exec.initialise(Exec.java:109)
        at org.apache.hadoop.fs.shell.find.BaseExpression.initialise(BaseExpression.java:64)
        at org.apache.hadoop.fs.shell.Find.processArguments(Find.java:383)
        at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)
        at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
        at org.apache.hadoop.fs.FsShell.run(FsShell.java:255)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
        at org.apache.solr.hadoop.HdfsFindTool.main(HdfsFindTool.java:43)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

I suppose there is a problem with -exec, since the command works fine without this option 

 

Any help is very much appreciated!

Thank you!

1 REPLY 1

Re: Copy latest files from HDFS to local file system

Contributor

What parameters are you running passing after -exec, it seems it's getting nothing.

 

Also, I don't think you can run a linux command there but only an hdfs command and I'm not aware of any time filtering command.