Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

copy files within hdfs based on the modified time or access time

Solved Go to solution

copy files within hdfs based on the modified time or access time

Expert Contributor

I have to write a script to move files(csv) from one location in hdfs to another staging location in hdfs.(based on date) As of now I have to move files from April 2nd 2016. Later I have to schedule it so that files will be picked up for every 1 hr and moved to staging location. Hive tables are created on top of this staging location.

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: copy files within hdfs based on the modified time or access time

Cloudera Employee

1) For moving files from 2nd april to another folder in hdfs.

for i in `hdfs dfs -ls /old_data/dataset/|grep "2016-04-02"|awk '{print $8}'`;do echo ${i}; hdfs dfs -mv ${i} /old_data/dataset/TEST/;done

2) Once the above is done you can just setup a crontab.

Please try this scenario out on a test folder in non prod.

View solution in original post

2 REPLIES 2
Highlighted

Re: copy files within hdfs based on the modified time or access time

Cloudera Employee

1) For moving files from 2nd april to another folder in hdfs.

for i in `hdfs dfs -ls /old_data/dataset/|grep "2016-04-02"|awk '{print $8}'`;do echo ${i}; hdfs dfs -mv ${i} /old_data/dataset/TEST/;done

2) Once the above is done you can just setup a crontab.

Please try this scenario out on a test folder in non prod.

View solution in original post

Highlighted

Re: copy files within hdfs based on the modified time or access time

Rising Star
Don't have an account?
Coming from Hortonworks? Activate your account here