Created 10-25-2017 12:07 PM
Hello,
I've got 30 thousand of files to move to another hdfs directory.
Do you know a better way than "hdfs dfs -mv /mydirectory/* /targetdirectory" to go faster ?
Average size of a file : 10 Kb.
And I can't merge the files in a bigger one before.
Thanks for your feedback
Created 10-25-2017 12:29 PM
If there are less files in /targetdirectory than the /mydirectory , you can do the below
hdfs dfs -mv /targetdirectory /x hdfs dfs -mv /mydirectory /targetdirectory hdfs dfs -mv /x/* /targetdirectory
Thanks,
Aditya
Created 10-25-2017 12:12 PM
1. dfs -mv is the fastest as compare to -cp or distcp .
If possible move mydirectory instead of mydirectory/* into /targetdirectory
Created 10-25-2017 12:22 PM
Thanks
Not possible because the result is /targetdirectory/mydirectory and I expect all the files moved in path /targetdirectory/*
Created 10-25-2017 12:29 PM
If there are less files in /targetdirectory than the /mydirectory , you can do the below
hdfs dfs -mv /targetdirectory /x hdfs dfs -mv /mydirectory /targetdirectory hdfs dfs -mv /x/* /targetdirectory
Thanks,
Aditya
Created 10-25-2017 01:04 PM
Thanks but it doesn't work for the same reason.
When you "mv /mydirectory /targetdirectory" the result is always /targetdirectory/mydirectory.
Created 10-25-2017 01:17 PM
After running the first command targetdirectory will be renamed to x.
So mv /mydirectory /targetdirectory is not /targetdirectory/mydirectory , instead it will just rename mydirectory to targetdirectory since the destination directory doesn't exist.
So, if targetdirectory has less files this is an option.Instead of moving 30k files, you can move less files
Thanks,
Aditya
Created 10-25-2017 01:36 PM
Created 10-25-2017 01:54 PM
If you have more than 10 GB, I'd recommend use distcp instead of using Copy OR Move.