Member since
01-26-2017
1
Post
1
Kudos Received
0
Solutions
06-19-2018
08:57 PM
Thanks, Pardeep. To make it 500x faster, do 500 files per call to the hadoop command. By changing the second line above, we can do this instead: $ hdfs fsck / | grep 'Under replicated' | awk -F':' '{print $1}' >> /tmp/under_replicated_files
# Now using xargs -n 500 (or --max-args 500)
$ cat /tmp/under_replicated_files |xargs -n 500 hdfs dfs -setrep 1 /tmp/under_replicated_files<br>
... View more