I have Hadoop on-premise cluster located on AWS.
This cluster contains exported HBsase snapshots on HDFS, about 50TB.
I would like to copy the entire HDFS content to S3 every day.
I tried to copy to s3 via distcp, but there are a lot of configuration and options to tune and the performance is not so good.
Is it possible to upload such amount of data to s3?
... View more