Created 06-25-2015 07:48 AM
Hello,
I am trying to setup HDFS backup using S3 Buckets on Clouera. I have searched the internet high and low for the syntax to do this via CLI and have had no luck!
After many hours I have figured out how to get this to work
If you have this defined in the properties file use:
sudo -u hdfs hdfs dfs -cp hdfs://nameservice/* s3n://@BUCKET-NAME/
If you do not have it defined:
sudo -u hdfs hdfs dfs -cp hdfs://nameservice/* s3n://SECRET-KEY:PRIVATE-KEY@s3://s3-us-BUCKET-NAME/
I hope that saves many users hours of guessing and configuration
Ok so my issue is
What is the syntax for s3 vs s3n? s3n works flawlessly however it limits files to less than 5gb in size!
Since I am backing up HDFS which is >5gb in size I will need to use the s3://
What am I missing? Changing the above commands from s3n:// to s3:// does gives me the following error:
cp: `s3://BUCKET-NAME/': No such file or directory
Created 06-26-2015 11:20 AM
You have to use distCp. There's an example or two here:
http://lintool.github.io/Cloud9/docs/content/start-S3.html#step1