Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Cloudera Express - HDFS - Using CLI to push backups to S3 Bucket

Cloudera Express - HDFS - Using CLI to push backups to S3 Bucket

Explorer

Hello,

 

I am trying to setup HDFS backup using S3 Buckets on Clouera.  I have searched the internet high and low for the syntax to do this via CLI and have had no luck!

After many hours I have figured out how to get this to work

 

If you have this defined in the properties file use:

sudo -u hdfs hdfs dfs -cp hdfs://nameservice/* s3n://@BUCKET-NAME/

 

If you do not have it defined:

sudo -u hdfs hdfs dfs -cp hdfs://nameservice/* s3n://SECRET-KEY:PRIVATE-KEY@s3://s3-us-BUCKET-NAME/

 

I hope that saves many users hours of guessing and configuration

 

Ok so my issue is

What is the syntax for s3 vs s3n?  s3n works flawlessly however it limits files to less than 5gb in size!

Since I am backing up HDFS which is >5gb in size I will need to use the s3:// 

 

 

What am I missing?  Changing the above commands from s3n://  to s3:// does gives me the following error:

 

cp: `s3://BUCKET-NAME/': No such file or directory

1 REPLY 1
Highlighted

Re: Cloudera Express - HDFS - Using CLI to push backups to S3 Bucket

Explorer

You have to use distCp. There's an example or two here:

 

http://lintool.github.io/Cloud9/docs/content/start-S3.html#step1

Don't have an account?
Coming from Hortonworks? Activate your account here