Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Falcon Mirroring: HDFS To S3 AWS Credentials

avatar
Guru

Where do you define your AWS access key and secret key credentials for mirroring data from a local Falcon cluster to S3? I have the job setup but it is failing because those values are not defined.

1 ACCEPTED SOLUTION

avatar
Guru

It seems the issue is specific to Falcon UI. The Falcon UI attempts to validate the S3 URI and enforces that it ends with amazonaws.com however this is not the format expected by "Jets3tNativeFileSystemStore" which distcp ultimately invokes. The format needs to be s3n://BUCKET/PATH. This was causing the authentication to fail even with the proper credentials in place since the wrong endpoint was being hit.

A work around is to download the xml from the Falcon UI and then edit the s3n URI manually and then re upload the xml file through the Falcon UI.

229-screen-shot-2015-10-09-at-84913-am.png

View solution in original post

4 REPLIES 4

avatar
Cloudera Employee

Did you add the following properties to your core-site.xml?

fs.s3n.awsAccessKeyId

fs.s3n.awsSecretAccessKey

avatar
Guru

It seems the issue is specific to Falcon UI. The Falcon UI attempts to validate the S3 URI and enforces that it ends with amazonaws.com however this is not the format expected by "Jets3tNativeFileSystemStore" which distcp ultimately invokes. The format needs to be s3n://BUCKET/PATH. This was causing the authentication to fail even with the proper credentials in place since the wrong endpoint was being hit.

A work around is to download the xml from the Falcon UI and then edit the s3n URI manually and then re upload the xml file through the Falcon UI.

229-screen-shot-2015-10-09-at-84913-am.png

avatar
Expert Contributor

The issue raised by @Jeremy Dyer will be fixed by Falcon team in Dal-M20 release.

avatar
Explorer

Hi Team, I have tried above and I see the Job status KILLED after running the workflow. After launching Oozie, I can see the workflow changing status from RUNNING to KILLED. Is there a way to troubleshoot. I can run hadoop fs -ls commands on my s3 bucket so definitely got access. I suspect its the s3 URL. I tried downloading the xml changing the URL and uploading with no luck. Any other suggestions. Appreciate all your help/support in advance. Regards

Anil