Member since
09-09-2015
5
Posts
2
Kudos Received
0
Solutions
05-03-2018
09:20 PM
1 Kudo
You may not have all the permissions you need on the bucket. Make sure you have "s3:ListBucket" on resource "arn:aws:s3:::<bucket-name>", as well as "s3:PutObject", "s3:GetObject", "s3:DeleteObject", "s3:PutObjectAcl" on resource "arn:aws:s3:::<bucket-name>/*" You probably also want to add the permissions for creating and deleting multipart uploads. I'd add priviliges for the operations mentioned in the multipart upload docs as well.
... View more
02-07-2018
03:59 PM
1 Kudo
Distcp can take some time to complete depending on your source data. One thing to try would be to list a public bucket. I believe if you have no credentials set you'll see an error, but if you have any valid credentials you should be able to list it: hadoop fs -ls s3a://landsat-pds/ Also make sure you've deployed your client configs in Cloudera Manager (CM).
... View more
02-01-2018
10:57 PM
STS should work. I would try (1) using s3a, not s3n, and (2) building your spark app with the same AWS SDK version as used in the cluster.
... View more