Member since
01-26-2016
5
Posts
0
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2658 | 09-18-2017 04:11 PM |
09-18-2017
04:11 PM
[ SOLVED ] After posting this request for help, it got me thinking about re-wording my Google search parameters which led me to a Horton site for copying data between Horton and S3 buckets. It didn't have an exact answer, but what it did have was the field for specifying an end point. It was exactly what I needed to add to my cli command. Here is the complete command I used to get distcp working for s3-govcloud to hdfs. Opposite also works, hdfs to s3-govcloud. #AWS_SECRET=xxxxxxxxx #AWS_KEY_ID=xxxxxxxxx #AWS_BUCKET=xxxxxxxx <-- name of your govcloud bucket #hadoop distcp -D fs.s3a.bucket.#AWS_BUCKET.endpoint=s3-us-gov-west-1.amazonaws.com -D fs.s3a.awsAccessKeyId=$AWS_KEY_ID -D fs.s3a.awsSecretAccessKey=$AWS_SECRET s3a://$AWS_BUCKET/path/to/files/ /path/to/hdfs/files/ Links I used for reference: AWS GovCloud Endpoints http://docs.aws.amazon.com/govcloud-us/latest/UserGuide/using-govcloud-endpoints.html Horton Amazon S3 Bucket Configuration https://hortonworks.github.io/hdp-aws/s3-copy-data/index.html
... View more
09-18-2017
03:08 PM
Hello community. I was hoping if someone might know of a way to point distcp to use GovCloud. Problem background. Having problems doing a distributed file transfer on GovCloud. Using a distcp process I know works on my 50 node cluster on standard AWS, I've been attempting to do the same on GovCloud, but it doesn't work. I have verified that my keys are current and have the appropriate permissions. I am able to access the S3 files I via the an "aws s3 cp source dest" on my GovCloud systems. The failure occurs when I use distcp on those same files. It is attempting to pull information on the GovCloud bucket from standard AWS. See screenshot below. In the circled, part you'll see the reference to AWS standard in the error. Underlined are the descriptions of the errors. For reference, I have been successful using the natively install s3-dist-cp on my EMR cluster in GovCloud. I'm able to acces any file I need and transfer them to hdfs. Possible solutions. Can anyone tell me of a way to give distcp the gov-cloud endpoint? Or another possible solution is, does anyone know of a method to install s3-dist-cp on a CDH cluster?
... View more
Labels:
- Labels:
-
HDFS
04-21-2016
11:57 AM
Hi, just wanted to add that I had a similar problem with the service monitor and after moving the old directory it started. The only significant thing I have to add is that I did not see any "error" labels in the start up error log file. Thanks!!
... View more