Reply
New Contributor
Posts: 3
Registered: ‎12-04-2014

Distcp data backup to AWS S3 with AWS KMS encryption

We are currently trying to backup data from CDH cluster to S3 for backup and it works fine.
However when we want to use AWS KMS encryption to encrypt data at AWS side.

Typically this should be switch to encrypt with codes like below,


hadoop distcp \
-Dfs.s3a.access.key=<Access Key> \
-Dfs.s3a.secret.key=<Secret Key> \
-Dfs.s3a.server-side-encryption-algorithm=aws:kms \
-Dfs.s3a.server-side-encryption-key=<encryption-key> \
-Dcom.amazonaws.services.s3.disablePutObjectMD5Validation=true \
hdfs://<name-node>:8020/tmp/ \
s3a://<bucket-name>/temp1/

 

However I keep getting error related to hash code mismatch.

 

Anybody has any luck in this please?

New Contributor
Posts: 1
Registered: ‎02-22-2018

Re: Distcp data backup to AWS S3 with AWS KMS encryption

Hi Ankush,

 

have you got any solution for the same. I am looking for similar case where i want to migrate the data from Hadoop to AWS S3 using s3-dist-cp with AWS KMS keys.

 

Please let me know if you have any solution for this.

 

Thanks in Advance.

Krishna

Highlighted
Expert Contributor
Posts: 113
Registered: ‎02-15-2016

Re: Distcp data backup to AWS S3 with AWS KMS encryption

Looks like this is resloved in hadoop 2.8.0 ( not sure though) 

check this ==.> https://github.com/minio/minio/issues/2965

 

only workaround i found is first load data without  encryption and then enable encryption  on file copied in S3 (manually) . 

 

btw i have a general question . this SSE is to portect data in S3 only ,what about if someone with aws admin role download data to local disk ,its not more encrypted data 

 

Announcements