Created 06-13-2017 08:05 AM
We are currently trying to backup data from CDH cluster to S3 for backup and it works fine.
However when we want to use AWS KMS encryption to encrypt data at AWS side.
Typically this should be switch to encrypt with codes like below,
hadoop distcp \
-Dfs.s3a.access.key=<Access Key> \
-Dfs.s3a.secret.key=<Secret Key> \
-Dfs.s3a.server-side-encryption-algorithm=aws:kms \
-Dfs.s3a.server-side-encryption-key=<encryption-key> \
-Dcom.amazonaws.services.s3.disablePutObjectMD5Validation=true \
hdfs://<name-node>:8020/tmp/ \
s3a://<bucket-name>/temp1/
However I keep getting error related to hash code mismatch.
Anybody has any luck in this please?
Created 02-22-2018 03:28 AM
Hi Ankush,
have you got any solution for the same. I am looking for similar case where i want to migrate the data from Hadoop to AWS S3 using s3-dist-cp with AWS KMS keys.
Please let me know if you have any solution for this.
Thanks in Advance.
Krishna
Created 03-11-2018 10:21 PM
Looks like this is resloved in hadoop 2.8.0 ( not sure though)
check this ==.> https://github.com/minio/minio/issues/2965
only workaround i found is first load data without encryption and then enable encryption on file copied in S3 (manually) .
btw i have a general question . this SSE is to portect data in S3 only ,what about if someone with aws admin role download data to local disk ,its not more encrypted data