Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Encryption of data during sqoop load

Highlighted

Encryption of data during sqoop load

New Contributor

Hi,

 

I work for a company that has a CDH 5.16 Express cluster. One of our analysts is wanting to take a MySQL table (CRMID, Email) and load it into HDFS. During the load process we need to encrypt the Email addresses with a known salt, so the Impala table looks like; CRMID, EmailHash. This is for GDPR.

 

If we were to store the salt in the credential provider API, would sqoop be able to encrypt the email addresses? If not, does the Hadoop ecosystem have a method for this?

 

If not we're going to have to import the data via our SQL Server estate, which adds a dependency I'd rather avoid.

 

Thanks,

Tom